Re: [Clip] regex to find paragraphs in a document
- Don - HtmlFixIt.com wrote:
> Sheri should be proud. I actually figured out a couple of regexes!:)
I would point out that you don't need ^!Jump Doc_Start if you're using
the "W" whole document option. Also, "T" is meaningless in combination
with "R" regex option.
You can try this for your paragraphs:
^!Replace "^(?!\<h).+$" >> "<p>$0</p>" RAWS
It matches the beginning of a line (if that BOL is not followed by the
start of heading tag), and everything on that line (as long as there's
at least one character) up to the CRLF. Because of the parentheses, the
text is captured as subpattern 1. Then it replaces the matched text with
subpattern 1 surrounded by paragraph tags.
- --- In email@example.com, Sheri <silvermoonwoman@...> wrote:
>Sorry, I sent a bit too quickly. Ignore what I said about subpattern
> Don - HtmlFixIt.com wrote:
> > Sheri should be proud. I actually figured out a couple of regexes!
> Hi Don,
> I would point out that you don't need ^!Jump
> Doc_Start if you're using the "W" whole document
> option. Also, "T" is meaningless in combination with
> "R" regex option.
> You can try this for your paragraphs:
> ^!Replace "^(?!\<h).+$" >> "<p>$0</p>" RAWS
> It matches the beginning of a line (if that BOL is
> not followed by the start of heading tag), and
> everything on that line (as long as there's at least
> one character) up to the CRLF. Because of the
> parentheses, the text is captured as subpattern 1. Then
> it replaces the matched text with subpattern 1
> surrounded by paragraph tags.
1, I took the parentheses out of the pattern because they were
unnecessary. The parentheses were previously around the dot plus in
the pattern. Then the replacement referred to $1 instead of $0.
Thought it might confuse you that the dot plus was $1, because of the
other parentheses in the pattern. Parentheses surrounding an assertion
do not count.
>> I would point out that you don't need ^!JumpYes, good point. When I first started I was doing just one. When I
>> Doc_Start if you're using the "W" whole document
>> option. Also, "T" is meaningless in combination with
>> "R" regex option.
added the W I should have deleted the jump doc start.
I am getting the T because I am using somebodies clip bar help and it
doesn't work properly for the regex search I don't think. So I use the
Normal replace dialog.
There are a couple of little bugs actually I keep meaning to write down
in cc syntax.
One is the regex replace as mentioned above.
If you type replace and hit the ccsyn icon (that's how I do it anyway).
You get an opportunity to select either Normal or Regular Expression
I choose regular and get essentially this as output:
^!Replace "x" >> "y" Ignore case (can also be accomplished with (?i) in
Also, when you use iferror you get this:
^!IfError GoToLabelTrue [ELSE GoToLabelFalse]
after getting a non-sense option popping up. I think it should be
prompting me for the labels for the goto and else goto but it doesn't.
Let me say again how much I love CCSYN! Thank you for your efforts in
I will try your paragraph method next. I was kind of getting them
(don't laugh) using this:
;find paragraphs after heading
^!Find "[\w[:punct:]]\r\n[\w]" TIRS
^!If "^$GetRow$" = "^$GetLineCount$" Loop2
One problem with that is that it was grabbing the return at the end of
the paragraph so then the extra replace is necessary to reverse the </p>
and the ^P.
I was also trying ^!Select paragraphs, but same issue there with
grabbing the ^P inside my paragraph tags.
- Hi Don,
Thanks for reporting those bugs, if you mention things as you come
across them I'll try to fix them up. I posted an update to Clipcode
Syntax in the files area. :)
On 4/9/2007 11:31 PM, Sheri wrote:
>I posted an update to Clipcode Syntax in the files area. :)
Nice work, Sheri, as usual. Thanks! I'll have to look this one over carefully.
[Non-text portions of this message have been removed]