Notes On Regular Expressions In version 5 beata-G.O.
- Notes On Regular Expressions In version 5 beata-G.O.
I made more notes additions in green in the ExpressionsV5.rtf in the
HTML COMMENTS need special notation to be found with regular expressions
now because they are similar to the way assertions are written.
Character class notation must be used like this [<][!] on the
first two characters for example:
^!find "[<][!]-- start -->[[:ascii:]]+[<][!]-- end -->" RHITS
Period (.) meaning any character can't be used in a character class.
Be very careful using numbered tagged matches in replace-patterns
because extra numbers will be spit out as plain text in the output. For
^!Replace "(<td>)[\s\r\n]*|(<li>)[\s\r\n]*" >> $1$2$3 RHIATS Spits out
all "$3" after every <td> or <li>
It says below that tagged matched can be used up to $65535 but only $1 -
$9 work perhaps if $ needed to be escaped in replace-with statements it
could work as stated. Also since $0 matches the whole expression not
just the (tagged) it would be nice to have a generic like $n just $ that
matched any numbered replace in OR cases but not necessary.
Multiple Line-breaks is some thing I need frequently. Thankfully in
version 5 the same simple statement works great in BOTH Windows and Unix
files. Although it is a bit of a mystery on Unix files I like it.
IN FACT Any repetitive string can be quantified inside () for example
( )+ VERY USEFUL thanks but it is not very clear in the
instructions. SEE ExpressionsV5.rtf in the files section.
For line breaks starting with specific text it requires something else
to match I don't know why.
Example <anytext>[\r\n\s]* worked for me. Even though it was line breaks
only, <anytext>(\r\n)+ did not match.
[Non-text portions of this message have been removed]