Loading ...
Sorry, an error occurred while loading the content.

22596[Clip] Re: FIND in NTL V7pr3 still acting odd

Expand Messages
  • flo.gehrke
    Apr 14, 2012
    • 0 Attachment
      --- In ntb-clips@yahoogroups.com, "John Shotsky" <jshotsky@...> wrote:
      >
      > \R is included in \s, so you need no \R if you have \s present.
      > You will never capture a \R following a \s, because the
      > \s will capture it first.

      > You will never capture a \R following a \s, because the
      > \s will capture it first.

      So how do you explain that the two CRNL between 'xxx' and 'yyy' in...

      www
      xxx

      yyy
      zzz

      are both matched with '\R(\s)*\R' or '\R(\s)+\R' in NT 6.2 and NT 7.0 as well? And why, in NT 6.2, both CRNL are matched with '\R\s*\R' or ''\R\s+\R'?

      So, IMHO, your statement is not fully convincing (this pertains to Alec #22593 too). See the PCRE documentation on "PCRE Pattern / Newline sequences":

      "Outside a character class, by default, the escape sequence \R matches any Unicode newline sequence...In non-UTF-8 mode \R is equivalent to the following: (?>\r\n|\n|\x0b|\f|\r|\x85)"

      Since CRNL, in Windows, consists of CR + NL, the sequence of two CRNL actually represents four characters. So this explains the abovementioned matching. You can compare it with 'xxxx' being matched with 'x{1,2}xxx{1,2}'.

      If I'm not mistaken, the crux in Joy's problem was the DIFFERENCE between the behavior of NT 6.2 and NT 7.0 (see my humble contribution in #22591). I think this difference needs some more explanation...

      Regards,
      Flo
    • Show all 15 messages in this topic