Loading ...
Sorry, an error occurred while loading the content.

22606Re: [Clip] Re: FIND in NTL V7pr3 still acting odd

Expand Messages
  • Eric Fookes
    Apr 16, 2012
    • 0 Attachment
      Here's an extract of the explanation I sent to Flo in private mail:

      NoteTab embeds the PCRE engine through the DIRegEx library developed by
      Ralf Junker:

      http://www.yunqa.de/delphi/doku.php/products/regex/index

      Although bugs can appear in the NoteTab code that integrates and invokes
      DIRegEx, it seems not to be the case with the latest issues you found.
      You can verify this by downloading and testing Ralf's DIRegEx_Workbench
      program from here (no install required):

      http://www.fookes.com/ftp/DIRegEx_Workbench.zip

      The default RE options are the same as those used in NoteTab.

      If you find a search that works in DIRegEx_Workbench but not in NoteTab,
      then the bug is probably in NoteTab. However, if a search fails in both
      programs, then the bug is probably in PCRE or DIRegEx.

      Or you have a mistake in your pattern.

      --
      Regards,

      Eric Fookes
      http://www.fookes.com/

      On 14/04/2012 13:19, flo.gehrke wrote:
      > --- In ntb-clips@yahoogroups.com, "John Shotsky"<jshotsky@...> wrote:
      >>
      >> \R is included in \s, so you need no \R if you have \s present.
      >> You will never capture a \R following a \s, because the
      >> \s will capture it first.
      >
      >> You will never capture a \R following a \s, because the
      >> \s will capture it first.
      >
      > So how do you explain that the two CRNL between 'xxx' and 'yyy' in...
      >
      > www
      > xxx
      >
      > yyy
      > zzz
      >
      > are both matched with '\R(\s)*\R' or '\R(\s)+\R' in NT 6.2 and NT 7.0 as well? And why, in NT 6.2, both CRNL are matched with '\R\s*\R' or ''\R\s+\R'?
      >
      > So, IMHO, your statement is not fully convincing (this pertains to Alec #22593 too). See the PCRE documentation on "PCRE Pattern / Newline sequences":
      >
      > "Outside a character class, by default, the escape sequence \R matches any Unicode newline sequence...In non-UTF-8 mode \R is equivalent to the following: (?>\r\n|\n|\x0b|\f|\r|\x85)"
      >
      > Since CRNL, in Windows, consists of CR + NL, the sequence of two CRNL actually represents four characters. So this explains the abovementioned matching. You can compare it with 'xxxx' being matched with 'x{1,2}xxx{1,2}'.
      >
      > If I'm not mistaken, the crux in Joy's problem was the DIFFERENCE between the behavior of NT 6.2 and NT 7.0 (see my humble contribution in #22591). I think this difference needs some more explanation...
      >
      > Regards,
      > Flo
    • Show all 15 messages in this topic