Loading ...
Sorry, an error occurred while loading the content.
 

[Clip] Re: Using the \G assertion

Expand Messages
  • Sheri
    ... Everything you said is essentially right, but asterisk is greedy, not non-greedy. Doesn t matter to the success of the pattern because it also works with
    Message 1 of 25 , Oct 21, 2008
      --- In ntb-clips@yahoogroups.com, "Don - HtmlFixIt.com" <don@...> wrote:
      >
      > >>>> \G(?:(?!19)\d\d\d\d)*(19\d\d)
      >
      >
      > zero or more non-greedy

      Everything you said is essentially right, but asterisk is greedy, not
      non-greedy. Doesn't matter to the success of the pattern because it
      also works with *? (non-greedy) and *+ (possessive). Using the
      possessive form might be the most efficient choice here. Also, I've
      read that its usually more efficient to use repeats like \d{4} than not.

      Regards,
      Sheri
    • Don - HtmlFixIt.com
      Just for the record I sent my ahh before you sent your can t you read regex dummy email ;-) I think I do get it. With doc match all, it extracts all of them.
      Message 2 of 25 , Oct 21, 2008
        Just for the record I sent my ahh before you sent your can't you read
        regex dummy email ;-)

        I think I do get it.

        With doc match all, it extracts all of them.

        Otherwise you would still need to move the cursor to the end of the find
        to get the next one, correct Sheri?

        In Perl \G automatically moves to the end of the find, but we need to do
        it manually in PRCE or whatever flavor this is?


        > The asterisk makes it match zero or more occurrences of four digits
        > that don't start with 19. If string starts with 1904, the whole
        > pattern will still match: it has zero occurrences of years that don't
        > begin with 19 followed by one that does begin with 19. Get it?
        >
        > Regards,
        > Sheri
      • Flo
        ... Jane, Sheri, Thanks — that s a great breakthrough! Foolishly, I ve tested solutions like that before because it s quite similar to what is recommended by
        Message 3 of 25 , Oct 21, 2008
          --- In ntb-clips@yahoogroups.com, "Sheri" <silvermoonwoman@...> wrote:
          >
          > > using
          > >  
          > > \G(?:(?!19)\d\d\d\d)*(19\d\d)
          > >  (...)
          > > it cycles through the string 4 numbers at a time and captures
          > > 1902 and 1949 only.
          > > (...)
          > That's quite good! Works fine:
          >
          > ^!Toolbar New Document
          > ^!SetWordWrap off
          > 1719180618191902194920192050
          > ^!Jump Doc_Start
          > ^!Info Matches ^$GetDocMatchAll("\G(?:(?!19)\d{4})*(19\d{2})";1)$

          Jane, Sheri,

          Thanks — that's a great breakthrough!

          Foolishly, I've tested solutions like that before because it's quite
          similar to what is recommended by Jeffrey Friedl. Unfortunately, I
          placed \G in the wrong position:

          (?:(?!19)\d\d\d\d)*(\G19\d\d)

          I'm terribly sorry for this mistake! Thanks again to Jane for
          correcting this!

          Here are two more solutions mentioned by Friedl which work with
          Sheri's clip:

          ^!Info Matches ^$GetDocMatchAll("\G(?:[^1]\d\d\d|\d[^9]\d\d)*(19\d
          {2})";1)$

          ^!Info Matches ^$GetDocMatchAll("\G(?:\d\d\d\d)*?(19\d{2})";1)$

          For members not being familiar with \G it might be helpful to say: We
          still have to watch that the cursor is in the right position at the
          start of string. If we vary the clip like this...

          ^!Toolbar New Document
          ^!SetWordWrap off
          String for testing the \G assertion
          1719180618191902194920192050
          ^!Jump Doc_Start
          ^!Info Matches ^$GetDocMatchAll("\G(?:(?!19)\d{4})*(19\d{2})";1)$

          ...no year will be matched.

          So if that string occurs anywhere else in the document we have to add
          some code which brings the cursor to the start of the line containing
          that string (if I'm not missing another solution).

          Flo
           
        • hsavage
          ... Sheri, This example works extremely well, no time lag visible. What modifications would be used to match numbers in multiple lines. --
          Message 4 of 25 , Oct 21, 2008
            Sheri wrote:
            > --- In ntb-clips@yahoogroups.com, Jane Sedgewick <jane_sedgewick@...>
            > wrote:
            >
            >> I may have misunderstood what you are discussing, but if I do a
            >> Regex search on the test line
            >> 1719180618191902194920192050
            >>
            >> using
            >>
            >> \G(?:(?!19)\d\d\d\d)*(19\d\d)
            >>
            >> it cycles through the string 4 numbers at a time and captures
            >> 1902 and 1949 only.
            >>
            >> I know this is not in a clip but it should work there as well. Or
            >> am I missing the point.
            >>
            >
            >
            > That's quite good! Works fine:
            >
            > ^!Toolbar New Document
            > ^!SetWordWrap off
            > 1719180618191902194920192050
            > ^!Jump Doc_Start
            > ^!Info Matches ^$GetDocMatchAll("\G(?:(?!19)\d{4})*(19\d{2})";1)$
            >
            > Regards,
            > Sheri
            >
            Sheri,

            This example works extremely well, no time lag visible. What
            modifications would be used to match numbers in multiple lines.

            --
            ·············································
            ºvº SL_295 created_2008.10.21_09.33.39

            Top Things A Wife Won't Say
            • Your mother is way better than mine.
            € hrs € hsavage € pobox € com
          • Sheri
            ... Hi Harvey, I think I d do it like this: ^!SetWordWrap off ^!Select All ^!IfMatch (?s)^(( d{4})*+ xB6*+)+$ ^$GetDocReplaceAll( R ; xB6 )$ Next Else
            Message 5 of 25 , Oct 21, 2008
              hsavage wrote:
              > Sheri wrote:
              >
              >> --- In ntb-clips@yahoogroups.com, Jane Sedgewick <jane_sedgewick@...>
              >> wrote:
              >>
              >>
              >>> I may have misunderstood what you are discussing, but if I do a
              >>> Regex search on the test line
              >>> 1719180618191902194920192050
              >>>
              >>> using
              >>>
              >>> \G(?:(?!19)\d\d\d\d)*(19\d\d)
              >>>
              >>> it cycles through the string 4 numbers at a time and captures
              >>> 1902 and 1949 only.
              >>>
              >>> I know this is not in a clip but it should work there as well. Or
              >>> am I missing the point.
              >>>
              >>>
              >>
              >>
              >> That's quite good! Works fine:
              >>
              >> ^!Toolbar New Document
              >> ^!SetWordWrap off
              >> 1719180618191902194920192050
              >> ^!Jump Doc_Start
              >> ^!Info Matches ^$GetDocMatchAll("\G(?:(?!19)\d{4})*(19\d{2})";1)$
              >>
              >> Regards,
              >> Sheri
              >>
              >>
              > Sheri,
              >
              > This example works extremely well, no time lag visible. What
              > modifications would be used to match numbers in multiple lines.
              >
              >
              Hi Harvey,

              I think I'd do it like this:

              ^!SetWordWrap off
              ^!Select All
              ^!IfMatch "(?s)^((\d{4})*+\xB6*+)+$" "^$GetDocReplaceAll("\R";"\xB6")$"
              Next Else Error
              ^!Jump Doc_Start
              ^!Setlistdelimiter ^P
              ^!Info Matches ^P^$GetDocMatchAll("\G(?:(?!19)\d{4}|\R+)*+\K19\d{2}";0)$
              ^!Goto End
              :Error
              ^!Prompt Error: Input Data isn't lines with multiples of four digits

              The ifmatch part is just to make sure the document consists of multiples
              of 4-digits and line breaks. But IfMatch doesn't work if the comparison
              string has actual line breaks, so I substituted paragraph markers in
              that test.

              In the GetDocMatchAll, it could be 4 digits or one-or-more line breaks
              before a match of 19\d\d

              One or the other will always match from the end of a previous match.

              Regards,
              Sheri
            • Sheri
              ... Hi Flo, If using the ^$GetDoc... functions its not enough to move the cursor, we have to actually make a selection from the desired starting position to
              Message 6 of 25 , Oct 22, 2008
                Flo wrote:
                > --- In ntb-clips@yahoogroups.com, "Sheri" <silvermoonwoman@...> wrote:
                >
                >>> using
                >>>
                >>> \G(?:(?!19)\d\d\d\d)*(19\d\d)
                >>> (...)
                >>> it cycles through the string 4 numbers at a time and captures
                >>> 1902 and 1949 only.
                >>> (...)
                >>>
                >> That's quite good! Works fine:
                >>
                >> ^!Toolbar New Document
                >> ^!SetWordWrap off
                >> 1719180618191902194920192050
                >> ^!Jump Doc_Start
                >> ^!Info Matches ^$GetDocMatchAll("\G(?:(?!19)\d{4})*(19\d{2})";1)$
                >>
                >
                > Jane, Sheri,
                >
                > Thanks — that's a great breakthrough!
                >
                > Foolishly, I've tested solutions like that before because it's quite
                > similar to what is recommended by Jeffrey Friedl. Unfortunately, I
                > placed \G in the wrong position:
                >
                > (?:(?!19)\d\d\d\d)*(\G19\d\d)
                >
                > I'm terribly sorry for this mistake! Thanks again to Jane for
                > correcting this!
                >
                > Here are two more solutions mentioned by Friedl which work with
                > Sheri's clip:
                >
                > ^!Info Matches ^$GetDocMatchAll("\G(?:[^1]\d\d\d|\d[^9]\d\d)*(19\d
                > {2})";1)$
                >
                > ^!Info Matches ^$GetDocMatchAll("\G(?:\d\d\d\d)*?(19\d{2})";1)$
                >
                > For members not being familiar with \G it might be helpful to say: We
                > still have to watch that the cursor is in the right position at the
                > start of string. If we vary the clip like this...
                >
                > ^!Toolbar New Document
                > ^!SetWordWrap off
                > String for testing the \G assertion
                > 1719180618191902194920192050
                > ^!Jump Doc_Start
                > ^!Info Matches ^$GetDocMatchAll("\G(?:(?!19)\d{4})*(19\d{2})";1)$
                >
                > ...no year will be matched.
                >
                > So if that string occurs anywhere else in the document we have to add
                > some code which brings the cursor to the start of the line containing
                > that string (if I'm not missing another solution).
                >
                Hi Flo,

                If using the ^$GetDoc... functions its not enough to move the cursor, we
                have to actually make a selection from the desired starting position to
                the end of the document (or other ending position). Otherwise, it
                usually acts on the whole document. I've noticed that it doesn't act on
                anything at all immediately after plain text insertion of the document
                text. If the text is inserted using ^!InsertText instead, it does work
                (i.e., on the whole document).

                ^!Toolbar New Document
                1719180618191902194920192050
                ^!Info Matches ^$GetDocMatchAll("\G(?:[^1]\d\d\d|\d[^9]\d\d)*(19\d{2})";1)$

                vs.

                ^!Toolbar New Document
                ^!InsertText 1719180618191902194920192050
                ^!Info Matches ^$GetDocMatchAll("\G(?:[^1]\d\d\d|\d[^9]\d\d)*(19\d{2})";1)$

                Regards,
                Sheri
              • Axel Berger
                ... That s as it should be. ^!InsertSelect (and probably what you were using) leaves the insertion selected while ^!InsertText deselects it. In one of my clips
                Message 7 of 25 , Oct 22, 2008
                  Sheri wrote:
                  > I've noticed that it doesn't act on anything at all immediately
                  > after plain text insertion of the document text. If the text is
                  > inserted using ^!InsertText instead, it does work (i.e., on the
                  > whole document).

                  That's as it should be. ^!InsertSelect (and probably what you were
                  using) leaves the insertion selected while ^!InsertText deselects it.

                  In one of my clips I use several
                  ^!InsertSelect ^$StrReplace("old";"new";"^$GetSelection$";FALSE;FALSE)$
                  in a row and do the last one with ^!InsertText before looping to the
                  next ^!Find.

                  Axel
                • Sheri
                  ... The issue occurs only when the text is clip-inserted without using a command to do it. In that case, you need some some command, any command, between that
                  Message 8 of 25 , Oct 22, 2008
                    --- In ntb-clips@yahoogroups.com, Axel Berger <Axel-Berger@...> wrote:

                    > That's as it should be.

                    The issue occurs only when the text is clip-inserted without using a
                    command to do it. In that case, you need some some command, any
                    command, between that line and the one that uses a ^$GetDoc... function.

                    Fails:
                    ^!Toolbar New Document
                    1719180618191902194920192050
                    ^!Info Matches^%NL%^$GetDocMatchAll(".";0)$

                    Works:
                    ^!Toolbar New Document
                    1719180618191902194920192050
                    ^!Info
                    ^!Info Matches^%NL%^$GetDocMatchAll(".";0)$

                    Regards,
                    Sheri
                  • Flo
                    ... Sheri, Sorry for not mentioning the selection. With regard to your latest advice, the following clip works for me when run on a multiple line text...
                    Message 9 of 25 , Oct 23, 2008
                      --- In ntb-clips@yahoogroups.com, Sheri <silvermoonwoman@...> wrote:
                      >
                      > If using the ^$GetDoc... functions its not enough to move the
                      > cursor, we have to actually make a selection from the desired
                      > starting position to the end of the document (or other ending
                      > position)...

                      Sheri,

                      Sorry for not mentioning the selection. With regard to your latest
                      advice, the following clip works for me when run on a multiple line
                      text...

                      ^!Toolbar New Document
                      ^!SetWordWrap off
                      String for testing the \G assertion
                      1719180618191902194920192050
                      Matching years of the 20th century
                      ^!SetCursor 2:1
                      ^!Select EOL
                      ^!If ^$Calc(^$GetSelSize$ MOD 4)$ <> 0 Error Else Next
                      ^!Info Matches ^$GetDocMatchAll("\G(?:(?!19)\d{4})*(19\d{2})";1)$
                      ^!Goto End

                      :Error
                      ^!Prompt Error: Input Data isn't lines with multiples of four digits

                      I hope you will agree with my alternative way of checking 4-digit-
                      multiples. It tries to comply with Don's idea in message #18573.

                      Flo
                       
                    Your message has been successfully submitted and would be delivered to recipients shortly.