Loading ...
Sorry, an error occurred while loading the content.
 

Re: GetDocMatchAll

Expand Messages
  • Sheri
    ... You would need to present your doc and clip for help explaining that. Two things come to mind. First, in the pattern I used: (?i)(^.* bTarget b.*)|(^.+)
    Message 1 of 19 , Dec 6, 2008
      --- In ntb-clips@yahoogroups.com, "ebbtidalflats" <ebbtidalflats@...>
      wrote:

      > the code does not work when I
      > remove the New Document command and the test text, then
      > run it on an open document.
      >
      > It still jumps to the end of the doc, and runs the same
      > from then on. And the clip removes 65 lines from the
      > document. But only 24 lines contained the target word,
      > some of which targets were NOT removed!

      You would need to present your doc and clip for help explaining that.
      Two things come to mind. First, in the pattern I used:

      (?i)(^.*\bTarget\b.*)|(^.+)

      the \b's require the target text to be whole words, e.g., "targets",
      "targeting" and "targeted" are not being in/excluded.

      Second, if you change the target text, you need to make sure any
      metacharacters it contains are escaped. You can't for example use a
      variable that might contain metacharacters like "^" without escaping
      them, like "\^".

      Flo has given a very good explanation of what the pattern does.
      Regular expressions search for text that affirmatively matches a
      pattern, not one that doesn't. Even when you use a negative assertion,
      you are actually searching for something that follows from a given
      position.

      Regular expressions always work from left to right. Both of the
      subpatterns match whole lines that are not empty lines. If a line
      fails to match the first subpattern (because it doesn't contain
      "target"), it must match the second subpattern because that subpattern
      matches anything (except linebreaks). The vertical bar says "or"
      between the subpatterns. But if it matches the first subpattern, it
      doesn't try the second. In the format string, we ask only for the
      second subpattern, aka $2 in GetDocListAll or just 2 in GetDocMatchAll.

      There was a long standing bug until the latest NoteTab version that
      prevented this type of pattern from working properly in
      GetDocMatchall. Cheers to Eric that its fixed now. :)

      Regards,
      Sheri
    • Sheri
      Hi Flo, ... That is probably the most direct approach. If for some reason we wanted the result in a variable instead of the document we could use the new
      Message 2 of 19 , Dec 7, 2008
        Hi Flo,

        > I think in a more conventional way we just would remove lines
        > containing "TARGET" in order to get lines NOT containing that word.
        > This still works with "good old" ^!Replace...
        >
        > ^!Replace "^.*TARGET.*(\r\n|\Z)" >> "" AWRS

        That is probably the most direct approach. If for some reason we
        wanted the result in a variable instead of the document we could use
        the new ^$GetDocReplaceall$ function instead. But I think you meant to
        use \z instead of \Z... :D

        Also my list pattern could have been improved resource-wise by not
        capturing substring #1 (since it wasn't needed), e.g.,

        "(?i)(?:^.*\bTarget\b.*)|(^.+)";"$1\r\n"

        >
        > This is my homework for today...

        A+ :D

        Regards,
        Sheri
      • ebbtidalflats
        Hi Flo, Sheri, That s a lot to digest, and it will take me awhile. I may not get a chance to try your suggestions until this weekend, thanks very much for your
        Message 3 of 19 , Dec 8, 2008
          Hi Flo, Sheri,


          That's a lot to digest, and it will take me awhile.
          I may not get a chance to try your suggestions until
          this weekend, thanks very much for your tips.

          The document I'm testing has 525 lines. A bit much to
          append. But it's all text, whole words, actually a
          bunch of random SQL statements, which I searched for
          "SELECT" and NOT "SELECT".



          Regards,


          Eb

          --- In ntb-clips@yahoogroups.com, "Sheri" <silvermoonwoman@...> wrote:
          >
          > > ...

          > You would need to present your doc and clip for help explaining that.
          > Two things come to mind. First, in the pattern I used:
          >
          > (?i)(^.*\bTarget\b.*)|(^.+)
          >
          > the \b's require the target text to be whole words, e.g., "targets",
          > "targeting" and "targeted" are not being in/excluded.
          >
          > Second, if you change the target text, you need to make sure any
          > metacharacters it contains are escaped. You can't for example use a
          > variable that might contain metacharacters like "^" without escaping
          > them, like "\^".
          >
          > Flo has given a very good explanation of what the pattern does.
          > Regular expressions search for text that affirmatively matches a
          > pattern, not one that doesn't. Even when you use a negative assertion,
          > you are actually searching for something that follows from a given
          > position.
          >
          > Regular expressions always work from left to right. Both of the
          > subpatterns match whole lines that are not empty lines. If a line
          > fails to match the first subpattern (because it doesn't contain
          > "target"), it must match the second subpattern because that subpattern
          > matches anything (except linebreaks). The vertical bar says "or"
          > between the subpatterns. But if it matches the first subpattern, it
          > doesn't try the second. In the format string, we ask only for the
          > second subpattern, aka $2 in GetDocListAll or just 2 in GetDocMatchAll.
          >
        • ebbtidalflats
          Sheri, Flo, Thanks very much for your help and explanations. It turns out that there were two reasons Sheri s clip didn t work on MY document: 1. I screwed up
          Message 4 of 19 , Dec 11, 2008
            Sheri, Flo,

            Thanks very much for your help and explanations.

            It turns out that there were two reasons Sheri's clip didn't work
            on MY document:

            1. I screwed up
            -- didn't replace "target" with the real keyword D=8.

            This resulted in NO lines being removed by THIS test,
            but see below

            2. The code did strip blank lines, intended to remove
            those created by the replace algorithm.

            There happened to be 25 original blank lines
            and 24 keywords in the doc. Coincident!
            25 lines removed, verifying code faked out.

            Result: Wild Goose Chase.

            When I finally pinned this down, I moved the EoLs into the search
            pattern to handle them in the same step. The working part is now down
            to 3 lines:

            ;Sheri's fix, modified to single step
            ;long line ---
            ^!Set
            %keepers%="^$GetDocReplaceAll("(?i)(^.*\bSELECT\b.*\r\n)|(^.+\r\n)";"$2")$"
            ;end long line ---
            ^!Select All
            ^%keepers%
            ;clip end ---


            As long as blocks of text are organized in single lines, the algorithm
            can remove sentences (lines) containing keywords.
            It's complement removes sentences NOT containing the keyword.



            Regards,


            Eb
          • Sheri
            ... %keepers%= ^$GetDocReplaceAll( (?i)(^.* bSELECT b.* r n)|(^.+ r n) ; $2 )$ ... Hi Eb, It s not necessary to use the alternation and substrings if using
            Message 5 of 19 , Dec 11, 2008
              --- In ntb-clips@yahoogroups.com, "ebbtidalflats" <ebbtidalflats@...>
              wrote:
              >
              > Sheri, Flo,
              >
              > Thanks very much for your help and explanations.
              >
              > It turns out that there were two reasons Sheri's clip didn't work
              > on MY document:
              >
              > 1. I screwed up
              > -- didn't replace "target" with the real keyword D=8.
              >
              > This resulted in NO lines being removed by THIS test,
              > but see below
              >
              > 2. The code did strip blank lines, intended to remove
              > those created by the replace algorithm.
              >
              > There happened to be 25 original blank lines
              > and 24 keywords in the doc. Coincident!
              > 25 lines removed, verifying code faked out.
              >
              > Result: Wild Goose Chase.
              >
              > When I finally pinned this down, I moved the EoLs into the search
              > pattern to handle them in the same step. The working part is now down
              > to 3 lines:
              >
              > ;Sheri's fix, modified to single step
              > ;long line ---
              > ^!Set
              >
              %keepers%="^$GetDocReplaceAll("(?i)(^.*\bSELECT\b.*\r\n)|(^.+\r\n)";"$2")$"
              > ;end long line ---
              > ^!Select All
              > ^%keepers%
              > ;clip end ---
              >
              >
              > As long as blocks of text are organized in single lines, the
              > algorithm can remove sentences (lines) containing keywords. It's
              > complement removes sentences NOT containing the keyword.

              Hi Eb,

              It's not necessary to use the alternation and substrings if using
              ^$GetDocReplaceAll$.

              You would just replace the matching lines with an empty string, just
              like in regex ^!Replace.

              Main difference is, no special switches. If text is selected it
              applies only in the selection. Otherwise it applies to the whole
              document. Other difference is the result isn't pasted into the
              document window (unless you choose to paste it).

              e.g.,

              ^!Set %keepers%="^$GetDocReplaceAll("(?i)(^.*\bSELECT\b.*\r\n)";"")$"

              Regards,
              Sheri
            • ebbtidalflats
              ... Thanks Sheri, I _LIKE_ shorter code. Eb
              Message 6 of 19 , Dec 11, 2008
                --- In ntb-clips@yahoogroups.com, "Sheri" <silvermoonwoman@...> wrote:
                >
                > --- In ntb-clips@yahoogroups.com, "ebbtidalflats" <ebbtidalflats@>
                > wrote:
                > >
                > It's not necessary to use the alternation and substrings if using
                > ^$GetDocReplaceAll$.
                >
                > You would just replace the matching lines with an empty string, just
                > like in regex ^!Replace.
                >
                > Main difference is, no special switches. If text is selected it
                > applies only in the selection. Otherwise it applies to the whole
                > document. Other difference is the result isn't pasted into the
                > document window (unless you choose to paste it).
                >
                > e.g.,
                >
                > ^!Set %keepers%="^$GetDocReplaceAll("(?i)(^.*\bSELECT\b.*\r\n)";"")$"
                >


                Thanks Sheri,

                I _LIKE_ shorter code.

                Eb
              • hsavage
                My regex education is lacking and I would appreciate assistance from our regex stars if they have time. I m trying to refine a browser selection clip using,
                Message 7 of 19 , Dec 12, 2008
                  My regex education is lacking and I would appreciate assistance from our
                  regex stars if they have time.

                  I'm trying to refine a browser selection clip using,
                  ^!Set %browser%=^$GetDocMatchaLL("^\[.+\]")$
                  to collect the name headings within the browsers.dat file.

                  The line above works fine but requires 2 ^$StrReplace( lines to get rid
                  of the brackets. Is there a regex answer that will find the bracketed
                  titles and return only the text between them using a modified example of
                  the line above.

                  Help is appreciated.

                  ·············································
                  ºvº SL_day# 347 - created 2008.12.12_15.04.47

                  World's Shortest Books
                  • Different Ways To Spell Bob

                  € hrs € hsavage € pobox € com
                • Flo
                  ... Harvey, Try... ^!Set %browser%=^$GetDocMatchAll( ^ [ K[^]]+ )$ ^!Info ^%browser% In my tests, the closing bracket ] needs not to be escaped inside the
                  Message 8 of 19 , Dec 12, 2008
                    --- In ntb-clips@yahoogroups.com, hsavage <hsavage@...> wrote:
                    >
                    > I'm trying to refine a browser selection clip using,
                    > ^!Set %browser%=^$GetDocMatchaLL("^\[.+\]")$
                    > to collect the name headings within the browsers.dat file.
                    > The line above works fine but requires 2 ^$StrReplace( lines to get
                    > rid of the brackets. Is there a regex answer that will find the
                    > bracketed titles and return only the text between them...

                    Harvey,

                    Try...

                    ^!Set %browser%=^$GetDocMatchAll("^\[\K[^]]+")$
                    ^!Info ^%browser%

                    In my tests, the closing bracket "]" needs not to be escaped inside
                    the Character Class. If you have any problems with that, try [^\]] or
                    [^\x5D].

                    Regards,
                    Flo
                     
                  • hsavage
                    ... Flo, Thanks very much, it appears to work exactly as I wanted and needed.
                    Message 9 of 19 , Dec 12, 2008
                      Flo wrote:
                      > Harvey,
                      >
                      > Try...
                      >
                      > ^!Set %browser%=^$GetDocMatchAll("^\[\K[^]]+")$
                      > ^!Info ^%browser%
                      >
                      > In my tests, the closing bracket "]" needs not to be escaped inside
                      > the Character Class. If you have any problems with that, try [^\]] or
                      > [^\x5D].
                      >
                      > Regards,
                      > Flo

                      Flo,

                      Thanks very much, it appears to work exactly as I wanted and needed.

                      ·············································
                      ºvº SL_day# 347 - created 2008.12.12_17.48.10

                      World's Shortest Books
                      • Different Ways To Spell Bob

                      € hrs € hsavage € pobox € com
                    • Sheri
                      ... Hi Harvey, Flo, You consider adding r n [ into the character class e.g.: ^ [ K[^ r n [ ]]+ Although you wouldn t expect to encounter a situation where a
                      Message 10 of 19 , Dec 12, 2008
                        --- In ntb-clips@yahoogroups.com, "Flo" <flo.gehrke@...> wrote:
                        >
                        > --- In ntb-clips@yahoogroups.com, hsavage <hsavage@> wrote:
                        > >
                        > > I'm trying to refine a browser selection clip using,
                        > > ^!Set %browser%=^$GetDocMatchaLL("^\[.+\]")$
                        > > to collect the name headings within the browsers.dat file.
                        > > The line above works fine but requires 2 ^$StrReplace( lines to get
                        > > rid of the brackets. Is there a regex answer that will find the
                        > > bracketed titles and return only the text between them...
                        >
                        > Harvey,
                        >
                        > Try...
                        >
                        > ^!Set %browser%=^$GetDocMatchAll("^\[\K[^]]+")$
                        > ^!Info ^%browser%
                        >
                        > In my tests, the closing bracket "]" needs not to be escaped inside
                        > the Character Class. If you have any problems with that, try [^\]] or
                        > [^\x5D].
                        >
                        > Regards,
                        > Flo
                        >  
                        >

                        Hi Harvey, Flo,

                        You consider adding \r\n\[ into the character class e.g.:

                        "^\[\K[^\r\n\[\]]+"

                        Although you wouldn't expect to encounter a situation where a closing
                        bracket is missing and another opening bracket exists before a closing
                        bracket, as is, the pattern would match across multiple lines right up
                        to the next closing bracket.

                        Regards,
                        Sheri
                      • Sheri
                        ... Oops my fingers got ahead of me, I meant to say You might want to consider ... :)
                        Message 11 of 19 , Dec 12, 2008
                          --- In ntb-clips@yahoogroups.com, "Sheri" <silvermoonwoman@...> wrote:
                          > Hi Harvey, Flo,
                          >
                          > You consider adding \r\n\[ into the character class e.g.:

                          Oops my fingers got ahead of me, I meant to say "You might want to
                          consider ..." :)
                        • hsavage
                          ... Sheri, Flo, This works also, thanks again. For those interested this clip picks up the title of the browsers.dat entries, creates an Array from them and
                          Message 12 of 19 , Dec 12, 2008
                            Sheri wrote:
                            >
                            > Hi Harvey, Flo,
                            >
                            > You consider adding \r\n\[ into the character class e.g.:
                            >
                            > "^\[\K[^\r\n\[\]]+"
                            >
                            > Although you wouldn't expect to encounter a situation where a closing
                            > bracket is missing and another opening bracket exists before a closing
                            > bracket, as is, the pattern would match across multiple lines right up
                            > to the next closing bracket.
                            >
                            > Regards,
                            > Sheri

                            Sheri, Flo,

                            This works also, thanks again.

                            For those interested this clip picks up the title of the browsers.dat
                            entries, creates an Array from them and uses any number of them to view
                            a html file.

                            Users can choose 1 or greater number of browsers to view the file at the
                            same time.

                            I currently use 6 browsers and can view the file in all 6 by selecting
                            all when the clip is run.

                            Of course, the browsers.dat entries must work correctly for the clip to
                            work.

                            H="MultiBrowsers"
                            ; • Modified-Updated~Created_2008.12.12
                            ; • hrs ø hsavage·pobox·com_09:45:31p
                            ; • Uses Browsers.Dat File Entries
                            ; • create necessary entries in browsers.dat
                            ; • to exclude browsers.dat entries from clip list
                            ; • place semi-colon before name in browsers.dat
                            ^!ClearVariables
                            ^!SetScreenUpdate 0
                            ^!Set %di%=C:\Documents and Settings\User\desktop\emdoc.emd
                            ^!SetWizardWidth 100
                            ^!SetWizardTitle "Select ALTERNATE Browsers"
                            ^!SetWizardLabel "PICK A BROWSER,^%nL%MAY SELECT MORE THAN ONE -"
                            ^!SetListDelimiter |
                            ^!Open ^$GetAppPath$browsers.dat
                            ^!Set %browser%=^$GetDocMatchAll("^\[\K[^\r\n\[\]]+")$
                            ^!Close
                            ^!SetDocIndex ^$GetDocIndex(^%di%)$
                            ; • this first ^!Set %url% line is very long, may get wrapped in email.
                            ^!Set %url%=^?{(T=O;H=11)VIEW THIS FILE, OR, SELECT
                            ANOTHER==C:\Documents and Settings\User\desktop\emdoc.emd};
                            %browser%="^?{(T=A;H=9)BROWSER TO VIEW FILE WITH==^%browser%}"
                            ;
                            ^!Set %url%=^$StrReplace("|";":";"^$FileToUrl(^%url%)$";0;0)$
                            ^!SetArray %browser%=^%browser%
                            ^!Set %count%=^%browser0%; %loop%=0
                            :LOOP
                            ^!Inc %loop%
                            ^!Url ["^%browser^%loop%%"] "^%url%"
                            ^!If ^%loop% < ^%count% LOOP

                            ·············································
                            ºvº SL_day# 347 - created 2008.12.12_23.14.23

                            World's Shortest Books
                            • Different Ways To Spell Bob

                            € hrs € hsavage € pobox € com
                          • hsavage
                            Sheri, Flo, All, In the previous cut&paste some of the variables in the clip were expanded, hopefully this will take care of that problem. H= MultiBrowsers ;
                            Message 13 of 19 , Dec 12, 2008
                              Sheri, Flo, All,

                              In the previous cut&paste some of the variables in the clip were
                              expanded, hopefully this will take care of that problem.

                              H="MultiBrowsers"
                              ; • Modified-Updated~Created_2008.12.12
                              ; • hrs ø hsavage·pobox·com_09:45:31p
                              ; • Uses Browsers.Dat File Entries
                              ; • create necessary entries in browsers.dat
                              ; • to exclude browsers.dat entries from clip list
                              ; • place semi-colon before name in browsers.dat
                              ^!ClearVariables
                              ^!SetScreenUpdate 0
                              ^!Set %di%=^##
                              ^!SetWizardWidth 100
                              ^!SetWizardTitle "Select ALTERNATE Browsers"
                              ^!SetWizardLabel "PICK A BROWSER,^%nL%MAY SELECT MORE THAN ONE -"
                              ^!SetListDelimiter |
                              ^!Open ^$GetAppPath$browsers.dat
                              ^!Set %browser%=^$GetDocMatchAll("^\[\K[^\r\n\[\]]+")$
                              ^!Close
                              ^!SetDocIndex ^$GetDocIndex(^%di%)$
                              ; • this first ^!Set %url% line is very long, may get wrapped in email.
                              ^!Set %url%=^?{(T=O;H=11)VIEW THIS FILE, OR, SELECT ANOTHER==^##};
                              %browser%="^?{(T=A;H=9)BROWSER TO VIEW FILE WITH==^%browser%}"
                              ;
                              ^!Set %url%=^$StrReplace("|";":";"^$FileToUrl(^%url%)$";0;0)$
                              ^!SetArray %browser%=^%browser%
                              ^!Set %count%=^%browser0%; %loop%=0
                              :LOOP
                              ^!Inc %loop%
                              ^!Url ["^%browser^%loop%%"] "^%url%"
                              ^!If ^%loop% < ^%count% LOOP

                              --
                              ·············································
                              ºvº SL_day# 347 - created 2008.12.12_23.26.11

                              World's Shortest Books
                              • Different Ways To Spell Bob

                              € hrs € hsavage € pobox € com
                            Your message has been successfully submitted and would be delivered to recipients shortly.