Loading ...
Sorry, an error occurred while loading the content.

RE: [Clip] regex to delete all lines not containing

Expand Messages
  • John Shotsky
    There is no file size limitation. but what are you trying to do? If it s what I think, I d probably just use a replace to tag the beginnings of lines with a
    Message 1 of 9 , Sep 3, 2011
    • 0 Attachment
      There is no file size limitation. but what are you trying to do?
      If it's what I think, I'd probably just use a replace to tag the beginnings of lines with a special character that DO
      contain what you want, then delete all the others. Then remove the tag.
      Three lines of fast running code.

      Regards,
      John

      From: ntb-clips@yahoogroups.com [mailto:ntb-clips@yahoogroups.com] On Behalf Of Don
      Sent: Saturday, September 03, 2011 07:46
      To: ntb-clips@yahoogroups.com
      Subject: [Clip] regex to delete all lines not containing


      Suddenly not working ... thought it did.

      File has 1300 lines, is there a file size max on this working?

      ^!Set %DataTested%=^?{RegEx Term to Search For, Pipe Separated "or"}
      ^!Select All
      ^!SetListDelimiter ^P
      ^!Set %DataOutput%="^$GetDocMatchAll("^.*(^%DataTested%).*$")$"
      ^!InsertText ^%DataOutput%



      [Non-text portions of this message have been removed]
    • Don
      this works fine on 500 and 1000 lines but not on 1500 lines it removes all lines not containing something which can be alternate words (since it s regex) so
      Message 2 of 9 , Sep 3, 2011
      • 0 Attachment
        this works fine on 500 and 1000 lines but not on 1500 lines

        it removes all lines not containing something which can be alternate
        words (since it's regex) so there must be a limitation

        I used to do it with a several loop clip.

        On 9/3/2011 10:54 AM, John Shotsky wrote:
        > There is no file size limitation. but what are you trying to do?
        > If it's what I think, I'd probably just use a replace to tag the beginnings of lines with a special character that DO
        > contain what you want, then delete all the others. Then remove the tag.
        > Three lines of fast running code.
        >
        > Regards,
        > John
        >
        > From: ntb-clips@yahoogroups.com [mailto:ntb-clips@yahoogroups.com] On Behalf Of Don
        > Sent: Saturday, September 03, 2011 07:46
        > To: ntb-clips@yahoogroups.com
        > Subject: [Clip] regex to delete all lines not containing
        >
        >
        > Suddenly not working ... thought it did.
        >
        > File has 1300 lines, is there a file size max on this working?
        >
        > ^!Set %DataTested%=^?{RegEx Term to Search For, Pipe Separated "or"}
        > ^!Select All
        > ^!SetListDelimiter ^P
        > ^!Set %DataOutput%="^$GetDocMatchAll("^.*(^%DataTested%).*$")$"
        > ^!InsertText ^%DataOutput%
        >
        >
      • John Shotsky
        I have a 900K regex library, and I run it on 100,000 line files with no such limitations. I suspect there is something else going on – computer memory, swap
        Message 3 of 9 , Sep 3, 2011
        • 0 Attachment
          I have a 900K regex library, and I run it on 100,000 line files with no such limitations. I suspect there is something
          else going on � computer memory, swap file, unexpected stuff in the file, etc. Since you appear to not want to provide
          an example, all we can do is guess.

          Regards,
          John

          From: ntb-clips@yahoogroups.com [mailto:ntb-clips@yahoogroups.com] On Behalf Of Don
          Sent: Saturday, September 03, 2011 10:14
          To: ntb-clips@yahoogroups.com
          Subject: Re: [Clip] regex to delete all lines not containing


          this works fine on 500 and 1000 lines but not on 1500 lines

          it removes all lines not containing something which can be alternate
          words (since it's regex) so there must be a limitation

          I used to do it with a several loop clip.

          On 9/3/2011 10:54 AM, John Shotsky wrote:
          > There is no file size limitation. but what are you trying to do?
          > If it's what I think, I'd probably just use a replace to tag the beginnings of lines with a special character that DO
          > contain what you want, then delete all the others. Then remove the tag.
          > Three lines of fast running code.
          >
          > Regards,
          > John
          >
          > From: ntb-clips@yahoogroups.com <mailto:ntb-clips%40yahoogroups.com> [mailto:ntb-clips@yahoogroups.com
          <mailto:ntb-clips%40yahoogroups.com> ] On Behalf Of Don
          > Sent: Saturday, September 03, 2011 07:46
          > To: ntb-clips@yahoogroups.com <mailto:ntb-clips%40yahoogroups.com>
          > Subject: [Clip] regex to delete all lines not containing
          >
          >
          > Suddenly not working ... thought it did.
          >
          > File has 1300 lines, is there a file size max on this working?
          >
          > ^!Set %DataTested%=^?{RegEx Term to Search For, Pipe Separated "or"}
          > ^!Select All
          > ^!SetListDelimiter ^P
          > ^!Set %DataOutput%="^$GetDocMatchAll("^.*(^%DataTested%).*$")$"
          > ^!InsertText ^%DataOutput%
          >
          >



          [Non-text portions of this message have been removed]
        • Axel Berger
          ... Don saves all his lines in an array. 1300 lines at, say, 60 characters makes 78000 bytes. This smells like a 64 kB array limit. Axel
          Message 4 of 9 , Sep 3, 2011
          • 0 Attachment
            John Shotsky wrote:
            > and I run it on 100,000 line files with no such limitations.

            Don saves all his lines in an array. 1300 lines at, say, 60 characters
            makes 78000 bytes. This smells like a 64 kB array limit.

            Axel
          • flo.gehrke
            ... Don, For me, your clip works fine. But occasionally we ve experienced that ^$GetDocMatchAll$ gets into trouble with the $ sign. In this case, you better
            Message 5 of 9 , Sep 3, 2011
            • 0 Attachment
              --- In ntb-clips@yahoogroups.com, Don <don@...> wrote:
              >
              > Suddenly not working ... thought it did.
              >
              > File has 1300 lines, is there a file size max on this working?
              >
              > ^!Set %DataTested%=^?{RegEx Term to Search For, Pipe Separated "or"}
              > ^!Select All
              > ^!SetListDelimiter ^P
              > ^!Set %DataOutput%="^$GetDocMatchAll("^.*(^%DataTested%).*$")$"
              > ^!InsertText ^%DataOutput%
              >

              Don,

              For me, your clip works fine. But occasionally we've experienced that ^$GetDocMatchAll$ gets into trouble with the '$' sign. In this case, you better replace '$' with ^%Dollar%...

              ^!Set %DataOutput%="^$GetDocMatchAll("^.*(^%DataTested%).*^%Dollar%")$"

              (cf the P.S. in message #18321 of Sep 6, 2008).

              On the other hand, '^$GetDocListAll' has proved to be more reliable...

              ^!Set %DataOutput%="^$GetDocListAll("^.*(^%DataTested%).*$";"$0\r\n")$"

              (no '^!SetListDelimiter' needed here).

              Regards,
              Flo
            • flo.gehrke
              ... John, Complicated, isn t it?. Why don t you write... ^!Set %Del%=^?{Remove lines not containing:} ^!Replace ^(?!.*(?:^%Del%) b).*( R| Z) WARS Watch
              Message 6 of 9 , Sep 4, 2011
              • 0 Attachment
                --- In ntb-clips@yahoogroups.com, "John Shotsky" <jshotsky@...> wrote:

                > I'd probably just use a replace to tag the beginnings of lines with
                > a special character that DO contain what you want, then delete all
                > the others. Then remove the tag.

                John,

                Complicated, isn't it?. Why don't you write...

                ^!Set %Del%=^?{Remove lines not containing:}
                ^!Replace "^(?!.*(?:^%Del%)\b).*(\R|\Z)" >> "" WARS

                Watch the '\b' -- it makes sure that, for example, 'Alfred' is matched but not 'Alfredo'.

                > There is no file size limitation.

                Agreed! (I tested it with 50,000 lines.)

                Regards,
                Flo
              • Don
                Okay, so I switched out to yesterday s suggestion of GetDocListAll. Seems to work. As to the sample file it was suggested that I seemed unwilling to provide a
                Message 7 of 9 , Sep 4, 2011
                • 0 Attachment
                  Okay, so I switched out to yesterday's suggestion of GetDocListAll.
                  Seems to work.

                  As to the sample file it was suggested that I seemed unwilling to
                  provide a sample. Quite the opposite, I was a bunch of delimited text
                  ... nothing special about it so I didn't see a need to send a sample.

                  We found solutions.

                  I will say that I typically am cleaning up team results so team names
                  are unique and so the \b in today's suggestion would not typically come
                  into play for me, but good to have there. What does the ?! at the
                  beginning do however? Negative look ahead? I don't get look aheads
                  fully just yet, when to use them, what to do with them.

                  I appreciate those that helped. Should it fail again of course I'll
                  write again and let your wise minds and willing spirits come back into play.

                  On 9/4/2011 7:45 AM, flo.gehrke wrote:
                  > --- In ntb-clips@yahoogroups.com, "John Shotsky" <jshotsky@...> wrote:
                  >
                  >> I'd probably just use a replace to tag the beginnings of lines with
                  >> a special character that DO contain what you want, then delete all
                  >> the others. Then remove the tag.
                  >
                  > John,
                  >
                  > Complicated, isn't it?. Why don't you write...
                  >
                  > ^!Set %Del%=^?{Remove lines not containing:}
                  > ^!Replace "^(?!.*(?:^%Del%)\b).*(\R|\Z)" >> "" WARS
                  >
                  > Watch the '\b' -- it makes sure that, for example, 'Alfred' is matched but not 'Alfredo'.
                  >
                  >> There is no file size limitation.
                  >
                  > Agreed! (I tested it with 50,000 lines.)
                  >
                  > Regards,
                  > Flo
                  >
                  >
                  >
                  >
                  > ------------------------------------
                  >
                  > Fookes Software: http://www.fookes.com/
                  > NoteTab website: http://www.notetab.com/
                  > NoteTab Discussion Lists: http://www.notetab.com/groups.php
                  >
                  > ***
                  > Yahoo! Groups Links
                  >
                  >
                  >
                  >
                • flo.gehrke
                  ... It says: Find a line where - beginning at the start of line ( ^ ) - you do NOT see the search string (^%Del%) from any position when looking ahead. The
                  Message 8 of 9 , Sep 5, 2011
                  • 0 Attachment
                    --- In ntb-clips@yahoogroups.com, Don <don@...> wrote:

                    >> ^!Set %Del%=^?{Remove lines not containing:}
                    >> ^!Replace "^(?!.*(?:^%Del%)\b).*(\R|\Z)" >> "" WARS

                    > What does the ?! at the beginning do however?
                    > Negative look ahead? I don't get look aheads
                    > fully just yet, when to use them, what to do with them.

                    It says: Find a line where - beginning at the start of line ('^') - you do NOT see the search string (^%Del%) from any position when looking ahead. The search string may be preceded or followed by any character 0 or more times ('.*'). If this true, replace that line including a CRNL with an empty string (i.e. delete it). With '\Z', it also matches at the end of the subject string where no CRNL follows.

                    Regards,
                    Flo
                  Your message has been successfully submitted and would be delivered to recipients shortly.