Loading ...
Sorry, an error occurred while loading the content.

Re: [NTB] RegEx multiple find over hard line breaks

Expand Messages
  • Alec Burgess
    Mike: I think what you need is the dot matches all option at the beginning of your regex pattern: (?s)---- regexp pattern including dots --- If included then a
    Message 1 of 5 , Apr 19, 2011
    • 0 Attachment
      Mike:
      I think what you need is the dot matches all option at the beginning of
      your regex pattern:
      (?s)---- regexp pattern including dots ---
      If included then a dot (or .+ or .* or .+?) will consume/match all
      characters including linefeeds.
      See Help on Regular expressions - Internal Option Setting.

      On 2011-04-19 20:07, Mike Lenahan wrote:
      > O wise ones
      > I am having a bit of a problem getting my head around RegEx not having used Perl or anything similar in the past so please don't laugh when I ask some silly questions.
      >
      > I have some text files (60M) and have to find and delete certain portions of the text between two patterns of text, which might involve chars such as ÀÊ.
      >
      > The two patterns have an unknown number of characters (including hard returns) between them. I can find and delete the text between using something like:
      >
      > ^!Replace "string1.+\R(?=\R).+?string2">> "" AWRS
      >
      > but if the two strings go over an unknown number of hard returns then I can't get the between text.
      >
      > I have tried playing about with the {0,} but can't seem to get it working. The last effort is
      >
      > ^!find "string1.+(.+|\r\n|\n|\r)(?1){0,}string2" CIWRS
      >
      > Which, I thought would match string1 followed by multiple chars (up to a line end) OR hard return (\r\n) OR newline (\n) OR return (\r) repeated 0 or more times then string2.
      >
      > What am I missing / doing wrong?
      >
      > Regards
      >
      > Mike

      Regards ... Alec (buralex@gmail& WinLiveMess - alec.m.burgess@skype)
    • Mike Lenahan
      O wise ones I am having a bit of a problem getting my head around RegEx not having used Perl or anything similar in the past so please don t laugh when I ask
      Message 2 of 5 , Apr 19, 2011
      • 0 Attachment
        O wise ones
        I am having a bit of a problem getting my head around RegEx not having used Perl or anything similar in the past so please don't laugh when I ask some silly questions.

        I have some text files (60M) and have to find and delete certain portions of the text between two patterns of text, which might involve chars such as ÀÊ.

        The two patterns have an unknown number of characters (including hard returns) between them. I can find and delete the text between using something like:

        ^!Replace "string1.+\R(?=\R).+?string2" >> "" AWRS

        but if the two strings go over an unknown number of hard returns then I can't get the between text.

        I have tried playing about with the {0,} but can't seem to get it working. The last effort is

        ^!find "string1.+(.+|\r\n|\n|\r)(?1){0,}string2" CIWRS

        Which, I thought would match string1 followed by multiple chars (up to a line end) OR hard return (\r\n) OR newline (\n) OR return (\r) repeated 0 or more times then string2.

        What am I missing / doing wrong?

        Regards

        Mike
      • Mike Lenahan
        Alex, That is exactly what I wanted. I have been pulling the ever thinning locks out since the weekend over this. Regards Mike
        Message 3 of 5 , Apr 19, 2011
        • 0 Attachment
          Alex,
          That is exactly what I wanted. I have been pulling the ever thinning locks out since the weekend over this.

          Regards

          Mike
          --- In notetab@yahoogroups.com, Alec Burgess <buralex@...> wrote:
          >
          > Mike:
          > I think what you need is the dot matches all option at the beginning of
          > your regex pattern:
          > (?s)---- regexp pattern including dots ---
          > If included then a dot (or .+ or .* or .+?) will consume/match all
          > characters including linefeeds.
          > See Help on Regular expressions - Internal Option Setting.
          >
          > On 2011-04-19 20:07, Mike Lenahan wrote:
          > > O wise ones
          > > I am having a bit of a problem getting my head around RegEx not having used Perl or anything similar in the past so please don't laugh when I ask some silly questions.
          > >
          > > I have some text files (60M) and have to find and delete certain portions of the text between two patterns of text, which might involve chars such as ÀÊ.
          > >
          > > The two patterns have an unknown number of characters (including hard returns) between them. I can find and delete the text between using something like:
          > >
          > > ^!Replace "string1.+\R(?=\R).+?string2">> "" AWRS
          > >
          > > but if the two strings go over an unknown number of hard returns then I can't get the between text.
          > >
          > > I have tried playing about with the {0,} but can't seem to get it working. The last effort is
          > >
          > > ^!find "string1.+(.+|\r\n|\n|\r)(?1){0,}string2" CIWRS
          > >
          > > Which, I thought would match string1 followed by multiple chars (up to a line end) OR hard return (\r\n) OR newline (\n) OR return (\r) repeated 0 or more times then string2.
          > >
          > > What am I missing / doing wrong?
          > >
          > > Regards
          > >
          > > Mike
          >
          > Regards ... Alec (buralex@gmail& WinLiveMess - alec.m.burgess@skype)
          >
        • John Shotsky
          It will be easier for us to help you if you post some actual data – what do you want to find, and what does the data look like? It s easy to skip past line
          Message 4 of 5 , Apr 20, 2011
          • 0 Attachment
            It will be easier for us to help you if you post some actual data – what do you want to find, and what does the data
            look like? It's easy to skip past line ends, but you probably already know that.



            Regards,

            John

            <http://recipetools.gotdns.com/> RecipeTools site

            <http://groups.yahoo.com/group/RecipeTools/> RecipeTools Yahoo group

            <http://shotsky.gotdns.com/index.htm> Beaverton Weather



            From: notetab@yahoogroups.com [mailto:notetab@yahoogroups.com] On Behalf Of Mike Lenahan
            Sent: Tuesday, April 19, 2011 22:04
            To: notetab@yahoogroups.com
            Subject: [NTB] RegEx multiple find over hard line breaks





            O wise ones
            I am having a bit of a problem getting my head around RegEx not having used Perl or anything similar in the past so
            please don't laugh when I ask some silly questions.

            I have some text files (60M) and have to find and delete certain portions of the text between two patterns of text,
            which might involve chars such as ÀÊ.

            The two patterns have an unknown number of characters (including hard returns) between them. I can find and delete the
            text between using something like:

            ^!Replace "string1.+\R(?=\R).+?string2" >> "" AWRS

            but if the two strings go over an unknown number of hard returns then I can't get the between text.

            I have tried playing about with the {0,} but can't seem to get it working. The last effort is

            ^!find "string1.+(.+|\r\n|\n|\r)(?1){0,}string2" CIWRS

            Which, I thought would match string1 followed by multiple chars (up to a line end) OR hard return (\r\n) OR newline (\n)
            OR return (\r) repeated 0 or more times then string2.

            What am I missing / doing wrong?

            Regards

            Mike





            [Non-text portions of this message have been removed]
          Your message has been successfully submitted and would be delivered to recipients shortly.