Loading ...
Sorry, an error occurred while loading the content.

RegEx multiple find over hard line breaks

Expand Messages
  • Mike Lenahan
    O wise ones I am having a bit of a problem getting my head around RegEx not having used Perl or anything similar in the past so please don t laugh when I ask
    Message 1 of 5 , Apr 19, 2011
    • 0 Attachment
      O wise ones
      I am having a bit of a problem getting my head around RegEx not having used Perl or anything similar in the past so please don't laugh when I ask some silly questions.

      I have some text files (60M) and have to find and delete certain portions of the text between two patterns of text, which might involve chars such as ÀÊ.

      The two patterns have an unknown number of characters (including hard returns) between them. I can find and delete the text between using something like:

      ^!Replace "string1.+\R(?=\R).+?string2" >> "" AWRS

      but if the two strings go over an unknown number of hard returns then I can't get the between text.

      I have tried playing about with the {0,} but can't seem to get it working. The last effort is

      ^!find "string1.+(.+|\r\n|\n|\r)(?1){0,}string2" CIWRS

      Which, I thought would match string1 followed by multiple chars (up to a line end) OR hard return (\r\n) OR newline (\n) OR return (\r) repeated 0 or more times then string2.

      What am I missing / doing wrong?

      Regards

      Mike
    • Mike Lenahan
      Alex, That is exactly what I wanted. I have been pulling the ever thinning locks out since the weekend over this. Regards Mike
      Message 2 of 5 , Apr 19, 2011
      • 0 Attachment
        Alex,
        That is exactly what I wanted. I have been pulling the ever thinning locks out since the weekend over this.

        Regards

        Mike
        --- In notetab@yahoogroups.com, Alec Burgess <buralex@...> wrote:
        >
        > Mike:
        > I think what you need is the dot matches all option at the beginning of
        > your regex pattern:
        > (?s)---- regexp pattern including dots ---
        > If included then a dot (or .+ or .* or .+?) will consume/match all
        > characters including linefeeds.
        > See Help on Regular expressions - Internal Option Setting.
        >
        > On 2011-04-19 20:07, Mike Lenahan wrote:
        > > O wise ones
        > > I am having a bit of a problem getting my head around RegEx not having used Perl or anything similar in the past so please don't laugh when I ask some silly questions.
        > >
        > > I have some text files (60M) and have to find and delete certain portions of the text between two patterns of text, which might involve chars such as ÀÊ.
        > >
        > > The two patterns have an unknown number of characters (including hard returns) between them. I can find and delete the text between using something like:
        > >
        > > ^!Replace "string1.+\R(?=\R).+?string2">> "" AWRS
        > >
        > > but if the two strings go over an unknown number of hard returns then I can't get the between text.
        > >
        > > I have tried playing about with the {0,} but can't seem to get it working. The last effort is
        > >
        > > ^!find "string1.+(.+|\r\n|\n|\r)(?1){0,}string2" CIWRS
        > >
        > > Which, I thought would match string1 followed by multiple chars (up to a line end) OR hard return (\r\n) OR newline (\n) OR return (\r) repeated 0 or more times then string2.
        > >
        > > What am I missing / doing wrong?
        > >
        > > Regards
        > >
        > > Mike
        >
        > Regards ... Alec (buralex@gmail& WinLiveMess - alec.m.burgess@skype)
        >
      • John Shotsky
        It will be easier for us to help you if you post some actual data – what do you want to find, and what does the data look like? It s easy to skip past line
        Message 3 of 5 , Apr 20, 2011
        • 0 Attachment
          It will be easier for us to help you if you post some actual data – what do you want to find, and what does the data
          look like? It's easy to skip past line ends, but you probably already know that.



          Regards,

          John

          <http://recipetools.gotdns.com/> RecipeTools site

          <http://groups.yahoo.com/group/RecipeTools/> RecipeTools Yahoo group

          <http://shotsky.gotdns.com/index.htm> Beaverton Weather



          From: notetab@yahoogroups.com [mailto:notetab@yahoogroups.com] On Behalf Of Mike Lenahan
          Sent: Tuesday, April 19, 2011 22:04
          To: notetab@yahoogroups.com
          Subject: [NTB] RegEx multiple find over hard line breaks





          O wise ones
          I am having a bit of a problem getting my head around RegEx not having used Perl or anything similar in the past so
          please don't laugh when I ask some silly questions.

          I have some text files (60M) and have to find and delete certain portions of the text between two patterns of text,
          which might involve chars such as ÀÊ.

          The two patterns have an unknown number of characters (including hard returns) between them. I can find and delete the
          text between using something like:

          ^!Replace "string1.+\R(?=\R).+?string2" >> "" AWRS

          but if the two strings go over an unknown number of hard returns then I can't get the between text.

          I have tried playing about with the {0,} but can't seem to get it working. The last effort is

          ^!find "string1.+(.+|\r\n|\n|\r)(?1){0,}string2" CIWRS

          Which, I thought would match string1 followed by multiple chars (up to a line end) OR hard return (\r\n) OR newline (\n)
          OR return (\r) repeated 0 or more times then string2.

          What am I missing / doing wrong?

          Regards

          Mike





          [Non-text portions of this message have been removed]
        Your message has been successfully submitted and would be delivered to recipients shortly.