Loading ...
Sorry, an error occurred while loading the content.
 

RE: [NTS] Can a Reg Exp handle 123 AND not a|b|c followed by x?

Expand Messages
  • John Shotsky
    Since negative character classes don t support strings, I would approach this by first inserting a special character in front of each of the day strings, then
    Message 1 of 13 , May 9, 2012
      Since negative character classes don't support strings, I would approach this by first inserting a special character in
      front of each of the day strings, then use a negative character clase that included that special character. Afterwards,
      I'd remove the special character.

      I wish there were a 'negative string' feature in PCRE, but the substitution of special characters does work well.

      Regards,
      John
      RecipeTools Web Site: <http://recipetools.gotdns.com/> http://recipetools.gotdns.com/

      From: ntb-scripts@yahoogroups.com [mailto:ntb-scripts@yahoogroups.com] On Behalf Of mycroftj
      Sent: Wednesday, May 09, 2012 18:55
      To: ntb-scripts@yahoogroups.com
      Subject: [NTS] Can a Reg Exp handle 123 AND not a|b|c followed by x?


      Is there a relatively simple reg exp to find the following?

      Date in format mm/dd/yyyy OR yyyy/mm/dd followed by a space followed by an open paren followed by three characters that
      are *NOT* MON|TUE|WED|THU|FRI|SAT|SUN followed by a closing paren?

      I've gotten as far as the following that does all the above except for checking for the closing paren.

      ^!Find "(?:(\d\d/\d\d/(?:19|20)\d\d)|((?:19|20)\d\d/\d\d/\d\d))\x20\((?!(MON|TUE|WED|THU|FRI|SAT|SUN))" RSTI

      Note: (?: is for a non-capturing group

      Example:

      05/12/1995 (Fri) <-- This should not match
      05/13/1995 (Sat) <-- This should not match
      05/14/1995 (zzzday) <-- This should not match
      05/15/1995 (Mox) <-- ***This should match***
      05/16/1995 (Tue) <-- This should not match
      05/17/2005 (Tue) <-- This should not match

      I did get the clip working using the above FIND and a few extra lines, but now I'd like to know if this is (easily)
      possible to do with one FIND.

      Thanks!

      Joy



      [Non-text portions of this message have been removed]
    • Alec Burgess
      On 2012-05-09 21:55, mycroftj wroteTest data ... Developed and tested in RegexBuddyI and not (yet) in Notetab. Added these three test lines to make sure the
      Message 2 of 13 , May 9, 2012
        On 2012-05-09 21:55, mycroftj wroteTest data
        > Attempt:
        > :^!Find
        > "(?:(\d\d/\d\d/(?:19|20)\d\d)|((?:19|20)\d\d/\d\d/\d\d))\x20\((?!(MON|TUE|WED|THU|FRI|SAT|SUN))"
        > RSTI

        > 05/12/1995 (Fri) <-- This should not match
        > 05/13/1995 (Sat) <-- This should not match
        > 05/14/1995 (zzzday) <-- This should not match
        > 05/15/1995 (Mox) <-- ***This should match***
        > 05/16/1995 (Tue) <-- This should not match
        > 05/17/2005 (Tue) <-- This should not match
        Developed and tested in RegexBuddyI and not (yet) in Notetab.
        Added these three test lines to make sure the yyyy-first was handled
        correctly
        > 2005/05/16 (Mox)
        > 2005/05/16 (Mon)
        > 05/12/1995 (Mon)
        The free format regex (always easier to see how the bits and pieces work
        together by enabling and disabling parts with a leading "#":
        (?i)
        (?:\d{2}/\d{2}/\d{4}|\d{4}/\d{2}/\d{2})\x20\(
        (?!
        mon|tue|wed|thu|fri|sat|sun
        )
        .{3}
        \)

        change to equivalent single line regexp Hopefully this will not trigger
        yahoo line wrapping:
        (?i)(?:\d{2}/\d{2}/\d{4}|\d{4}/\d{2}/\d{2})\x20\((?!mon|tue|wed|thu|fri|sat|sun).{3}\)


        Key thing I think you were missing is to do a negative look-ahead for
        day names BEFORE doing a MUST MATCH on any 3 characters followed by
        closing ")" parenthesis.

        Later ... works in Notetab Find dialog and Replace dialog gives
        expected count of 2.
        So here is the ^!Find statement
        ^!Find
        "(?i)(?:\d{2}/\d{2}/\d{4}|\d{4}/\d{2}/\d{2})\x20\((?!mon|tue|wed|thu|fri|sat|sun).{3}\)"
        RIS
        Note: AFAIK the T modifier has no effect when using R option.

        --
        Regards ... Alec (buralex@gmail & WinLiveMess - alec.m.burgess@skype)
      • flo.gehrke
        ... With I and (?i) , the ignore case option is applied even twice in your pattern -- although, like T , it isn t needed here. The RegEx matches MON or
        Message 3 of 13 , May 10, 2012
          --- In ntb-scripts@yahoogroups.com, Alec Burgess <buralex@...> wrote:
          > So here is the ^!Find statement
          > ^!Find
          > "(?i)(?:\d{2}/\d{2}/\d{4}|\d{4}/\d{2}/\d{2})\x20\((?!mon|tue|wed|thu|fri|sat|sun).{3}\)"
          > RIS
          > Note: AFAIK the T modifier has no effect when using R option.

          With 'I' and '(?i)', the 'ignore case' option is applied even twice in your pattern -- although, like 'T', it isn't needed here. The RegEx matches 'MON' or 'mon' as well.

          If there's no need to capture anything, you could make not only the date but the whole pattern non-capturing...

          (?:(\d{2}/\d{2}/\d{4}|\d{4}/\d{2}/\d{2})\x20\((?!mon|tue|wed|thu|fri|sat|sun).{3}\))

          or enclose the whole pattern in an Atomic Group...

          ^!Find "^(?>\d{2,4}/?){3}\x20\((?!Mon|Tue|Wed|Thu|Fri|Sat|Sun).{3}\)" RS

          Regards,
          Flo
        • flo.gehrke
          ... Upps, sorry, I meant... ^(? ( d{2,4}/?){3} x20 ((?!Mon|Tue|Wed|Thu|Fri|Sat|Sun).{3} )) of course ;-) Flo
          Message 4 of 13 , May 10, 2012
            --- In ntb-scripts@yahoogroups.com, "flo.gehrke" <flo.gehrke@...> wrote:
            >
            > or enclose the whole pattern in an Atomic Group...
            >
            > ^!Find "^(?>\d{2,4}/?){3}\x20\((?!Mon|Tue|Wed|Thu|Fri|Sat|Sun).{3}\)" RS

            Upps, sorry, I meant...

            ^(?>(\d{2,4}/?){3}\x20\((?!Mon|Tue|Wed|Thu|Fri|Sat|Sun).{3}\))

            of course ;-)

            Flo
          • John Shotsky
            I am not understanding something here – The criteria was: three characters that are *NOT* MON|TUE|WED|THU|FRI|SAT|SUN How is this avoiding those strings?
            Message 5 of 13 , May 10, 2012
              I am not understanding something here � The criteria was:
              three characters that are *NOT* MON|TUE|WED|THU|FRI|SAT|SUN

              How is this avoiding those strings? I've wanted to do this text that didn't contain a certain string on multiple
              occasions.

              Regards,
              John
              RecipeTools Web Site: <http://recipetools.gotdns.com/> http://recipetools.gotdns.com/

              From: ntb-scripts@yahoogroups.com [mailto:ntb-scripts@yahoogroups.com] On Behalf Of flo.gehrke
              Sent: Thursday, May 10, 2012 05:48
              To: ntb-scripts@yahoogroups.com
              Subject: Re: [NTS] Can a Reg Exp handle 123 AND not a|b|c followed by x?


              --- In ntb-scripts@yahoogroups.com <mailto:ntb-scripts%40yahoogroups.com> , "flo.gehrke" <flo.gehrke@...> wrote:
              >
              > or enclose the whole pattern in an Atomic Group...
              >
              > ^!Find "^(?>\d{2,4}/?){3}\x20\((?!Mon|Tue|Wed|Thu|Fri|Sat|Sun).{3}\)" RS

              Upps, sorry, I meant...

              ^(?>(\d{2,4}/?){3}\x20\((?!Mon|Tue|Wed|Thu|Fri|Sat|Sun).{3}\))

              of course ;-)

              Flo



              [Non-text portions of this message have been removed]
            • flo.gehrke
              ... John, The second part of that RegEx... ((?!Mon|Tue|Wed|Thu|Fri|Sat|Sun).{3} ) matches an opening and a closing literal bracket (...) embracing three
              Message 6 of 13 , May 10, 2012
                --- In ntb-scripts@yahoogroups.com, "John Shotsky" <jshotsky@...> wrote:
                >
                > I am not understanding something here – The criteria was:
                > three characters that are *NOT* MON|TUE|WED|THU|FRI|SAT|SUN
                >
                > How is this avoiding those strings? I've wanted to do this text
                > that didn't contain a certain string on multiple occasions.
                >

                John,

                The second part of that RegEx...

                \((?!Mon|Tue|Wed|Thu|Fri|Sat|Sun).{3}\)

                matches an opening and a closing literal bracket '(...)' embracing three digits '.{3}' that are NOT 'Mon', 'Tue' etc, as Joy demanded.

                The 3-digit-days are excluded with a Negative Lookahead. Since a Lookahead doesn't consume any character, any different 3-digit-string will match at the same position between the opening and the closing bracket. That's why, for example,..

                'John' is matched with '(?!Mary)John'

                that is: Find 'John' at a position where you don't see 'Mary' when looking ahead.

                Regards,
                Flo
              • John Shotsky
                Flo, Thank you. Always nice to learn something new. I will play around with this until I have it fully internalized. I have needed this function quite a few
                Message 7 of 13 , May 10, 2012
                  Flo,
                  Thank you. Always nice to learn something new. I will play around with this until I have it fully internalized. I have
                  needed this function quite a few times and have 'tokenized' and then used a character class instead. (And then
                  untokenized.) This is obviously a better way to do it.

                  Regards,
                  John
                  RecipeTools Web Site: <http://recipetools.gotdns.com/> http://recipetools.gotdns.com/

                  From: ntb-scripts@yahoogroups.com [mailto:ntb-scripts@yahoogroups.com] On Behalf Of flo.gehrke
                  Sent: Thursday, May 10, 2012 07:46
                  To: ntb-scripts@yahoogroups.com
                  Subject: Re: [NTS] Can a Reg Exp handle 123 AND not a|b|c followed by x?


                  --- In ntb-scripts@yahoogroups.com <mailto:ntb-scripts%40yahoogroups.com> , "John Shotsky" <jshotsky@...> wrote:
                  >
                  > I am not understanding something here � The criteria was:
                  > three characters that are *NOT* MON|TUE|WED|THU|FRI|SAT|SUN
                  >
                  > How is this avoiding those strings? I've wanted to do this text
                  > that didn't contain a certain string on multiple occasions.
                  >

                  John,

                  The second part of that RegEx...

                  \((?!Mon|Tue|Wed|Thu|Fri|Sat|Sun).{3}\)

                  matches an opening and a closing literal bracket '(...)' embracing three digits '.{3}' that are NOT 'Mon', 'Tue' etc, as
                  Joy demanded.

                  The 3-digit-days are excluded with a Negative Lookahead. Since a Lookahead doesn't consume any character, any different
                  3-digit-string will match at the same position between the opening and the closing bracket. That's why, for example,..

                  'John' is matched with '(?!Mary)John'

                  that is: Find 'John' at a position where you don't see 'Mary' when looking ahead.

                  Regards,
                  Flo



                  [Non-text portions of this message have been removed]
                • Art Kocsis
                  ... Don t you mean three CHARACTERS instead of three DIGITS? The . matches any character, d is used match any digit. Just trying to keep things clear for
                  Message 8 of 13 , May 10, 2012
                    At 5/10/2012 07:46 AM, Flo wrote:
                    >The second part of that RegEx...
                    >
                    >\((?!Mon|Tue|Wed|Thu|Fri|Sat|Sun).{3}\)
                    >
                    >matches an opening and a closing literal bracket '(...)' embracing three
                    >digits '.{3}' that are NOT 'Mon', 'Tue' etc, as Joy demanded.

                    Don't you mean three CHARACTERS instead of three DIGITS? The "." matches
                    any character, "\d" is used match any digit.

                    Just trying to keep things clear for future readers. ;)

                    Namaste', Art
                  • flo.gehrke
                    ... Art, Thanks for correcting my bad English! Of course, .{3} means any character, not numbers (digits) only. Please give me a helping hand: 2012 is a
                    Message 9 of 13 , May 10, 2012
                      --- In ntb-scripts@yahoogroups.com, Art Kocsis <artkns@...> wrote:
                      >
                      >> matches an opening and a closing literal bracket '(...)' embracing
                      >> three digits '.{3}'...
                      >
                      > Don't you mean three CHARACTERS instead of three DIGITS?

                      Art,

                      Thanks for correcting my bad English! Of course, '.{3}' means any character, not numbers (digits) only.

                      Please give me a helping hand: '2012' is a four-digit number, 'Peter' is a four-letter name -- correct? But what is 'Boeing-707'? A four-letter name, a four-digit string? :-(

                      Flo
                    • Art Kocsis
                      ... Correct. ... No, it has five letters. ... Expensive!!! He he he. Namaste , Art Young at heart Slightly older in other places A wise ass all throughout!
                      Message 10 of 13 , May 10, 2012
                        At 5/10/2012 03:05 PM, Flo wrote:
                        >--- In <mailto:ntb-scripts%40yahoogroups.com>ntb-scripts@yahoogroups.com,
                        >Art Kocsis <artkns@...> wrote:
                        > >
                        > >> matches an opening and a closing literal bracket '(...)' embracing
                        > >> three digits '.{3}'...
                        > >
                        > > Don't you mean three CHARACTERS instead of three DIGITS?
                        >
                        >Art,
                        >
                        >Thanks for correcting my bad English! Of course, '.{3}' means any
                        >character, not numbers (digits) only.
                        >
                        >Please give me a helping hand: '2012' is a four-digit number,
                        Correct.

                        >'Peter' is a four-letter name -- correct?
                        No, it has five letters.

                        >But what is 'Boeing-707'? A four-letter name, a four-digit string? :-(
                        Expensive!!!


                        He he he.

                        Namaste', Art

                        Young at heart
                        Slightly older in other places
                        A wise ass all throughout!
                      • Computerhusky
                        Hi Flo, I d call Boeing-747 a 10-character string (or a large aeroplane). And Peter is a 5 letter name (or string) :-) Kind regards Thomas Von iPad
                        Message 11 of 13 , May 10, 2012
                          Hi Flo,
                          I'd call 'Boeing-747' a 10-character string (or a large aeroplane).
                          And 'Peter' is a 5 letter name (or string) :-)
                          Kind regards
                          Thomas

                          Von iPad gesendet / sent from iPad

                          Am 11.05.2012 um 00:05 schrieb "flo.gehrke" <flo.gehrke@...>:

                          > --- In ntb-scripts@yahoogroups.com, Art Kocsis <artkns@...> wrote:
                          > >
                          > >> matches an opening and a closing literal bracket '(...)' embracing
                          > >> three digits '.{3}'...
                          > >
                          > > Don't you mean three CHARACTERS instead of three DIGITS?
                          >
                          > Art,
                          >
                          > Thanks for correcting my bad English! Of course, '.{3}' means any character, not numbers (digits) only.
                          >
                          > Please give me a helping hand: '2012' is a four-digit number, 'Peter' is a four-letter name -- correct? But what is 'Boeing-707'? A four-letter name, a four-digit string? :-(
                          >
                          > Flo
                          >
                          >


                          [Non-text portions of this message have been removed]
                        • mycroftj
                          ... Flo, Thank you so much for that. I was close! For some reason, I never realized a look-behind or look-ahead could come in the middle of a regexp. I don t
                          Message 12 of 13 , May 12, 2012
                            --- In ntb-scripts@yahoogroups.com, "flo.gehrke" <flo.gehrke@...> wrote:
                            >
                            > --- In ntb-scripts@yahoogroups.com, Alec Burgess <buralex@> wrote:
                            > > So here is the ^!Find statement
                            > > ^!Find
                            > > "(?i)(?:\d{2}/\d{2}/\d{4}|\d{4}/\d{2}/\d{2})\x20\((?!mon|tue|wed|thu|fri|sat|sun).{3}\)"
                            > > RIS
                            > > Note: AFAIK the T modifier has no effect when using R option.
                            >
                            > With 'I' and '(?i)', the 'ignore case' option is applied even twice in your pattern -- although, like 'T', it isn't needed here. The RegEx matches 'MON' or 'mon' as well.
                            >
                            > If there's no need to capture anything, you could make not only the date but the whole pattern non-capturing...
                            >
                            > (?:(\d{2}/\d{2}/\d{4}|\d{4}/\d{2}/\d{2})\x20\((?!mon|tue|wed|thu|fri|sat|sun).{3}\))
                            >
                            > or enclose the whole pattern in an Atomic Group...
                            >
                            > ^!Find "^(?>\d{2,4}/?){3}\x20\((?!Mon|Tue|Wed|Thu|Fri|Sat|Sun).{3}\)" RS
                            >
                            > Regards,
                            > Flo
                            >


                            Flo,

                            Thank you so much for that. I was close! For some reason, I never realized a look-behind or look-ahead could come in the middle of a regexp. I don't recall ever seeing that in an example. But it can and is perfect and I learned a very important thing.

                            Thank you always for your answers and remember how many people learn by seeing others discuss EVERYTHING here.

                            Regards

                            Joy
                          Your message has been successfully submitted and would be delivered to recipients shortly.