Loading ...
Sorry, an error occurred while loading the content.

Re: Compact digital range RexEx - And complement?

Expand Messages
  • Allan Dystrup
    Hi JE, Yes i can follow your line of reasoning here, step by step building up the negated Regex es. It is workable (actually more work than able to my
    Message 1 of 20 , Dec 1, 2004
    • 0 Attachment
      Hi JE,

      Yes i can follow your line of reasoning here, step by step building up
      the "negated" Regex'es.

      It is workable (actually more 'work' than 'able' to my taste...), so
      what i was really looking for was a built-in Regex
      operator/metacharacter like [^ ] for char. classes or (?! ) for
      lookahead, a feature that would just "invert" any given Regex, so to
      speak.

      I haven't been able to uncover such a feature though; Instead i've
      now opened the source of the parser and reversed the Perl matching
      operator (=~ to !=). This is a hack, and doesn't solve the general
      problem of feeding the parser any RE (in a one line textbox),
      including any complemented RE. I still chew on that one.

      Thanks,
      Allan


      --- In perl-beginner@yahoogroups.com, "J.E. Cripps" <cycmn@n...>
      wrote:
      >
      >
      > On Wed, 1 Dec 2004, Allan Dystrup wrote:
      >
      > > I have to pass the "negated" RE as a new RE to the parsing
      program,
      > > so i'll have to come up with a "complement RE" for :
      > >
      > > (CX36(5|6))|(JA30[0-2])|(JA3(([2-8]\d)|(9[0-4])))|(JA5.*)|
      > > (JA6((0\d)|(1[0-3])))|(JA64[7-9])|(JA687.*)|(JA74[0-3])|(JB5.*)|
      (JY
      > > (((1|2)\d\d)|(3[0-3]\d))))|(JY[3-9][5-9]\d)|(JZ51(3|4)00.*)
      >
      > Hmmm... here's what I have so far:
      >
      > # Looking for selected matches on a five character string
      > # (What's the name of this string? What does it mean?)
      >
      > # 2 letters 3 digits e.g. JB523
      > # would be a good idea to show the the original matches but
      > # I do not have the original post. IOW,
      >
      > # To match all strings except JB360-JB356 ... etc. etc.
      >
      > # using the x modifier to include comments and spacing
      > # see page 57 of the Camel books and perlretut
      >
      > ############# This is the start of the complement regexp
      #################
      >
      > / # The opening /
      >
      > (([AB]|[D-I]|[K-Z])\w)(\d{3}) | #match all but initial J, C
      > # then match the three digits
      > # the \w will match any alphanumeric,
      > # do you have to worry about anomalous data e.g. Kz301 L0500
      > #or maybe
      > # (^[CJ]\w{1})(\d{3})|
      >
      > # or maybe
      > # (^[CJ])(...)
      > # or (^[CJ])(.*)
      > # since you have a similar
      > # regexp in one of
      > # the J cases
      >
      > (C^[X])(\d{3})| # match the initial C
      other
      > # than those followed by X
      > # or (C^[X])(...)
      >
      > ((CX) ( ([0-2]|[4-9]) \d{1})) | # match CX followed by
      > # any digit except 3
      >
      > (CX3)([2345789]\d{1})| # match CX3 followed
      by
      > # any digit except 6
      >
      > (CX36([0-4][7-9]))| # match CX36 except when
      > # followed by 5 or 6
      >
      > (J^[ABYZ]))(\d{1})| # match J followed by
      > # letters other than ABYZ
      > # or (J^[ABYZ])(...)
      >
      > (JA)([012489](...)| # match JA followed by
      > # digits other than
      > # 3,4,6 or 7
      >
      > # this one is not the next in your original regexp
      > (JB^[5])(..)| # JB except if
      > # followed by a 5
      > # skipping the J cases that remain
      > # I do not have the original post and cannot reconstruct the
      > # target data from memory
      >
      > # the J cases for which I have not tried to do a complement:
      > #(JA30[0-2])|
      > #(JA3(([2-8]\d)|(9[0-4])))|
      > #(JA6((0\d)|(1[0-3])))|
      > #(JA64[7-9])|
      > #(JA687.*)|
      > #(JA74[0-3])|
      > #(JB5.*)|
      > #(JY(((1|2)\d\d)|(3[0-3]\d))))|
      > #(JY[3-9][5-9]\d)|
      > #(JZ51(3|4)00.*)
      > # (JZ51(3|4)00.*) # does this have more than three digits in it?
      > # # what are those 00s?
      >
      > # let's pretend all the complement matches are written
      > # and we'll close with the /x
      >
      > /x # The closing /x
      > ############ This is the end of the Complement Regexp
      #################
    • J.E. Cripps
      ... everything I ve seen indicates that there isn t any such feature, and the complementing a regexp tends to be laborious, messy or both ... the ^ should be
      Message 2 of 20 , Dec 1, 2004
      • 0 Attachment
        On Wed, 1 Dec 2004, Allan Dystrup wrote:

        > Yes i can follow your line of reasoning here, step by step building up
        > the "negated" Regex'es. ...
        > It is workable (actually more 'work' than 'able' to my taste...), so
        > what i was really looking for was a built-in Regex
        > operator/metacharacter like [^ ] for char. classes or (?! ) for
        > lookahead, a feature that would just "invert" any given Regex, so to
        > speak.

        everything I've seen indicates that there isn't any such feature,
        and the complementing a regexp tends to be laborious, messy or both


        another error in a previous message of mine:

        > > (C^[X])(\d{3})| # match the initial C

        the ^ should be _inside_ the [ ], i.e.

        (C[^X])(\d{3})

        > > (J^[ABYZ]))(\d{1})| # match J followed by

        which should be

        (J[^ABYZ])(\d{1})
      • Jonathan Paton
        Dear Allan, I think you are looking for: (?!pattern) A zero-width negative look-ahead assertion. For example /foo(?!bar)/ matches any occurrence of foo
        Message 3 of 20 , Dec 1, 2004
        • 0 Attachment
          Dear Allan,

          I think you are looking for:

          "(?!pattern)"
          A zero-width negative look-ahead assertion. For example
          "/foo(?!bar)/" matches any occurrence of "foo" that isn't
          followed by "bar". Note however that look-ahead and look-
          behind are NOT the same thing. You cannot use this for
          look-behind.

          ...

          from perldoc perlre

          You might need to wrap the regex with ^ and $ assertions.

          Jonathan Paton
        • Allan Dystrup
          Hi Jonathan , Yes indeed, i ve reached the same conclusion. The (?!pattern) can solve the issue in as clean a way as it s probably possible with Regex es. Eg:
          Message 4 of 20 , Dec 1, 2004
          • 0 Attachment
            Hi Jonathan ,

            Yes indeed, i've reached the same conclusion.
            The (?!pattern) can solve the issue in as clean a way
            as it's probably possible with Regex'es. Eg:

            Range RegEx Complement
            ----------- ----------- ----------
            CX365-CX366 CX36(5|6) ^(?!CX36(5|6))
            JA300-JA302 JA30[0-2] ^(?!JA30[0-2])
            JA320-JA394 JA3(([2-8]\d)|(9[0-4])) ^(?!(JA3(([2-8]\d)|(9[0-4]))
            etc.

            Thanks a lot,
            Allan


            --- In perl-beginner@yahoogroups.com, Jonathan Paton <jepaton@g...>
            wrote:
            > Dear Allan,
            >
            > I think you are looking for:
            >
            > "(?!pattern)"
            > A zero-width negative look-ahead assertion. For
            example
            > "/foo(?!bar)/" matches any occurrence of "foo"
            that isn't
            > followed by "bar". Note however that look-ahead
            and look-
            > behind are NOT the same thing. You cannot use
            this for
            > look-behind.
            >
            > ...
            >
            > from perldoc perlre
            >
            > You might need to wrap the regex with ^ and $ assertions.
            >
            > Jonathan Paton
          Your message has been successfully submitted and would be delivered to recipients shortly.