Loading ...
Sorry, an error occurred while loading the content.

Re: [PBML] Compact digital range RexEx - And complement?

Expand Messages
  • J.E. Cripps
    errata: #((CX) ( ([0-2]|[4-9]) d{1})) | # match CX followed by # any digit except 3 #should be ((CX) (([0-2]|[4-9]) d{2}) | # or end with (..)
    Message 1 of 20 , Dec 1, 2004
    • 0 Attachment
      errata:

      #((CX) ( ([0-2]|[4-9]) \d{1})) | # match CX followed by
      # any digit except 3
      #should be
      ((CX) (([0-2]|[4-9]) \d{2}) | # or end with (..)


      #(J^[ABYZ]))(\d{1})| # match J followed by
      # letters other than ABYZ
      #should be
      (J^[ABYZ])(\d{3})|
    • Allan Dystrup
      Hi JE, Yes i can follow your line of reasoning here, step by step building up the negated Regex es. It is workable (actually more work than able to my
      Message 2 of 20 , Dec 1, 2004
      • 0 Attachment
        Hi JE,

        Yes i can follow your line of reasoning here, step by step building up
        the "negated" Regex'es.

        It is workable (actually more 'work' than 'able' to my taste...), so
        what i was really looking for was a built-in Regex
        operator/metacharacter like [^ ] for char. classes or (?! ) for
        lookahead, a feature that would just "invert" any given Regex, so to
        speak.

        I haven't been able to uncover such a feature though; Instead i've
        now opened the source of the parser and reversed the Perl matching
        operator (=~ to !=). This is a hack, and doesn't solve the general
        problem of feeding the parser any RE (in a one line textbox),
        including any complemented RE. I still chew on that one.

        Thanks,
        Allan


        --- In perl-beginner@yahoogroups.com, "J.E. Cripps" <cycmn@n...>
        wrote:
        >
        >
        > On Wed, 1 Dec 2004, Allan Dystrup wrote:
        >
        > > I have to pass the "negated" RE as a new RE to the parsing
        program,
        > > so i'll have to come up with a "complement RE" for :
        > >
        > > (CX36(5|6))|(JA30[0-2])|(JA3(([2-8]\d)|(9[0-4])))|(JA5.*)|
        > > (JA6((0\d)|(1[0-3])))|(JA64[7-9])|(JA687.*)|(JA74[0-3])|(JB5.*)|
        (JY
        > > (((1|2)\d\d)|(3[0-3]\d))))|(JY[3-9][5-9]\d)|(JZ51(3|4)00.*)
        >
        > Hmmm... here's what I have so far:
        >
        > # Looking for selected matches on a five character string
        > # (What's the name of this string? What does it mean?)
        >
        > # 2 letters 3 digits e.g. JB523
        > # would be a good idea to show the the original matches but
        > # I do not have the original post. IOW,
        >
        > # To match all strings except JB360-JB356 ... etc. etc.
        >
        > # using the x modifier to include comments and spacing
        > # see page 57 of the Camel books and perlretut
        >
        > ############# This is the start of the complement regexp
        #################
        >
        > / # The opening /
        >
        > (([AB]|[D-I]|[K-Z])\w)(\d{3}) | #match all but initial J, C
        > # then match the three digits
        > # the \w will match any alphanumeric,
        > # do you have to worry about anomalous data e.g. Kz301 L0500
        > #or maybe
        > # (^[CJ]\w{1})(\d{3})|
        >
        > # or maybe
        > # (^[CJ])(...)
        > # or (^[CJ])(.*)
        > # since you have a similar
        > # regexp in one of
        > # the J cases
        >
        > (C^[X])(\d{3})| # match the initial C
        other
        > # than those followed by X
        > # or (C^[X])(...)
        >
        > ((CX) ( ([0-2]|[4-9]) \d{1})) | # match CX followed by
        > # any digit except 3
        >
        > (CX3)([2345789]\d{1})| # match CX3 followed
        by
        > # any digit except 6
        >
        > (CX36([0-4][7-9]))| # match CX36 except when
        > # followed by 5 or 6
        >
        > (J^[ABYZ]))(\d{1})| # match J followed by
        > # letters other than ABYZ
        > # or (J^[ABYZ])(...)
        >
        > (JA)([012489](...)| # match JA followed by
        > # digits other than
        > # 3,4,6 or 7
        >
        > # this one is not the next in your original regexp
        > (JB^[5])(..)| # JB except if
        > # followed by a 5
        > # skipping the J cases that remain
        > # I do not have the original post and cannot reconstruct the
        > # target data from memory
        >
        > # the J cases for which I have not tried to do a complement:
        > #(JA30[0-2])|
        > #(JA3(([2-8]\d)|(9[0-4])))|
        > #(JA6((0\d)|(1[0-3])))|
        > #(JA64[7-9])|
        > #(JA687.*)|
        > #(JA74[0-3])|
        > #(JB5.*)|
        > #(JY(((1|2)\d\d)|(3[0-3]\d))))|
        > #(JY[3-9][5-9]\d)|
        > #(JZ51(3|4)00.*)
        > # (JZ51(3|4)00.*) # does this have more than three digits in it?
        > # # what are those 00s?
        >
        > # let's pretend all the complement matches are written
        > # and we'll close with the /x
        >
        > /x # The closing /x
        > ############ This is the end of the Complement Regexp
        #################
      • J.E. Cripps
        ... everything I ve seen indicates that there isn t any such feature, and the complementing a regexp tends to be laborious, messy or both ... the ^ should be
        Message 3 of 20 , Dec 1, 2004
        • 0 Attachment
          On Wed, 1 Dec 2004, Allan Dystrup wrote:

          > Yes i can follow your line of reasoning here, step by step building up
          > the "negated" Regex'es. ...
          > It is workable (actually more 'work' than 'able' to my taste...), so
          > what i was really looking for was a built-in Regex
          > operator/metacharacter like [^ ] for char. classes or (?! ) for
          > lookahead, a feature that would just "invert" any given Regex, so to
          > speak.

          everything I've seen indicates that there isn't any such feature,
          and the complementing a regexp tends to be laborious, messy or both


          another error in a previous message of mine:

          > > (C^[X])(\d{3})| # match the initial C

          the ^ should be _inside_ the [ ], i.e.

          (C[^X])(\d{3})

          > > (J^[ABYZ]))(\d{1})| # match J followed by

          which should be

          (J[^ABYZ])(\d{1})
        • Jonathan Paton
          Dear Allan, I think you are looking for: (?!pattern) A zero-width negative look-ahead assertion. For example /foo(?!bar)/ matches any occurrence of foo
          Message 4 of 20 , Dec 1, 2004
          • 0 Attachment
            Dear Allan,

            I think you are looking for:

            "(?!pattern)"
            A zero-width negative look-ahead assertion. For example
            "/foo(?!bar)/" matches any occurrence of "foo" that isn't
            followed by "bar". Note however that look-ahead and look-
            behind are NOT the same thing. You cannot use this for
            look-behind.

            ...

            from perldoc perlre

            You might need to wrap the regex with ^ and $ assertions.

            Jonathan Paton
          • Allan Dystrup
            Hi Jonathan , Yes indeed, i ve reached the same conclusion. The (?!pattern) can solve the issue in as clean a way as it s probably possible with Regex es. Eg:
            Message 5 of 20 , Dec 1, 2004
            • 0 Attachment
              Hi Jonathan ,

              Yes indeed, i've reached the same conclusion.
              The (?!pattern) can solve the issue in as clean a way
              as it's probably possible with Regex'es. Eg:

              Range RegEx Complement
              ----------- ----------- ----------
              CX365-CX366 CX36(5|6) ^(?!CX36(5|6))
              JA300-JA302 JA30[0-2] ^(?!JA30[0-2])
              JA320-JA394 JA3(([2-8]\d)|(9[0-4])) ^(?!(JA3(([2-8]\d)|(9[0-4]))
              etc.

              Thanks a lot,
              Allan


              --- In perl-beginner@yahoogroups.com, Jonathan Paton <jepaton@g...>
              wrote:
              > Dear Allan,
              >
              > I think you are looking for:
              >
              > "(?!pattern)"
              > A zero-width negative look-ahead assertion. For
              example
              > "/foo(?!bar)/" matches any occurrence of "foo"
              that isn't
              > followed by "bar". Note however that look-ahead
              and look-
              > behind are NOT the same thing. You cannot use
              this for
              > look-behind.
              >
              > ...
              >
              > from perldoc perlre
              >
              > You might need to wrap the regex with ^ and $ assertions.
              >
              > Jonathan Paton
            Your message has been successfully submitted and would be delivered to recipients shortly.