Loading ...
Sorry, an error occurred while loading the content.

21679Re: [NTS] Trying to perfect RegExp to match various numbers

Expand Messages
  • flo.gehrke
    Mar 31, 2011
    • 0 Attachment
      --- In ntb-clips@yahoogroups.com, Alec Burgess <buralex@...> wrote:
      >
      > Following will enforce 5 digits (zero-padded) before optional
      > decimal and 4 after
      >
      > H=test B3-30 leading / trailing zeros
      > ; Alec Burgess 2011-03-30
      > ; currently enforces 5 digits before (optional) decimal and 4 after
      > ^!replace "\b(\d+)\.?(\d*)\b" >> "00000$1.$2===0000" rwais
      > ^!replace "===" >> "" rwais
      > ^!replace "\b0*(\d{5})\.(\d{4})0*\b" >> "$1.$2" rwais
      >
      > Note - I wanted the first replace to be just "00000$1.$20000" but
      > I haven't figured out how to prevent clip replace from confusing
      > $2 with $20 - (ie. non-existent 20th or 20000th sub-pattern.
      >
      > Does anyone know how to do this? As is just make sure "===" is any
      > string which does not exist in the input.

      Alec,

      You may want to try this...

      ^!replace "\b(\d+)\.?(\d*)" >> "00000$1.$2\0000" rwais

      > Note - adding line ^!replace "\.0*\b" >> "" rwais to above will
      > eliminate decimal padding after unnecessary decimal point.

      Maybe you have been surprised why '123.' is replaced with '00000123.0000.'. This is caused by the second word border \b in your original line...

      ^!replace "\b(\d+)\.?(\d*)\b" >> "00000$1.$2\0000" rwais

      I think what's happening here, is this...

      In short, the RegEx is matching the first subpattern and the dot but can't find a following word border. Since the dot is optional, the engine tracks back to '3' and searches for another number followed by a word border which is false again. Consequently, the '123' is replaced with '00000' and '123'. Since $2 is empty, now another dot, an empty string, and '0000' gets inserted pushing the original dot to the end of string. So the result is '00000123.0000.'

      Probably, that \b could be omitted here and that additional command line won't be necessary.

      Regards,
      Flo
    • Show all 2 messages in this topic