Loading ...
Sorry, an error occurred while loading the content.
 

Re: replace tokens on shorter line too

Expand Messages
  • flo.gehrke
    ... Your search string defines seven capturing groups whereas your replace uses six back references only. Maybe you could just omit the seventh capturing group
    Message 1 of 8 , Nov 19, 2012
      --- In ntb-clips@yahoogroups.com, "beshtin" <beshtin@...> wrote:
      >
      > I have hundreds of lines of text such as the following:
      >
      > 5 (5) PARSONS, MARY AST 40045 19 525 320
      > 7 (9) MORRIS, ALICE FRG 2400 25
      >
      > The following clip converts the first line and all other lines with the same number of blocks of characters to a row of six cells in an HTML table:
      >
      > ^!Jump Doc_Start
      > ;long line follows
      > ^!Replace "^(\d+) ([(]\d+[)]) (.*?) ([A-Z]{3}) (\d+) (\d+) (.*)">>"<tr><td>$1</td><td>$2</td><td>$3</td><td>$4</td><td align=right>$5</td><td align=right>$6</td></tr>" RAWS
      > ;long line prior
      > ^!Jump Doc_End
      >
      > But, it simply ignores shorter lines like the second line of text
      > above...

      Your search string defines seven capturing groups whereas your replace uses six back references only.

      Maybe you could just omit the seventh capturing group and write:

      ^!Replace "(?x)^(\d+)\x20 (\(\d+\))\x20 ([A-Z,\x20]+)\x20 ([A-Z]{3})\x20 (\d+)\x20 (\d+).+$" >> "<tr><td>$1</td><td>$2</td><td>$3</td><td>$4</td><td align=right>$5</td><td align=right>$6</td></tr>" WARS

      I've used \x20 to make spaces more visible; also used Extended Mode to separate the subpatterns a bit.

      For me, the result is:

      <tr><td>5</td><td>(5)</td><td>PARSONS, MARY</td><td>AST</td><td align=right>40045</td><td align=right>19</td></tr>
      <tr><td>7</td><td>(9)</td><td>MORRIS, ALICE</td><td>FRG</td><td align=right>2400</td><td align=right>2</td></tr>

      Regards,
      Flo
    • beshtin
      Thank you, Don. So simple, and it works perfectly!
      Message 2 of 8 , Nov 19, 2012
        Thank you, Don. So simple, and it works perfectly!

        --- In ntb-clips@yahoogroups.com, Don <don@...> wrote:
        >
        > On 11/19/2012 4:08 PM, beshtin wrote:
        > > 5 (5) PARSONS, MARY AST 40045 19 525 320
        > > 7 (9) MORRIS, ALICE FRG 2400 25
        >
        > ^(\d+) ([(]\d+[)]) (.*?) ([A-Z]{3}) (\d+) (\d+) ?(.*)
        >
        > Make the space optional by adding the ? character after it.
        >
      • beshtin
        Thank you for the extra tips, Flo!
        Message 3 of 8 , Nov 19, 2012
          Thank you for the extra tips, Flo!

          --- In ntb-clips@yahoogroups.com, "flo.gehrke" <flo.gehrke@...> wrote:
          >
          > --- In ntb-clips@yahoogroups.com, "beshtin" <beshtin@> wrote:
          > >
          > > I have hundreds of lines of text such as the following:
          > >
          > > 5 (5) PARSONS, MARY AST 40045 19 525 320
          > > 7 (9) MORRIS, ALICE FRG 2400 25
          > >
          > > The following clip converts the first line and all other lines with the same number of blocks of characters to a row of six cells in an HTML table:
          > >
          > > ^!Jump Doc_Start
          > > ;long line follows
          > > ^!Replace "^(\d+) ([(]\d+[)]) (.*?) ([A-Z]{3}) (\d+) (\d+) (.*)">>"<tr><td>$1</td><td>$2</td><td>$3</td><td>$4</td><td align=right>$5</td><td align=right>$6</td></tr>" RAWS
          > > ;long line prior
          > > ^!Jump Doc_End
          > >
          > > But, it simply ignores shorter lines like the second line of text
          > > above...
          >
          > Your search string defines seven capturing groups whereas your replace uses six back references only.
          >
          > Maybe you could just omit the seventh capturing group and write:
          >
          > ^!Replace "(?x)^(\d+)\x20 (\(\d+\))\x20 ([A-Z,\x20]+)\x20 ([A-Z]{3})\x20 (\d+)\x20 (\d+).+$" >> "<tr><td>$1</td><td>$2</td><td>$3</td><td>$4</td><td align=right>$5</td><td align=right>$6</td></tr>" WARS
          >
          > I've used \x20 to make spaces more visible; also used Extended Mode to separate the subpatterns a bit.
          >
          > For me, the result is:
          >
          > <tr><td>5</td><td>(5)</td><td>PARSONS, MARY</td><td>AST</td><td align=right>40045</td><td align=right>19</td></tr>
          > <tr><td>7</td><td>(9)</td><td>MORRIS, ALICE</td><td>FRG</td><td align=right>2400</td><td align=right>2</td></tr>
          >
          > Regards,
          > Flo
          >
        • beshtin
          Thank you, Axel.
          Message 4 of 8 , Nov 19, 2012
            Thank you, Axel.

            --- In ntb-clips@yahoogroups.com, Axel Berger <Axel-Berger@...> wrote:
            >
            > beshtin wrote:
            >
            > > ^!Replace "^(\d+) ([(]\d+[)]) (.*?) ([A-Z]{3}) (\d+) (\d+) (.*)">>
            >
            > > I'm stumped. Any suggestions?
            >
            > It's the space in front of the last pair of parentheses.
            >
            > Axel
            >
          • Don
            ... How did you learn all of this stuff ... geez! Extended mode is a new one to me. All proud to understand K and now more ... Triggered with the (?x): In
            Message 5 of 8 , Nov 19, 2012
              >> I've used \x20 to make spaces more visible; also used Extended Mode to separate the subpatterns a bit.

              How did you learn all of this stuff ... geez! Extended mode is a new
              one to me. All proud to understand \K and now more ...

              Triggered with the (?x):
              In free-spacing mode, whitespace between regular expression tokens is
              ignored. Whitespace includes spaces, tabs and line breaks. Note that
              only whitespace between tokens is ignored. E.g. a b c is the same as abc
              in free-spacing mode, but \ d and \d are not the same. The former
              matches d, while the latter matches a digit. \d is a single regex token
              composed of a backslash and a "d". Breaking up the token with a space
              gives you an escaped space (which matches a space), and a literal "d".

              From RegEx Buddy help.
            Your message has been successfully submitted and would be delivered to recipients shortly.