Loading ...
Sorry, an error occurred while loading the content.

replace tokens on shorter line too

Expand Messages
  • beshtin
    I have hundreds of lines of text such as the following: 5 (5) PARSONS, MARY AST 40045 19 525 320 7 (9) MORRIS, ALICE FRG 2400 25 The following clip converts
    Message 1 of 8 , Nov 19, 2012
    • 0 Attachment
      I have hundreds of lines of text such as the following:

      5 (5) PARSONS, MARY AST 40045 19 525 320
      7 (9) MORRIS, ALICE FRG 2400 25

      The following clip converts the first line and all other lines with the same number of blocks of characters to a row of six cells in an HTML table:

      ^!Jump Doc_Start
      ;long line follows
      ^!Replace "^(\d+) ([(]\d+[)]) (.*?) ([A-Z]{3}) (\d+) (\d+) (.*)">>"<tr><td>$1</td><td>$2</td><td>$3</td><td>$4</td><td align=right>$5</td><td align=right>$6</td></tr>" RAWS
      ;long line prior
      ^!Jump Doc_End

      But, it simply ignores shorter lines like the second line of text above. Getting the clip to process the shorter line like the longer line seems as if it should require only a simple modification, but after a few hours of trying to figure it out, I'm stumped. Any suggestions?
    • Axel Berger
      ... It s the space in front of the last pair of parentheses. Axel
      Message 2 of 8 , Nov 19, 2012
      • 0 Attachment
        beshtin wrote:

        > ^!Replace "^(\d+) ([(]\d+[)]) (.*?) ([A-Z]{3}) (\d+) (\d+) (.*)">>

        > I'm stumped. Any suggestions?

        It's the space in front of the last pair of parentheses.

        Axel
      • Don
        ... ^( d+) ([(] d+[)]) (.*?) ([A-Z]{3}) ( d+) ( d+) ?(.*) Make the space optional by adding the ? character after it.
        Message 3 of 8 , Nov 19, 2012
        • 0 Attachment
          On 11/19/2012 4:08 PM, beshtin wrote:
          > 5 (5) PARSONS, MARY AST 40045 19 525 320
          > 7 (9) MORRIS, ALICE FRG 2400 25

          ^(\d+) ([(]\d+[)]) (.*?) ([A-Z]{3}) (\d+) (\d+) ?(.*)

          Make the space optional by adding the ? character after it.
        • flo.gehrke
          ... Your search string defines seven capturing groups whereas your replace uses six back references only. Maybe you could just omit the seventh capturing group
          Message 4 of 8 , Nov 19, 2012
          • 0 Attachment
            --- In ntb-clips@yahoogroups.com, "beshtin" <beshtin@...> wrote:
            >
            > I have hundreds of lines of text such as the following:
            >
            > 5 (5) PARSONS, MARY AST 40045 19 525 320
            > 7 (9) MORRIS, ALICE FRG 2400 25
            >
            > The following clip converts the first line and all other lines with the same number of blocks of characters to a row of six cells in an HTML table:
            >
            > ^!Jump Doc_Start
            > ;long line follows
            > ^!Replace "^(\d+) ([(]\d+[)]) (.*?) ([A-Z]{3}) (\d+) (\d+) (.*)">>"<tr><td>$1</td><td>$2</td><td>$3</td><td>$4</td><td align=right>$5</td><td align=right>$6</td></tr>" RAWS
            > ;long line prior
            > ^!Jump Doc_End
            >
            > But, it simply ignores shorter lines like the second line of text
            > above...

            Your search string defines seven capturing groups whereas your replace uses six back references only.

            Maybe you could just omit the seventh capturing group and write:

            ^!Replace "(?x)^(\d+)\x20 (\(\d+\))\x20 ([A-Z,\x20]+)\x20 ([A-Z]{3})\x20 (\d+)\x20 (\d+).+$" >> "<tr><td>$1</td><td>$2</td><td>$3</td><td>$4</td><td align=right>$5</td><td align=right>$6</td></tr>" WARS

            I've used \x20 to make spaces more visible; also used Extended Mode to separate the subpatterns a bit.

            For me, the result is:

            <tr><td>5</td><td>(5)</td><td>PARSONS, MARY</td><td>AST</td><td align=right>40045</td><td align=right>19</td></tr>
            <tr><td>7</td><td>(9)</td><td>MORRIS, ALICE</td><td>FRG</td><td align=right>2400</td><td align=right>2</td></tr>

            Regards,
            Flo
          • beshtin
            Thank you, Don. So simple, and it works perfectly!
            Message 5 of 8 , Nov 19, 2012
            • 0 Attachment
              Thank you, Don. So simple, and it works perfectly!

              --- In ntb-clips@yahoogroups.com, Don <don@...> wrote:
              >
              > On 11/19/2012 4:08 PM, beshtin wrote:
              > > 5 (5) PARSONS, MARY AST 40045 19 525 320
              > > 7 (9) MORRIS, ALICE FRG 2400 25
              >
              > ^(\d+) ([(]\d+[)]) (.*?) ([A-Z]{3}) (\d+) (\d+) ?(.*)
              >
              > Make the space optional by adding the ? character after it.
              >
            • beshtin
              Thank you for the extra tips, Flo!
              Message 6 of 8 , Nov 19, 2012
              • 0 Attachment
                Thank you for the extra tips, Flo!

                --- In ntb-clips@yahoogroups.com, "flo.gehrke" <flo.gehrke@...> wrote:
                >
                > --- In ntb-clips@yahoogroups.com, "beshtin" <beshtin@> wrote:
                > >
                > > I have hundreds of lines of text such as the following:
                > >
                > > 5 (5) PARSONS, MARY AST 40045 19 525 320
                > > 7 (9) MORRIS, ALICE FRG 2400 25
                > >
                > > The following clip converts the first line and all other lines with the same number of blocks of characters to a row of six cells in an HTML table:
                > >
                > > ^!Jump Doc_Start
                > > ;long line follows
                > > ^!Replace "^(\d+) ([(]\d+[)]) (.*?) ([A-Z]{3}) (\d+) (\d+) (.*)">>"<tr><td>$1</td><td>$2</td><td>$3</td><td>$4</td><td align=right>$5</td><td align=right>$6</td></tr>" RAWS
                > > ;long line prior
                > > ^!Jump Doc_End
                > >
                > > But, it simply ignores shorter lines like the second line of text
                > > above...
                >
                > Your search string defines seven capturing groups whereas your replace uses six back references only.
                >
                > Maybe you could just omit the seventh capturing group and write:
                >
                > ^!Replace "(?x)^(\d+)\x20 (\(\d+\))\x20 ([A-Z,\x20]+)\x20 ([A-Z]{3})\x20 (\d+)\x20 (\d+).+$" >> "<tr><td>$1</td><td>$2</td><td>$3</td><td>$4</td><td align=right>$5</td><td align=right>$6</td></tr>" WARS
                >
                > I've used \x20 to make spaces more visible; also used Extended Mode to separate the subpatterns a bit.
                >
                > For me, the result is:
                >
                > <tr><td>5</td><td>(5)</td><td>PARSONS, MARY</td><td>AST</td><td align=right>40045</td><td align=right>19</td></tr>
                > <tr><td>7</td><td>(9)</td><td>MORRIS, ALICE</td><td>FRG</td><td align=right>2400</td><td align=right>2</td></tr>
                >
                > Regards,
                > Flo
                >
              • beshtin
                Thank you, Axel.
                Message 7 of 8 , Nov 19, 2012
                • 0 Attachment
                  Thank you, Axel.

                  --- In ntb-clips@yahoogroups.com, Axel Berger <Axel-Berger@...> wrote:
                  >
                  > beshtin wrote:
                  >
                  > > ^!Replace "^(\d+) ([(]\d+[)]) (.*?) ([A-Z]{3}) (\d+) (\d+) (.*)">>
                  >
                  > > I'm stumped. Any suggestions?
                  >
                  > It's the space in front of the last pair of parentheses.
                  >
                  > Axel
                  >
                • Don
                  ... How did you learn all of this stuff ... geez! Extended mode is a new one to me. All proud to understand K and now more ... Triggered with the (?x): In
                  Message 8 of 8 , Nov 19, 2012
                  • 0 Attachment
                    >> I've used \x20 to make spaces more visible; also used Extended Mode to separate the subpatterns a bit.

                    How did you learn all of this stuff ... geez! Extended mode is a new
                    one to me. All proud to understand \K and now more ...

                    Triggered with the (?x):
                    In free-spacing mode, whitespace between regular expression tokens is
                    ignored. Whitespace includes spaces, tabs and line breaks. Note that
                    only whitespace between tokens is ignored. E.g. a b c is the same as abc
                    in free-spacing mode, but \ d and \d are not the same. The former
                    matches d, while the latter matches a digit. \d is a single regex token
                    composed of a backslash and a "d". Breaking up the token with a space
                    gives you an escaped space (which matches a space), and a literal "d".

                    From RegEx Buddy help.
                  Your message has been successfully submitted and would be delivered to recipients shortly.