Loading ...
Sorry, an error occurred while loading the content.
 

Re: [Clip] truncate found capture in regex?

Expand Messages
  • Don
    Was toying with something like this: ^$GetDocReplaceAll( ^( d+ +)([A-Z a-z -]) + ; $1 ^$StrCopyLeft( $2 ;24)$ )$ Doesn t seem to be working ...
    Message 1 of 9 , Mar 12, 2012
      Was toying with something like this:
      ^$GetDocReplaceAll("^(\d+ +)([A-Z a-z\-]) +";"$1
      ^$StrCopyLeft("$2";24)$ ")$
      Doesn't seem to be working ...

      On 3/13/2012 12:59 AM, Don wrote:
      > I want to find this:
      > "4 Lansingburgh High School "
      >
      > I want to then replace the "4 "
      > I want to truncate the "Lansingburgh High School" if it is longer than
      > 26 characters
      > I want to replace the " "
      >
      > I can find with this:
      > ^!Find "^(\d+ +)(.*?) +" CISR
      >
      > Is $2 an actual variable I can work with and truncate?
      >
      > There are numerous of these in the document with various school names,
      > some of which are longer than 26 total characters.
    • Alec Burgess
      ... You don t say what you want to replace it with? Assuming below - replace with , ditto balance of high school name more than 26 characters ... This
      Message 2 of 9 , Mar 13, 2012
        On 2012-03-13 00:59, Don wrote:
        > I want to find this:
        > "4 Lansingburgh High School "
        >
        > I want to then replace the "4 "
        You don't say what you want to replace it with? Assuming below - replace
        with <empty>, ditto balance of high school name more than 26 characters
        ... and that there is nothing else on the line
        > I want to truncate the "Lansingburgh High School" if it is longer than
        > 26 characters
        > I want to replace the " "
        >
        > I can find with this:
        > ^!Find "^(\d+ +)(.*?) +" CISR
        This should be pretty close (not tested) (I'm using \x20 instead of
        space because its easier to see it)
        ^!replace "^(\d+\x20+)(.{0,26})(.*)" >> "$2"
        >
        > Is $2 an actual variable I can work with and truncate?
        >
        > There are numerous of these in the document with various school names,
        > some of which are longer than 26 total characters.

        --
        Regards ... Alec (buralex@gmail & WinLiveMess - alec.m.burgess@skype)
      • Axel Berger
        ... Alec, your mail program seems to eat multiple spaces. The intended result was a single space after the number. But your solution is great. I didn t see it
        Message 3 of 9 , Mar 13, 2012
          Alec Burgess wrote:
          > > I want to then replace the "4 "
          > You don't say what you want to replace it with?

          Alec, your mail program seems to eat multiple spaces. The intended
          result was a single space after the number. But your solution is great.
          I didn't see it and was playing with a complicated multi step process.
          The multi space sequnce at the end needs to be conserved, it seems.
          Adding that in I get:

          ^!Replace "^(\d+\x20)(\x20+)(.{3,24}?)(.*?)(\x20{3,}" >> "$1$3" WRASTI

          It's untested but ought to work.

          Axel
        • Don
          This was close to what I actually needed and very helpful to me. There is content after the second set of spaces that we get to in some cases. I actually do
          Message 4 of 9 , Mar 13, 2012
            This was close to what I actually needed and very helpful to me.

            There is content after the second set of spaces that we get to in some
            cases.

            I actually do need multiple spaces kept or put back between items as
            that is the delimiter ... in my example I had multiple spaces between
            the quotes both times. I meant that I want to place those back again
            (or leave them undisturbed) before and after Alec. Use of the word
            "replace" was poor on my part. Replaced in the text with themselves
            perhaps, or just left alone would be a better description.

            Before:
            Girls 4x200 Meter Relay
            Team Relay Finals
            Finals
            1 Saratoga High School A 1:53.4
            2768 Bianco, Ellery FR 2822 Ventra, Olivia FR
            2817 Soto, Dionna JR 2815 Shannon, Rachael JR
            2 Albany High School A 1:53.6
            3 Lansingburgh High School A 2:01.0
            4 Cohoes High School A 2:04.2
            5 Fonda-Fultonville Central Scho A 2:06.6
            6 Ichabod Crane High School A 2:07.9
            7 Averill Park High School A 2:08.2

            After:
            Girls 4x200 Meter Relay
            Team Relay Finals
            Finals
            1 Saratoga High School 1:53.4
            2768 Bianco, Ellery FR 2822 Ventra, Olivia FR
            2817 Soto, Dionna JR 2815 Shannon, Rachael JR
            2 Albany High School A 1:53.6
            3 Lansingburgh High School 2:01.0
            4 Cohoes High School A 2:04.2
            5 Fonda-Fultonville Centra 2:06.6
            6 Ichabod Crane High Schoo 2:07.9
            7 Averill Park High School 2:08.2

            See how Fonda, Ichabod and Averill for example are shortened.


            Using:
            ^!Replace "^(\d+\x20)(\x20+)([A-Z].{23}?)(.*?)(\x20{2,})(x|\d)" >> "$1
            $3 $6" WRASTI

            I added this: [A-Z]
            The reason is that some lines look similar but had a digit in that space.

            I added this at the end because I needed to include the \x20+A\x20+ in
            the "school name" that I am truncating.

            Thanks!




            On 3/13/2012 7:18 AM, Axel Berger wrote:
            > Alec Burgess wrote:
            >>> I want to then replace the "4 "
            >> You don't say what you want to replace it with?
            >
            > Alec, your mail program seems to eat multiple spaces. The intended
            > result was a single space after the number. But your solution is great.
            > I didn't see it and was playing with a complicated multi step process.
            > The multi space sequnce at the end needs to be conserved, it seems.
            > Adding that in I get:
            >
            > ^!Replace "^(\d+\x20)(\x20+)(.{3,24}?)(.*?)(\x20{3,}" >> "$1$3" WRASTI
            >
            > It's untested but ought to work.
            >
            > Axel
            >
            >
            > ------------------------------------
            >
            > Fookes Software: http://www.fookes.com/
            > NoteTab website: http://www.notetab.com/
            > NoteTab Discussion Lists: http://www.notetab.com/groups.php
            >
            > ***
            > Yahoo! Groups Links
            >
            >
            >
            >
          • Axel Berger
            ... I was aware of that but forgot. My suggestion ought to remove all trailing spaces, so the required number needs to be put back in at the end of the replace
            Message 5 of 9 , Mar 13, 2012
              Don wrote:
              > I actually do need multiple spaces kept or put back

              I was aware of that but forgot. My suggestion ought to remove all
              trailing spaces, so the required number needs to be put back in at the
              end of the replace string. Meant to do it but forgot.

              Your ([A-Z].{23}?) should break with short school names.

              Axel
            • Don
              The thing is I can have unlimited spaces there as a delimiter, so it doesn t break ... it just leaves extra spaces.
              Message 6 of 9 , Mar 13, 2012
                The thing is I can have unlimited spaces there as a delimiter, so it
                doesn't break ... it just leaves extra spaces.

                On 3/13/2012 8:45 AM, Axel Berger wrote:
                > Don wrote:
                >> I actually do need multiple spaces kept or put back
                >
                > I was aware of that but forgot. My suggestion ought to remove all
                > trailing spaces, so the required number needs to be put back in at the
                > end of the replace string. Meant to do it but forgot.
                >
                > Your ([A-Z].{23}?) should break with short school names.
                >
                > Axel
              • Axel Berger
                ... In the case of very short names your exactly 23 characters will not only include spaces, unnless there are well over ten of them, but unintended stuff
                Message 7 of 9 , Mar 13, 2012
                  Don wrote:
                  > so it doesn't break ... it just leaves extra spaces.

                  In the case of very short names your exactly 23 characters will not only
                  include spaces, unnless there are well over ten of them, but unintended
                  stuff coming after. Your extra spaces may be inserted at an inconvenient
                  place.

                  Axel
                • Don
                  I agree fully, but the data structure for the data I m working on doesn t seem to suffer that flaw. There are always 23 or more characters before the next bit
                  Message 8 of 9 , Mar 13, 2012
                    I agree fully, but the data structure for the data I'm working on
                    doesn't seem to suffer that flaw. There are always 23 or more
                    characters before the next bit I want to retain. You guys solved a
                    puzzle I couldn't and I thank you.

                    On 3/13/2012 3:46 PM, Axel Berger wrote:
                    > Don wrote:
                    >> so it doesn't break ... it just leaves extra spaces.
                    >
                    > In the case of very short names your exactly 23 characters will not only
                    > include spaces, unnless there are well over ten of them, but unintended
                    > stuff coming after. Your extra spaces may be inserted at an inconvenient
                    > place.
                    >
                    > Axel
                    >
                    >
                  Your message has been successfully submitted and would be delivered to recipients shortly.