Loading ...
Sorry, an error occurred while loading the content.

Re: [Clip] Re: I can do part of this with reg-ex

Expand Messages
  • Sheri
    ... The (?s) is just an option setting, sets dotall so subsequent dots in the pattern will match linebreak characters. Until the (?-s) when it goes back to
    Message 1 of 9 , Nov 30, 2007
    View Source
    • 0 Attachment
      Don - HtmlFixIt.com wrote:
      > Sheri,
      >
      > You never cease to amaze me. I'm working to see how you are getting rid
      > of the duplicative artist names? Or put another way how your first
      > regex finds the block of one artist at at time.
      >
      > Two minor issues for me were caused by a line wrap at the end of the
      > clip which I then removed and it works as promised.
      >
      > I'm trying to understand this line:
      > ^!Find "(?s)^(.+) - .+(^\1)+(?-s).+" RS
      >
      > The (?s) finds any letters/spaces at the start of the line, correct and
      > also becomes \1, so later you go to the last instance of \1 and the rest
      > of that line? So you can actually use a reference later in the same
      > search string?
      >
      The (?s) is just an option setting, sets dotall so subsequent dots in
      the pattern will match linebreak characters. Until the (?-s) when it
      goes back to matching everything but linebreak characters.

      The part that becomes \1 is (.+) which is the artist.

      The .+(^\1) causes a lot of backtracking.

      First the .+ matches everything until the end of the file. Then it keeps
      backing up by one character until the previously identified artist name
      can follow the dots. That would be the last occurrence of the that
      artist. Really the ^\1 didn't need to be in parentheses or be followed
      by the plus sign.

      It might be quite slow. It would be nice to have a big file somewhere
      for testing. The little sample doesn't have enough different artists.

      I think I did at one point have a single replace that did it all with no
      loop, but it didn't keep the artist and album name together where there
      was only one album by the artist.

      Regards,
      Sheri
    • Sheri
      I thought it might be faster if I made the Find ungreedy instead of greedy but it doesn t seem to make much difference. I was surprised to find that it only
      Message 2 of 9 , Nov 30, 2007
      View Source
      • 0 Attachment
        I thought it might be faster if I made the Find ungreedy instead of
        greedy but it doesn't seem to make much difference.

        I was surprised to find that it only takes about 5 to 7 seconds for the
        clip to process 1450 lines on my slow pc either way.

        Both Find statements find the same thing. The first one is basically
        greedy and the second one is more ungreedy. The greedy one looks for the
        last line with the same artist, while the ungreedy looks forward until
        it hits a line that starts with a different artist.

        ^!SetHintInfo Updating Artist - Album List
        ^!SetScreenUpdate Off
        ^!Set %starttime%="Start Time: ^$GetDate(tt)$"
        ^!Jump Doc_Start
        :Loop
        ;^!Find "^(.+? - )(?s).+^\1(?-s).+\r\n" RS
        ^!Find "^(.+? - )(?s).+?^(?!\1)" RS
        ^!IfError Finish
        ^!Replace "\R(.+? - )" >> "\t" RAHS1
        ^!Jump Select_End
        ^!Goto Loop
        :Finish
        ;start of long line
        ^!Replace "^(?<artist>.+) - (?<album>.+)(?=\r\n\t)" >>
        "$<artist>\r\n\t$<album>" RAWS0
        ;end of long line
        ^!Set %endtime%="End Time: ^$GetDate(tt)$"
        ^!Prompt ^$GetClipName$^%NL%^%NL%^%starttime%^%NL%^%endtime%


        Regards,
        Sheri
      • Dave
        Hi works beautifully till it finds multiple artists on last line then it leaves them alone THANKYOU DAVE M ... From: Sheri To:
        Message 3 of 9 , Nov 30, 2007
        View Source
        • 0 Attachment
          Hi
          works beautifully till it finds multiple artists on last line then it leaves
          them alone
          THANKYOU DAVE M

          ----- Original Message -----
          From: "Sheri" <silvermoonwoman@...>
          To: <ntb-clips@yahoogroups.com>
          Sent: Saturday, December 01, 2007 10:13 AM
          Subject: Re: [Clip] I can do part of this with reg-ex


          >I thought it might be faster if I made the Find ungreedy instead of
          > greedy but it doesn't seem to make much difference.
          >
          > I was surprised to find that it only takes about 5 to 7 seconds for the
          > clip to process 1450 lines on my slow pc either way.
          >
          > Both Find statements find the same thing. The first one is basically
          > greedy and the second one is more ungreedy. The greedy one looks for the
          > last line with the same artist, while the ungreedy looks forward until
          > it hits a line that starts with a different artist.
          >
          > ^!SetHintInfo Updating Artist - Album List
          > ^!SetScreenUpdate Off
          > ^!Set %starttime%="Start Time: ^$GetDate(tt)$"
          > ^!Jump Doc_Start
          > :Loop
          > ;^!Find "^(.+? - )(?s).+^\1(?-s).+\r\n" RS
          > ^!Find "^(.+? - )(?s).+?^(?!\1)" RS
          > ^!IfError Finish
          > ^!Replace "\R(.+? - )" >> "\t" RAHS1
          > ^!Jump Select_End
          > ^!Goto Loop
          > :Finish
          > ;start of long line
          > ^!Replace "^(?<artist>.+) - (?<album>.+)(?=\r\n\t)" >>
          > "$<artist>\r\n\t$<album>" RAWS0
          > ;end of long line
          > ^!Set %endtime%="End Time: ^$GetDate(tt)$"
          > ^!Prompt ^$GetClipName$^%NL%^%NL%^%starttime%^%NL%^%endtime%
          >
          >
          > Regards,
          > Sheri
          >
          >
          >
          > Fookes Software: http://www.fookes.com/
          > Fookes Software Discussion Lists: http://www.fookes.com/groups.php
          > Yahoo! Groups Links
          >
          >
          >
          >
        • Dave
          Hi ignore what I just sent I did not second one THANKYOU DAVE M ... From: Sheri To: Sent: Saturday,
          Message 4 of 9 , Nov 30, 2007
          View Source
          • 0 Attachment
            Hi ignore what I just sent I did not second one
            THANKYOU DAVE M

            ----- Original Message -----
            From: "Sheri" <silvermoonwoman@...>
            To: <ntb-clips@yahoogroups.com>
            Sent: Saturday, December 01, 2007 10:13 AM
            Subject: Re: [Clip] I can do part of this with reg-ex


            >I thought it might be faster if I made the Find ungreedy instead of
            > greedy but it doesn't seem to make much difference.
            >
            > I was surprised to find that it only takes about 5 to 7 seconds for the
            > clip to process 1450 lines on my slow pc either way.
            >
            > Both Find statements find the same thing. The first one is basically
            > greedy and the second one is more ungreedy. The greedy one looks for the
            > last line with the same artist, while the ungreedy looks forward until
            > it hits a line that starts with a different artist.
            >
            > ^!SetHintInfo Updating Artist - Album List
            > ^!SetScreenUpdate Off
            > ^!Set %starttime%="Start Time: ^$GetDate(tt)$"
            > ^!Jump Doc_Start
            > :Loop
            > ;^!Find "^(.+? - )(?s).+^\1(?-s).+\r\n" RS
            > ^!Find "^(.+? - )(?s).+?^(?!\1)" RS
            > ^!IfError Finish
            > ^!Replace "\R(.+? - )" >> "\t" RAHS1
            > ^!Jump Select_End
            > ^!Goto Loop
            > :Finish
            > ;start of long line
            > ^!Replace "^(?<artist>.+) - (?<album>.+)(?=\r\n\t)" >>
            > "$<artist>\r\n\t$<album>" RAWS0
            > ;end of long line
            > ^!Set %endtime%="End Time: ^$GetDate(tt)$"
            > ^!Prompt ^$GetClipName$^%NL%^%NL%^%starttime%^%NL%^%endtime%
            >
            >
            > Regards,
            > Sheri
            >
            >
            >
            > Fookes Software: http://www.fookes.com/
            > Fookes Software Discussion Lists: http://www.fookes.com/groups.php
            > Yahoo! Groups Links
            >
            >
            >
            >
          • Sheri
            ... The greedy one as written was requiring a line break at the end of every line. It wouldn t work on the last group if the last entry for the last group
            Message 5 of 9 , Dec 1, 2007
            View Source
            • 0 Attachment
              Dave wrote:
              > Hi ignore what I just sent I did not second one
              > THANKYOU DAVE M
              >
              The greedy one as written was requiring a line break at the end of every
              line. It wouldn't work on the last group if the last entry for the last
              group terminated at the end of the file instead of the end of a line.
              That pattern could be changed to accept either a line break or the end
              of the file.

              ^!Find "^(.+? - )(?s).+^\1(?-s).+\R|\z" RS

              Regards,
              Sheri
            • Sheri
              ... The non-greedy one seemed to have the same issue. Could be changed to ^!Find ^(.+? - )(?s).+?((^(?! 1))| z) RS Dave I m not sure what you were saying.
              Message 6 of 9 , Dec 1, 2007
              View Source
              • 0 Attachment
                Sheri wrote:
                > Dave wrote:
                >
                >> Hi ignore what I just sent I did not second one
                >> THANKYOU DAVE M
                >>
                >>
                > The greedy one as written was requiring a line break at the end of every
                > line. It wouldn't work on the last group if the last entry for the last
                > group terminated at the end of the file instead of the end of a line.
                > That pattern could be changed to accept either a line break or the end
                > of the file.
                >
                > ^!Find "^(.+? - )(?s).+^\1(?-s).+\R|\z" RS
                >
                > Regards,
                > Sheri
                >
                >
                >
                The non-greedy one seemed to have the same issue. Could be changed to

                ^!Find "^(.+? - )(?s).+?((^(?!\1))|\z)" RS

                Dave I'm not sure what you were saying.

                Regards,
                Sheri
              Your message has been successfully submitted and would be delivered to recipients shortly.