Loading ...
Sorry, an error occurred while loading the content.
 

I can do part of this with reg-ex

Expand Messages
  • Dave
    Hi I can convert this list to an otl with out to much problem but how do I detect multiple artist entry and produce the second list any help ?? 12inch -
    Message 1 of 9 , Nov 30, 2007
      Hi
      I can convert this list to an otl with out to much problem but how do I
      detect multiple artist entry and produce the second list any help ??

      12inch - Eighties [CD 2]
      Aaron Jasinski - Serpentine Soiree
      Act - Absolutely Ammune 12 inch
      Act - Emotional Highlights from Snobbery And Decay
      Act - Laughter, Tears and Rage The Anthology [CD1]
      Act - Laughter, Tears and Rage The Anthology [CD2]
      Act - Laughter, Tears and Rage The Anthology [CD3]
      Amethystium - Aphelion
      Anoushka Shankar - Anoushka
      Anoushka Shankar - Rise

      12inch - Eighties [CD 2]
      Aaron Jasinski - Serpentine Soiree
      Act
      Absolutely Ammune 12 inch
      Emotional Highlights from Snobbery And Decay
      Laughter, Tears and Rage The Anthology [CD1]
      Laughter, Tears and Rage The Anthology [CD2]
      Laughter, Tears and Rage The Anthology [CD3]
      Amethystium - Aphelion
      Anoushka Shankar -
      Anoushka
      Rise

      THANKYOU DAVE M
    • Sheri
      ... This should turn the plain presorted list into the second one: ^!Jump Doc_Start ... ^!Find (?s)^(.+) - .+(^ 1)+(?-s).+ RS ^!IfError Finish ^!Replace
      Message 2 of 9 , Nov 30, 2007
        --- In ntb-clips@yahoogroups.com, "Dave" <dmc43959@...> wrote:
        >
        > Hi
        > I can convert this list to an otl with out to much problem but how do I
        > detect multiple artist entry and produce the second list any help ??
        >
        > 12inch - Eighties [CD 2]
        > Aaron Jasinski - Serpentine Soiree
        > Act - Absolutely Ammune 12 inch
        > Act - Emotional Highlights from Snobbery And Decay
        > Act - Laughter, Tears and Rage The Anthology [CD1]
        > Act - Laughter, Tears and Rage The Anthology [CD2]
        > Act - Laughter, Tears and Rage The Anthology [CD3]
        > Amethystium - Aphelion
        > Anoushka Shankar - Anoushka
        > Anoushka Shankar - Rise
        >
        > 12inch - Eighties [CD 2]
        > Aaron Jasinski - Serpentine Soiree
        > Act
        > Absolutely Ammune 12 inch
        > Emotional Highlights from Snobbery And Decay
        > Laughter, Tears and Rage The Anthology [CD1]
        > Laughter, Tears and Rage The Anthology [CD2]
        > Laughter, Tears and Rage The Anthology [CD3]
        > Amethystium - Aphelion
        > Anoushka Shankar -
        > Anoushka
        > Rise
        >
        > THANKYOU DAVE M
        >

        This should turn the plain presorted list into the second one:

        ^!Jump Doc_Start
        :Loop
        ^!Find "(?s)^(.+) - .+(^\1)+(?-s).+" RS
        ^!IfError Finish
        ^!Replace "(\R)^(.+) - " >> "$1\t" RAHS
        ^!Goto Loop
        :Finish
        ^!Replace "^(?<artist>.+) - (?<album>.+)(?=\r\n\t)" >>
        "$<artist>\r\n\t$<album>" RAWS0

        Regards,
        Sheri
      • Don - HtmlFixIt.com
        Sheri, You never cease to amaze me. I m working to see how you are getting rid of the duplicative artist names? Or put another way how your first regex finds
        Message 3 of 9 , Nov 30, 2007
          Sheri,

          You never cease to amaze me. I'm working to see how you are getting rid
          of the duplicative artist names? Or put another way how your first
          regex finds the block of one artist at at time.

          Two minor issues for me were caused by a line wrap at the end of the
          clip which I then removed and it works as promised.

          I'm trying to understand this line:
          ^!Find "(?s)^(.+) - .+(^\1)+(?-s).+" RS

          The (?s) finds any letters/spaces at the start of the line, correct and
          also becomes \1, so later you go to the last instance of \1 and the rest
          of that line? So you can actually use a reference later in the same
          search string?

          (^\1)+(?-s).+ catches every matching instance that starts with the same
          artist name?

          Wow thanks for taking the time to show the opportunities to the rest of us!

          Don


          >> 12inch - Eighties [CD 2]
          >> Aaron Jasinski - Serpentine Soiree
          >> Act
          >> Absolutely Ammune 12 inch
          >> Emotional Highlights from Snobbery And Decay
          >> Laughter, Tears and Rage The Anthology [CD1]
          >> Laughter, Tears and Rage The Anthology [CD2]
          >> Laughter, Tears and Rage The Anthology [CD3]
          >> Amethystium - Aphelion
          >> Anoushka Shankar -
          >> Anoushka
          >> Rise
          >>
          >
          > This should turn the plain presorted list into the second one:
          >
          > ^!Jump Doc_Start
          > :Loop
          > ^!Find "(?s)^(.+) - .+(^\1)+(?-s).+" RS
          > ^!IfError Finish
          > ^!Replace "(\R)^(.+) - " >> "$1\t" RAHS
          > ^!Goto Loop
          > :Finish
          > ^!Replace "^(?<artist>.+) - (?<album>.+)(?=\r\n\t)" >>
          > "$<artist>\r\n\t$<album>" RAWS0
          >
        • Sheri
          ... The (?s) is just an option setting, sets dotall so subsequent dots in the pattern will match linebreak characters. Until the (?-s) when it goes back to
          Message 4 of 9 , Nov 30, 2007
            Don - HtmlFixIt.com wrote:
            > Sheri,
            >
            > You never cease to amaze me. I'm working to see how you are getting rid
            > of the duplicative artist names? Or put another way how your first
            > regex finds the block of one artist at at time.
            >
            > Two minor issues for me were caused by a line wrap at the end of the
            > clip which I then removed and it works as promised.
            >
            > I'm trying to understand this line:
            > ^!Find "(?s)^(.+) - .+(^\1)+(?-s).+" RS
            >
            > The (?s) finds any letters/spaces at the start of the line, correct and
            > also becomes \1, so later you go to the last instance of \1 and the rest
            > of that line? So you can actually use a reference later in the same
            > search string?
            >
            The (?s) is just an option setting, sets dotall so subsequent dots in
            the pattern will match linebreak characters. Until the (?-s) when it
            goes back to matching everything but linebreak characters.

            The part that becomes \1 is (.+) which is the artist.

            The .+(^\1) causes a lot of backtracking.

            First the .+ matches everything until the end of the file. Then it keeps
            backing up by one character until the previously identified artist name
            can follow the dots. That would be the last occurrence of the that
            artist. Really the ^\1 didn't need to be in parentheses or be followed
            by the plus sign.

            It might be quite slow. It would be nice to have a big file somewhere
            for testing. The little sample doesn't have enough different artists.

            I think I did at one point have a single replace that did it all with no
            loop, but it didn't keep the artist and album name together where there
            was only one album by the artist.

            Regards,
            Sheri
          • Sheri
            I thought it might be faster if I made the Find ungreedy instead of greedy but it doesn t seem to make much difference. I was surprised to find that it only
            Message 5 of 9 , Nov 30, 2007
              I thought it might be faster if I made the Find ungreedy instead of
              greedy but it doesn't seem to make much difference.

              I was surprised to find that it only takes about 5 to 7 seconds for the
              clip to process 1450 lines on my slow pc either way.

              Both Find statements find the same thing. The first one is basically
              greedy and the second one is more ungreedy. The greedy one looks for the
              last line with the same artist, while the ungreedy looks forward until
              it hits a line that starts with a different artist.

              ^!SetHintInfo Updating Artist - Album List
              ^!SetScreenUpdate Off
              ^!Set %starttime%="Start Time: ^$GetDate(tt)$"
              ^!Jump Doc_Start
              :Loop
              ;^!Find "^(.+? - )(?s).+^\1(?-s).+\r\n" RS
              ^!Find "^(.+? - )(?s).+?^(?!\1)" RS
              ^!IfError Finish
              ^!Replace "\R(.+? - )" >> "\t" RAHS1
              ^!Jump Select_End
              ^!Goto Loop
              :Finish
              ;start of long line
              ^!Replace "^(?<artist>.+) - (?<album>.+)(?=\r\n\t)" >>
              "$<artist>\r\n\t$<album>" RAWS0
              ;end of long line
              ^!Set %endtime%="End Time: ^$GetDate(tt)$"
              ^!Prompt ^$GetClipName$^%NL%^%NL%^%starttime%^%NL%^%endtime%


              Regards,
              Sheri
            • Dave
              Hi works beautifully till it finds multiple artists on last line then it leaves them alone THANKYOU DAVE M ... From: Sheri To:
              Message 6 of 9 , Nov 30, 2007
                Hi
                works beautifully till it finds multiple artists on last line then it leaves
                them alone
                THANKYOU DAVE M

                ----- Original Message -----
                From: "Sheri" <silvermoonwoman@...>
                To: <ntb-clips@yahoogroups.com>
                Sent: Saturday, December 01, 2007 10:13 AM
                Subject: Re: [Clip] I can do part of this with reg-ex


                >I thought it might be faster if I made the Find ungreedy instead of
                > greedy but it doesn't seem to make much difference.
                >
                > I was surprised to find that it only takes about 5 to 7 seconds for the
                > clip to process 1450 lines on my slow pc either way.
                >
                > Both Find statements find the same thing. The first one is basically
                > greedy and the second one is more ungreedy. The greedy one looks for the
                > last line with the same artist, while the ungreedy looks forward until
                > it hits a line that starts with a different artist.
                >
                > ^!SetHintInfo Updating Artist - Album List
                > ^!SetScreenUpdate Off
                > ^!Set %starttime%="Start Time: ^$GetDate(tt)$"
                > ^!Jump Doc_Start
                > :Loop
                > ;^!Find "^(.+? - )(?s).+^\1(?-s).+\r\n" RS
                > ^!Find "^(.+? - )(?s).+?^(?!\1)" RS
                > ^!IfError Finish
                > ^!Replace "\R(.+? - )" >> "\t" RAHS1
                > ^!Jump Select_End
                > ^!Goto Loop
                > :Finish
                > ;start of long line
                > ^!Replace "^(?<artist>.+) - (?<album>.+)(?=\r\n\t)" >>
                > "$<artist>\r\n\t$<album>" RAWS0
                > ;end of long line
                > ^!Set %endtime%="End Time: ^$GetDate(tt)$"
                > ^!Prompt ^$GetClipName$^%NL%^%NL%^%starttime%^%NL%^%endtime%
                >
                >
                > Regards,
                > Sheri
                >
                >
                >
                > Fookes Software: http://www.fookes.com/
                > Fookes Software Discussion Lists: http://www.fookes.com/groups.php
                > Yahoo! Groups Links
                >
                >
                >
                >
              • Dave
                Hi ignore what I just sent I did not second one THANKYOU DAVE M ... From: Sheri To: Sent: Saturday,
                Message 7 of 9 , Nov 30, 2007
                  Hi ignore what I just sent I did not second one
                  THANKYOU DAVE M

                  ----- Original Message -----
                  From: "Sheri" <silvermoonwoman@...>
                  To: <ntb-clips@yahoogroups.com>
                  Sent: Saturday, December 01, 2007 10:13 AM
                  Subject: Re: [Clip] I can do part of this with reg-ex


                  >I thought it might be faster if I made the Find ungreedy instead of
                  > greedy but it doesn't seem to make much difference.
                  >
                  > I was surprised to find that it only takes about 5 to 7 seconds for the
                  > clip to process 1450 lines on my slow pc either way.
                  >
                  > Both Find statements find the same thing. The first one is basically
                  > greedy and the second one is more ungreedy. The greedy one looks for the
                  > last line with the same artist, while the ungreedy looks forward until
                  > it hits a line that starts with a different artist.
                  >
                  > ^!SetHintInfo Updating Artist - Album List
                  > ^!SetScreenUpdate Off
                  > ^!Set %starttime%="Start Time: ^$GetDate(tt)$"
                  > ^!Jump Doc_Start
                  > :Loop
                  > ;^!Find "^(.+? - )(?s).+^\1(?-s).+\r\n" RS
                  > ^!Find "^(.+? - )(?s).+?^(?!\1)" RS
                  > ^!IfError Finish
                  > ^!Replace "\R(.+? - )" >> "\t" RAHS1
                  > ^!Jump Select_End
                  > ^!Goto Loop
                  > :Finish
                  > ;start of long line
                  > ^!Replace "^(?<artist>.+) - (?<album>.+)(?=\r\n\t)" >>
                  > "$<artist>\r\n\t$<album>" RAWS0
                  > ;end of long line
                  > ^!Set %endtime%="End Time: ^$GetDate(tt)$"
                  > ^!Prompt ^$GetClipName$^%NL%^%NL%^%starttime%^%NL%^%endtime%
                  >
                  >
                  > Regards,
                  > Sheri
                  >
                  >
                  >
                  > Fookes Software: http://www.fookes.com/
                  > Fookes Software Discussion Lists: http://www.fookes.com/groups.php
                  > Yahoo! Groups Links
                  >
                  >
                  >
                  >
                • Sheri
                  ... The greedy one as written was requiring a line break at the end of every line. It wouldn t work on the last group if the last entry for the last group
                  Message 8 of 9 , Dec 1, 2007
                    Dave wrote:
                    > Hi ignore what I just sent I did not second one
                    > THANKYOU DAVE M
                    >
                    The greedy one as written was requiring a line break at the end of every
                    line. It wouldn't work on the last group if the last entry for the last
                    group terminated at the end of the file instead of the end of a line.
                    That pattern could be changed to accept either a line break or the end
                    of the file.

                    ^!Find "^(.+? - )(?s).+^\1(?-s).+\R|\z" RS

                    Regards,
                    Sheri
                  • Sheri
                    ... The non-greedy one seemed to have the same issue. Could be changed to ^!Find ^(.+? - )(?s).+?((^(?! 1))| z) RS Dave I m not sure what you were saying.
                    Message 9 of 9 , Dec 1, 2007
                      Sheri wrote:
                      > Dave wrote:
                      >
                      >> Hi ignore what I just sent I did not second one
                      >> THANKYOU DAVE M
                      >>
                      >>
                      > The greedy one as written was requiring a line break at the end of every
                      > line. It wouldn't work on the last group if the last entry for the last
                      > group terminated at the end of the file instead of the end of a line.
                      > That pattern could be changed to accept either a line break or the end
                      > of the file.
                      >
                      > ^!Find "^(.+? - )(?s).+^\1(?-s).+\R|\z" RS
                      >
                      > Regards,
                      > Sheri
                      >
                      >
                      >
                      The non-greedy one seemed to have the same issue. Could be changed to

                      ^!Find "^(.+? - )(?s).+?((^(?!\1))|\z)" RS

                      Dave I'm not sure what you were saying.

                      Regards,
                      Sheri
                    Your message has been successfully submitted and would be delivered to recipients shortly.