Loading ...
Sorry, an error occurred while loading the content.

Need help with RegExp to delete columns

Expand Messages
  • joy8388608
    Another simple clip gets harder than I thought... I can usually follow RegExp fairly well but can not always come up with what I need myself. Especially not
    Message 1 of 6 , Jul 9, 2013
    • 0 Attachment
      Another 'simple' clip gets harder than I thought... I can usually follow RegExp fairly well but can not always come up with what I need myself. Especially not as well as some regulars here.

      I wanted to create a clip that would allow me to delete a specified column.
      For now, assume comma delimited and one column at a time.
      Data can be anything and any length including zero.
      We can also assume no trailing delimiters (commas).

      Thanks!
      Joy

      If a general replace works then fine, but I was trying to keep it fairly simple with code like:
      ; Do not take my replace statements seriously!!!
      ^!Set %ColToRem1%=^?[Col=2]
      ^!If ^%ColToRem1% > 2 GT2
      ^!If ^%ColToRem1% = 2 EQ2

      :EQ1
      ^!REPLACE "^.*?,?(.*(\R|\Z))" >> "$1" WRSA
      ^!Goto END

      :EQ2
      ^!REPLACE "^(?:.*?)\K(,+)(.*?)(\R|\Z)" >> "$2$3" WRSA
      ^!Goto END

      :GT2
      ^!Set %ColToRem_minus_1%=^$Calc(^%ColToRem1% - 1)$
      ^!REPLACE "^(?:(?:.*?,){^%ColToRem_minus_1%})\K.*?\t(.*(\R|\Z))" >> "$1" WRSA
      ^!Goto END


      Test Data
      A11
      11,22,33,44,BB
      ,22,33,CC
      ,22,33,44,DD
      11,,33,44,EE
      11,,,44,FF
      11,22,,44,GG
      11,,,,HH


      If I wanted column 1 to be deleted, I would be left with
      <empty line>
      22,33,44,BB
      22,33,CC
      22,33,44,DD
      ,33,44,EE
      ,,44,FF
      22,,44,GG
      ,,,HH


      If I wanted column 2 to be deleted, I would be left with
      A11
      11,33,44,BB
      ,33,CC
      ,33,44,DD
      11,33,44,EE
      11,,44,FF
      11,,44,GG
      11,,,HH


      If I wanted column 3 to be deleted, I would be left with
      A11
      11,22,44,BB
      ,22,CC
      ,22,44,DD
      11,,44,EE
      11,,44,FF
      11,22,44,GG
      11,,,HH
    • Art Kocsis
      ... This seems to work but it may need a special case for column one. Although it seems to work for deleting col 1 as RegEx may burp sometimes when the
      Message 2 of 6 , Jul 9, 2013
      • 0 Attachment
        At 7/9/2013 01:09 PM, Joy wrote:
        >Another 'simple' clip gets harder than I thought... I can usually follow RegExp fairly well but can not always come up with what I need myself. Especially not as well as some regulars here.
        >
        >I wanted to create a clip that would allow me to delete a specified column.
        >For now, assume comma delimited and one column at a time.
        >Data can be anything and any length including zero.
        >We can also assume no trailing delimiters (commas).
        >
        >Thanks!
        >Joy
        >
        >If a general replace works then fine, but I was trying to keep it fairly simple with code like:
        >; Do not take my replace statements seriously!!!
        >^!Set %ColToRem1%=^?[Col=2]
        >^!If ^%ColToRem1% > 2 GT2
        >^!If ^%ColToRem1% = 2 EQ2
        >
        >:EQ1
        >^!REPLACE "^.*?,?(.*(\R|\Z))" >> "$1" WRSA
        >^!Goto END
        >
        >:EQ2
        >^!REPLACE "^(?:.*?)\K(,+)(.*?)(\R|\Z)" >> "$2$3" WRSA
        >^!Goto END
        >
        >:GT2
        >^!Set %ColToRem_minus_1%=^$Calc(^%ColToRem1% - 1)$
        >^!REPLACE "^(?:(?:.*?,){^%ColToRem_minus_1%})\K.*?\t(.*(\R|\Z))" >> "$1" WRSA
        >^!Goto END
        >
        >Test Data
        >A11
        >11,22,33,44,BB
        >,22,33,CC
        >,22,33,44,DD
        >11,,33,44,EE
        >11,,,44,FF
        >11,22,,44,GG
        >11,,,,HH
        >
        >If I wanted column 1 to be deleted, I would be left with
        ><empty line>
        >22,33,44,BB
        >22,33,CC
        >22,33,44,DD
        >,33,44,EE
        >,,44,FF
        >22,,44,GG
        >,,,HH
        >
        >If I wanted column 2 to be deleted, I would be left with
        >A11
        >11,33,44,BB
        >,33,CC
        >,33,44,DD
        >11,33,44,EE
        >11,,44,FF
        >11,,44,GG
        >11,,,HH
        >
        >If I wanted column 3 to be deleted, I would be left with
        >A11
        >11,22,44,BB
        >,22,CC
        >,22,44,DD
        >11,,44,EE
        >11,,44,FF
        >11,22,44,GG
        >11,,,HH

        This seems to work but it may need a special case for column one. Although it seems to work for deleting col 1 as RegEx may burp sometimes when the quantifier = 0.

        ^!Set %K%=^?{Enter col # to delete}
        ^!Dec %K%
        ^!Replace "^([^\,]*,){^%K%}\K[^,]*,(.*(\R|\Z))" >> "$2" AIRW

        HTH

        Namaste', Art
      • flo.gehrke
        ... ^!Set %R%=^?{Remove column:==First^=F|Second^=1|Third^=2} ^!IfSame ^%R% F Next Else Skip_2 ^!Replace ^([^, r n]+|,),{0,1} WARS ^!Goto End
        Message 3 of 6 , Jul 9, 2013
        • 0 Attachment
          --- In ntb-clips@yahoogroups.com, "joy8388608" <mycroftj@...> wrote:
          >
          > I wanted to create a clip that would allow me to delete a
          > specified column (...)


          ^!Set %R%=^?{Remove column:==First^=F|Second^=1|Third^=2}
          ^!IfSame "^%R%" "F" Next Else Skip_2
          ^!Replace "^([^,\r\n]+|,),{0,1}" >> "" WARS
          ^!Goto End
          ^!Replace "^([^,\r\n]{0,},){^%R%}\K\w{0,}," >> "" WARS


          Regards,
          Flo
        • joy8388608
          ... Flo and Art - Thanks very much for the help. (Sorry for the delay - I didn t want to reply until I had time to test) For others taking notes, I made a few
          Message 4 of 6 , Jul 13, 2013
          • 0 Attachment
            --- In ntb-clips@yahoogroups.com, "flo.gehrke" <flo.gehrke@...> wrote:
            >
            > --- In ntb-clips@yahoogroups.com, "joy8388608" <mycroftj@> wrote:
            > >
            > > I wanted to create a clip that would allow me to delete a
            > > specified column (...)
            >
            >
            > ^!Set %R%=^?{Remove column:==First^=F|Second^=1|Third^=2}
            > ^!IfSame "^%R%" "F" Next Else Skip_2
            > ^!Replace "^([^,\r\n]+|,),{0,1}" >> "" WARS
            > ^!Goto End
            > ^!Replace "^([^,\r\n]{0,},){^%R%}\K\w{0,}," >> "" WARS
            >
            >
            > Regards,
            > Flo


            Flo and Art -

            Thanks very much for the help. (Sorry for the delay - I didn't want to reply until I had time to test)

            For others taking notes, I made a few changes since there were two small problems which I think I fixed and I also replaced the \w since there could be other characters involved.

            I really appreciate the help. I am not helpless when pointed way down the correct path, but I could never have gotten here without help. Never was very successful with recursion either...

            Problems fixed:
            Remove column 1 was turning ",,33" to "33" instead of ",33" and remove column x was not working if there were exactly x columns

            ^!Set %R%=^?{Remove column:==First^=F|Second^=1|Third^=2}
            ^!IfSame "^%R%" "F" Next Else Skip_2
            ;^!Replace "^([^,\r\n]+|,),{0,1}" >> "" WARS
            ^!Replace "^([^,\r\n]*),{0,1}" >> "" WARS
            ^!Goto End
            ;^!Replace "^([^,\r\n]{0,},){^%R%}\K\w{0,}," >> "" WARS
            ^!Replace "^([^,\r\n]{0,},){^%R%}\K[^,\r\n]{0,},{0,1}" >> "" WARS

            Joy
          • Art Kocsis
            ... Here you go. This works for all column numbers, even negative ones. No changes are made for negative numbers or for any lines with fewer columns than the
            Message 5 of 6 , Jul 14, 2013
            • 0 Attachment
              At 7/13/2013 01:39 PM, Joy wrote:
              >--- In <mailto:ntb-clips%40yahoogroups.com>ntb-clips@yahoogroups.com, "flo.gehrke" <flo.gehrke@...> wrote:
              >> --- In <mailto:ntb-clips%40yahoogroups.com>ntb-clips@yahoogroups.com, "joy8388608" <mycroftj@> wrote:
              >> >
              >> > I wanted to create a clip that would allow me to delete a
              >> > specified column (...)
              >>
              >> ^!Set %R%=^?{Remove column:==First^=F|Second^=1|Third^=2}
              >> ^!IfSame "^%R%" "F" Next Else Skip_2
              >> ^!Replace "^([^,\r\n]+|,),{0,1}" >> "" WARS
              >> ^!Goto End
              >> ^!Replace "^([^,\r\n]{0,},){^%R%}\K\w{0,}," >> "" WARS
              >> Regards,
              >> Flo
              >
              >Flo and Art -
              >
              >Thanks very much for the help. (Sorry for the delay - I didn't want to reply until I had time to test)
              >
              >For others taking notes, I made a few changes since there were two small problems which I think I fixed and I also replaced the \w since there could be other characters involved.
              >
              >I really appreciate the help. I am not helpless when pointed way down the correct path, but I could never have gotten here without help. Never was very successful with recursion either...
              >
              >Problems fixed:
              >Remove column 1 was turning ",,33" to "33" instead of ",33" and remove column x was not working if there were exactly x columns
              >
              >^!Set %R%=^?{Remove column:==First^=F|Second^=1|Third^=2}
              >^!IfSame "^%R%" "F" Next Else Skip_2
              >;^!Replace "^([^,\r\n]+|,),{0,1}" >> "" WARS
              >^!Replace "^([^,\r\n]*),{0,1}" >> "" WARS
              >^!Goto End
              >;^!Replace "^([^,\r\n]{0,},){^%R%}\K\w{0,}," >> "" WARS
              >^!Replace "^([^,\r\n]{0,},){^%R%}\K[^,\r\n]{0,},{0,1}" >> "" WARS
              >
              >Joy

              Here you go. This works for all column numbers, even negative ones. No changes are made for negative numbers or for any lines with fewer columns than the one to be deleted.

              My earlier suggestion worked fine for column selections one thru four but failed on your test data for higher values. Since your clip only covers columns one thru three that implies it would have been sufficient for you. Anyway, this removes any restriction on column numbers.

              Art

              ;<<<<<<<<<<<<<<<< Start of clip >>>>>>>>>>>>>>>>>>>>>>>>>
              ^!Set %K%=^?{Enter column # to delete}
              ^!Dec %K%
              ^!Replace "^([^\,\r\n]*,){^%K%}\K[^,\r\n]*(?|(,)([^\r\n]*?)|(\h*)())(?=\R|\Z)" >> "$3" AIRW

              ;Prev Rplc: ^!Replace "^([^\,]*,){^%K%}\K[^,]*,(.*(\R|\Z))" >> "$2" AIRW
              ;<<<<<<<<<<<<<<<< End of clip >>>>>>>>>>>>>>>>>>>>>>>>>
            • joy8388608
              ... And I tried so hard to be exact with my test data. I figured what would work for 1,2 or 3 columns would also work for 4 or more. This seems to work just
              Message 6 of 6 , Jul 16, 2013
              • 0 Attachment
                --- In ntb-clips@yahoogroups.com, Art Kocsis <artkns@...> wrote:
                >
                > At 7/13/2013 01:39 PM, Joy wrote:
                > >--- In <mailto:ntb-clips%40yahoogroups.com>ntb-clips@yahoogroups.com, "flo.gehrke" <flo.gehrke@> wrote:
                > >> --- In <mailto:ntb-clips%40yahoogroups.com>ntb-clips@yahoogroups.com, "joy8388608" <mycroftj@> wrote:
                > >> >
                > >> > I wanted to create a clip that would allow me to delete a
                > >> > specified column (...)
                > >>
                > >> ^!Set %R%=^?{Remove column:==First^=F|Second^=1|Third^=2}
                > >> ^!IfSame "^%R%" "F" Next Else Skip_2
                > >> ^!Replace "^([^,\r\n]+|,),{0,1}" >> "" WARS
                > >> ^!Goto End
                > >> ^!Replace "^([^,\r\n]{0,},){^%R%}\K\w{0,}," >> "" WARS
                > >> Regards,
                > >> Flo
                > >
                > >Flo and Art -
                > >
                > >Thanks very much for the help. (Sorry for the delay - I didn't want to reply until I had time to test)
                > >
                > >For others taking notes, I made a few changes since there were two small problems which I think I fixed and I also replaced the \w since there could be other characters involved.
                > >
                > >I really appreciate the help. I am not helpless when pointed way down the correct path, but I could never have gotten here without help. Never was very successful with recursion either...
                > >
                > >Problems fixed:
                > >Remove column 1 was turning ",,33" to "33" instead of ",33" and remove column x was not working if there were exactly x columns
                > >
                > >^!Set %R%=^?{Remove column:==First^=F|Second^=1|Third^=2}
                > >^!IfSame "^%R%" "F" Next Else Skip_2
                > >;^!Replace "^([^,\r\n]+|,),{0,1}" >> "" WARS
                > >^!Replace "^([^,\r\n]*),{0,1}" >> "" WARS
                > >^!Goto End
                > >;^!Replace "^([^,\r\n]{0,},){^%R%}\K\w{0,}," >> "" WARS
                > >^!Replace "^([^,\r\n]{0,},){^%R%}\K[^,\r\n]{0,},{0,1}" >> "" WARS
                > >
                > >Joy
                >
                > Here you go. This works for all column numbers, even negative ones. No changes are made for negative numbers or for any lines with fewer columns than the one to be deleted.
                >
                > My earlier suggestion worked fine for column selections one thru four but failed on your test data for higher values. Since your clip only covers columns one thru three that implies it would have been sufficient for you. Anyway, this removes any restriction on column numbers.
                >
                > Art
                >
                > ;<<<<<<<<<<<<<<<< Start of clip >>>>>>>>>>>>>>>>>>>>>>>>>
                > ^!Set %K%=^?{Enter column # to delete}
                > ^!Dec %K%
                > ^!Replace "^([^\,\r\n]*,){^%K%}\K[^,\r\n]*(?|(,)([^\r\n]*?)|(\h*)())(?=\R|\Z)" >> "$3" AIRW
                >
                > ;Prev Rplc: ^!Replace "^([^\,]*,){^%K%}\K[^,]*,(.*(\R|\Z))" >> "$2" AIRW
                > ;<<<<<<<<<<<<<<<< End of clip >>>>>>>>>>>>>>>>>>>>>>>>>
                >

                And I tried so hard to be exact with my test data. I figured what would work for 1,2 or 3 columns would also work for 4 or more.

                This seems to work just fine although it hurts my head a bit so thanks! This type of help is great for learning and improving one's regex (and other things) and is where I learned and continue to learn a lot of good things.

                Joy

                P.S. If anyone wants\needs a clip to delete multiple delimited columns in one shot, let me know. I now have one!
              Your message has been successfully submitted and would be delivered to recipients shortly.