Loading ...
Sorry, an error occurred while loading the content.

regexp to identify N duplicates parts into several lines

Expand Messages
  • epanda
    Hi, I have a file formatted into CSV and the first column take the name of files. I would like to copy all lines which have the same file name to the end of
    Message 1 of 8 , Nov 1, 2007
    • 0 Attachment
      Hi,

      I have a file formatted into CSV and the first column take the name of
      files.

      I would like to copy all lines which have the same file name to the
      end of the file by regexp.

      g/\(^[^/]\+;\)\(.*\n\)\(\(^\1\)\(.*\n\)\)\{1,\}/t$

      This regexp search highlights correctly the three first lines because
      there is three times sortie.cpp,
      it highlights the three lines after because there is three sortie.h
      but it does not copy all highlighted at the end of the file.

      It copies only the first occurence and I would copy all occurences.

      why ?

      sortie.cpp;5216497a433e57252e68d0b3ddcfb348;directory1/contr
      Sortie.cpp;bd4a068804eed79102686c01b8e2cd30;directory1/SRC/boitier
      Sortie.cpp;f0f4fbce772de5add53e54fea0f0e777;directory1/Rose/boitier
      Sortie.h;380567b94118cafff11dea994dbc5595;directory1/SRC/boitier
      Sortie.h;7ce3fa69a264aacaa10daba846e1bc0e;directory1/Rose/boitier
      sortie.h;7ce3fa69a264aacaa10daba846e1bc0e;directory1_equipements/contr
      SrvApp.cpp;c2ef9e82c145ec1d7ebb54811b88df1a;directory1/boitier
      srvApp.h;4ab0c12083a98435e1f9fa70220a56ac;directory1/boitier


      Thanks
      epanda


      --~--~---------~--~----~------------~-------~--~----~
      You received this message from the "vim_use" maillist.
      For more information, visit http://www.vim.org/maillist.php
      -~----------~----~----~----~------~----~------~--~---
    • Ben Schmidt
      ... As documented, :g only runs its command on the first line of multi-line matches. You will need to do something a bit more complicated. I would suggest: 1.
      Message 2 of 8 , Nov 1, 2007
      • 0 Attachment
        > g/\(^[^/]\+;\)\(.*\n\)\(\(^\1\)\(.*\n\)\)\{1,\}/t$
        >
        > This regexp search highlights correctly the three first lines because
        > there is three times sortie.cpp,
        > it highlights the three lines after because there is three sortie.h
        > but it does not copy all highlighted at the end of the file.

        As documented, :g only runs its command on the first line of multi-line matches.

        You will need to do something a bit more complicated.

        I would suggest:

        1. Clear some register, say "a. :let @a=""

        2. Make the :g command append each match to that register using yank:

        :g/\(^[^/]\+;\)\(.*\n\)\(\(^\1\)\(.*\n\)\)\{1,\}/exe "normal \"Ay//e\<CR>"

        3. Move to the end of your buffer and put the register. :$put a

        Hope this helps!

        Ben.




        Send instant messages to your online friends http://au.messenger.yahoo.com


        --~--~---------~--~----~------------~-------~--~----~
        You received this message from the "vim_use" maillist.
        For more information, visit http://www.vim.org/maillist.php
        -~----------~----~----~----~------~----~------~--~---
      • Ben Schmidt
        ... Actually, my previous suggestion won t work when there are more than two duplicate lines, I don t think, as it will run on each line except the
        Message 3 of 8 , Nov 1, 2007
        • 0 Attachment
          > Hi,
          >
          > I have a file formatted into CSV and the first column take the name of
          > files.
          >
          > I would like to copy all lines which have the same file name to the
          > end of the file by regexp.
          >
          > g/\(^[^/]\+;\)\(.*\n\)\(\(^\1\)\(.*\n\)\)\{1,\}/t$
          >
          > This regexp search highlights correctly the three first lines because
          > there is three times sortie.cpp,
          > it highlights the three lines after because there is three sortie.h
          > but it does not copy all highlighted at the end of the file.
          >
          > It copies only the first occurence and I would copy all occurences.
          >
          > why ?

          Actually, my previous suggestion won't work when there are more than two duplicate
          lines, I don't think, as it will run on each line except the last...hmmm...

          This would be a problem with your original command, too, though.

          You will probably want to consider a more complicated solution again, like
          scripting the whole thing, which would be fairly easy: e.g.:

          function! GetDuplicates()
          let lastline=line('$')
          let prevkey=""
          let curline=1
          let copiedkey=0
          while curline <= lastline
          let thiskey = matchstr(getline(curline),'^[^;]*;')
          if thiskey == prevkey
          if copiedkey == 0
          execute (curline-1)."t$"
          let copiedkey = 1
          endif
          execute curline."t$"
          else
          let copiedkey = 0
          endif
          let curline = curline + 1
          let prevkey = thiskey
          endwhile
          endfunction

          I've barely tested it, but it shouldn't be too far wrong!

          Grins,

          Ben.




          Send instant messages to your online friends http://au.messenger.yahoo.com


          --~--~---------~--~----~------------~-------~--~----~
          You received this message from the "vim_use" maillist.
          For more information, visit http://www.vim.org/maillist.php
          -~----------~----~----~----~------~----~------~--~---
        • Andy Wokula
          ... -- Andy --~--~---------~--~----~------------~-------~--~----~ You received this message from the vim_use maillist. For more information, visit
          Message 4 of 8 , Nov 1, 2007
          • 0 Attachment
            Ben Schmidt schrieb:
            >> g/\(^[^/]\+;\)\(.*\n\)\(\(^\1\)\(.*\n\)\)\{1,\}/t$
            >>
            >> This regexp search highlights correctly the three first lines because
            >> there is three times sortie.cpp,
            >> it highlights the three lines after because there is three sortie.h
            >> but it does not copy all highlighted at the end of the file.
            >
            > As documented, :g only runs its command on the first line of multi-line matches.
            >
            > You will need to do something a bit more complicated.
            >
            > I would suggest:
            >
            > 1. Clear some register, say "a. :let @a=""
            >
            > 2. Make the :g command append each match to that register using yank:
            >
            > :g/\(^[^/]\+;\)\(.*\n\)\(\(^\1\)\(.*\n\)\)\{1,\}/exe "normal \"Ay//e\<CR>"

            :g/\%(^\1;.*\n\)\@<!\_^\([^;]*\);.*\(\n\1;.*\)\+$/exe "norm! \"AyV//e\r"

            > 3. Move to the end of your buffer and put the register. :$put a
            >
            > Hope this helps!
            >
            > Ben.

            --
            Andy

            --~--~---------~--~----~------------~-------~--~----~
            You received this message from the "vim_use" maillist.
            For more information, visit http://www.vim.org/maillist.php
            -~----------~----~----~----~------~----~------~--~---
          • epanda
            Hi Andy, the command hich works : g/ %(^ 1;.* n ) @
            Message 5 of 8 , Nov 4, 2007
            • 0 Attachment
              Hi Andy,

              the command hich works : g/\%(^\1;.*\n\)\@<!\_^\([^;]*\);.*\(\n\1;.*\)\
              +$/exe "norm! \"AyV//e\r"

              Your command works very well but could I have a detailed explaination
              cause I have never used registers out of default registers ?

              You have resolved a big part of my problem, in fact I would like to do
              sdiff an store the results of files of the same name (not the same
              checksum).
              NowI have the list sorted by name and I know for example that I have
              to do 2 diff between 3 files of the same name which have several
              basename and not the same checksum.

              Thanks
              epanda




              On 1 nov, 13:50, Andy Wokula <anw...@...> wrote:
              > Ben Schmidt schrieb:
              >
              >
              >
              > >> g/\(^[^/]\+;\)\(.*\n\)\(\(^\1\)\(.*\n\)\)\{1,\}/t$
              >
              > >> This regexp search highlights correctly the three first lines because
              > >> there is three times sortie.cpp,
              > >> it highlights the three lines after because there is three sortie.h
              > >> but it does not copy all highlighted at the end of the file.
              >
              > > As documented, :g only runs its command on the first line of multi-line matches.
              >
              > > You will need to do something a bit more complicated.
              >
              > > I would suggest:
              >
              > > 1. Clear some register, say "a. :let @a=""
              >
              > > 2. Make the :g command append each match to that register using yank:
              >
              > > :g/\(^[^/]\+;\)\(.*\n\)\(\(^\1\)\(.*\n\)\)\{1,\}/exe "normal \"Ay//e\<CR>"
              >
              > :g/\%(^\1;.*\n\)\@<!\_^\([^;]*\);.*\(\n\1;.*\)\+$/exe "norm! \"AyV//e\r"
              >
              > > 3. Move to the end of your buffer and put the register. :$put a
              >
              > > Hope this helps!
              >
              > > Ben.
              >
              > --
              > Andy


              --~--~---------~--~----~------------~-------~--~----~
              You received this message from the "vim_use" maillist.
              For more information, visit http://www.vim.org/maillist.php
              -~----------~----~----~----~------~----~------~--~---
            • Andy Wokula
              ... On the regexp: Match any line, create a back reference for the file name part: _^ ([^;]* );.* . . Require one or more following lines with
              Message 6 of 8 , Nov 4, 2007
              • 0 Attachment
                epanda schrieb:
                > Hi Andy,
                >
                > the command hich works : g/\%(^\1;.*\n\)\@<!\_^\([^;]*\);.*\(\n\1;.*\)\
                > +$/exe "norm! \"AyV//e\r"
                >
                > Your command works very well but could I have a detailed explaination
                > cause I have never used registers out of default registers ?

                On the regexp:

                Match any line, create a back reference for the file name part:
                \_^\([^;]*\);.*
                . .
                Require one or more following lines with the same file name (depends on
                'ignorecase'): : '
                : \(\n\1;.*\)\+$
                :
                Look back, let the above pattern only match if the file name in the
                previous line is different:
                \%(^\1;.*\n\)\@<!

                This check is needed, because the global command tries every line for a
                match and ignores overlappings from a previous match. The backref is ok
                here, a match for this part is tried last.

                Now the global command marks each first line of a block and when
                processing the line, sets the cursor on the first column.

                Collecting the matches:

                Yank the whole matched text and add it to register a:
                "Ay//e<Enter>
                :help "A
                :h /<CR>
                :h search-offset

                There is one small problem, we need a linewise yank:
                "AyV//e<Enter>
                :h o_V

                Written as an Ex command:
                :exe "norm! \"AyV//e\r"
                or
                :exe 'norm! "AyV//e'."\<Enter>"

                > You have resolved a big part of my problem, in fact I would like to do
                > sdiff and store the results of files of the same name (not the same
                > checksum).
                > NowI have the list sorted by name and I know for example that I have
                > to do 2 diff between 3 files of the same name which have several
                > basename and not the same checksum.
                >
                > Thanks
                > epanda

                > On 1 nov, 13:50, Andy Wokula <anw...@...> wrote:

                Wie bereits
                >> Ben Schmidt schrieb:
                >>> I would suggest:
                >>> 1. Clear some register, say "a. :let @a=""
                >>> 2. Make the :g command append each match to that register using yank:
                >>> :g/\(^[^/]\+;\)\(.*\n\)\(\(^\1\)\(.*\n\)\)\{1,\}/exe "normal \"Ay//e\<CR>"
                >> :g/\%(^\1;.*\n\)\@<!\_^\([^;]*\);.*\(\n\1;.*\)\+$/exe "norm! \"AyV//e\r"
                >>> 3. Move to the end of your buffer and put the register. :$put a
                >>> Hope this helps!
                >>> Ben.

                Slightly shortened regexp:
                :g/\%(\1;.*\n\)\@<!\(^[^;]*\);.*\(\n\1;.*\)\+$/exe "norm! \"AyV//e\r"

                I'd also prefer :copy alias :t with a range, but it's not possible to
                use
                .,//e copy
                because "e" would be ":edit" here. Maybe someone finds another trick to
                do this ...

                --
                Andy

                --~--~---------~--~----~------------~-------~--~----~
                You received this message from the "vim_use" maillist.
                For more information, visit http://www.vim.org/maillist.php
                -~----------~----~----~----~------~----~------~--~---
              • A.Politz
                ... It is possible to use a zs item in the pattern, without disturbing g. I also used the very-magic flag v to avoid leaning-toothstick-sickness.
                Message 7 of 8 , Nov 4, 2007
                • 0 Attachment
                  Andy Wokula wrote:

                  >
                  >
                  >Slightly shortened regexp:
                  >:g/\%(\1;.*\n\)\@<!\(^[^;]*\);.*\(\n\1;.*\)\+$/exe "norm! \"AyV//e\r"
                  >
                  >I'd also prefer :copy alias :t with a range, but it's not possible to
                  >use
                  > .,//e copy
                  >because "e" would be ":edit" here. Maybe someone finds another trick to
                  >do this ...
                  >
                  >
                  >
                  It is possible to use a '\zs' item in the pattern, without
                  disturbing g. I also used the very-magic flag '\v' to avoid
                  leaning-toothstick-sickness.

                  g/\v%(\1;.*\n)@<!(^[^;]*);.*(\n\1;.*)+$\zs/ .,//t$

                  -ap



                  --
                  Ich hab geträumt, der Krieg wär vorbei.


                  --~--~---------~--~----~------------~-------~--~----~
                  You received this message from the "vim_use" maillist.
                  For more information, visit http://www.vim.org/maillist.php
                  -~----------~----~----~----~------~----~------~--~---
                • Andy Wokula
                  ... Exactly what I was looking for. BTW: what terms are advisable to distinguish start of the match v /abc zsdef ^ from start of the match ? -- Andy
                  Message 8 of 8 , Nov 4, 2007
                  • 0 Attachment
                    A.Politz schrieb:
                    > Andy Wokula wrote:
                    >
                    >>
                    >> Slightly shortened regexp:
                    >> :g/\%(\1;.*\n\)\@<!\(^[^;]*\);.*\(\n\1;.*\)\+$/exe "norm! \"AyV//e\r"
                    >>
                    >> I'd also prefer :copy alias :t with a range, but it's not possible to
                    >> use
                    >> .,//e copy
                    >> because "e" would be ":edit" here. Maybe someone finds another trick to
                    >> do this ...
                    >>
                    > It is possible to use a '\zs' item in the pattern, without
                    > disturbing g. I also used the very-magic flag '\v' to avoid
                    > leaning-toothstick-sickness.
                    >
                    > g/\v%(\1;.*\n)@<!(^[^;]*);.*(\n\1;.*)+$\zs/ .,//t$
                    >
                    > -ap

                    Exactly what I was looking for.

                    BTW: what terms are advisable to distinguish

                    "start of the match"
                    v
                    /abc\zsdef
                    ^
                    from "start of the match" ?

                    --
                    Andy

                    --~--~---------~--~----~------------~-------~--~----~
                    You received this message from the "vim_use" maillist.
                    For more information, visit http://www.vim.org/maillist.php
                    -~----------~----~----~----~------~----~------~--~---
                  Your message has been successfully submitted and would be delivered to recipients shortly.