Loading ...
Sorry, an error occurred while loading the content.
 

Re: regexp to identify N duplicates parts into several lines

Expand Messages
  • Ben Schmidt
    ... Actually, my previous suggestion won t work when there are more than two duplicate lines, I don t think, as it will run on each line except the
    Message 1 of 8 , Nov 1 3:52 AM
      > Hi,
      >
      > I have a file formatted into CSV and the first column take the name of
      > files.
      >
      > I would like to copy all lines which have the same file name to the
      > end of the file by regexp.
      >
      > g/\(^[^/]\+;\)\(.*\n\)\(\(^\1\)\(.*\n\)\)\{1,\}/t$
      >
      > This regexp search highlights correctly the three first lines because
      > there is three times sortie.cpp,
      > it highlights the three lines after because there is three sortie.h
      > but it does not copy all highlighted at the end of the file.
      >
      > It copies only the first occurence and I would copy all occurences.
      >
      > why ?

      Actually, my previous suggestion won't work when there are more than two duplicate
      lines, I don't think, as it will run on each line except the last...hmmm...

      This would be a problem with your original command, too, though.

      You will probably want to consider a more complicated solution again, like
      scripting the whole thing, which would be fairly easy: e.g.:

      function! GetDuplicates()
      let lastline=line('$')
      let prevkey=""
      let curline=1
      let copiedkey=0
      while curline <= lastline
      let thiskey = matchstr(getline(curline),'^[^;]*;')
      if thiskey == prevkey
      if copiedkey == 0
      execute (curline-1)."t$"
      let copiedkey = 1
      endif
      execute curline."t$"
      else
      let copiedkey = 0
      endif
      let curline = curline + 1
      let prevkey = thiskey
      endwhile
      endfunction

      I've barely tested it, but it shouldn't be too far wrong!

      Grins,

      Ben.




      Send instant messages to your online friends http://au.messenger.yahoo.com


      --~--~---------~--~----~------------~-------~--~----~
      You received this message from the "vim_use" maillist.
      For more information, visit http://www.vim.org/maillist.php
      -~----------~----~----~----~------~----~------~--~---
    • Andy Wokula
      ... -- Andy --~--~---------~--~----~------------~-------~--~----~ You received this message from the vim_use maillist. For more information, visit
      Message 2 of 8 , Nov 1 5:50 AM
        Ben Schmidt schrieb:
        >> g/\(^[^/]\+;\)\(.*\n\)\(\(^\1\)\(.*\n\)\)\{1,\}/t$
        >>
        >> This regexp search highlights correctly the three first lines because
        >> there is three times sortie.cpp,
        >> it highlights the three lines after because there is three sortie.h
        >> but it does not copy all highlighted at the end of the file.
        >
        > As documented, :g only runs its command on the first line of multi-line matches.
        >
        > You will need to do something a bit more complicated.
        >
        > I would suggest:
        >
        > 1. Clear some register, say "a. :let @a=""
        >
        > 2. Make the :g command append each match to that register using yank:
        >
        > :g/\(^[^/]\+;\)\(.*\n\)\(\(^\1\)\(.*\n\)\)\{1,\}/exe "normal \"Ay//e\<CR>"

        :g/\%(^\1;.*\n\)\@<!\_^\([^;]*\);.*\(\n\1;.*\)\+$/exe "norm! \"AyV//e\r"

        > 3. Move to the end of your buffer and put the register. :$put a
        >
        > Hope this helps!
        >
        > Ben.

        --
        Andy

        --~--~---------~--~----~------------~-------~--~----~
        You received this message from the "vim_use" maillist.
        For more information, visit http://www.vim.org/maillist.php
        -~----------~----~----~----~------~----~------~--~---
      • epanda
        Hi Andy, the command hich works : g/ %(^ 1;.* n ) @
        Message 3 of 8 , Nov 4 1:39 AM
          Hi Andy,

          the command hich works : g/\%(^\1;.*\n\)\@<!\_^\([^;]*\);.*\(\n\1;.*\)\
          +$/exe "norm! \"AyV//e\r"

          Your command works very well but could I have a detailed explaination
          cause I have never used registers out of default registers ?

          You have resolved a big part of my problem, in fact I would like to do
          sdiff an store the results of files of the same name (not the same
          checksum).
          NowI have the list sorted by name and I know for example that I have
          to do 2 diff between 3 files of the same name which have several
          basename and not the same checksum.

          Thanks
          epanda




          On 1 nov, 13:50, Andy Wokula <anw...@...> wrote:
          > Ben Schmidt schrieb:
          >
          >
          >
          > >> g/\(^[^/]\+;\)\(.*\n\)\(\(^\1\)\(.*\n\)\)\{1,\}/t$
          >
          > >> This regexp search highlights correctly the three first lines because
          > >> there is three times sortie.cpp,
          > >> it highlights the three lines after because there is three sortie.h
          > >> but it does not copy all highlighted at the end of the file.
          >
          > > As documented, :g only runs its command on the first line of multi-line matches.
          >
          > > You will need to do something a bit more complicated.
          >
          > > I would suggest:
          >
          > > 1. Clear some register, say "a. :let @a=""
          >
          > > 2. Make the :g command append each match to that register using yank:
          >
          > > :g/\(^[^/]\+;\)\(.*\n\)\(\(^\1\)\(.*\n\)\)\{1,\}/exe "normal \"Ay//e\<CR>"
          >
          > :g/\%(^\1;.*\n\)\@<!\_^\([^;]*\);.*\(\n\1;.*\)\+$/exe "norm! \"AyV//e\r"
          >
          > > 3. Move to the end of your buffer and put the register. :$put a
          >
          > > Hope this helps!
          >
          > > Ben.
          >
          > --
          > Andy


          --~--~---------~--~----~------------~-------~--~----~
          You received this message from the "vim_use" maillist.
          For more information, visit http://www.vim.org/maillist.php
          -~----------~----~----~----~------~----~------~--~---
        • Andy Wokula
          ... On the regexp: Match any line, create a back reference for the file name part: _^ ([^;]* );.* . . Require one or more following lines with
          Message 4 of 8 , Nov 4 5:00 AM
            epanda schrieb:
            > Hi Andy,
            >
            > the command hich works : g/\%(^\1;.*\n\)\@<!\_^\([^;]*\);.*\(\n\1;.*\)\
            > +$/exe "norm! \"AyV//e\r"
            >
            > Your command works very well but could I have a detailed explaination
            > cause I have never used registers out of default registers ?

            On the regexp:

            Match any line, create a back reference for the file name part:
            \_^\([^;]*\);.*
            . .
            Require one or more following lines with the same file name (depends on
            'ignorecase'): : '
            : \(\n\1;.*\)\+$
            :
            Look back, let the above pattern only match if the file name in the
            previous line is different:
            \%(^\1;.*\n\)\@<!

            This check is needed, because the global command tries every line for a
            match and ignores overlappings from a previous match. The backref is ok
            here, a match for this part is tried last.

            Now the global command marks each first line of a block and when
            processing the line, sets the cursor on the first column.

            Collecting the matches:

            Yank the whole matched text and add it to register a:
            "Ay//e<Enter>
            :help "A
            :h /<CR>
            :h search-offset

            There is one small problem, we need a linewise yank:
            "AyV//e<Enter>
            :h o_V

            Written as an Ex command:
            :exe "norm! \"AyV//e\r"
            or
            :exe 'norm! "AyV//e'."\<Enter>"

            > You have resolved a big part of my problem, in fact I would like to do
            > sdiff and store the results of files of the same name (not the same
            > checksum).
            > NowI have the list sorted by name and I know for example that I have
            > to do 2 diff between 3 files of the same name which have several
            > basename and not the same checksum.
            >
            > Thanks
            > epanda

            > On 1 nov, 13:50, Andy Wokula <anw...@...> wrote:

            Wie bereits
            >> Ben Schmidt schrieb:
            >>> I would suggest:
            >>> 1. Clear some register, say "a. :let @a=""
            >>> 2. Make the :g command append each match to that register using yank:
            >>> :g/\(^[^/]\+;\)\(.*\n\)\(\(^\1\)\(.*\n\)\)\{1,\}/exe "normal \"Ay//e\<CR>"
            >> :g/\%(^\1;.*\n\)\@<!\_^\([^;]*\);.*\(\n\1;.*\)\+$/exe "norm! \"AyV//e\r"
            >>> 3. Move to the end of your buffer and put the register. :$put a
            >>> Hope this helps!
            >>> Ben.

            Slightly shortened regexp:
            :g/\%(\1;.*\n\)\@<!\(^[^;]*\);.*\(\n\1;.*\)\+$/exe "norm! \"AyV//e\r"

            I'd also prefer :copy alias :t with a range, but it's not possible to
            use
            .,//e copy
            because "e" would be ":edit" here. Maybe someone finds another trick to
            do this ...

            --
            Andy

            --~--~---------~--~----~------------~-------~--~----~
            You received this message from the "vim_use" maillist.
            For more information, visit http://www.vim.org/maillist.php
            -~----------~----~----~----~------~----~------~--~---
          • A.Politz
            ... It is possible to use a zs item in the pattern, without disturbing g. I also used the very-magic flag v to avoid leaning-toothstick-sickness.
            Message 5 of 8 , Nov 4 6:11 AM
              Andy Wokula wrote:

              >
              >
              >Slightly shortened regexp:
              >:g/\%(\1;.*\n\)\@<!\(^[^;]*\);.*\(\n\1;.*\)\+$/exe "norm! \"AyV//e\r"
              >
              >I'd also prefer :copy alias :t with a range, but it's not possible to
              >use
              > .,//e copy
              >because "e" would be ":edit" here. Maybe someone finds another trick to
              >do this ...
              >
              >
              >
              It is possible to use a '\zs' item in the pattern, without
              disturbing g. I also used the very-magic flag '\v' to avoid
              leaning-toothstick-sickness.

              g/\v%(\1;.*\n)@<!(^[^;]*);.*(\n\1;.*)+$\zs/ .,//t$

              -ap



              --
              Ich hab geträumt, der Krieg wär vorbei.


              --~--~---------~--~----~------------~-------~--~----~
              You received this message from the "vim_use" maillist.
              For more information, visit http://www.vim.org/maillist.php
              -~----------~----~----~----~------~----~------~--~---
            • Andy Wokula
              ... Exactly what I was looking for. BTW: what terms are advisable to distinguish start of the match v /abc zsdef ^ from start of the match ? -- Andy
              Message 6 of 8 , Nov 4 6:35 AM
                A.Politz schrieb:
                > Andy Wokula wrote:
                >
                >>
                >> Slightly shortened regexp:
                >> :g/\%(\1;.*\n\)\@<!\(^[^;]*\);.*\(\n\1;.*\)\+$/exe "norm! \"AyV//e\r"
                >>
                >> I'd also prefer :copy alias :t with a range, but it's not possible to
                >> use
                >> .,//e copy
                >> because "e" would be ":edit" here. Maybe someone finds another trick to
                >> do this ...
                >>
                > It is possible to use a '\zs' item in the pattern, without
                > disturbing g. I also used the very-magic flag '\v' to avoid
                > leaning-toothstick-sickness.
                >
                > g/\v%(\1;.*\n)@<!(^[^;]*);.*(\n\1;.*)+$\zs/ .,//t$
                >
                > -ap

                Exactly what I was looking for.

                BTW: what terms are advisable to distinguish

                "start of the match"
                v
                /abc\zsdef
                ^
                from "start of the match" ?

                --
                Andy

                --~--~---------~--~----~------------~-------~--~----~
                You received this message from the "vim_use" maillist.
                For more information, visit http://www.vim.org/maillist.php
                -~----------~----~----~----~------~----~------~--~---
              Your message has been successfully submitted and would be delivered to recipients shortly.