Loading ...
Sorry, an error occurred while loading the content.
 

Re: save only duplicate lines

Expand Messages
  • Tim Chase
    ... Is the case normalized in both of them, or does it need to be case insensitive? Your two examples ( sort iu vs. s/^ (.* n ) 1/.../ ) conflict in how
    Message 1 of 5 , Apr 8, 2013
      On 2013-04-08 08:13, Bee wrote:
      > I have two address lists, one address per line.
      >
      > One list is the combination of the two with some cleanup.
      >
      > The other is a small list of special addresses
      > that has not been cleaned for a long time.
      >
      > I would like to save only the addresses that occur in both.
      > ie, save only one copy of duplicate lines.

      Is the case normalized in both of them, or does it need to be case
      insensitive? Your two examples ("sort iu" vs. "s/^\(.*\n\)\1/.../")
      conflict in how they handle case sensitivity. In either case, I'd be
      tempted to use a "decorate/process/undecorate" pattern, performing
      some transformation on each duplicated line. If case is the
      significant, you could do something like

      :set sw=4 ts=4 noet
      :%sort i " sort like lines together
      :g/^\(.*\n\)\1/s/^/XXX " mark duplicates
      :v/^\t/d " delete the non-indented lines
      :%s/^XXX " unmark the lines of interest
      :sort iu " optionally remove duplicates if there
      " were more than 2 duplicated entries

      If you need case insensitivity, you can do something like this for
      the 3rd command above:

      :1,$-1g/^/if toupper(getline('.'))==toupper(getline(line('.')+1))|sil!> |endif

      You just need something that is unique to the lines you want to save.

      -tim





      --
      --
      You received this message from the "vim_use" maillist.
      Do not top-post! Type your reply below the text you are replying to.
      For more information, visit http://www.vim.org/maillist.php

      ---
      You received this message because you are subscribed to the Google Groups "vim_use" group.
      To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@....
      For more options, visit https://groups.google.com/groups/opt_out.
    • Bee
      ... Yes, all files have been changed to lowercase select all and lowercase ggVGu Bill -- -- You received this message from the vim_use maillist. Do not
      Message 2 of 5 , Apr 8, 2013
        On Apr 8, 12:01 pm, Tim Chase<v...@...> wrote:
        > On 2013-04-08 08:13, Bee wrote:
        >
        > > I have two address lists, one address per line.
        >
        > > One list is the combination of the two with some cleanup.
        >
        > > The other is a small list of special addresses
        > > that has not been cleaned for a long time.
        >
        > > I would like to save only the addresses that occur in both.
        > > ie, save only one copy of duplicate lines.
        >
        > Is the case normalized in both of them, or does it need to be case
        > insensitive?  Your two examples ("sort iu" vs. "s/^\(.*\n\)\1/.../")
        > conflict in how they handle case sensitivity.  In either case, I'd be
        > tempted to use a "decorate/process/undecorate" pattern, performing
        > some transformation on each duplicated line.  If case is the
        > significant, you could do something like
        >
        >   :set sw=4 ts=4 noet
        >   :%sort i                 " sort like lines together
        >   :g/^\(.*\n\)\1/s/^/XXX   " mark duplicates
        >   :v/^\t/d                 " delete the non-indented lines
        >   :%s/^XXX                 " unmark the lines of interest
        >   :sort iu                 " optionally remove duplicates if there
        >                            " were more than 2 duplicated entries
        >
        > If you need case insensitivity, you can do something like this for
        > the 3rd command above:
        >
        >  :1,$-1g/^/if toupper(getline('.'))==toupper(getline(line('.')+1))|sil!> |endif
        >
        > You just need something that is unique to the lines you want to save.
        >
        > -tim

        Yes, all files have been changed to lowercase

        " select all and lowercase
        ggVGu

        Bill

        --
        --
        You received this message from the "vim_use" maillist.
        Do not top-post! Type your reply below the text you are replying to.
        For more information, visit http://www.vim.org/maillist.php

        ---
        You received this message because you are subscribed to the Google Groups "vim_use" group.
        To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@....
        For more options, visit https://groups.google.com/groups/opt_out.
      • Bee
        Tim, thank you for the ideas. I had forgotten about marking and deleting. This is what I have working. Using :%s is more than twice as fast as :g Also no need
        Message 3 of 5 , Apr 10, 2013
          Tim, thank you for the ideas.
          I had forgotten about marking and deleting.

          This is what I have working.
          Using :%s is more than twice as fast as :g
          Also no need for the :sort iu at the end.

          function! KeepOneDup()
          let start = reltime()
          sort i
          %s/\(^.\+\n\)\1\+/@\1/ " mark and remove all but one dup
          v/^@/d " delete un-marked
          %s/^@// " remove mark
          echo reltimestr(reltime(start))
          endfun " :call KeepOneDup()<cr>

          Bill

          --
          --
          You received this message from the "vim_use" maillist.
          Do not top-post! Type your reply below the text you are replying to.
          For more information, visit http://www.vim.org/maillist.php

          ---
          You received this message because you are subscribed to the Google Groups "vim_use" group.
          To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@....
          For more options, visit https://groups.google.com/groups/opt_out.
        Your message has been successfully submitted and would be delivered to recipients shortly.