Loading ...
Sorry, an error occurred while loading the content.

save only duplicate lines

Expand Messages
  • Bee
    I have two address lists, one address per line. One list is the combination of the two with some cleanup. The other is a small list of special addresses that
    Message 1 of 5 , Apr 8, 2013
    • 0 Attachment
      I have two address lists, one address per line.

      One list is the combination of the two with some cleanup.

      The other is a small list of special addresses
      that has not been cleaned for a long time.

      I would like to save only the addresses that occur in both.
      ie, save only one copy of duplicate lines.

      How would that be done.

      Bill

      --
      --
      You received this message from the "vim_use" maillist.
      Do not top-post! Type your reply below the text you are replying to.
      For more information, visit http://www.vim.org/maillist.php

      ---
      You received this message because you are subscribed to the Google Groups "vim_use" group.
      To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@....
      For more options, visit https://groups.google.com/groups/opt_out.
    • Bee
      What I would like is something like :sort iu but the opposite, to keep only one copy of the duplicate lines.
      Message 2 of 5 , Apr 8, 2013
      • 0 Attachment
        What I would like is something like ":sort iu"
        but the opposite, to keep only one copy of the duplicate lines.

        " -------1---------2---------3---------4---------5---------6----
        This seems to work, but is not elegant:

        " clear the a register
        qaq

        " search and copy all duplicate lines
        :g/\(^.\+\n\)\1/y A

        " open a new buffer, paste contents of register a
        "aP

        " sort removing duplicate lines
        :sort iu

        " -------1---------2---------3---------4---------5---------6----

        --
        --
        You received this message from the "vim_use" maillist.
        Do not top-post! Type your reply below the text you are replying to.
        For more information, visit http://www.vim.org/maillist.php

        ---
        You received this message because you are subscribed to the Google Groups "vim_use" group.
        To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@....
        For more options, visit https://groups.google.com/groups/opt_out.
      • Tim Chase
        ... Is the case normalized in both of them, or does it need to be case insensitive? Your two examples ( sort iu vs. s/^ (.* n ) 1/.../ ) conflict in how
        Message 3 of 5 , Apr 8, 2013
        • 0 Attachment
          On 2013-04-08 08:13, Bee wrote:
          > I have two address lists, one address per line.
          >
          > One list is the combination of the two with some cleanup.
          >
          > The other is a small list of special addresses
          > that has not been cleaned for a long time.
          >
          > I would like to save only the addresses that occur in both.
          > ie, save only one copy of duplicate lines.

          Is the case normalized in both of them, or does it need to be case
          insensitive? Your two examples ("sort iu" vs. "s/^\(.*\n\)\1/.../")
          conflict in how they handle case sensitivity. In either case, I'd be
          tempted to use a "decorate/process/undecorate" pattern, performing
          some transformation on each duplicated line. If case is the
          significant, you could do something like

          :set sw=4 ts=4 noet
          :%sort i " sort like lines together
          :g/^\(.*\n\)\1/s/^/XXX " mark duplicates
          :v/^\t/d " delete the non-indented lines
          :%s/^XXX " unmark the lines of interest
          :sort iu " optionally remove duplicates if there
          " were more than 2 duplicated entries

          If you need case insensitivity, you can do something like this for
          the 3rd command above:

          :1,$-1g/^/if toupper(getline('.'))==toupper(getline(line('.')+1))|sil!> |endif

          You just need something that is unique to the lines you want to save.

          -tim





          --
          --
          You received this message from the "vim_use" maillist.
          Do not top-post! Type your reply below the text you are replying to.
          For more information, visit http://www.vim.org/maillist.php

          ---
          You received this message because you are subscribed to the Google Groups "vim_use" group.
          To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@....
          For more options, visit https://groups.google.com/groups/opt_out.
        • Bee
          ... Yes, all files have been changed to lowercase select all and lowercase ggVGu Bill -- -- You received this message from the vim_use maillist. Do not
          Message 4 of 5 , Apr 8, 2013
          • 0 Attachment
            On Apr 8, 12:01 pm, Tim Chase<v...@...> wrote:
            > On 2013-04-08 08:13, Bee wrote:
            >
            > > I have two address lists, one address per line.
            >
            > > One list is the combination of the two with some cleanup.
            >
            > > The other is a small list of special addresses
            > > that has not been cleaned for a long time.
            >
            > > I would like to save only the addresses that occur in both.
            > > ie, save only one copy of duplicate lines.
            >
            > Is the case normalized in both of them, or does it need to be case
            > insensitive?  Your two examples ("sort iu" vs. "s/^\(.*\n\)\1/.../")
            > conflict in how they handle case sensitivity.  In either case, I'd be
            > tempted to use a "decorate/process/undecorate" pattern, performing
            > some transformation on each duplicated line.  If case is the
            > significant, you could do something like
            >
            >   :set sw=4 ts=4 noet
            >   :%sort i                 " sort like lines together
            >   :g/^\(.*\n\)\1/s/^/XXX   " mark duplicates
            >   :v/^\t/d                 " delete the non-indented lines
            >   :%s/^XXX                 " unmark the lines of interest
            >   :sort iu                 " optionally remove duplicates if there
            >                            " were more than 2 duplicated entries
            >
            > If you need case insensitivity, you can do something like this for
            > the 3rd command above:
            >
            >  :1,$-1g/^/if toupper(getline('.'))==toupper(getline(line('.')+1))|sil!> |endif
            >
            > You just need something that is unique to the lines you want to save.
            >
            > -tim

            Yes, all files have been changed to lowercase

            " select all and lowercase
            ggVGu

            Bill

            --
            --
            You received this message from the "vim_use" maillist.
            Do not top-post! Type your reply below the text you are replying to.
            For more information, visit http://www.vim.org/maillist.php

            ---
            You received this message because you are subscribed to the Google Groups "vim_use" group.
            To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@....
            For more options, visit https://groups.google.com/groups/opt_out.
          • Bee
            Tim, thank you for the ideas. I had forgotten about marking and deleting. This is what I have working. Using :%s is more than twice as fast as :g Also no need
            Message 5 of 5 , Apr 10, 2013
            • 0 Attachment
              Tim, thank you for the ideas.
              I had forgotten about marking and deleting.

              This is what I have working.
              Using :%s is more than twice as fast as :g
              Also no need for the :sort iu at the end.

              function! KeepOneDup()
              let start = reltime()
              sort i
              %s/\(^.\+\n\)\1\+/@\1/ " mark and remove all but one dup
              v/^@/d " delete un-marked
              %s/^@// " remove mark
              echo reltimestr(reltime(start))
              endfun " :call KeepOneDup()<cr>

              Bill

              --
              --
              You received this message from the "vim_use" maillist.
              Do not top-post! Type your reply below the text you are replying to.
              For more information, visit http://www.vim.org/maillist.php

              ---
              You received this message because you are subscribed to the Google Groups "vim_use" group.
              To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@....
              For more options, visit https://groups.google.com/groups/opt_out.
            Your message has been successfully submitted and would be delivered to recipients shortly.