Loading ...
Sorry, an error occurred while loading the content.

RE: sort & remove duplicate lines?

Expand Messages
  • Michael Geddes
    For sort use the Sort function as mentioned by other people... there are some quick version available. I ve just added a tip for deleting duplicates based on
    Message 1 of 22 , Feb 1, 2004
    • 0 Attachment
      For sort use the Sort function as mentioned by other people... there are
      some quick version available.

      I've just added a tip for deleting duplicates based on what was going to
      be my reply :)

      http://vim.sourceforge.net/tips/tip.php?tip_id=648

      //.ichael G

      -----Original Message-----
      From: Tim Musson [mailto:Tim@...]
      Sent: Monday, 2 February 2004 12:58 PM
      To: vim@...
      Subject: sort & remove duplicate lines?


      vim,

      I use Vim for everything, but find myself going to TextPad for one
      or 2 things. I was wondering if there is a Vim equivalent?

      The first one and the one I need most often is "sort & remove
      duplicate lines". I know I can sort easily with ":n,n!sort", but I
      don't know how to remove dups. Can anyone point me in the right
      direction?

      The other thing I do is repetitive processing (called Macros in
      TextPad). For example, I have a list of files in my http/pix
      directory. So I list them into the file;
      img1.png
      img2.png
      etc...
      Next I make it a html document displaying the image and it's name.
      <img src=img1.phg width=99>img1.png<br>
      <img src=img2.phg width=99>img2.png<br>
      (I don't do this any more, but it was a simple example)

      Thanks in advance for any pointers!

      --
      Tim Musson
      Flying with The Bat! eMail v2.01.3
      Windows 2000 5.0.2195 (Service Pack 3)
      Printed on 100% recycled electrons.
    • Luc Hermitte
      ... Some times ago, there was a discussion here on this topic. The result is available within
      Message 2 of 22 , Feb 2, 2004
      • 0 Attachment
        * On Mon, Feb 02, 2004 at 03:46:44PM +1100, Michael Geddes <mgeddes@...> wrote:
        >
        > I've just added a tip for deleting duplicates based on what was going
        > to be my reply :)
        > http://vim.sourceforge.net/tips/tip.php?tip_id=648

        Some times ago, there was a discussion here on this topic.
        The result is available within
        http://hermitte.free.fr/vim/ressources/vimfiles/plugin/system_utils.vim

        (Basically the same code than your tip. It also adds a command "around"
        that works on visual ranges)

        --
        Luc Hermitte
        http://hermitte.free.fr/vim/
      • Alan G Isaac
        ... http://www.weihenstephan.de/~syring/win32/UnxUtils.html fwiw, Alan Isaac
        Message 3 of 22 , Feb 2, 2004
        • 0 Attachment
          On Mon, 2 Feb 2004, "Antoine J. Mechelynck" apparently wrote:
          > Unix has a filter called "uniq" to remove duplicate lines, so you may do
          > (outside of Vim)
          > sort < inputfilename | uniq > outputfilename
          > If Windows has anything similar to that "uniq" filter, I'm not aware of it


          http://www.weihenstephan.de/~syring/win32/UnxUtils.html

          fwiw,
          Alan Isaac
        • Antoine J. Mechelynck
          ... Yeah, sure. Here s what they tell me: Forbidden You don t have permission to access /~syring/win32/UnxUtils.html on this server. Apache/1.3.26 Server at
          Message 4 of 22 , Feb 2, 2004
          • 0 Attachment
            Alan G Isaac <aisaac@...> wrote:
            > On Mon, 2 Feb 2004, "Antoine J. Mechelynck" apparently wrote:
            > > Unix has a filter called "uniq" to remove duplicate lines, so you
            > > may do (outside of Vim)
            > > sort < inputfilename | uniq > outputfilename
            > > If Windows has anything similar to that "uniq" filter, I'm not
            > > aware of it
            >
            >
            > http://www.weihenstephan.de/~syring/win32/UnxUtils.html
            >
            > fwiw,
            > Alan Isaac

            Yeah, sure. Here's what they tell me:



            Forbidden
            You don't have permission to access /~syring/win32/UnxUtils.html on this
            server.

            Apache/1.3.26 Server at www.weihenstephan.de Port 80



            Regards,
            Tony.
          • A. S. Budden
            ... Try unxutils.sourceforge.net Al
            Message 5 of 22 , Feb 2, 2004
            • 0 Attachment
              Thus spake Antoine J. Mechelynck:
              > Alan G Isaac <aisaac@...> wrote:
              > > On Mon, 2 Feb 2004, "Antoine J. Mechelynck" apparently wrote:
              > > > Unix has a filter called "uniq" to remove duplicate lines, so you
              > > > may do (outside of Vim)
              > > > sort < inputfilename | uniq > outputfilename
              > > > If Windows has anything similar to that "uniq" filter, I'm not
              > > > aware of it
              > >
              > >
              > > http://www.weihenstephan.de/~syring/win32/UnxUtils.html
              > >
              > > fwiw,
              > > Alan Isaac
              >
              > Yeah, sure. Here's what they tell me:
              >
              >
              >
              > Forbidden
              > You don't have permission to access /~syring/win32/UnxUtils.html on this
              > server.
              >
              > Apache/1.3.26 Server at www.weihenstephan.de Port 80

              Try unxutils.sourceforge.net

              Al
            • Steve Hall
              From: Antoine J. Mechelynck , Feb 2, 2004 10:51 AM ... The main site is: http://unxutils.sourceforge.net -- Steve Hall [ digitect@mindspring.com ]
              Message 6 of 22 , Feb 2, 2004
              • 0 Attachment
                From: "Antoine J. Mechelynck", Feb 2, 2004 10:51 AM
                > Alan G Isaac <aisaac@...> wrote:
                > > On Mon, 2 Feb 2004, "Antoine J. Mechelynck" apparently wrote:
                > > >
                > > > If Windows has anything similar to that "uniq" filter, I'm not
                > > > aware of it
                > >
                > > http://www.weihenstephan.de/~syring/win32/UnxUtils.html
                >
                > You don't have permission to access /~syring/win32/UnxUtils.html on
                > this server.

                The main site is:

                http://unxutils.sourceforge.net


                --
                Steve Hall [ digitect@... ]
              • Will Fiveash
                ... which will remove duplicate lines and doesn t require sorting or adjacency. -- Will Fiveash Sun Microsystems Inc. Austin, TX, USA (TZ=CST6CDT) GPG PubKey
                Message 7 of 22 , Feb 2, 2004
                • 0 Attachment
                  On Sun, Feb 01, 2004 at 09:13:31PM -0600, Elliott Hoel wrote:
                  > On Sun, Feb 01, 2004 at 08:57:47PM -0500, Tim Musson wrote:
                  > > I use Vim for everything, but find myself going to TextPad for one
                  > > or 2 things. I was wondering if there is a Vim equivalent?
                  > >
                  > > The first one and the one I need most often is "sort & remove
                  > > duplicate lines". I know I can sort easily with ":n,n!sort", but I
                  > > don't know how to remove dups. Can anyone point me in the right
                  > > direction?
                  >
                  > In Unix (I don't know about anything else) sort has the option -u. this
                  > sorts and removes duplicate lines (man sort): :%!sort -u. Also see the
                  > alternative command uniq (man uniq), which doesn't sort but removes
                  > adjacent duplicate lines.

                  Another option, where nawk is available, is:

                  :%!nawk '!seen[$0]++'

                  which will remove duplicate lines and doesn't require sorting or
                  adjacency.

                  --
                  Will Fiveash
                  Sun Microsystems Inc.
                  Austin, TX, USA (TZ=CST6CDT)
                  GPG PubKey ID:0x7D31DC39, Key server: www.keyserver.net
                • Alejandro Lopez-Valencia
                  ... I rather like the binaries distributed at http://gnuwin32.sourceforge.net/, usually compiled with the latest versions of Mingw32 gcc and the w32api
                  Message 8 of 22 , Feb 2, 2004
                  • 0 Attachment
                    Steve Hall scribbled on Monday, February 02, 2004 11:04 AM:

                    > From: "Antoine J. Mechelynck", Feb 2, 2004 10:51 AM
                    >> Alan G Isaac <aisaac@...> wrote:
                    >>> On Mon, 2 Feb 2004, "Antoine J. Mechelynck" apparently wrote:
                    >>>>
                    >>>> If Windows has anything similar to that "uniq" filter, I'm not
                    >>>> aware of it
                    >>>
                    >>> http://www.weihenstephan.de/~syring/win32/UnxUtils.html
                    >>
                    >> You don't have permission to access /~syring/win32/UnxUtils.html on
                    >> this server.
                    >
                    > The main site is:
                    >
                    > http://unxutils.sourceforge.net

                    I rather like the binaries distributed at http://gnuwin32.sourceforge.net/,
                    usually compiled with the latest versions of Mingw32 gcc and the w32api
                    runtime, BTW.
                  • Antoine J. Mechelynck
                    ... [...] ... Thx. The above works when used directly (but note that project unxutils on SourceForge exists since June 2000 but has no description, no
                    Message 9 of 22 , Feb 2, 2004
                    • 0 Attachment
                      A. S. Budden <vim.mail@...> wrote:
                      > Thus spake Antoine J. Mechelynck:
                      [...]
                      > > Forbidden
                      > > You don't have permission to access /~syring/win32/UnxUtils.html on
                      > > this server.
                      > >
                      > > Apache/1.3.26 Server at www.weihenstephan.de Port 80
                      >
                      > Try unxutils.sourceforge.net
                      >
                      > Al

                      Thx. The above works when used directly (but note that "project unxutils" on
                      SourceForge exists since June 2000 but has no description, no released
                      files, and 2 forums with 3 "welcome" or null messages, 6 questions, 1
                      answer.)

                      Regards,
                      Tony.
                    • Antoine J. Mechelynck
                      ... Thanks. (But see also my reply to Alan G. Isaac.) Regards, Tony.
                      Message 10 of 22 , Feb 2, 2004
                      • 0 Attachment
                        Steve Hall <digitect@...> wrote:
                        > From: "Antoine J. Mechelynck", Feb 2, 2004 10:51 AM
                        > > Alan G Isaac <aisaac@...> wrote:
                        > > > On Mon, 2 Feb 2004, "Antoine J. Mechelynck" apparently wrote:
                        > > > >
                        > > > > If Windows has anything similar to that "uniq" filter, I'm not
                        > > > > aware of it
                        > > >
                        > > > http://www.weihenstephan.de/~syring/win32/UnxUtils.html
                        > >
                        > > You don't have permission to access /~syring/win32/UnxUtils.html on
                        > > this server.
                        >
                        > The main site is:
                        >
                        > http://unxutils.sourceforge.net
                        >
                        >
                        > --
                        > Steve Hall [ digitect@... ]

                        Thanks. (But see also my reply to Alan G. Isaac.)

                        Regards,
                        Tony.
                      • Alan G Isaac
                        ... I have had good experiences with UnxUtils, but this looks great: Thanks! It seems like there is some
                        Message 11 of 22 , Feb 3, 2004
                        • 0 Attachment
                          On Mon, 2 Feb 2004, Alejandro Lopez-Valencia apparently wrote:
                          > I rather like the binaries distributed at
                          > http://gnuwin32.sourceforge.net/, usually compiled with
                          > the latest versions of Mingw32 gcc and the w32api runtime,
                          > BTW.


                          I have had good experiences with UnxUtils, but this looks great:
                          <URL:http://gnuwin32.sourceforge.net/packages.html>
                          Thanks! It seems like there is some overlap between
                          the two projects.
                          Alan
                        • Benji Fisher
                          ... The last time this came up, I think I suggested going bottom-up instead of top-down. That is, use let i = a:lastline | while i a:firstline | ... |
                          Message 12 of 22 , Feb 5, 2004
                          • 0 Attachment
                            On Mon, Feb 02, 2004 at 04:32:58AM +0100, Antoine J. Mechelynck wrote:
                            >
                            > Let's have a try at it (untested)
                            >
                            > command Uniq -range=% call RemoveDuplicates()
                            > function RemoveDuplicates() range
                            > let i = a:firstline + 1
                            > let endl = a:lastline
                            > while i <= endl
                            > if getline(i) == getline(i-1)
                            > exe i . "," . i . " delete"
                            > let endl = endl - 1
                            > else
                            > let i = i+1
                            > endif
                            > endwhile
                            > endfunction

                            The last time this came up, I think I suggested going bottom-up
                            instead of top-down. That is, use

                            let i = a:lastline | while i > a:firstline | ... | endwhile

                            That way, you never have to change endl. Call it "microefficiency" or
                            "elegance," as you see fit. ;)

                            --Benji Fisher
                          • David Fishburn
                            I have this in my vimrc file from a previous discussion on this mailling list. Remove duplicate lines (assuming they follow each other). Courtesy of Preben
                            Message 13 of 22 , Feb 5, 2004
                            • 0 Attachment
                              I have this in my vimrc file from a previous discussion on this mailling
                              list.

                              " Remove duplicate lines (assuming they follow each other).
                              " Courtesy of Preben 'Peppe' Guldberg, Piet Delport
                              " Visually select a range of rows and type :Uniq
                              command! -range=% Uniq <line1>,<line2>g/^\%<<line2>l\(.*\)\n\1$/d


                              Dave

                              > -----Original Message-----
                              > From: Benji Fisher [mailto:benji@...]
                              > Sent: Thursday, February 05, 2004 1:50 PM
                              > To: vim@...
                              > Subject: Re: sort & remove duplicate lines?
                              >
                              >
                              > On Mon, Feb 02, 2004 at 04:32:58AM +0100, Antoine J. Mechelynck wrote:
                              > >
                              > > Let's have a try at it (untested)
                              > >
                              > > command Uniq -range=% call RemoveDuplicates()
                              > > function RemoveDuplicates() range
                              > > let i = a:firstline + 1
                              > > let endl = a:lastline
                              > > while i <= endl
                              > > if getline(i) == getline(i-1)
                              > > exe i . "," . i . " delete"
                              > > let endl = endl - 1
                              > > else
                              > > let i = i+1
                              > > endif
                              > > endwhile
                              > > endfunction
                              >
                              > The last time this came up, I think I suggested going
                              > bottom-up instead of top-down. That is, use
                              >
                              > let i = a:lastline | while i > a:firstline | ... | endwhile
                              >
                              > That way, you never have to change endl. Call it
                              > "microefficiency" or "elegance," as you see fit. ;)
                              >
                              > --Benji Fisher
                              >
                            • Paul Brinkley
                              [I searched all over the web for an answer to this problem, including vim.org s Tips, Sourceforge, Google, and Google Groups. I found one reference to it in
                              Message 14 of 22 , Feb 10, 2004
                              • 0 Attachment
                                [I searched all over the web for an answer to this problem,
                                including vim.org's Tips, Sourceforge, Google, and Google
                                Groups. I found one reference to it in comp.editors back in
                                2001, unsolved. The thread title is
                                "case-preserving replace in vi[m]?".]

                                Basically, I'd like to replace (e.g.) "apple" with "banana"
                                everywhere in a block of text, but if "apple" happens to be
                                capitalized, it should be replaced with "Banana". In other
                                words, I'd like a handy way of doing

                                :s/Apple/Banana/g
                                :s/apple/banana/g

                                (assuming noignorecase) with one command.

                                I could probably slog my way through a script to do this, but
                                it strikes me as a fairly useful feature - would Vim be likely
                                to have this feature added since 2001?

                                TIA...
                              • gumnos (Tim Chase)
                                ... I ve toyed with this idea before, even asking the list at one point. However, you hit cases (including your own example) where the search-regex and the
                                Message 15 of 22 , Feb 10, 2004
                                • 0 Attachment
                                  > Basically, I'd like to replace (e.g.) "apple" with "banana"
                                  > everywhere in a block of text, but if "apple" happens to be
                                  > capitalized, it should be replaced with "Banana". In other
                                  > words, I'd like a handy way of doing
                                  >
                                  > :s/Apple/Banana/g
                                  > :s/apple/banana/g
                                  >
                                  > (assuming noignorecase) with one command.

                                  I've toyed with this idea before, even asking the list at one point.
                                  However, you hit cases (including your own example) where the search-regex
                                  and the replacement don't line up correctly. eg.

                                  :s/ApplE/FOO/g

                                  What's the replacement "FOO"? would it be "BanaNa" or would it be
                                  "BananA" (is that a "make the 5th letter uppercase", or is it a "make the
                                  last letter uppercase"). It gets worse, if you use camel-case:

                                  :s/HelloWorld/SomeText/g
                                  :s/helloworld/sometext/g
                                  :s/helloWorld/someText/g

                                  is likely what you want, but with the below suggestion, you end up with
                                  something more like

                                  :s/HelloWorld/SometExt/g
                                  :s/helloworld/sometext/g
                                  :s/helloWorld/sometExt/g

                                  It really does take some sort of script logic to perform, where the script
                                  clarifies these peculiar cases. I think there's a simple script (that
                                  came out when I asked, or perhaps before) that handles the first N
                                  characters of the replacement text, where N is the length of the matched
                                  expression. I don't know if its author is around somewhere, or if it's be
                                  enhanced any.

                                  It would be of the form (wrapped for email, but should be on all one
                                  line):

                                  :%s/apple\c/\=CaseSubstitute(submatch(0),
                                  "desired_replacement_text")/g

                                  where CaseSubstitute(searchString, replacementString) does something like
                                  the pseudo-code

                                  outputString = ""
                                  for index = 1 to strlen(searchString)
                                  if isUpper(searchString[i]) outputString =
                                  outputString . upper(replacementString[i])
                                  else outputString = outputString .
                                  lower(replacementString[i])
                                  next
                                  append any additional bit of replacementString that
                                  doesn't have a corresponding letter in the match
                                  string
                                  return outputString

                                  for the isUpper(), you could use
                                  match(searchString[i], "\\u")

                                  for upper(), you could use
                                  substitute(replacementString[i], ".", "\\u&", "")

                                  for lower(), you could use
                                  substitute(replacementString[i], ".", "\\l&", "")

                                  (if OE bunged those replacements by attempting to turn them into UNC
                                  names, that's back-slash, back-slash, followed by either a "u" or
                                  lowercase "L", followed by an ampersand)

                                  You might also prefer to use [:upper:] as a character class rather than
                                  "\u", which may handle foreign character sets a little more gracefully,
                                  and the number of backslashes may need to be escaped properly--my guess
                                  was for only two, but I'm frequently unlucky on my first guess at escaping
                                  here :)

                                  Hope this sets you out in a helpful direction,

                                  -tim
                                • Paul Brinkley
                                  ... Harumph - yeah, this had crossed my mind while I was typing my message, and then I forgot to actually mention it. My inclination in most of these cases is
                                  Message 16 of 22 , Feb 10, 2004
                                  • 0 Attachment
                                    At 01:40 PM 2/10/2004 -0600, gumnos \(Tim Chase\) wrote:
                                    >> Basically, I'd like to replace (e.g.) "apple" with "banana"
                                    >> everywhere in a block of text, but if "apple" happens to be
                                    >> capitalized, it should be replaced with "Banana". In other
                                    >> words, I'd like a handy way of doing
                                    >>
                                    >> :s/Apple/Banana/g
                                    >> :s/apple/banana/g
                                    >>
                                    >> (assuming noignorecase) with one command.
                                    >
                                    >I've toyed with this idea before, even asking the list at one point.
                                    >However, you hit cases (including your own example) where the search-regex
                                    >and the replacement don't line up correctly. eg.
                                    >
                                    > :s/ApplE/FOO/g
                                    >
                                    >What's the replacement "FOO"? would it be "BanaNa" or would it be
                                    >"BananA" (is that a "make the 5th letter uppercase", or is it a "make the
                                    >last letter uppercase"). It gets worse, if you use camel-case:
                                    >
                                    > :s/HelloWorld/SomeText/g
                                    > :s/helloworld/sometext/g
                                    > :s/helloWorld/someText/g
                                    >
                                    >is likely what you want, but with the below suggestion, you end up with
                                    >something more like
                                    >
                                    > :s/HelloWorld/SometExt/g
                                    > :s/helloworld/sometext/g
                                    > :s/helloWorld/sometExt/g

                                    Harumph - yeah, this had crossed my mind while I was typing my message,
                                    and then I forgot to actually mention it. My inclination in most of these
                                    cases is to simply punt. The feature I'm looking for should effectively
                                    be a narrow special case of :s applicable only to the following:

                                    :s/apple/banana/~
                                    :s/appleBread/bananaBread/~

                                    where ~ is the flag that turns this on (best flag I could think of;
                                    suggestions?). It works ONLY on the first character. My rationale for
                                    this is simply based on the places where I use it: in code. Specifically,
                                    Java code, though I would find this useful in C, C++, and others. Just
                                    today I had code that looked like

                                    closeButton.addActionListener(...)
                                    ...
                                    fireCloseButtonClicked();

                                    I wanted to replace "close" with "cancel" everywhere. I can't see having
                                    to worry about any occurrences of "CLOse", "ClosE", etc. I can certainly
                                    see the argument for wanting to preserve a form of elegance to the feature,
                                    however. (The last thing I want is a warty Vim.)

                                    I didn't think of this before, but I might also want to replace "CLOSE"
                                    with "CANCEL":

                                    public static final CLOSE_BUTTON_TEXT = "Close";

                                    So that's three cases, all relatively clear-cut. I can think of no others
                                    in common use. (Can you?) Realistically, what would you do if you saw
                                    "CLOse" or "CLoSe"? What ought to be done? Confirm replace? Just skip it?

                                    >Hope this sets you out in a helpful direction,

                                    The snippets certainly do, indeed. I still say this ought to have native
                                    Vim support. :-) But at the very least, there's a lead here on a nice
                                    script to upload to the website.

                                    The spec I'm seeing is something like this:

                                    :s/fooBar/quxBar/~

                                    The ~ flag expands the search to the following (noignorecase implied):

                                    :s/fooBar/quxBar
                                    :s/FooBar/QuxBar
                                    :s/FOO_BAR/QUX_BAR

                                    Semantically, this appears clear enough to me. I'd of course be
                                    concerned with how generally useful this would be. For instance,
                                    how universal is the above usage in coding in languages other than
                                    English?
                                  • David Fishburn
                                    Look no further (and of course it has been done) :-) Michael Geddes wrote KeepCase.vim http://www.vim.org/scripts/script.php?script_id=6 It is a script (not a
                                    Message 17 of 22 , Feb 10, 2004
                                    • 0 Attachment
                                      Look no further (and of course it has been done) :-)

                                      Michael Geddes wrote KeepCase.vim
                                      http://www.vim.org/scripts/script.php?script_id=6

                                      It is a script (not a plugin), so open it, type
                                      :so %

                                      Then try one of his examples, works great for me.

                                      Dave

                                      > -----Original Message-----
                                      > From: Paul Brinkley [mailto:laugh@...]
                                      > Sent: Tuesday, February 10, 2004 12:44 PM
                                      > To: vim@...
                                      > Subject: Preserve case during search & replace?
                                      >
                                      >
                                      > [I searched all over the web for an answer to this problem,
                                      > including vim.org's Tips, Sourceforge, Google, and Google
                                      > Groups. I found one reference to it in comp.editors back in
                                      > 2001, unsolved. The thread title is "case-preserving replace
                                      > in vi[m]?".]
                                      >
                                      > Basically, I'd like to replace (e.g.) "apple" with "banana"
                                      > everywhere in a block of text, but if "apple" happens to be
                                      > capitalized, it should be replaced with "Banana". In other
                                      > words, I'd like a handy way of doing
                                      >
                                      > :s/Apple/Banana/g
                                      > :s/apple/banana/g
                                      >
                                      > (assuming noignorecase) with one command.
                                      >
                                      > I could probably slog my way through a script to do this, but
                                      > it strikes me as a fairly useful feature - would Vim be
                                      > likely to have this feature added since 2001?
                                      >
                                      > TIA...
                                      >
                                      >
                                    • gumnos (Tim Chase)
                                      ... A nice addition would be to have behavior like the KeepCaseSameLen, only tacking on any extra from the newword that extended beyond the original, so a
                                      Message 18 of 22 , Feb 10, 2004
                                      • 0 Attachment
                                        > Look no further (and of course it has been done) :-)
                                        >
                                        > Michael Geddes wrote KeepCase.vim
                                        > http://www.vim.org/scripts/script.php?script_id=6
                                        >
                                        > It is a script (not a plugin), so open it, type
                                        > :so %
                                        >
                                        > Then try one of his examples, works great for me.

                                        A nice addition would be to have behavior like the KeepCaseSameLen, only
                                        tacking on any extra from the newword that extended beyond the original,
                                        so a formatting/search string of "HelloWorld" would change "exampletext"
                                        to "ExampLetext". But yes, I think this was the script/fn that was
                                        recommended to me for keeping case on a :s command. Thanks Michael!

                                        -tim
                                      Your message has been successfully submitted and would be delivered to recipients shortly.