Loading ...
Sorry, an error occurred while loading the content.

Re: New (?) sort implementation for Vim, beats quick sort [long]

Expand Messages
  • Bram Moolenaar
    ... If it s included in the distribution, we might as well implement it in C. That s a lot faster. ... Those questions still exist. In this specific case it
    Message 1 of 17 , May 1, 2003
    • 0 Attachment
      Benji Fisher wrote:

      > I have not tested it myself, but if this works as well as you claim, I
      > think this should be added to the distro ov vim 6.2 as (part of) a new
      > default plugin.

      If it's included in the distribution, we might as well implement it in
      C. That's a lot faster.

      > We have discussed this before: how to make utility functions
      > available to all plugins. IIRC, the discussion got bogged down in
      > questions about whether to use autoloading, lots of functions in one
      > file or in many files, etc., and we still do not have them available.
      > It might be better if people used external scripting languages (perl,
      > python, etc.) instead, but I have not seen much of this. For
      > portability, I think we need some basic utilities in vim script, and
      > sorting is at the top of the list.

      Those questions still exist. In this specific case it would also be
      needed to have some way to specify the compare function to be used.

      Instead of constantly increasing the functionality of the script
      language (either in C or with plugins) it would be a good idea to chose
      one external script language and recommend using it. The selected
      language must be available on most platforms, which is currently the
      main problem. I would chose Python (if only because Guido is Dutch!
      :-), if we didn't have trouble with threading...

      --
      There are three kinds of persons: Those who can count and those who can't.

      /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
      /// Creator of Vim - Vi IMproved -- http://www.Vim.org \\\
      \\\ Project leader for A-A-P -- http://www.A-A-P.org ///
      \\\ Help AIDS victims, buy at Amazon -- http://ICCF.nl/click1.html ///
    • Benji Fisher
      ... Robert Webb s sorting program *is* already included in the distribution: once in doc/eval.txt and once in plugin/explorer.vim . ... Those questions still
      Message 2 of 17 , May 1, 2003
      • 0 Attachment
        Bram Moolenaar wrote:
        > Benji Fisher wrote:
        >
        >
        >> I have not tested it myself, but if this works as well as you claim, I
        >>think this should be added to the distro ov vim 6.2 as (part of) a new
        >>default plugin.
        >
        >
        > If it's included in the distribution, we might as well implement it in
        > C. That's a lot faster.

        Robert Webb's sorting program *is* already included in the distribution:
        once in doc/eval.txt and once in plugin/explorer.vim .

        >>It might be better if people used external scripting languages (perl,
        >>python, etc.) instead, but I have not seen much of this. For
        >>portability, I think we need some basic utilities in vim script, and
        >>sorting is at the top of the list.
        >
        > Those questions still exist. In this specific case it would also be
        > needed to have some way to specify the compare function to be used.
        >
        > Instead of constantly increasing the functionality of the script
        > language (either in C or with plugins) it would be a good idea to chose
        > one external script language and recommend using it. The selected
        > language must be available on most platforms, which is currently the
        > main problem. I would chose Python (if only because Guido is Dutch!
        > :-), if we didn't have trouble with threading...

        Those questions still exist. (Is there an echo in here? ;) As for
        choosing a script language, my comment stands: it has not happened yet.

        --Benji Fisher
      • digitect@mindspring.com
        ... Wow, works for me! Although, at first, I had my doubts. I set up functions so I could call normal or reverse sort, using the current selection, like:
        Message 3 of 17 , May 1, 2003
        • 0 Attachment
          Piet Delport wrote:
          >
          > A few days ago, while trying to speed up the various quick sort
          > implementation available for Vim, i unexpectedly found another
          > algorithm that completely blows quick sort away on all fronts.

          Wow, works for me!

          Although, at first, I had my doubts. I set up functions so I could
          call normal or reverse sort, using the current selection, like:

          function! BISort2_call(mode)
          if a:mode != "v" | return | endif
          normal gv
          call BISort2(line("'<"), line("'>"))
          endfunction

          Bad idea. For some reason this was 9-10 times slower.

          In the end, I ended up creating commands:

          command! -nargs=0 -range BISort2 <line1>,<line2>call BISort2()
          command! -nargs=0 -range BISort2Reverse <line2>,<line1>call BISort2()

          and revised the function definition like so:

          function! BISort2() range

          and then revised the internal variables like so:

          a:start => a:firstline
          a:end => a:lastline

          which finally showed the improved behavior. I'd recommend if this
          becomes a keeper with the rest of the project, something similar be
          provided.

          Piet, are there any restrictions on the algorithm? May I include your
          code in my publically distributed set of scripts?

          --
          Steve Hall [ digitect@... ]
        • Bill McCarthy
          ... It doesn t appear in the archive either. Could someone repost this please (or email me too)? Bill
          Message 4 of 17 , May 1, 2003
          • 0 Attachment
            Ron Aaron wrote:

            >I can't find the original message -- can anyone send it to me,
            >please?

            It doesn't appear in the archive either. Could someone repost this
            please (or email me too)?

            Bill
          • digitect@mindspring.com
            From: Bill McCarthy, 05/01/03 05:25 PM ... I think you were both thwarted by a cross-post, the message originally occured on vim@vim.org and archived here:
            Message 5 of 17 , May 1, 2003
            • 0 Attachment
              From: Bill McCarthy, 05/01/03 05:25 PM
              >
              > Ron Aaron wrote:
              > >
              > > I can't find the original message -- can anyone send it to me,
              > > please?
              >
              > It doesn't appear in the archive either. Could someone repost this
              > please (or email me too)?

              I think you were both thwarted by a cross-post, the message originally
              occured on vim@... and archived here:

              http://groups.yahoo.com/group/vim/message/39476

              --
              Steve Hall [ digitect@... ]

              Cream... the Vim text editor in sheep's clothing!
              http://cream.sourceforge.net
            • Piet Delport
              ... It was posted to vim@ originally. Since there s interest, i ll repost it here. (With the attachment included this time, while i m at it :-) -- Piet
              Message 6 of 17 , May 2, 2003
              • 0 Attachment
                On Thu, 01 May 2003 at 17:25:04 -0400, Bill McCarthy wrote:
                >
                >>I can't find the original message -- can anyone send it to me,
                >>please?
                >
                > It doesn't appear in the archive either. Could someone repost this
                > please (or email me too)?

                It was posted to vim@ originally. Since there's interest, i'll repost
                it here. (With the attachment included this time, while i'm at it :-)

                --
                Piet Delport
                Today's subliminal thought is:
              • Piet Delport
                ... Eek; that almost definitely doesn t do what you think it does. the function assumes the lines are in order (the arguments are called start and end for
                Message 7 of 17 , May 2, 2003
                • 0 Attachment
                  On Thu, 01 May 2003 at 17:05:16 -0400, digitect@... wrote:
                  > Piet Delport wrote:
                  >>
                  >> A few days ago, while trying to speed up the various quick sort
                  >> implementation available for Vim, i unexpectedly found another
                  >> algorithm that completely blows quick sort away on all fronts.
                  >
                  > Wow, works for me!
                  >
                  > Although, at first, I had my doubts. I set up functions so I could
                  > call normal or reverse sort, using the current selection, like:
                  >
                  > function! BISort2_call(mode)
                  > if a:mode != "v" | return | endif
                  > normal gv
                  > call BISort2(line("'<"), line("'>"))
                  > endfunction

                  Eek; that almost definitely doesn't do what you think it does. the
                  function assumes the lines are in order (the arguments are called
                  "start" and "end" for a reason :-), but doesn't bail out if they're not.
                  The check exists in the other functions; i forgot to add it to BISort().
                  This line should be at the start of each function:

                  if a:start >= a:end | return | endif

                  > Bad idea. For some reason this was 9-10 times slower.

                  That's because it would have been starting from the end of your range,
                  and doing something[1] until it reaches the end of the buffer, i think.

                  [1] I'm not sure *what* it would be doing, but it very probably wouldn't be
                  sorting. If does, it would be by accident. :-)

                  > In the end, I ended up creating commands:
                  >
                  > command! -nargs=0 -range BISort2 <line1>,<line2>call BISort2()
                  > command! -nargs=0 -range BISort2Reverse <line2>,<line1>call BISort2()

                  I used commands along the lines of:

                  :command -range=% BISort2 call BISort2(<line1>, <line2>)

                  (see the sortbench.vim file in the re-post of my original mail to this
                  list)

                  Your BISort2Reverse will end up working exactly like your BISort2
                  command, and not sort in reverse order. (It just gives the function a
                  range that's the wrong way around, which Vim will automagically
                  correct.) To actually sort in reverse, you'd want to use the BISort()
                  function, and give it a comparator that compares in reverse order (or
                  any other order you can think of, for that matter; numerically, by
                  date, by length...).

                  > and revised the function definition like so:
                  >
                  > function! BISort2() range
                  >
                  > and then revised the internal variables like so:
                  >
                  > a:start => a:firstline
                  > a:end => a:lastline
                  >
                  > which finally showed the improved behavior.

                  I generally avoid ranges on sorting-type functions, as recursive calls
                  become really ugly. The range handling is a nice thing to have, but
                  since the primary UI to the function is the user command (which already
                  does range handling), it's not really a problem to omit it from the
                  function itself.

                  > I'd recommend if this becomes a keeper with the rest of the project,
                  > something similar be provided.
                  >
                  > Piet, are there any restrictions on the algorithm? May I include your
                  > code in my publically distributed set of scripts?

                  Heck, do whatever you like with it. If you're feeling particularly
                  generous, you can even credit me, but i don't really mind either way.

                  --
                  Piet Delport
                  Today's subliminal thought is:
                • Piet Delport
                  ... [snip] For values of a lot that include astronomically . :-) As i ve said before, i m very much in favor of a C implementation of sorting for Vim, but
                  Message 8 of 17 , May 2, 2003
                  • 0 Attachment
                    On Thu, 01 May 2003 at 20:28:46 +0200, Bram Moolenaar wrote:
                    >
                    > Benji Fisher wrote:
                    >
                    >> I have not tested it myself, but if this works as well as you claim, I
                    >> think this should be added to the distro ov vim 6.2 as (part of) a new
                    >> default plugin.
                    >
                    > If it's included in the distribution, we might as well implement it in
                    > C. That's a lot faster.
                    >
                    [snip]

                    For values of 'a lot' that include 'astronomically'. :-)

                    As i've said before, i'm very much in favor of a C implementation of
                    sorting for Vim, but if you were even vaguely considering implementing
                    the same algorithm that i did, please don't: it's just a trivial hack
                    of an algorithm that's usually counted among the ranks of the bubble and
                    selection sort algorithms in terms of raw speed, and only works as fast
                    as it does in Vim because buffers are more like linked lists than arrays
                    (which is what quick sort is aimed at).

                    I've been giving a bit of thought lately to what a good C implementation
                    would be like, though, and i guess i might as well voice them here. I
                    think the most important requirements of an implementation would be:

                    - The user should be able to provide a range (obviously) and comparison
                    expression. This should provide enough power... i don't think a more
                    complex interface[1] is warranted. The lines could be passed to the
                    sort expression via v:a and v:b variables, much in way that's
                    currently being done with some of the '*expr' settings. The range and
                    expression would default to the entire file in alphanumeric order,
                    respectively.

                    [1] such as a direction[2], or user-definable accessors
                    [2] it's easy enough to reverse the comparison expression

                    - I'm not familiar with how Vim works with lines/buffers under the hood,
                    but assuming they're close to linked lists, the best algorithm to
                    implement is probably merge sort. It's optimal for linked lists
                    (better than quicksort, IOW), and, as a bonus (a rather significant
                    one IMO), it's also stable (unlike quicksort).

                    The remaining big question is whether to expose it as a sort() function,
                    or as a :sort ex command. Both would make sense, IMHO, but there's a
                    slight trade-off in "naturalness"... It's completely natural to pass a
                    range to an ex command, but :exec is the only other command i can think
                    of that takes an expression string. Similarly, it's completely natural
                    to pass an expression string to a Vim function, but a range would be a
                    first, AFAIK (for a built-in function, anyway).

                    I'm currently leaning towards :sort, though, mainly because sort()
                    raises the additional question of whether the range should be passed via
                    a range to :call, or via address arguments to sort() itself (and, in the
                    latter case, whether the expression or the range comes first).

                    Of course, Someone Has To Implement It; until then, this is just talk.
                    I'd very much like to try my hand at it, but i'm really not sure when
                    i'll get time to de-rust my C skills and get familiar enough with the
                    Vim codebase in the near future. :-/

                    --
                    Piet Delport
                    Today's subliminal thought is:
                  • Benji Fisher
                    ... I think you are wrong about the importance (and difficulty) of design decisions versus implementation. I bet that Bram could implement his choice of sort
                    Message 9 of 17 , May 2, 2003
                    • 0 Attachment
                      Piet Delport wrote:
                      >
                      > I'm currently leaning towards :sort, though, mainly because sort()
                      > raises the additional question of whether the range should be passed via
                      > a range to :call, or via address arguments to sort() itself (and, in the
                      > latter case, whether the expression or the range comes first).
                      >
                      > Of course, Someone Has To Implement It; until then, this is just talk.
                      > I'd very much like to try my hand at it, but i'm really not sure when
                      > i'll get time to de-rust my C skills and get familiar enough with the
                      > Vim codebase in the near future. :-/

                      I think you are wrong about the importance (and difficulty) of design
                      decisions versus implementation. I bet that Bram could implement his choice of
                      sort (or maybe just link to one in the standard library) in an hour. Even more
                      important, an inefficient implementation can be changed in a later release, but
                      once the user interface has been chosen, it is very hard to change it.

                      I vote in favor of :sort . Consider the difference between :substitute ,
                      which acts on a range (current line by default) and substitute(), which acts on
                      strings. It would be nuch more convenient to sort lines with an ex command.

                      :.,$sort "default comparison
                      :.,.+5sort "a:1[0] < a:2[0]"

                      If lines are passed to the comparison expression as a:1 and a:2, then
                      user-defined comparison functions would have to be declared to accept two
                      "optional" arguments: an interesting idea, but maybe not the best. If lines
                      are instead passed as a:firstline and a:lastline (perhaps an abuse, but again
                      something worth considering), then we could use

                      :sort "MyComparisonFunction()"

                      Of course, the default range is the whole file.

                      --Benji Fisher
                    • Ron Aaron
                      ... I agree, but I would add that the syntax should allow a default sort (perhaps specified in sortorder or something. ... would do the default sorting of
                      Message 10 of 17 , May 2, 2003
                      • 0 Attachment
                        > I vote in favor of :sort . Consider the difference
                        > between :substitute ,
                        > which acts on a range (current line by default) and
                        > substitute(), which acts on
                        > strings. It would be nuch more convenient to sort lines with
                        > an ex command.


                        I agree, but I would add that the syntax should allow a 'default' sort (perhaps specified in 'sortorder' or something.

                        So:
                        :'a,'b sort

                        would do the default sorting of 'a,'b whereas
                        :'a,'b sort "SortSpecial()"

                        would use the 'SortSpecial()' function, as would
                        : set sortorder="SortSpecial()"
                        :'a,'b sort

                        The default sort would be a simple strcmp or stricmp. Given that the slowest part of the entire sort as envisioned here is the compare function, using a sort algorithm which minimizes compares is a good thing...
                      • Mikolaj Machowski
                        ... Couldn t it be implemented in both ways? Managing : commands in scripts isn t always straightforward. m. -- LaTeX + Vim = http://vim-latex.sourceforge.net/
                        Message 11 of 17 , May 2, 2003
                        • 0 Attachment
                          On Fri, May 02, 2003 at 10:13:46AM -0400, Benji Fisher wrote:
                          > I think you are wrong about the importance (and difficulty) of design
                          > decisions versus implementation. I bet that Bram could implement his
                          > choice of sort (or maybe just link to one in the standard library) in an
                          > hour. Even more important, an inefficient implementation can be changed in
                          > a later release, but once the user interface has been chosen, it is very
                          > hard to change it.
                          > I vote in favor of :sort . Consider the difference between
                          > :substitute , which acts on a range (current line by default) and
                          > substitute(), which acts on strings. It would be nuch more convenient to
                          > sort lines with an ex command.
                          > :.,$sort "default comparison
                          > :.,.+5sort "a:1[0] < a:2[0]"

                          Couldn't it be implemented in both ways?
                          Managing : commands in scripts isn't always straightforward.

                          m.
                          --
                          LaTeX + Vim = http://vim-latex.sourceforge.net/
                          Vim-list(s) Users Map: (last change 21 Apr)
                          http://skawina.eu.org/mikolaj/vimlist
                          Are You There?
                        • Benji Fisher
                          ... I suppose so. There is setline(), a precedent for having a function that changes the buffer. (I never use it myself. Are there any others?) I guess I
                          Message 12 of 17 , May 2, 2003
                          • 0 Attachment
                            Mikolaj Machowski wrote:
                            > On Fri, May 02, 2003 at 10:13:46AM -0400, Benji Fisher wrote:
                            >
                            >> I vote in favor of :sort . Consider the difference between
                            >> :substitute , which acts on a range (current line by default) and
                            >>substitute(), which acts on strings. It would be nuch more convenient to
                            >>sort lines with an ex command.
                            >>:.,$sort "default comparison
                            >>:.,.+5sort "a:1[0] < a:2[0]"
                            >
                            > Couldn't it be implemented in both ways?
                            > Managing : commands in scripts isn't always straightforward.

                            I suppose so. There is setline(), a precedent for having a function that
                            changes the buffer. (I never use it myself. Are there any others?) I guess I
                            am still willing to use :execute .

                            Can you give an example where not having sort() would be particularly
                            awkward? How hard would it be to write a Sort() user function as a wrapper for
                            :execute "sort" ?

                            --Benji Fisher
                          • Mikolaj Machowski
                            ... Heh. What you wrote is example why sort() would be helpful - it is one step more for dealing with lists - slows execution of code - sometimes big. Vim is
                            Message 13 of 17 , May 4, 2003
                            • 0 Attachment
                              On Fri, May 02, 2003 at 09:41:54PM -0400, Benji Fisher wrote:
                              > >Couldn't it be implemented in both ways?
                              > >Managing : commands in scripts isn't always straightforward.
                              > I suppose so. There is setline(), a precedent for having a function
                              > that changes the buffer. (I never use it myself. Are there any others?)
                              > I guess I am still willing to use :execute .
                              > Can you give an example where not having sort() would be particularly
                              > awkward? How hard would it be to write a Sort() user function as a wrapper
                              > for :execute "sort" ?

                              Heh. What you wrote is example why sort() would be helpful - it is
                              one step more for dealing with lists - slows execution of code
                              - sometimes big. Vim is really powerful but slow. Example - simple
                              regexp on big file (ca. 20M) Vim makes 5m, perl 15s.

                              m.
                              --
                              LaTeX + Vim = http://vim-latex.sourceforge.net/
                              Vim-list(s) Users Map: (last change 21 Apr)
                              http://skawina.eu.org/mikolaj/vimlist
                              Are You There?
                            • Siemerink, Ben J H (Bernardus)
                              Hello, I agree that :sort feels more natural than sort(). Without boosting any possible implementation: I would propose a few standard comparison functions or
                              Message 14 of 17 , May 5, 2003
                              • 0 Attachment
                                Hello,


                                I agree that :sort feels more natural than sort().

                                Without boosting any possible implementation: I would propose a few standard
                                comparison functions or options:
                                + ignore case
                                + reverse
                                + numerical

                                The ignore case is a must IMHO.

                                Yet again there are various ways to do so:
                                + commandline options: -i -n -r. The advantage is that you can combine them
                                (ie. -nr).
                                + standard comparison functions. I suddenly don't remember if they are
                                already included in Vim. However, these functions can take arguments that
                                make them behave as commandline options.


                                Saludos, Ben.
                                --
                                Ben Siemerink
                                Madrid, Spain.
                              • Piet Delport
                                ... (What about ic ?) ... I have a better proposal: Add an operator that works like Perl s and cmp operators[1]. Besides being very useful in its own
                                Message 15 of 17 , May 6, 2003
                                • 0 Attachment
                                  On Tue, 06 May 2003 at 08:37:07 +0200, Siemerink, Ben J H (Bernardus) wrote:
                                  >
                                  > I agree that :sort feels more natural than sort().
                                  >
                                  > Without boosting any possible implementation: I would propose a few standard
                                  > comparison functions or options:
                                  > + ignore case

                                  (What about 'ic'?)

                                  > + reverse
                                  > + numerical
                                  >
                                  > The ignore case is a must IMHO.
                                  >
                                  > Yet again there are various ways to do so:
                                  > + commandline options: -i -n -r. The advantage is that you can combine them
                                  > (ie. -nr).
                                  > + standard comparison functions. I suddenly don't remember if they are
                                  > already included in Vim. However, these functions can take arguments that
                                  > make them behave as commandline options.

                                  I have a better proposal: Add an operator that works like Perl's <=>
                                  and cmp operators[1]. Besides being very useful in its own right, such
                                  an operator will allow you to do all of the above simply as:

                                  :sort 'v:a <=> v:b' " default compare
                                  :sort 'v:b <=> v:a' " reverse compare
                                  :sort 'v:a <=># v:b' " force case match, regardless of 'ic'
                                  :sort 'v:a <=>? v:b' " force ignored case match
                                  :sort 'v:b+0 <=> v:a' " numeric, reversed

                                  ...and so on. The operator should be trivial to implement by returning
                                  the result of strcmp()/stricmp() (or whatever wrapper/replacement Vim
                                  uses under the hood).


                                  [1] For those not familiar with Perl, those operators return -1, 0, or
                                  +1 if their first argument is less than, equal to, or greater than
                                  their right argument, respectively. They're used all the time in
                                  comparison expressions such as: sort { lc($a) cmp lc($b) } @list

                                  --
                                  Piet Delport
                                  Today's subliminal thought is:
                                Your message has been successfully submitted and would be delivered to recipients shortly.