Loading ...
Sorry, an error occurred while loading the content.

New (?) sort implementation for Vim, beats quick sort [long]

Expand Messages
  • Piet Delport
    [ Note: Unless making fast sort algorithm implementations in pure Vim is your idea of fun, this message will probably bore you to tears. You have been
    Message 1 of 3 , May 1, 2003
    View Source
    • 0 Attachment
      [ Note: Unless making fast sort algorithm implementations in pure Vim
      is your idea of fun, this message will probably bore you to tears. You
      have been warned. ]

      A few days ago, while trying to speed up the various quick sort
      implementation available for Vim, i unexpectedly found another algorithm
      that completely blows quick sort away on all fronts. I thought this was
      cool enough to warrant posting here.

      1. Quick description:

      The algorithm (which you can find at the end of this mail) is a binary
      insertion sort[1] implementation, with a Vim-specific optimization:
      instead of doing a whole bunch of linear swaps to move the current line
      to the insertion point, it moves the line in one step.

      [1] ...which is like an insertion sort, except that a binary search is used
      to find the insertion point

      2. Detailed description:

      In general, BIS minimized the amount of compares need to sort a list,
      with the trade-off being the large amount of swaps needed to move
      elements into place. In Vim, however, it's practically just as fast to
      delete or insert lines as it is to swap them, so you eliminate this
      disadvantage by replacing the O(n/4) (if my math isn't off) swaps needed
      per element, on average, with one delete/insert operation, which is a
      huge improvement, especially as the input size gets larger.

      Add the fact that BIS is non-recursive, which avoids Vim's function
      calling overhead (and has the added bonus of side-stepping stack
      overflows), and you have an algorithm that not only catches up to
      pure-Vim quick sorts, but leaves them way behind in the dust.

      3. Benchmarks:

      First, a description of the various implementations i compared:

      - QSortGenUtils: Hari Krishna Dara's quick sort implementation from
      genutils.vim, unmodified. Included only for interest's sake; it's not
      really a fair comparison, as it uses abstracted accessors/mutators
      which significantly slows it down.

      - QSortWebb: Slightly edited version of Robert Webb's quick sort
      implementation from :h eval.txt (mostly just some renamed variables,
      and some minor optimizations such caching the comparison statement).
      For some reason, Webb's original script overflows the stack input as
      large as the test data i'm using, while this version doesn't. I
      haven't managed to isolate what i seem to have accidentally improved,
      yet, so i can't benchmark Webb's original implementation too,
      unfortunately.

      - QSort: Based on Hari Krishna Dara's algorithm above, but with all the
      functionality that isn't needed to sort buffer text removed, and the
      same minor optimizations as above applied. I used this as a basis for
      experimenting with different pivot selection algorithms (pseudo-random,
      median-of-3, Bentley-McKilroy 3-way partitioning) and other variations
      on the quick sort theme, such as switching to other algorithms when
      the partition size gets small, but these all turned out to be slower
      than the plain quick sort (probably because the interpretation
      overhead actually exceeds the time savings gained by the extra code).
      It was while trying out various simpler algorithms for quick sort to
      switch over to, however, that i found...

      - BISort: The Binary Insertion Sort described at the top.

      - QSort2/BISort2: These are the same as above two functions, but edited
      to directly compare lines of text instead of going through a a
      comparison function, for maximum speed. These should give a more
      accurate representation of each algorithm's relative performance.

      Test data used: the archives of a (completely randomly selected)
      mailing list, concatenated into one file, weighing in at 41,726 lines
      (1.65 MB) of text. I think this should provide good spread of different
      input data characteristics.

      Each sort was run over a copy of the file three times, and the run time
      recorded. The results are listed as <total runtime>: <individual runs>
      (both in seconds), with a nice graph at the end (everyone loves graphs):

      QSortGenUtils 522: 174 174 174 *************************************
      QSortWebb 378: 126 126 126 ***************************
      QSort 372: 124 124 124 **************************
      QSort2 267: 89 89 89 *******************
      BISort 108: 36 36 36 *******
      BISort2 70: 24 23 23 *****

      As expected, the QSortGenUtils is slowest, but its simplified version
      (QSort) manages to be slightly faster than QSortWebb (possibly due to
      the slightly different underlying algorithms?). It pales in comparison
      to BISort, though, which completes all three runs in less time than a
      single QSort run.

      There's still the question of how well BISort and and QSort scale
      against each other, so i ran single tests (since there doesn't seem to
      be significant variation between runs) on larger input sizes too:

      Input Size QSort2 BISort2 Ratio
      41,726 (from above) 89 23.3+ 3.81 : 1
      62,389 (2.32 MB) 189 38 4.97 : 1
      104,115 (3.97 MB) 536 72 7.44 : 1

      IOW, QSort2 seems to degrade very fast indeed as the input size
      increases, compared to BISort2.

      4. Reality check

      sort(1) sorts the 104,115 line file above in ~1.5 seconds, compared to
      BISort2's 72. Nasty.

      5. Code

      Anyway, enough noise, here's the algorithm. If anyone else is
      interested, i've attached a file containing the rest of the code i used
      in the benchmarks above.


      function BISort(start, end, cmp)
      let compare_ival_mid = 'let diff = '.a:cmp.'(i_val, getline(mid))'
      let i = a:start + 1
      while i <= a:end
      " find insertion point via binary search
      let i_val = getline(i)
      let lo = a:start
      let hi = i
      while lo < hi
      let mid = (lo + hi) / 2
      exec compare_ival_mid
      if diff < 0
      let hi = mid
      else
      let lo = mid + 1
      if diff == 0 | break | endif
      endif
      endwhile
      " do insert
      if lo < i
      exec i.'d_'
      call append(lo - 1, i_val)
      endif
      let i = i + 1
      endwhile
      endfunction

      function BISort2(start, end)
      let i = a:start + 1
      while i <= a:end
      " find insertion point via binary search
      let i_val = getline(i)
      let lo = a:start
      let hi = i
      while lo < hi
      let mid = (lo + hi) / 2
      let mid_val = getline(mid)
      if i_val < mid_val
      let hi = mid
      else
      let lo = mid + 1
      if i_val == mid_val | break | endif
      endif
      endwhile
      " do insert
      if lo < i
      exec i.'d_'
      call append(lo - 1, i_val)
      endif
      let i = i + 1
      endwhile
      endfunction

      --
      Piet Delport
      Today's subliminal thought is:
    • Charles E. Campbell
      ... Hello! In thinking about coding a C version, I believe this new sort of yours most strongly resembles the use of a red-black tree. The problem with coding
      Message 2 of 3 , May 1, 2003
      View Source
      • 0 Attachment
        Thus saith Piet Delport:
        > [1] ...which is like an insertion sort, except that a binary search is used
        > to find the insertion point
        ---------------------------------------------------------------------

        Hello!

        In thinking about coding a C version, I believe this new sort of yours
        most strongly resembles the use of a red-black tree. The problem with
        coding this in C/C++ is implementing the search and insertion capability
        while retaining lg(n) behavior. (for n insertions, the binary-search
        sort has o(n lg(n)) operations, like quicksort and other decent sorting
        algorithms). For example, insertion can be done quickly and easily with
        doubly-linkd lists but then the binary search can't be done.

        A red-black tree (nearly) retains a balanced tree while undergoing
        insertions (and deletions, for that matter). Thus with it both the
        search and insertion retain lg(n) behavior. In fact, with red-black
        trees one merely reads the list and dumps it into the associated
        insertion procedure.

        Vim hides the work needed to perform insertion while retaining the
        ability to do quick binary searches by doing the data structure work
        internally. Lisp, actually, would be another language which would
        benefit from this sort; again, due to the behind-the-scenes list
        management the insertion acts as a low-cost operation.

        Still, its a good sort, especially for interpreted languages which
        make the search&insertion easy.

        Regards,
        Chip Campbell

        --
        Charles E Campbell, Jr, PhD _ __ __
        Goddard Space Flight Center / /_/\_\_/ /
        cec@... /_/ \/_//_/
        PGP public key: http://www.erols.com/astronaut/pgp.html
      • Hari Krishna Dara
        Apologies for responding very late to this posting, actually this got lost in my pending mailbox and remembered only recently when there was some reference to
        Message 3 of 3 , Jun 25, 2003
        View Source
        • 0 Attachment
          Apologies for responding very late to this posting, actually this got
          lost in my pending mailbox and remembered only recently when there was
          some reference to this email.

          If Piet is still listening, I am curious how this new fast algorithm
          will respond if a more generic interface is plugged over it (the way I
          did in my original simple QSort). I was looking for your attachment with
          the rest of the code that you mentioned below, but I think you forgot to
          attach it. Is it still possible for you to send the code that you used
          as a reference (ie the improved QSort) and the setup that you used to
          benchmark different algorithms, if you haven't already lost it?

          Thank you,
          Hari

          On Thu, 1 May 2003 at 11:19am, Piet Delport wrote:

          > [ Note: Unless making fast sort algorithm implementations in pure Vim
          > is your idea of fun, this message will probably bore you to tears. You
          > have been warned. ]
          >
          > A few days ago, while trying to speed up the various quick sort
          > implementation available for Vim, i unexpectedly found another algorithm
          > that completely blows quick sort away on all fronts. I thought this was
          > cool enough to warrant posting here.
          >
          > 1. Quick description:
          >
          > The algorithm (which you can find at the end of this mail) is a binary
          > insertion sort[1] implementation, with a Vim-specific optimization:
          > instead of doing a whole bunch of linear swaps to move the current line
          > to the insertion point, it moves the line in one step.
          >
          > [1] ...which is like an insertion sort, except that a binary search is used
          > to find the insertion point
          >
          > 2. Detailed description:
          >
          > In general, BIS minimized the amount of compares need to sort a list,
          > with the trade-off being the large amount of swaps needed to move
          > elements into place. In Vim, however, it's practically just as fast to
          > delete or insert lines as it is to swap them, so you eliminate this
          > disadvantage by replacing the O(n/4) (if my math isn't off) swaps needed
          > per element, on average, with one delete/insert operation, which is a
          > huge improvement, especially as the input size gets larger.
          >
          > Add the fact that BIS is non-recursive, which avoids Vim's function
          > calling overhead (and has the added bonus of side-stepping stack
          > overflows), and you have an algorithm that not only catches up to
          > pure-Vim quick sorts, but leaves them way behind in the dust.
          >
          > 3. Benchmarks:
          >
          > First, a description of the various implementations i compared:
          >
          > - QSortGenUtils: Hari Krishna Dara's quick sort implementation from
          > genutils.vim, unmodified. Included only for interest's sake; it's not
          > really a fair comparison, as it uses abstracted accessors/mutators
          > which significantly slows it down.
          >
          > - QSortWebb: Slightly edited version of Robert Webb's quick sort
          > implementation from :h eval.txt (mostly just some renamed variables,
          > and some minor optimizations such caching the comparison statement).
          > For some reason, Webb's original script overflows the stack input as
          > large as the test data i'm using, while this version doesn't. I
          > haven't managed to isolate what i seem to have accidentally improved,
          > yet, so i can't benchmark Webb's original implementation too,
          > unfortunately.
          >
          > - QSort: Based on Hari Krishna Dara's algorithm above, but with all the
          > functionality that isn't needed to sort buffer text removed, and the
          > same minor optimizations as above applied. I used this as a basis for
          > experimenting with different pivot selection algorithms (pseudo-random,
          > median-of-3, Bentley-McKilroy 3-way partitioning) and other variations
          > on the quick sort theme, such as switching to other algorithms when
          > the partition size gets small, but these all turned out to be slower
          > than the plain quick sort (probably because the interpretation
          > overhead actually exceeds the time savings gained by the extra code).
          > It was while trying out various simpler algorithms for quick sort to
          > switch over to, however, that i found...
          >
          > - BISort: The Binary Insertion Sort described at the top.
          >
          > - QSort2/BISort2: These are the same as above two functions, but edited
          > to directly compare lines of text instead of going through a a
          > comparison function, for maximum speed. These should give a more
          > accurate representation of each algorithm's relative performance.
          >
          > Test data used: the archives of a (completely randomly selected)
          > mailing list, concatenated into one file, weighing in at 41,726 lines
          > (1.65 MB) of text. I think this should provide good spread of different
          > input data characteristics.
          >
          > Each sort was run over a copy of the file three times, and the run time
          > recorded. The results are listed as <total runtime>: <individual runs>
          > (both in seconds), with a nice graph at the end (everyone loves graphs):
          >
          > QSortGenUtils 522: 174 174 174 *************************************
          > QSortWebb 378: 126 126 126 ***************************
          > QSort 372: 124 124 124 **************************
          > QSort2 267: 89 89 89 *******************
          > BISort 108: 36 36 36 *******
          > BISort2 70: 24 23 23 *****
          >
          > As expected, the QSortGenUtils is slowest, but its simplified version
          > (QSort) manages to be slightly faster than QSortWebb (possibly due to
          > the slightly different underlying algorithms?). It pales in comparison
          > to BISort, though, which completes all three runs in less time than a
          > single QSort run.
          >
          > There's still the question of how well BISort and and QSort scale
          > against each other, so i ran single tests (since there doesn't seem to
          > be significant variation between runs) on larger input sizes too:
          >
          > Input Size QSort2 BISort2 Ratio
          > 41,726 (from above) 89 23.3+ 3.81 : 1
          > 62,389 (2.32 MB) 189 38 4.97 : 1
          > 104,115 (3.97 MB) 536 72 7.44 : 1
          >
          > IOW, QSort2 seems to degrade very fast indeed as the input size
          > increases, compared to BISort2.
          >
          > 4. Reality check
          >
          > sort(1) sorts the 104,115 line file above in ~1.5 seconds, compared to
          > BISort2's 72. Nasty.
          >
          > 5. Code
          >
          > Anyway, enough noise, here's the algorithm. If anyone else is
          > interested, i've attached a file containing the rest of the code i used
          > in the benchmarks above.
          >
          >
          > function BISort(start, end, cmp)
          > let compare_ival_mid = 'let diff = '.a:cmp.'(i_val, getline(mid))'
          > let i = a:start + 1
          > while i <= a:end
          > " find insertion point via binary search
          > let i_val = getline(i)
          > let lo = a:start
          > let hi = i
          > while lo < hi
          > let mid = (lo + hi) / 2
          > exec compare_ival_mid
          > if diff < 0
          > let hi = mid
          > else
          > let lo = mid + 1
          > if diff == 0 | break | endif
          > endif
          > endwhile
          > " do insert
          > if lo < i
          > exec i.'d_'
          > call append(lo - 1, i_val)
          > endif
          > let i = i + 1
          > endwhile
          > endfunction
          >
          > function BISort2(start, end)
          > let i = a:start + 1
          > while i <= a:end
          > " find insertion point via binary search
          > let i_val = getline(i)
          > let lo = a:start
          > let hi = i
          > while lo < hi
          > let mid = (lo + hi) / 2
          > let mid_val = getline(mid)
          > if i_val < mid_val
          > let hi = mid
          > else
          > let lo = mid + 1
          > if i_val == mid_val | break | endif
          > endif
          > endwhile
          > " do insert
          > if lo < i
          > exec i.'d_'
          > call append(lo - 1, i_val)
          > endif
          > let i = i + 1
          > endwhile
          > endfunction
          >
          >

          __________________________________
          Do you Yahoo!?
          SBC Yahoo! DSL - Now only $29.95 per month!
          http://sbc.yahoo.com
        Your message has been successfully submitted and would be delivered to recipients shortly.