Sorry, an error occurred while loading the content.

## New (?) sort implementation for Vim, beats quick sort [long]

Expand Messages
• [ Note: Unless making fast sort algorithm implementations in pure Vim is your idea of fun, this message will probably bore you to tears. You have been
Message 1 of 3 , May 1, 2003
[ Note: Unless making fast sort algorithm implementations in pure Vim
is your idea of fun, this message will probably bore you to tears. You
have been warned. ]

A few days ago, while trying to speed up the various quick sort
implementation available for Vim, i unexpectedly found another algorithm
that completely blows quick sort away on all fronts. I thought this was
cool enough to warrant posting here.

1. Quick description:

The algorithm (which you can find at the end of this mail) is a binary
insertion sort[1] implementation, with a Vim-specific optimization:
instead of doing a whole bunch of linear swaps to move the current line
to the insertion point, it moves the line in one step.

[1] ...which is like an insertion sort, except that a binary search is used
to find the insertion point

2. Detailed description:

In general, BIS minimized the amount of compares need to sort a list,
with the trade-off being the large amount of swaps needed to move
elements into place. In Vim, however, it's practically just as fast to
delete or insert lines as it is to swap them, so you eliminate this
disadvantage by replacing the O(n/4) (if my math isn't off) swaps needed
per element, on average, with one delete/insert operation, which is a
huge improvement, especially as the input size gets larger.

Add the fact that BIS is non-recursive, which avoids Vim's function
calling overhead (and has the added bonus of side-stepping stack
overflows), and you have an algorithm that not only catches up to
pure-Vim quick sorts, but leaves them way behind in the dust.

3. Benchmarks:

First, a description of the various implementations i compared:

- QSortGenUtils: Hari Krishna Dara's quick sort implementation from
genutils.vim, unmodified. Included only for interest's sake; it's not
really a fair comparison, as it uses abstracted accessors/mutators
which significantly slows it down.

- QSortWebb: Slightly edited version of Robert Webb's quick sort
implementation from :h eval.txt (mostly just some renamed variables,
and some minor optimizations such caching the comparison statement).
For some reason, Webb's original script overflows the stack input as
large as the test data i'm using, while this version doesn't. I
haven't managed to isolate what i seem to have accidentally improved,
yet, so i can't benchmark Webb's original implementation too,
unfortunately.

- QSort: Based on Hari Krishna Dara's algorithm above, but with all the
functionality that isn't needed to sort buffer text removed, and the
same minor optimizations as above applied. I used this as a basis for
experimenting with different pivot selection algorithms (pseudo-random,
median-of-3, Bentley-McKilroy 3-way partitioning) and other variations
on the quick sort theme, such as switching to other algorithms when
the partition size gets small, but these all turned out to be slower
than the plain quick sort (probably because the interpretation
overhead actually exceeds the time savings gained by the extra code).
It was while trying out various simpler algorithms for quick sort to
switch over to, however, that i found...

- BISort: The Binary Insertion Sort described at the top.

- QSort2/BISort2: These are the same as above two functions, but edited
to directly compare lines of text instead of going through a a
comparison function, for maximum speed. These should give a more
accurate representation of each algorithm's relative performance.

Test data used: the archives of a (completely randomly selected)
mailing list, concatenated into one file, weighing in at 41,726 lines
(1.65 MB) of text. I think this should provide good spread of different
input data characteristics.

Each sort was run over a copy of the file three times, and the run time
recorded. The results are listed as <total runtime>: <individual runs>
(both in seconds), with a nice graph at the end (everyone loves graphs):

QSortGenUtils 522: 174 174 174 *************************************
QSortWebb 378: 126 126 126 ***************************
QSort 372: 124 124 124 **************************
QSort2 267: 89 89 89 *******************
BISort 108: 36 36 36 *******
BISort2 70: 24 23 23 *****

As expected, the QSortGenUtils is slowest, but its simplified version
(QSort) manages to be slightly faster than QSortWebb (possibly due to
the slightly different underlying algorithms?). It pales in comparison
to BISort, though, which completes all three runs in less time than a
single QSort run.

There's still the question of how well BISort and and QSort scale
against each other, so i ran single tests (since there doesn't seem to
be significant variation between runs) on larger input sizes too:

Input Size QSort2 BISort2 Ratio
41,726 (from above) 89 23.3+ 3.81 : 1
62,389 (2.32 MB) 189 38 4.97 : 1
104,115 (3.97 MB) 536 72 7.44 : 1

IOW, QSort2 seems to degrade very fast indeed as the input size
increases, compared to BISort2.

4. Reality check

sort(1) sorts the 104,115 line file above in ~1.5 seconds, compared to
BISort2's 72. Nasty.

5. Code

Anyway, enough noise, here's the algorithm. If anyone else is
interested, i've attached a file containing the rest of the code i used
in the benchmarks above.

function BISort(start, end, cmp)
let compare_ival_mid = 'let diff = '.a:cmp.'(i_val, getline(mid))'
let i = a:start + 1
while i <= a:end
" find insertion point via binary search
let i_val = getline(i)
let lo = a:start
let hi = i
while lo < hi
let mid = (lo + hi) / 2
exec compare_ival_mid
if diff < 0
let hi = mid
else
let lo = mid + 1
if diff == 0 | break | endif
endif
endwhile
" do insert
if lo < i
exec i.'d_'
call append(lo - 1, i_val)
endif
let i = i + 1
endwhile
endfunction

function BISort2(start, end)
let i = a:start + 1
while i <= a:end
" find insertion point via binary search
let i_val = getline(i)
let lo = a:start
let hi = i
while lo < hi
let mid = (lo + hi) / 2
let mid_val = getline(mid)
if i_val < mid_val
let hi = mid
else
let lo = mid + 1
if i_val == mid_val | break | endif
endif
endwhile
" do insert
if lo < i
exec i.'d_'
call append(lo - 1, i_val)
endif
let i = i + 1
endwhile
endfunction

--
Piet Delport
Today's subliminal thought is:
• ... Hello! In thinking about coding a C version, I believe this new sort of yours most strongly resembles the use of a red-black tree. The problem with coding
Message 2 of 3 , May 1, 2003
Thus saith Piet Delport:
> [1] ...which is like an insertion sort, except that a binary search is used
> to find the insertion point
---------------------------------------------------------------------

Hello!

In thinking about coding a C version, I believe this new sort of yours
most strongly resembles the use of a red-black tree. The problem with
coding this in C/C++ is implementing the search and insertion capability
while retaining lg(n) behavior. (for n insertions, the binary-search
sort has o(n lg(n)) operations, like quicksort and other decent sorting
algorithms). For example, insertion can be done quickly and easily with
doubly-linkd lists but then the binary search can't be done.

A red-black tree (nearly) retains a balanced tree while undergoing
insertions (and deletions, for that matter). Thus with it both the
search and insertion retain lg(n) behavior. In fact, with red-black
trees one merely reads the list and dumps it into the associated
insertion procedure.

Vim hides the work needed to perform insertion while retaining the
ability to do quick binary searches by doing the data structure work
internally. Lisp, actually, would be another language which would
benefit from this sort; again, due to the behind-the-scenes list
management the insertion acts as a low-cost operation.

Still, its a good sort, especially for interpreted languages which
make the search&insertion easy.

Regards,
Chip Campbell

--
Charles E Campbell, Jr, PhD _ __ __
Goddard Space Flight Center / /_/\_\_/ /
cec@... /_/ \/_//_/
PGP public key: http://www.erols.com/astronaut/pgp.html
• Apologies for responding very late to this posting, actually this got lost in my pending mailbox and remembered only recently when there was some reference to
Message 3 of 3 , Jun 25, 2003
Apologies for responding very late to this posting, actually this got
lost in my pending mailbox and remembered only recently when there was
some reference to this email.

If Piet is still listening, I am curious how this new fast algorithm
will respond if a more generic interface is plugged over it (the way I
did in my original simple QSort). I was looking for your attachment with
the rest of the code that you mentioned below, but I think you forgot to
attach it. Is it still possible for you to send the code that you used
as a reference (ie the improved QSort) and the setup that you used to
benchmark different algorithms, if you haven't already lost it?

Thank you,
Hari

On Thu, 1 May 2003 at 11:19am, Piet Delport wrote:

> [ Note: Unless making fast sort algorithm implementations in pure Vim
> is your idea of fun, this message will probably bore you to tears. You
> have been warned. ]
>
> A few days ago, while trying to speed up the various quick sort
> implementation available for Vim, i unexpectedly found another algorithm
> that completely blows quick sort away on all fronts. I thought this was
> cool enough to warrant posting here.
>
> 1. Quick description:
>
> The algorithm (which you can find at the end of this mail) is a binary
> insertion sort[1] implementation, with a Vim-specific optimization:
> instead of doing a whole bunch of linear swaps to move the current line
> to the insertion point, it moves the line in one step.
>
> [1] ...which is like an insertion sort, except that a binary search is used
> to find the insertion point
>
> 2. Detailed description:
>
> In general, BIS minimized the amount of compares need to sort a list,
> with the trade-off being the large amount of swaps needed to move
> elements into place. In Vim, however, it's practically just as fast to
> delete or insert lines as it is to swap them, so you eliminate this
> disadvantage by replacing the O(n/4) (if my math isn't off) swaps needed
> per element, on average, with one delete/insert operation, which is a
> huge improvement, especially as the input size gets larger.
>
> Add the fact that BIS is non-recursive, which avoids Vim's function
> calling overhead (and has the added bonus of side-stepping stack
> overflows), and you have an algorithm that not only catches up to
> pure-Vim quick sorts, but leaves them way behind in the dust.
>
> 3. Benchmarks:
>
> First, a description of the various implementations i compared:
>
> - QSortGenUtils: Hari Krishna Dara's quick sort implementation from
> genutils.vim, unmodified. Included only for interest's sake; it's not
> really a fair comparison, as it uses abstracted accessors/mutators
> which significantly slows it down.
>
> - QSortWebb: Slightly edited version of Robert Webb's quick sort
> implementation from :h eval.txt (mostly just some renamed variables,
> and some minor optimizations such caching the comparison statement).
> For some reason, Webb's original script overflows the stack input as
> large as the test data i'm using, while this version doesn't. I
> haven't managed to isolate what i seem to have accidentally improved,
> yet, so i can't benchmark Webb's original implementation too,
> unfortunately.
>
> - QSort: Based on Hari Krishna Dara's algorithm above, but with all the
> functionality that isn't needed to sort buffer text removed, and the
> same minor optimizations as above applied. I used this as a basis for
> experimenting with different pivot selection algorithms (pseudo-random,
> median-of-3, Bentley-McKilroy 3-way partitioning) and other variations
> on the quick sort theme, such as switching to other algorithms when
> the partition size gets small, but these all turned out to be slower
> than the plain quick sort (probably because the interpretation
> overhead actually exceeds the time savings gained by the extra code).
> It was while trying out various simpler algorithms for quick sort to
> switch over to, however, that i found...
>
> - BISort: The Binary Insertion Sort described at the top.
>
> - QSort2/BISort2: These are the same as above two functions, but edited
> to directly compare lines of text instead of going through a a
> comparison function, for maximum speed. These should give a more
> accurate representation of each algorithm's relative performance.
>
> Test data used: the archives of a (completely randomly selected)
> mailing list, concatenated into one file, weighing in at 41,726 lines
> (1.65 MB) of text. I think this should provide good spread of different
> input data characteristics.
>
> Each sort was run over a copy of the file three times, and the run time
> recorded. The results are listed as <total runtime>: <individual runs>
> (both in seconds), with a nice graph at the end (everyone loves graphs):
>
> QSortGenUtils 522: 174 174 174 *************************************
> QSortWebb 378: 126 126 126 ***************************
> QSort 372: 124 124 124 **************************
> QSort2 267: 89 89 89 *******************
> BISort 108: 36 36 36 *******
> BISort2 70: 24 23 23 *****
>
> As expected, the QSortGenUtils is slowest, but its simplified version
> (QSort) manages to be slightly faster than QSortWebb (possibly due to
> the slightly different underlying algorithms?). It pales in comparison
> to BISort, though, which completes all three runs in less time than a
> single QSort run.
>
> There's still the question of how well BISort and and QSort scale
> against each other, so i ran single tests (since there doesn't seem to
> be significant variation between runs) on larger input sizes too:
>
> Input Size QSort2 BISort2 Ratio
> 41,726 (from above) 89 23.3+ 3.81 : 1
> 62,389 (2.32 MB) 189 38 4.97 : 1
> 104,115 (3.97 MB) 536 72 7.44 : 1
>
> IOW, QSort2 seems to degrade very fast indeed as the input size
> increases, compared to BISort2.
>
> 4. Reality check
>
> sort(1) sorts the 104,115 line file above in ~1.5 seconds, compared to
> BISort2's 72. Nasty.
>
> 5. Code
>
> Anyway, enough noise, here's the algorithm. If anyone else is
> interested, i've attached a file containing the rest of the code i used
> in the benchmarks above.
>
>
> function BISort(start, end, cmp)
> let compare_ival_mid = 'let diff = '.a:cmp.'(i_val, getline(mid))'
> let i = a:start + 1
> while i <= a:end
> " find insertion point via binary search
> let i_val = getline(i)
> let lo = a:start
> let hi = i
> while lo < hi
> let mid = (lo + hi) / 2
> exec compare_ival_mid
> if diff < 0
> let hi = mid
> else
> let lo = mid + 1
> if diff == 0 | break | endif
> endif
> endwhile
> " do insert
> if lo < i
> exec i.'d_'
> call append(lo - 1, i_val)
> endif
> let i = i + 1
> endwhile
> endfunction
>
> function BISort2(start, end)
> let i = a:start + 1
> while i <= a:end
> " find insertion point via binary search
> let i_val = getline(i)
> let lo = a:start
> let hi = i
> while lo < hi
> let mid = (lo + hi) / 2
> let mid_val = getline(mid)
> if i_val < mid_val
> let hi = mid
> else
> let lo = mid + 1
> if i_val == mid_val | break | endif
> endif
> endwhile
> " do insert
> if lo < i
> exec i.'d_'
> call append(lo - 1, i_val)
> endif
> let i = i + 1
> endwhile
> endfunction
>
>

__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only \$29.95 per month!
http://sbc.yahoo.com
Your message has been successfully submitted and would be delivered to recipients shortly.