- I originally sent the message below to vim@; since then, there have been

replies addressed here, which created some confusion. This time, i'm

also including the mentioned attachment, which i forgot to actually

attach in the original post.

----- Forwarded message from Piet Delport <pjd@...> -----

[ Note: Unless making fast sort algorithm implementations in pure Vim

is your idea of fun, this message will probably bore you to tears. You

have been warned. ]

A few days ago, while trying to speed up the various quick sort

implementation available for Vim, i unexpectedly found another algorithm

that completely blows quick sort away on all fronts. I thought this was

cool enough to warrant posting here.

1. Quick description:

The algorithm (which you can find at the end of this mail) is a binary

insertion sort[1] implementation, with a Vim-specific optimization:

instead of doing a whole bunch of linear swaps to move the current line

to the insertion point, it moves the line in one step.

[1] ...which is like an insertion sort, except that a binary search is used

to find the insertion point

2. Detailed description:

In general, BIS minimized the amount of compares need to sort a list,

with the trade-off being the large amount of swaps needed to move

elements into place. In Vim, however, it's practically just as fast to

delete or insert lines as it is to swap them, so you eliminate this

disadvantage by replacing the O(n/4) (if my math isn't off) swaps needed

per element, on average, with one delete/insert operation, which is a

huge improvement, especially as the input size gets larger.

Add the fact that BIS is non-recursive, which avoids Vim's function

calling overhead (and has the added bonus of side-stepping stack

overflows), and you have an algorithm that not only catches up to

pure-Vim quick sorts, but leaves them way behind in the dust.

3. Benchmarks:

First, a description of the various implementations i compared:

- QSortGenUtils: Hari Krishna Dara's quick sort implementation from

genutils.vim, unmodified. Included only for interest's sake; it's not

really a fair comparison, as it uses abstracted accessors/mutators

which significantly slows it down.

- QSortWebb: Slightly edited version of Robert Webb's quick sort

implementation from :h eval.txt (mostly just some renamed variables,

and some minor optimizations such caching the comparison statement).

For some reason, Webb's original script overflows the stack input as

large as the test data i'm using, while this version doesn't. I

haven't managed to isolate what i seem to have accidentally improved,

yet, so i can't benchmark Webb's original implementation too,

unfortunately.

- QSort: Based on Hari Krishna Dara's algorithm above, but with all the

functionality that isn't needed to sort buffer text removed, and the

same minor optimizations as above applied. I used this as a basis for

experimenting with different pivot selection algorithms (pseudo-random,

median-of-3, Bentley-McKilroy 3-way partitioning) and other variations

on the quick sort theme, such as switching to other algorithms when

the partition size gets small, but these all turned out to be slower

than the plain quick sort (probably because the interpretation

overhead actually exceeds the time savings gained by the extra code).

It was while trying out various simpler algorithms for quick sort to

switch over to, however, that i found...

- BISort: The Binary Insertion Sort described at the top.

- QSort2/BISort2: These are the same as above two functions, but edited

to directly compare lines of text instead of going through a a

comparison function, for maximum speed. These should give a more

accurate representation of each algorithm's relative performance.

Test data used: the archives of a (completely randomly selected)

mailing list, concatenated into one file, weighing in at 41,726 lines

(1.65 MB) of text. I think this should provide good spread of different

input data characteristics.

Each sort was run over a copy of the file three times, and the run time

recorded. The results are listed as <total runtime>: <individual runs>

(both in seconds), with a nice graph at the end (everyone loves graphs):

QSortGenUtils 522: 174 174 174 *************************************

QSortWebb 378: 126 126 126 ***************************

QSort 372: 124 124 124 **************************

QSort2 267: 89 89 89 *******************

BISort 108: 36 36 36 *******

BISort2 70: 24 23 23 *****

As expected, the QSortGenUtils is slowest, but its simplified version

(QSort) manages to be slightly faster than QSortWebb (possibly due to

the slightly different underlying algorithms?). It pales in comparison

to BISort, though, which completes all three runs in less time than a

single QSort run.

There's still the question of how well BISort and and QSort scale

against each other, so i ran single tests (since there doesn't seem to

be significant variation between runs) on larger input sizes too:

Input Size QSort2 BISort2 Ratio

41,726 (from above) 89 23.3+ 3.81 : 1

62,389 (2.32 MB) 189 38 4.97 : 1

104,115 (3.97 MB) 536 72 7.44 : 1

IOW, QSort2 seems to degrade very fast indeed as the input size

increases, compared to BISort2.

4. Reality check

sort(1) sorts the 104,115 line file above in ~1.5 seconds, compared to

BISort2's 72. Nasty.

5. Code

Anyway, enough noise, here's the algorithm. If anyone else is

interested, i've attached a file containing the rest of the code i used

in the benchmarks above.

function BISort(start, end, cmp)

let compare_ival_mid = 'let diff = '.a:cmp.'(i_val, getline(mid))'

let i = a:start + 1

while i <= a:end

" find insertion point via binary search

let i_val = getline(i)

let lo = a:start

let hi = i

while lo < hi

let mid = (lo + hi) / 2

exec compare_ival_mid

if diff < 0

let hi = mid

else

let lo = mid + 1

if diff == 0 | break | endif

endif

endwhile

" do insert

if lo < i

exec i.'d_'

call append(lo - 1, i_val)

endif

let i = i + 1

endwhile

endfunction

function BISort2(start, end)

let i = a:start + 1

while i <= a:end

" find insertion point via binary search

let i_val = getline(i)

let lo = a:start

let hi = i

while lo < hi

let mid = (lo + hi) / 2

let mid_val = getline(mid)

if i_val < mid_val

let hi = mid

else

let lo = mid + 1

if i_val == mid_val | break | endif

endif

endwhile

" do insert

if lo < i

exec i.'d_'

call append(lo - 1, i_val)

endif

let i = i + 1

endwhile

endfunction

--

Piet Delport

Today's subliminal thought is: - On Fri, 02 May 2003 at 12:41:35 +0200, Piet Delport wrote:
>

[...]

> This time, i'm also including the mentioned attachment, which i forgot

OK, find sortbench.vim, containing the functions used in the original

> to actually attach in the original post.

post, attached to *this* message. Someone be so kind as to pass me my

Fool's Guild hat, silly shoes and diploma, please?

--

Piet Delport

Today's subliminal thought is: