lost in my pending mailbox and remembered only recently when there was

some reference to this email.

If Piet is still listening, I am curious how this new fast algorithm

will respond if a more generic interface is plugged over it (the way I

did in my original simple QSort). I was looking for your attachment with

the rest of the code that you mentioned below, but I think you forgot to

attach it. Is it still possible for you to send the code that you used

as a reference (ie the improved QSort) and the setup that you used to

benchmark different algorithms, if you haven't already lost it?

Thank you,

Hari

On Thu, 1 May 2003 at 11:19am, Piet Delport wrote:

> [ Note: Unless making fast sort algorithm implementations in pure Vim

> is your idea of fun, this message will probably bore you to tears. You

> have been warned. ]

>

> A few days ago, while trying to speed up the various quick sort

> implementation available for Vim, i unexpectedly found another algorithm

> that completely blows quick sort away on all fronts. I thought this was

> cool enough to warrant posting here.

>

> 1. Quick description:

>

> The algorithm (which you can find at the end of this mail) is a binary

> insertion sort[1] implementation, with a Vim-specific optimization:

> instead of doing a whole bunch of linear swaps to move the current line

> to the insertion point, it moves the line in one step.

>

> [1] ...which is like an insertion sort, except that a binary search is used

> to find the insertion point

>

> 2. Detailed description:

>

> In general, BIS minimized the amount of compares need to sort a list,

> with the trade-off being the large amount of swaps needed to move

> elements into place. In Vim, however, it's practically just as fast to

> delete or insert lines as it is to swap them, so you eliminate this

> disadvantage by replacing the O(n/4) (if my math isn't off) swaps needed

> per element, on average, with one delete/insert operation, which is a

> huge improvement, especially as the input size gets larger.

>

> Add the fact that BIS is non-recursive, which avoids Vim's function

> calling overhead (and has the added bonus of side-stepping stack

> overflows), and you have an algorithm that not only catches up to

> pure-Vim quick sorts, but leaves them way behind in the dust.

>

> 3. Benchmarks:

>

> First, a description of the various implementations i compared:

>

> - QSortGenUtils: Hari Krishna Dara's quick sort implementation from

> genutils.vim, unmodified. Included only for interest's sake; it's not

> really a fair comparison, as it uses abstracted accessors/mutators

> which significantly slows it down.

>

> - QSortWebb: Slightly edited version of Robert Webb's quick sort

> implementation from :h eval.txt (mostly just some renamed variables,

> and some minor optimizations such caching the comparison statement).

> For some reason, Webb's original script overflows the stack input as

> large as the test data i'm using, while this version doesn't. I

> haven't managed to isolate what i seem to have accidentally improved,

> yet, so i can't benchmark Webb's original implementation too,

> unfortunately.

>

> - QSort: Based on Hari Krishna Dara's algorithm above, but with all the

> functionality that isn't needed to sort buffer text removed, and the

> same minor optimizations as above applied. I used this as a basis for

> experimenting with different pivot selection algorithms (pseudo-random,

> median-of-3, Bentley-McKilroy 3-way partitioning) and other variations

> on the quick sort theme, such as switching to other algorithms when

> the partition size gets small, but these all turned out to be slower

> than the plain quick sort (probably because the interpretation

> overhead actually exceeds the time savings gained by the extra code).

> It was while trying out various simpler algorithms for quick sort to

> switch over to, however, that i found...

>

> - BISort: The Binary Insertion Sort described at the top.

>

> - QSort2/BISort2: These are the same as above two functions, but edited

> to directly compare lines of text instead of going through a a

> comparison function, for maximum speed. These should give a more

> accurate representation of each algorithm's relative performance.

>

> Test data used: the archives of a (completely randomly selected)

> mailing list, concatenated into one file, weighing in at 41,726 lines

> (1.65 MB) of text. I think this should provide good spread of different

> input data characteristics.

>

> Each sort was run over a copy of the file three times, and the run time

> recorded. The results are listed as <total runtime>: <individual runs>

> (both in seconds), with a nice graph at the end (everyone loves graphs):

>

> QSortGenUtils 522: 174 174 174 *************************************

> QSortWebb 378: 126 126 126 ***************************

> QSort 372: 124 124 124 **************************

> QSort2 267: 89 89 89 *******************

> BISort 108: 36 36 36 *******

> BISort2 70: 24 23 23 *****

>

> As expected, the QSortGenUtils is slowest, but its simplified version

> (QSort) manages to be slightly faster than QSortWebb (possibly due to

> the slightly different underlying algorithms?). It pales in comparison

> to BISort, though, which completes all three runs in less time than a

> single QSort run.

>

> There's still the question of how well BISort and and QSort scale

> against each other, so i ran single tests (since there doesn't seem to

> be significant variation between runs) on larger input sizes too:

>

> Input Size QSort2 BISort2 Ratio

> 41,726 (from above) 89 23.3+ 3.81 : 1

> 62,389 (2.32 MB) 189 38 4.97 : 1

> 104,115 (3.97 MB) 536 72 7.44 : 1

>

> IOW, QSort2 seems to degrade very fast indeed as the input size

> increases, compared to BISort2.

>

> 4. Reality check

>

> sort(1) sorts the 104,115 line file above in ~1.5 seconds, compared to

> BISort2's 72. Nasty.

>

> 5. Code

>

> Anyway, enough noise, here's the algorithm. If anyone else is

> interested, i've attached a file containing the rest of the code i used

> in the benchmarks above.

>

>

> function BISort(start, end, cmp)

> let compare_ival_mid = 'let diff = '.a:cmp.'(i_val, getline(mid))'

> let i = a:start + 1

> while i <= a:end

> " find insertion point via binary search

> let i_val = getline(i)

> let lo = a:start

> let hi = i

> while lo < hi

> let mid = (lo + hi) / 2

> exec compare_ival_mid

> if diff < 0

> let hi = mid

> else

> let lo = mid + 1

> if diff == 0 | break | endif

> endif

> endwhile

> " do insert

> if lo < i

> exec i.'d_'

> call append(lo - 1, i_val)

> endif

> let i = i + 1

> endwhile

> endfunction

>

> function BISort2(start, end)

> let i = a:start + 1

> while i <= a:end

> " find insertion point via binary search

> let i_val = getline(i)

> let lo = a:start

> let hi = i

> while lo < hi

> let mid = (lo + hi) / 2

> let mid_val = getline(mid)

> if i_val < mid_val

> let hi = mid

> else

> let lo = mid + 1

> if i_val == mid_val | break | endif

> endif

> endwhile

> " do insert

> if lo < i

> exec i.'d_'

> call append(lo - 1, i_val)

> endif

> let i = i + 1

> endwhile

> endfunction

>

>

__________________________________

Do you Yahoo!?

SBC Yahoo! DSL - Now only $29.95 per month!

http://sbc.yahoo.com