Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 18 Aug 1999 16:56:48 -0700 (PDT)
From:      Christopher Seiwald <seiwald@perforce.com>
To:        freebsd-hackers@freebsd.org
Subject:   anybody love qsort.c?
Message-ID:  <199908182356.QAA09187@perforce.com>

next in thread | raw e-mail | index | archive | help
The FreeBSD qsort implementation has a rather nasty degenerate case.
If you have data that partitions exactly about the median pivot, yet
which is unsorted in either partition, you get get treated to an insertion
sort.  Example:

	1 2 3 4 5 6 7 8 9 10 19 18 17 16 14 14 13 12 11

qsort picks 10 as the median, does no swaps on its first pass, and so
decides to switch to an insertion sort.   The idea is that if no swaps
occur, the data is likely to be nearly already sorted and a good candidate
for insertion sort.  This turns out to be a (very) bad idea if you have
some unsorted data buried in the middle of a (very) large dataset.

The implementation claims to come from Bentley and McIlroy's paper,
"Engineering a Sort Function," and this is largely true, but the insertion
sort optimization(?) isn't in that paper.  The GNU qsort.c only does an
insertion sort if it is below a certain threshhold in size, and so never
attempts such a sort on the whole dataset.  (The GNU qsort does a number
of other things pooh-poohed in the Bentley paper.)

It's a pretty straightforward change to bypass the insertion sort for
large subsets of the data.  If no one has a strong love for qsort, I'll
educate myself on how to make and contribute this change.

Christopher
----
Christopher Seiwald     Perforce Software          1-510-864-7400
seiwald@perforce.com     f-f-f-fast SCM   http://www.perforce.com



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199908182356.QAA09187>