Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 27 Nov 2016 12:50:05 +0000
From:      Tristan Verniquet <tris_vern@hotmail.com>
To:        Alfred Perlstein <alfred@freebsd.org>, Mehmet Erol Sanliturk <m.e.sanliturk@gmail.com>, =?iso-8859-1?Q?Arne_Dag_Fidjest=F8l?= <adf@idi.ntnu.no>
Cc:        Hans Petter Selasky <hps@selasky.org>, freebsd hackers <freebsd-hackers@freebsd.org>
Subject:   Re: qsort switching to insertsort
Message-ID:  <ME1PR01MB054652477380257B4D1D2CAE838B0@ME1PR01MB0546.ausprd01.prod.outlook.com>
In-Reply-To: <8b0bd8e7-a77d-85d8-18e7-e1e33fed78f5@freebsd.org>
References:  <ME1PR01MB0546A92343D712D8439B299C83880@ME1PR01MB0546.ausprd01.prod.outlook.com> <0bfb49b0-5d24-2766-6982-b4e49b0d5e81@selasky.org> <ME1PR01MB0546D041DC160763C5D3689883880@ME1PR01MB0546.ausprd01.prod.outlook.com> <CAOgwaMvUE%2BK==OE3zkpfpXa5kBC1YT4JAUVhujxEt9PcHXgqTw@mail.gmail.com> <B8ECCF6B-6477-441B-87F1-0DB3B36F828E@idi.ntnu.no> <CAOgwaMtsP159jgT4Faxg2jED1sh3N-zHKZRSqQNE0Cfh1NM1eA@mail.gmail.com> <ME1PR01MB054602129230B98C6127F607838B0@ME1PR01MB0546.ausprd01.prod.outlook.com>, <8b0bd8e7-a77d-85d8-18e7-e1e33fed78f5@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
> From: Alfred Perlstein <alfred@freebsd.org>
> Sent: Sunday, 27 November 2016 4:40 PM
> To: Tristan Verniquet; Mehmet Erol Sanliturk; Arne Dag Fidjest=F8l
> Cc: Hans Petter Selasky; freebsd hackers
> Subject: Re: qsort switching to insertsort
> =A0  =20
> A couple notes:
>=20
> 1) It's interesting to me that it's based on a simple heuristic as=20
> opposed to also checking the depth of the recursion within the qsort=20
> itself.
>=20
> 2) Wondering if upon detection of this situation it might even be faster=
=20
> to perform some randomization (shuffle) of the data and then retry.
>=20
> 3) Wondering if there could be a way to return an error and indicate the=
=20
> data is unsorted if the corner case is hit allowing the caller to make a=
=20
> choice at that point.=A0 Obviously the API would need to be changed.
>=20
> I haven't thought too hard on this.


Note that the functionalilty in question is a special-case test which has b=
een bolted on to the side. Nothing much is really lost by removing it, whic=
h according to my first link is what NetBSD has done. I don't think we'll g=
et far discussing how to "fix" it as tempting as that is - probably better =
left for a very in depth studious well tested effort..

The qsort.c file is only 200 odd lines long. I highly recommend reading it =
to see what I mean (ie the case if swap_cnt =3D=3D 0).

(The special-case is to turn sorting sorted data from O(nlogn) to O(n), but=
 exposes a whole new set of average-case O(n^2) cases)

>
> One thing to keep in mind: Using qsort in a codepath has "deadlines" or=20
> other deterministic needs is not going to work out well.=A0 It's better t=
o=20
> use a sort with a known complexity with an upper bound than something=20
> that can be defeated by a pathological input data set.
>=20
> -Alfred


I guess I have the same comment/question as I do for Hans Petter Selasky's =
response.

In the cases where qsort would actually be used, don't you think the much h=
igher likelyhood of triggering the insertsort worst case especially with so=
me types of data makes a big difference?

Maybe, I see the insertsort case as similar (though obviously not as likely=
 to be hit) as the "choose the last element" pivot selection which sorts O(=
n^2) on sorted data. If users need to expect O(n^2) anyway, why not keep th=
e last element pivot selection? Instead, a pretty good effort has been made=
 to make sure bad pivots won't be chosen with normal data. That is undermin=
ed slightly with the insertsort switch imho.


Tristan

    =



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?ME1PR01MB054652477380257B4D1D2CAE838B0>