Date: Sun, 27 Nov 2016 12:50:05 +0000 From: Tristan Verniquet <tris_vern@hotmail.com> To: Alfred Perlstein <alfred@freebsd.org>, Mehmet Erol Sanliturk <m.e.sanliturk@gmail.com>, =?iso-8859-1?Q?Arne_Dag_Fidjest=F8l?= <adf@idi.ntnu.no> Cc: Hans Petter Selasky <hps@selasky.org>, freebsd hackers <freebsd-hackers@freebsd.org> Subject: Re: qsort switching to insertsort Message-ID: <ME1PR01MB054652477380257B4D1D2CAE838B0@ME1PR01MB0546.ausprd01.prod.outlook.com> In-Reply-To: <8b0bd8e7-a77d-85d8-18e7-e1e33fed78f5@freebsd.org> References: <ME1PR01MB0546A92343D712D8439B299C83880@ME1PR01MB0546.ausprd01.prod.outlook.com> <0bfb49b0-5d24-2766-6982-b4e49b0d5e81@selasky.org> <ME1PR01MB0546D041DC160763C5D3689883880@ME1PR01MB0546.ausprd01.prod.outlook.com> <CAOgwaMvUE%2BK==OE3zkpfpXa5kBC1YT4JAUVhujxEt9PcHXgqTw@mail.gmail.com> <B8ECCF6B-6477-441B-87F1-0DB3B36F828E@idi.ntnu.no> <CAOgwaMtsP159jgT4Faxg2jED1sh3N-zHKZRSqQNE0Cfh1NM1eA@mail.gmail.com> <ME1PR01MB054602129230B98C6127F607838B0@ME1PR01MB0546.ausprd01.prod.outlook.com>, <8b0bd8e7-a77d-85d8-18e7-e1e33fed78f5@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
> From: Alfred Perlstein <alfred@freebsd.org> > Sent: Sunday, 27 November 2016 4:40 PM > To: Tristan Verniquet; Mehmet Erol Sanliturk; Arne Dag Fidjest=F8l > Cc: Hans Petter Selasky; freebsd hackers > Subject: Re: qsort switching to insertsort > =A0 =20 > A couple notes: >=20 > 1) It's interesting to me that it's based on a simple heuristic as=20 > opposed to also checking the depth of the recursion within the qsort=20 > itself. >=20 > 2) Wondering if upon detection of this situation it might even be faster= =20 > to perform some randomization (shuffle) of the data and then retry. >=20 > 3) Wondering if there could be a way to return an error and indicate the= =20 > data is unsorted if the corner case is hit allowing the caller to make a= =20 > choice at that point.=A0 Obviously the API would need to be changed. >=20 > I haven't thought too hard on this. Note that the functionalilty in question is a special-case test which has b= een bolted on to the side. Nothing much is really lost by removing it, whic= h according to my first link is what NetBSD has done. I don't think we'll g= et far discussing how to "fix" it as tempting as that is - probably better = left for a very in depth studious well tested effort.. The qsort.c file is only 200 odd lines long. I highly recommend reading it = to see what I mean (ie the case if swap_cnt =3D=3D 0). (The special-case is to turn sorting sorted data from O(nlogn) to O(n), but= exposes a whole new set of average-case O(n^2) cases) > > One thing to keep in mind: Using qsort in a codepath has "deadlines" or=20 > other deterministic needs is not going to work out well.=A0 It's better t= o=20 > use a sort with a known complexity with an upper bound than something=20 > that can be defeated by a pathological input data set. >=20 > -Alfred I guess I have the same comment/question as I do for Hans Petter Selasky's = response. In the cases where qsort would actually be used, don't you think the much h= igher likelyhood of triggering the insertsort worst case especially with so= me types of data makes a big difference? Maybe, I see the insertsort case as similar (though obviously not as likely= to be hit) as the "choose the last element" pivot selection which sorts O(= n^2) on sorted data. If users need to expect O(n^2) anyway, why not keep th= e last element pivot selection? Instead, a pretty good effort has been made= to make sure bad pivots won't be chosen with normal data. That is undermin= ed slightly with the insertsort switch imho. Tristan =
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?ME1PR01MB054652477380257B4D1D2CAE838B0>