From owner-freebsd-current@freebsd.org Wed Apr 20 06:41:54 2016 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3855CB15AA5 for ; Wed, 20 Apr 2016 06:41:54 +0000 (UTC) (envelope-from hps@selasky.org) Received: from mail.turbocat.net (heidi.turbocat.net [88.198.202.214]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 01E3F1D7A for ; Wed, 20 Apr 2016 06:41:53 +0000 (UTC) (envelope-from hps@selasky.org) Received: from laptop015.home.selasky.org (unknown [62.141.129.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.turbocat.net (Postfix) with ESMTPSA id 884D41FE024; Wed, 20 Apr 2016 08:41:44 +0200 (CEST) Subject: Re: qsort() documentation To: Warren Block , Aleksander Alekseev , FreeBSD Current References: <5714C86A.8050204@selasky.org> <20160419113430.69e41a0b@fujitsu> From: Hans Petter Selasky Message-ID: <5717256C.5080805@selasky.org> Date: Wed, 20 Apr 2016 08:45:00 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Apr 2016 06:41:54 -0000 On 04/20/16 06:01, Warren Block wrote: > On Tue, 19 Apr 2016, Aleksander Alekseev wrote: > >>> Why Wikipedia, specifically? There are a lot of places that describe >>> quicksort. How about just >>> >>> Note: This implementation of qsort() is designed to avoid the >>> worst-case complexity of N**2 that is often seen with standard >>> versions. >> >> I would say that this statement is just false. Worst-case complexity is >> still N**2. How about something like: >> >> """ >> This implementation of qsort() has worst case complexity of N**2. >> However measures were taken that make it very unlikely that for some >> random input N**2 swaps will be made. It's still possible to generate >> such an input on purpose though. See code below for more details. >> """ > > Okay: > > The quicksort algorithm worst-case is O(N**2), but this implementation > has been designed to avoid that for most real data. Hi, There is something which I don't understand. Why is quicksort falling back to insertion sort which is an O(N**2) algorithm, when there exist a O(log(N)*log(N)*N) algorithms, which I propose as a solution to the "bad" characteristics of qsort. The replacement algorithm I propose does not allocate working memory and it does not recurse and it does not use a significant amount of stack space. Maybe some number theorists want to have a look? I cannot seem to find it anywhere public. See here: https://reviews.freebsd.org/D5241 Local test results w/D5241 running statically in userspace: > Data set size log2(N) qsort w/insert sort qsort w/hpsort mergesort data set > 10 1.28 1.12 1.34 random0 > 11 2.42 2.43 2.83 random0 > 12 5.21 5.2 6.1 random0 > > 10 1.26 1.14 1.44 random1 > 11 2.46 2.46 3.05 random1 > 12 5.28 5.29 6.56 random1 > > 10 10.01 5.1 0.2 bad0 > 11 39.12 12.11 0.33 bad0 > 12 156.54 28.42 0.58 bad0 New algorithm is in the middle column. Times are in seconds. Bad0 data set: > http://calmerthanyouare.org/2014/06/11/algorithmic-complexity-attacks-and-libc-qsort.html --HPS