From owner-freebsd-hackers@freebsd.org Sat Nov 26 10:28:00 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1485AC5574E for ; Sat, 26 Nov 2016 10:28:00 +0000 (UTC) (envelope-from tris_vern@hotmail.com) Received: from COL004-OMC3S3.hotmail.com (col004-omc3s3.hotmail.com [65.55.34.141]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (Client CN "*.outlook.com", Issuer "Microsoft IT SSL SHA2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id CB7A89CB for ; Sat, 26 Nov 2016 10:27:59 +0000 (UTC) (envelope-from tris_vern@hotmail.com) Received: from AUS01-ME1-obe.outbound.protection.outlook.com ([65.55.34.137]) by COL004-OMC3S3.hotmail.com over TLS secured channel with Microsoft SMTPSVC(7.5.7601.23008); Sat, 26 Nov 2016 02:26:53 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hotmail.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=MLEUw1gM3IpwNNsIiXV8PljCrJc39mjatakz2a7gaAI=; b=RtRHmt9U8wNgDbaYuATc0VOJXyok6OOa3g6yrX6QEZaFVL6hqzjPh/OZoQMED2L7YsfEvf33xogKCYyf9XvVDCHeUZhwythHAKNHVjHvNFunPaSKnw9S5nwQ093UmRp8y3sHwQeix2WXCoFVDEdExZSVOcan3jszHITQYJ7NiUDRjpnBj/EtvXDUL/8tZbSZHT7Q6qocG/uGDJ+HHh0KP52xKVfBgTcrsEL4Fo3Lf7arlBCv2J5Y6fLZy/rC9Lw0aZCMgkWNutP1jAt7oV0xpuj56UNMpEjjnQ0iHdkkj1XvZ7JPxKlj6XijbCHuIkU1+lTlp7RggAZ5cAd20wZHoQ== Received: from ME1AUS01FT008.eop-AUS01.prod.protection.outlook.com (10.152.232.54) by ME1AUS01HT010.eop-AUS01.prod.protection.outlook.com (10.152.232.115) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.734.4; Sat, 26 Nov 2016 10:26:50 +0000 Received: from ME1PR01MB0546.ausprd01.prod.outlook.com (10.152.232.56) by ME1AUS01FT008.mail.protection.outlook.com (10.152.232.117) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.734.4 via Frontend Transport; Sat, 26 Nov 2016 10:26:50 +0000 Received: from ME1PR01MB0546.ausprd01.prod.outlook.com ([10.162.68.22]) by ME1PR01MB0546.ausprd01.prod.outlook.com ([10.162.68.22]) with mapi id 15.01.0734.014; Sat, 26 Nov 2016 10:26:50 +0000 From: Tristan Verniquet To: freebsd hackers Subject: qsort switching to insertsort Thread-Topic: qsort switching to insertsort Thread-Index: AQHSR8zfjghRy3NqhkaB3zSchVE3aA== Date: Sat, 26 Nov 2016 10:26:50 +0000 Message-ID: Accept-Language: en-AU, en-US Content-Language: en-AU X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: freebsd.org; dkim=none (message not signed) header.d=none;freebsd.org; dmarc=none action=none header.from=hotmail.com; x-incomingtopheadermarker: OriginalChecksum:; UpperCasedChecksum:; SizeAsReceived:7238; Count:37 x-ms-exchange-messagesentrepresentingtype: 1 x-tmn: [aGg+A/MuAofWIcirsUxQNiQZrscfv7l0] x-incomingheadercount: 37 x-eopattributedmessage: 0 x-microsoft-exchange-diagnostics: 1; ME1AUS01HT010; 5:MierluyLBRA/9dHqSY9iq/qN81/2OOPHXUFFrDENRIUgKjTJpFMuaHBKRR2uNTHV3wUIoTmygpeY1lRiPTlCPx4TU/FLlb6nPeYKAZXUiMfgcutAYYfMEFjZGxGNZdxsPmwln/ZLNoR1FSDoNUDmPw==; 24:YsMEewgZp8Yc+tHJ+iXXkFslH4T7ro9PGPNBVK6WjDQSAU7EIUtIW4bQDiae4MaLEnQRm631VwlbEDVV9uoTHpNadsT5NsG/9e7/wG9rAFg=; 7:IuH4RG/rLm2P8DkroWzpcYn+GBwpjDEaABkG3YGK923JNzahX3b06LxfxDXBYIlOGw0IRy08qTGSnVQ5md0I4kV1J0myd3fAPgt9wRF4pk6QbChG2WcrCKL7BFfTs1nRKcCmcraPmpXgOm4hN0nGkLjjrvLARrd56xsLqr0eWfLQ7s2KWvJ2+eWA9qnKvp8mcPINGEmTpa2xsYO3dXol/I1tp1G0LZu9aHcfzAk7iPOkyXgmNAnuLGJx9ecGwRb6sf8l4EJErU8YaPjQvyuVXxDmdRgkeYEpbFEjtmgFdBovN0W04UitiUFw0BO7Hth+plLXJMRlFK5wFhaLfVGloUaUzfwKWw5S2hnP5/ue4E8= x-forefront-antispam-report: EFV:NLI; SFV:NSPM; SFS:(10019020)(98900003); DIR:OUT; SFP:1102; SCL:1; SRVR:ME1AUS01HT010; H:ME1PR01MB0546.ausprd01.prod.outlook.com; FPR:; SPF:None; LANG:en; x-ms-office365-filtering-correlation-id: 47f2e4df-341b-4f75-aeed-08d415e6bc00 x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(22001)(1601124038)(1603103113)(1603101340)(1601125047); SRVR:ME1AUS01HT010; x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(432015012)(82015046); SRVR:ME1AUS01HT010; BCL:0; PCL:0; RULEID:; SRVR:ME1AUS01HT010; x-forefront-prvs: 0138CD935C spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: hotmail.com X-MS-Exchange-CrossTenant-originalarrivaltime: 26 Nov 2016 10:26:50.4613 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Internet X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-Transport-CrossTenantHeadersStamped: ME1AUS01HT010 X-OriginalArrivalTime: 26 Nov 2016 10:26:53.0518 (UTC) FILETIME=[9B391EE0:01D247CF] Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Nov 2016 10:28:00 -0000 We recently hit an issue with qsort where it was taking 2 minutes to sort 5= 00k entries. Changing to mergesort was comparatively instantaneous (which i= s what was expected). This was trusted data and shouldn't have been failing= due to pivot choice. After investigation, we discovered we'd been caught by a quirk of FreeBSD's= qsort where it will switch to insertsort (even for the whole array) if a p= ass matches a very simplistic heuristic of doing no swaps. This has previously been written up about in the following links: http://calmerthanyouare.org/2013/05/31/qsort-shootout.html and http://calmerthanyouare.org/2014/06/11/algorithmic-complexity-attacks-and-l= ibc-qsort.html The articles above focus on the security aspect of quicksort, and that all = implementations are vulnerable, however what seems to be missed in the Free= BSD case is how simple it is to hit this (really quite bad) worst case scen= ario with everyday data, as we did. This discovery is enough for us to not = want to use this version of qsort. FreeBSD does provide heapsort and mergesort as alternatives, but not merges= ort_r(). Also, as a developer, having these options gives the impression th= at choosing qsort will give me a quicksort-like implementation with the tra= de-offs that come with that, but it is not obvious that I might be given in= sertsort. The easiest way forward for us is probably to comment out the offending cod= e. But I haven't been able to find much discussion on it. I'm not sure how wel= l known the quirk is. I'm not sure of the rationality for it in the first p= lace (obviously a speedup, but whether it was considered alongside the down= falls), or what other peoples opinions are. So I thought I'd ask. I've done up a simple test case that is similar to our original usage that = shows why we were likely to always trigger an insertsort. It has 2 sort fie= lds in a 1->N relationship with the second fields sorted within the first, = but the topmost fields slightly out of order. I imagine there are lots of o= ther "almost sorted" scenarios that could end up being quite bad. http://tverniquet.com/whirlpool/test_qsort/test_qsort.c Regards, Tristan