From owner-freebsd-hackers@FreeBSD.ORG Mon Sep 15 16:34:22 2003 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A222716A4B3 for ; Mon, 15 Sep 2003 16:34:22 -0700 (PDT) Received: from smtp01.syd.iprimus.net.au (smtp01.syd.iprimus.net.au [210.50.30.52]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0203A43FAF for ; Mon, 15 Sep 2003 16:34:22 -0700 (PDT) (envelope-from tim@robbins.dropbear.id.au) Received: from mail.robbins.dropbear.id.au (210.50.44.173) by smtp01.syd.iprimus.net.au (7.0.018) id 3F4C093C0044E8C6; Tue, 16 Sep 2003 09:34:20 +1000 Received: by mail.robbins.dropbear.id.au (Postfix, from userid 1000) id A4ACAC8BA; Tue, 16 Sep 2003 09:34:18 +1000 (EST) Date: Tue, 16 Sep 2003 09:34:18 +1000 From: Tim Robbins To: Kris Kennaway Message-ID: <20030915233418.GA13536@dilbert.robbins.dropbear.id.au> References: <20030915105356.GA11926@dilbert.robbins.dropbear.id.au> <20030915184307.GA6822@rot13.obsecurity.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030915184307.GA6822@rot13.obsecurity.org> User-Agent: Mutt/1.4.1i cc: freebsd-hackers@FreeBSD.ORG Subject: Re: A new sort utility X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Sep 2003 23:34:22 -0000 On Mon, Sep 15, 2003 at 11:43:07AM -0700, Kris Kennaway wrote: > On Mon, Sep 15, 2003 at 08:53:56PM +1000, Tim Robbins wrote: > > > It's not quite as fast as the GNU or 4.4BSD sort implementations > > Why is this? Because it spends too much time comparing lines. In particular, it seems to be spending a lot of time extracting the specified fields from lines, even when no -k options are specified. It's also more general than the 4.4BSD implementation, which can't sort according to the locale's LC_COLLATE settings, and has a lot of difficulty sorting numbers (with the -n option) properly. If speed was everything, we'd already be using that one -- it's significantly faster than GNU. > I often need to sort huge files, so I'd be reluctant to use an > implementation with a significant performance penalty. It would be great if you could compare my sort against GNU on some real world data and let me know how it goes. Tim