From owner-freebsd-hackers@FreeBSD.ORG  Mon Sep 15 16:34:22 2003
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id A222716A4B3
	for <freebsd-hackers@FreeBSD.ORG>;
	Mon, 15 Sep 2003 16:34:22 -0700 (PDT)
Received: from smtp01.syd.iprimus.net.au (smtp01.syd.iprimus.net.au
	[210.50.30.52])	by mx1.FreeBSD.org (Postfix) with ESMTP id 0203A43FAF
	for <freebsd-hackers@FreeBSD.ORG>;
	Mon, 15 Sep 2003 16:34:22 -0700 (PDT)
	(envelope-from tim@robbins.dropbear.id.au)
Received: from mail.robbins.dropbear.id.au (210.50.44.173) by
	smtp01.syd.iprimus.net.au (7.0.018)
	id 3F4C093C0044E8C6; Tue, 16 Sep 2003 09:34:20 +1000
Received: by mail.robbins.dropbear.id.au (Postfix, from userid 1000)
	id A4ACAC8BA; Tue, 16 Sep 2003 09:34:18 +1000 (EST)
Date: Tue, 16 Sep 2003 09:34:18 +1000
From: Tim Robbins <tjr@FreeBSD.ORG>
To: Kris Kennaway <kris@obsecurity.org>
Message-ID: <20030915233418.GA13536@dilbert.robbins.dropbear.id.au>
References: <20030915105356.GA11926@dilbert.robbins.dropbear.id.au>
	<20030915184307.GA6822@rot13.obsecurity.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20030915184307.GA6822@rot13.obsecurity.org>
User-Agent: Mutt/1.4.1i
cc: freebsd-hackers@FreeBSD.ORG
Subject: Re: A new sort utility
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Sep 2003 23:34:22 -0000

On Mon, Sep 15, 2003 at 11:43:07AM -0700, Kris Kennaway wrote:

> On Mon, Sep 15, 2003 at 08:53:56PM +1000, Tim Robbins wrote:
> 
> > It's not quite as fast as the GNU or 4.4BSD sort implementations
> 
> Why is this?

Because it spends too much time comparing lines. In particular, it seems to be
spending a lot of time extracting the specified fields from lines, even when
no -k options are specified.

It's also more general than the 4.4BSD implementation, which can't sort
according to the locale's LC_COLLATE settings, and has a lot of difficulty
sorting numbers (with the -n option) properly. If speed was everything, we'd
already be using that one -- it's significantly faster than GNU.

> I often need to sort huge files, so I'd be reluctant to use an
> implementation with a significant performance penalty.

It would be great if you could compare my sort against GNU on some real world
data and let me know how it goes.


Tim