From owner-freebsd-arch@FreeBSD.ORG  Thu Mar 15 14:48:08 2007
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
X-Original-To: arch@freebsd.org
Delivered-To: freebsd-arch@FreeBSD.ORG
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 6209416A400
	for <arch@freebsd.org>; Thu, 15 Mar 2007 14:48:08 +0000 (UTC)
	(envelope-from dds@aueb.gr)
Received: from blue.servers.aueb.gr (blue.servers.aueb.gr [195.251.255.132])
	by mx1.freebsd.org (Postfix) with ESMTP id 1EB6C13C45E
	for <arch@freebsd.org>; Thu, 15 Mar 2007 14:48:07 +0000 (UTC)
	(envelope-from dds@aueb.gr)
Received: from [195.251.254.141] ([::ffff:195.251.254.141])
	by blue.servers.aueb.gr with esmtp; Thu, 15 Mar 2007 16:37:57 +0200
	id 000D1214.45F95A45.00002DAD
Message-ID: <45F95A42.7080104@aueb.gr>
Date: Thu, 15 Mar 2007 16:37:54 +0200
From: Diomidis Spinellis <dds@aueb.gr>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
	rv:1.8.0.9) Gecko/20061211 SeaMonkey/1.0.7
MIME-Version: 1.0
To: Garrett Wollman <wollman@khavrinen.csail.mit.edu>
References: <200703151334.l2FDYAfo024194@khavrinen.csail.mit.edu>
In-Reply-To: <200703151334.l2FDYAfo024194@khavrinen.csail.mit.edu>
Content-Type: text/plain; charset=ISO-8859-7; format=flowed
Content-Transfer-Encoding: 7bit
Cc: arch@freebsd.org
Subject: Re: Multithreaded qsort(3)
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 15 Mar 2007 14:48:08 -0000

Garrett Wollman wrote:
> In <45F906ED.8070100@aueb.gr> you write:
> 
>> $ qsort_mt -t -h 2 -n 100000000 -l	# Integers; qsort(3)
>> 46.5 46.2 0.314
>> $ ./qsort_mt  -t -h 2 -n 100000000	# Integers; qsort_mt
>> 27.5 46.3 0.301
> 
> "fancy algorithms have large constants, and N is usually small".
> 
> Do you have any reason to believe that N is large with sufficient
> frequency to justify the overhead?

My proposed implementation (separate routine) doesn't add any overhead, 
unless you explicitly link it in, in which case I presume you know what 
you're doing.  If I was implementing a BSD version of sort(1) (I will, 
one day) or using qsort in an RDBMS engine, calling qsort_mt would 
certainly be worthwhile.

Memory and disk sizes are growing while CPU speeds are stagnant.  This 
means larger data sizes without corresponding serial horsepower to tame 
them.  Therefore, N is increasing in absolute and in relative terms.

You do have a point though, in that parallelizing the C library isn't by 
itself going to save us.  I run dtrace on some builds (a task I'd like 
to see going faster), and the time spent in libc was below 10%.

Diomidis