From owner-freebsd-current@FreeBSD.ORG  Sun Feb  8 01:45:48 2004
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 0427F16A4CE; Sun,  8 Feb 2004 01:45:48 -0800 (PST)
Received: from VARK.homeunix.com (adsl-68-122-2-18.dsl.pltn13.pacbell.net
	[68.122.2.18])	by mx1.FreeBSD.org (Postfix) with ESMTP
	id A854B43D1F; Sun,  8 Feb 2004 01:45:47 -0800 (PST)
	(envelope-from das@FreeBSD.ORG)
Received: from VARK.homeunix.com (localhost [127.0.0.1])
	by VARK.homeunix.com (8.12.10/8.12.10) with ESMTP id i189jbKW014996;
	Sun, 8 Feb 2004 01:45:37 -0800 (PST)
	(envelope-from das@FreeBSD.ORG)
Received: (from das@localhost)
	by VARK.homeunix.com (8.12.10/8.12.10/Submit) id i189jbnG014995;
	Sun, 8 Feb 2004 01:45:37 -0800 (PST)
	(envelope-from das@FreeBSD.ORG)
Date: Sun, 8 Feb 2004 01:45:37 -0800
From: David Schultz <das@FreeBSD.ORG>
To: Poul-Henning Kamp <phk@phk.freebsd.dk>
Message-ID: <20040208094537.GA14749@VARK.homeunix.com>
Mail-Followup-To: Poul-Henning Kamp <phk@phk.freebsd.dk>,
	Jun Su <junsu@delphij.net>, tjr@FreeBSD.ORG, current@FreeBSD.ORG,
	jhb@FreeBSD.ORG
References: <20040208080630.GA14364@VARK.homeunix.com>
	<16215.1076229779@critter.freebsd.dk>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <16215.1076229779@critter.freebsd.dk>
cc: jhb@FreeBSD.ORG
cc: current@FreeBSD.ORG
cc: tjr@FreeBSD.ORG
cc: Jun Su <junsu@delphij.net>
Subject: Re: PID Allocator Performance Results (was: Re: [UPDATE] new pid
	alloc...)
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 08 Feb 2004 09:45:48 -0000

On Sun, Feb 08, 2004, Poul-Henning Kamp wrote:
> In message <20040208080630.GA14364@VARK.homeunix.com>, David Schultz writes:
> >I spent some time today benchmarking the various proposed pid
> >allocators.  The full results, along with pretty pictures and a
> >more complete analysis, are at:
> >
> >	http://people.freebsd.org/~das/pbench/pbench.html  [1]
> 
> You _do_ realize that the difference between "tjr" and "net" in the
> bottom plot is not statistically significant ?
> 
> Stratification is visibly present from approx 1500 pids and up, and
> ends up being responsible for 1/3rd of the difference by the time
> you get to 5000 pids.
> 
> (The tell-tale sign here is that the two data sets both fall on two
> mostly straight lines in a random looking pattern, with practically
> no measurements hitting the interval between the two lines.)
> 
> If we assume the stratification has linearity with number of pids,
> which I think looks reasonable, and we read the right hand edge as
> half a second and the left hand edge as zero, we find:
> 
> 	    (.5 - 0) [second]
>   -------------------------------------- = 10 [nsec] / [iteration*pid]
>   10000 [iterations] * (5000 - 0) [pids]
> 
> 10nsec per operation is getting you into the territory of effective
> TSC-timecounter resolution, RAM access time, cache miss delays
> and all sorts of other hardware effects.

To avoid jitter and timestamping overhead, I read the time only at
the start and end of the entire sequence of 10000 operations.
I obtained the sample variance by running the entire test three
times, i.e.

	for (pass = 0; pass < 3; pass++) {
		for (nprocs = 100; nprocs < 5000; nprocs++) {
			set up the test, fork the sleepers;
			take starting timestamp;
			for (iter = 0; iter < 10000; iter++)
				run test;
			take ending timestamp;
		}
	}

Nevertheless, you're definitely right about the stratification.
I'm not sure how to explain that.  My best theory is that there's
some confounding factor, such as the pageout daemon waking up,
that has a constant overhead.  If you look at the original
samples, there's always two that are normal and one outlier.  More
samples would probably correct for this, and if I were running the
benchmark again, I would have done more iterations of the outer
loop in the pseudocode above and fewer of the inner loop.

> So all in all, I would say that you have proven that "tjr" and "net"
> are better than "old", but not that there is any statistically
> significant performance difference between them.

Yes, I realize that.  I took 10 more samples of 10000 forks each
with 5000 sleeping processes in the background and got the
following:

tjr:
1.130558492
1.125901197
1.144079485
1.118981882
1.131435699
1.123052511
1.133321135
1.121301171
1.133015788
1.124377539

net:
1.116848091
1.119333603
1.117941526
1.121989527 (got an outlier (2.547301682) here, so I reran this test)
1.118023912
1.110658198
1.126021045
1.106436712
1.116406694
1.100889638

This data show a difference at the 95% confidence level, namely,
that the NetBSD algorithm is about 1% faster on a system with 5000
processes (and only 0.1% faster if you're looking at the total
overhead of fork() rather than vfork().)  I think that pretty much
rules out performance as the deciding factor between the two.