From owner-freebsd-current@FreeBSD.ORG Fri Oct 31 05:31:10 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 09A2816A4CE for ; Fri, 31 Oct 2003 05:31:10 -0800 (PST) Received: from adicia.telenet-ops.be (adicia.telenet-ops.be [195.130.132.56]) by mx1.FreeBSD.org (Postfix) with ESMTP id 921A743F85 for ; Fri, 31 Oct 2003 05:31:08 -0800 (PST) (envelope-from bruno.van.den.bossche@pandora.be) Received: from localhost (localhost.localdomain [127.0.0.1]) by adicia.telenet-ops.be (Postfix) with SMTP id 2C3C137E60; Fri, 31 Oct 2003 14:31:07 +0100 (MET) Received: from Noisy.localdomain.local (D5E00357.kabel.telenet.be [213.224.3.87]) by adicia.telenet-ops.be (Postfix) with SMTP id D2F3637EAC; Fri, 31 Oct 2003 14:30:56 +0100 (MET) Date: Fri, 31 Oct 2003 14:30:56 +0100 From: Bruno Van Den Bossche To: Jeff Roberson Message-Id: <20031031143056.179cdef6.bruno.van.den.bossche@pandora.be> In-Reply-To: <20031031064532.Y43805-100000@mail.chesapeake.net> References: <20031029122358.S43805-100000@mail.chesapeake.net> <20031031064532.Y43805-100000@mail.chesapeake.net> Organization: Me, Myself & I X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.10; i386-portbld-freebsd5.1) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit cc: current@freebsd.org Subject: Re: More ULE bugs fixed. X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 Oct 2003 13:31:10 -0000 Jeff Roberson wrote: > On Wed, 29 Oct 2003, Jeff Roberson wrote: > > > On Thu, 30 Oct 2003, Bruce Evans wrote: > > > > > > Test for scheduling buildworlds: > > > > > > > > cd /usr/src/usr.bin > > > > for i in obj depend all > > > > do > > > > MAKEOBJDIRPREFIX=/somewhere/obj time make -s -j16 $i > > > > done >/tmp/zqz 2>&1 > > > > > > > > (Run this with an empty /somewhere/obj. The all stage doesn't > > > > quite finish.) On an ABIT BP6 system with a 400MHz and a 366MHz > > > > CPU, with/usr (including /usr/src) nfs-mounted (with 100 Mbps > > > > ethernet and a reasonably fast server) and /somewhere/obj > > > > ufs1-mounted (on a fairly slow disk; no soft-updates), this > > > > gives the following times: > > > > > > > > SCHED_ULE-yesterday, with not so careful setup: > > > > 40.37 real 8.26 user 6.26 sys > > > > 278.90 real 59.35 user 41.32 sys > > > > 341.82 real 307.38 user 69.01 sys > > > > SCHED_ULE-today, run immediately after booting: > > > > 41.51 real 7.97 user 6.42 sys > > > > 306.64 real 59.66 user 40.68 sys > > > > 346.48 real 305.54 user 69.97 sys > > > > SCHED_4BSD-yesterday, with not so careful setup: > > > > [same as today except the depend step was 10 seconds > > > > slower (real)] > > > > SCHED_4BSD-today, run immediately after booting: > > > > 18.89 real 8.01 user 6.66 sys > > > > 128.17 real 58.33 user 43.61 sys > > > > 291.59 real 308.48 user 72.33 sys > > > > SCHED_4BSD-yesterday, with a UP kernel (running on the 366 MHz > > > > CPU) with > > > > many local changes and not so careful setup: > > > > 17.39 real 8.28 user 5.49 sys > > > > 130.51 real 60.97 user 34.63 sys > > > > 390.68 real 310.78 user 60.55 sys > > > > > > > > Summary: SCHED_ULE was more than twice as slow as SCHED_4BSD for > > > > the obj and depend stages. These stages have little > > > > parallelism. SCHED_ULE was only 19% slower for the all stage. > > > > ... > > > > > > I reran this with -current (sched_ule.c 1.68, etc.). Result: no > > > significant change. However, with a UP kernel there was no > > > significant difference between the times for SCHED_ULE and > > > SCHED_4BSD. > > > > There was a significant difference on UP until last week. I'm > > working on SMP now. I have some patches but they aren't quite ready > > yet. > > I have commited my SMP fixes. I would appreciate it if you could post > update results. ULE now outperforms 4BSD in a single threaded kernel > compile and performs almost identically in a 16 way make. I still > have a few more things that I can do to improve the situation. I > would expect ULE to pull further ahead in the months to come. I recently had to complete a little piece of software in a course on parallel computing. I've put it online[1] (we only had to write the pract2.cpp file). It calculates the inverse of a Vandermonde matrix and allows you to spawn multiple slave-processes who each perform a part of the work. Everything happens in memory so I've used it lately to test the different changes you made to sched_ule.c and these last fixes do improve the performance on my dual p3 machine a lot. Here are the results of my (very limited tests) : sched4bsd --- dimension slaves time 1000 1 90.925408 1000 2 58.897038 200 1 0.735962 200 2 0.676660 sched_ule 1.68 --- dimension slaves time 1000 1 90.951015 1000 2 70.402845 200 1 0.743551 200 2 1.900455 sched_ule 1.70 --- dimension slaves time 1000 1 90.782309 1000 2 57.207351 200 1 0.739998 200 2 0.383545 I'm not really sure if this is very relevant to you, but from the end-user point of view (me :-)) this does means something. Thanks! [1] It can be used by running testpract2 with two arguments, the dimension of the matrix and the number of slaves. example './testpract2 200 2' will create a matrix with dimension 200 and 2 slaves. -- Bruno ... And then there's the guy who bought 20,000 bras, cut them in half, and sold 40,000 yamalchas with chin straps....