Date: Fri, 31 Oct 2003 14:30:56 +0100 From: Bruno Van Den Bossche <bruno.van.den.bossche@pandora.be> To: Jeff Roberson <jroberson@chesapeake.net> Cc: current@freebsd.org Subject: Re: More ULE bugs fixed. Message-ID: <20031031143056.179cdef6.bruno.van.den.bossche@pandora.be> In-Reply-To: <20031031064532.Y43805-100000@mail.chesapeake.net> References: <20031029122358.S43805-100000@mail.chesapeake.net> <20031031064532.Y43805-100000@mail.chesapeake.net>
next in thread | previous in thread | raw e-mail | index | archive | help
Jeff Roberson <jroberson@chesapeake.net> wrote: > On Wed, 29 Oct 2003, Jeff Roberson wrote: > > > On Thu, 30 Oct 2003, Bruce Evans wrote: > > > > > > Test for scheduling buildworlds: > > > > > > > > cd /usr/src/usr.bin > > > > for i in obj depend all > > > > do > > > > MAKEOBJDIRPREFIX=/somewhere/obj time make -s -j16 $i > > > > done >/tmp/zqz 2>&1 > > > > > > > > (Run this with an empty /somewhere/obj. The all stage doesn't > > > > quite finish.) On an ABIT BP6 system with a 400MHz and a 366MHz > > > > CPU, with/usr (including /usr/src) nfs-mounted (with 100 Mbps > > > > ethernet and a reasonably fast server) and /somewhere/obj > > > > ufs1-mounted (on a fairly slow disk; no soft-updates), this > > > > gives the following times: > > > > > > > > SCHED_ULE-yesterday, with not so careful setup: > > > > 40.37 real 8.26 user 6.26 sys > > > > 278.90 real 59.35 user 41.32 sys > > > > 341.82 real 307.38 user 69.01 sys > > > > SCHED_ULE-today, run immediately after booting: > > > > 41.51 real 7.97 user 6.42 sys > > > > 306.64 real 59.66 user 40.68 sys > > > > 346.48 real 305.54 user 69.97 sys > > > > SCHED_4BSD-yesterday, with not so careful setup: > > > > [same as today except the depend step was 10 seconds > > > > slower (real)] > > > > SCHED_4BSD-today, run immediately after booting: > > > > 18.89 real 8.01 user 6.66 sys > > > > 128.17 real 58.33 user 43.61 sys > > > > 291.59 real 308.48 user 72.33 sys > > > > SCHED_4BSD-yesterday, with a UP kernel (running on the 366 MHz > > > > CPU) with > > > > many local changes and not so careful setup: > > > > 17.39 real 8.28 user 5.49 sys > > > > 130.51 real 60.97 user 34.63 sys > > > > 390.68 real 310.78 user 60.55 sys > > > > > > > > Summary: SCHED_ULE was more than twice as slow as SCHED_4BSD for > > > > the obj and depend stages. These stages have little > > > > parallelism. SCHED_ULE was only 19% slower for the all stage. > > > > ... > > > > > > I reran this with -current (sched_ule.c 1.68, etc.). Result: no > > > significant change. However, with a UP kernel there was no > > > significant difference between the times for SCHED_ULE and > > > SCHED_4BSD. > > > > There was a significant difference on UP until last week. I'm > > working on SMP now. I have some patches but they aren't quite ready > > yet. > > I have commited my SMP fixes. I would appreciate it if you could post > update results. ULE now outperforms 4BSD in a single threaded kernel > compile and performs almost identically in a 16 way make. I still > have a few more things that I can do to improve the situation. I > would expect ULE to pull further ahead in the months to come. I recently had to complete a little piece of software in a course on parallel computing. I've put it online[1] (we only had to write the pract2.cpp file). It calculates the inverse of a Vandermonde matrix and allows you to spawn multiple slave-processes who each perform a part of the work. Everything happens in memory so I've used it lately to test the different changes you made to sched_ule.c and these last fixes do improve the performance on my dual p3 machine a lot. Here are the results of my (very limited tests) : sched4bsd --- dimension slaves time 1000 1 90.925408 1000 2 58.897038 200 1 0.735962 200 2 0.676660 sched_ule 1.68 --- dimension slaves time 1000 1 90.951015 1000 2 70.402845 200 1 0.743551 200 2 1.900455 sched_ule 1.70 --- dimension slaves time 1000 1 90.782309 1000 2 57.207351 200 1 0.739998 200 2 0.383545 I'm not really sure if this is very relevant to you, but from the end-user point of view (me :-)) this does means something. Thanks! [1] <http://users.pandora.be/bomberboy/mptest/final.tar.bz2> It can be used by running testpract2 with two arguments, the dimension of the matrix and the number of slaves. example './testpract2 200 2' will create a matrix with dimension 200 and 2 slaves. -- Bruno ... And then there's the guy who bought 20,000 bras, cut them in half, and sold 40,000 yamalchas with chin straps....
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20031031143056.179cdef6.bruno.van.den.bossche>