Date: Wed, 6 Jun 2007 23:24:04 -0700 From: "Kip Macy" <kip.macy@gmail.com> To: "Bruce Evans" <brde@optusnet.com.au> Cc: src-committers@freebsd.org, John Baldwin <jhb@freebsd.org>, cvs-src@freebsd.org, cvs-all@freebsd.org, Attilio Rao <attilio@freebsd.org>, Kostik Belousov <kostikbel@gmail.com>, Jeff Roberson <jroberson@chesapeake.net> Subject: Re: cvs commit: src/sys/kern kern_mutex.c Message-ID: <b1fa29170706062324p793ac8e2ga8dc5bf8ba151a60@mail.gmail.com> In-Reply-To: <20070607133524.S7002@besplex.bde.org> References: <200706051420.l55EKEih018925@repoman.freebsd.org> <3bbf2fe10706050829o2d756a4cu22f98cf11c01f5e4@mail.gmail.com> <3bbf2fe10706050843x5aaafaafy284e339791bcfe42@mail.gmail.com> <200706051230.21242.jhb@freebsd.org> <20070606094354.E51708@delplex.bde.org> <20070605195839.I606@10.0.0.1> <20070606154548.F3105@besplex.bde.org> <20070607133524.S7002@besplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Bruce - Can you also say how many runs do you do and how much variance there is between runs? Thanks. -Kip On 6/6/07, Bruce Evans <brde@optusnet.com.au> wrote: > On Wed, 6 Jun 2007, Bruce Evans wrote: > > > On Tue, 5 Jun 2007, Jeff Roberson wrote: > > >> You should try with kern.sched.pick_pri = 0. I have changed this to be the > >> default recently. This weakens the preemption and speeds up some > >> workloads. > > > > I haven't tried a new SCHED_ULE kernel yet. > > Tried now. In my makeworld benchmark, SCHED_ULE is now only 4% slower > than SCHED_4BSD (after losing 2% in SCHED_4BSD) (down from about 7% > slower). The difference is still from CPUs idling too much. > > Best result ever (SCHED_4BSD, June 4 kernel, no PREEMPTION): > --- > 827.48 real 1309.26 user 186.86 sys > 1332122 voluntary context switches > 1535129 involuntary context switches > pagezero time 6 seconds > --- > > After thread lock changes (SCHED_4BSD, no PREEMPTION): > --- > 847.70 real 1309.83 user 169.39 sys > 2933415 voluntary context switches > 1501808 involuntary context switches > pagezero time 30 seconds. > > Unlike what I wrote before, there is a scheduling bug that affects > pagezero directly. The bug from last month involving pagezero losing > its priority of PRI_MAX_IDLE and running at priority PUSER is back. > This bug seemed to be gone in the June 4 kernel, but actually only > happens less there. This bug seems to cost 0.5-1.0% real time. > --- > > After thread lock changes (SCHED_4BSD, now with PREEMPTION): > --- > 843.34 real 1304.00 user 168.87 sys > 1651011 voluntary context switches > 1630988 involuntary context switches > pagezero time 27 seconds > > The problem with the extra context switches is gone (these context switch > counts are like the ones in old kernels with PREEMPTION). This result is > affected by pagezero getting its priority clobbered. The best result for > an old kernel with PREMPTION was about 840 seconds, before various > optimizations reduced this to 827 seconds (-0+4 seconds). > --- > > Old run with SCHED_ULE (Mar 18): > 899.50 real 1311.00 user 187.47 sys > 1566366 voluntary context switches > 1959436 involuntary context switches > pagezero time 19 seconds > --- > > Today with SCHED_ULE: > --- > 883.65 real 1290.92 user 188.21 sys > 1658109 voluntary context switches > 1708148 involuntary context switches > pagezero time 7 seconds. > --- > > In all of these, the user + sys decomposition is very inaccurate, but the > (user + sys + pagezero_time) total is fairly accurate. It is 1500+-2 for > SCHED_4BSD and 1500+-17 for SCHED_ULE (old ULE larger, current ULE smaller). > > SCHED_ULE now shows intereting behaviour for non-parallel kernel > builds on a 2-way SMP machine. It is now slightly faster than SCHED_4BSD > for this, but still much slower for parallel kernel builds. This might > be because it likes to leave 1 CPU idle to wait to find a better CPU to > run on, and this is actually an optimization when there is >= 1 CPU to > spare: > > RELENG_4 kernel build on nfs, non-parallel make. > Best ever with SCHED_ULE (~June 4 kernel): > 62.55 real 55.30 user 3.65 sys > Current with SCHED_ULE: > 62.18 real 54.91 user 3.51 sys > > RELENG_4 kernel build on nfs, make -j4. > Best ever for SCHED_ULE (~June 4 kernel): > 32.00 real 56.98 user 3.90 sys > Current with SCHED_ULE: > 33.11 real 56.01 user 4.12 sys > ULE has been about 1 second slower for this since at least last November. > It presumably reduces user+sys time by running pagezero more. > > The slowdown is much larger for a build on ffs: > > Non-parallel results not shown (litte difference from above). > > RELENG_4 kernel build on ffs, make -j4. > Best ever for SCHED_ULE (~June 4 kernel): > 29.94 real 56.03 user 3.12 sys > Current with SCHED_ULE: > 32.63 real 55.13 user 3.53 sys > Now 9% of the real time (= 18% of the cycles on one CPU = almost the > sys sys overhead) is apparently wasted by leaving one CPU idle. This > benchmark is of course dominated by many instances of 2 gcc hogs which > should be scheduled to run in parallel with no idle cycles. (In all > these kernel benchmarks, everything except disk writes is cached before > starting. In other makeworld benchmarks, everything is cached before > starting on the nfs server, while on the client nothing is cached.) > > I don't have context switch counts or pagezero times for the kernel builds. > stathz is 100 = hz. Maybe SCHED_ULE doesn't like this. hz = 100 is > about 1% faster than hz = 1000 for the makeworld benchmark. > > Bruce >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?b1fa29170706062324p793ac8e2ga8dc5bf8ba151a60>