Date: Tue, 04 Mar 2008 11:03:52 +0100 From: Kris Kennaway <kris@FreeBSD.org> To: Jeff Roberson <jroberson@chesapeake.net> Cc: Garrett Wollman <wollman@freebsd.org>, current@freebsd.org Subject: Re: cvs commit: src/sys/kern sched_ule.c Message-ID: <47CD1E88.7000608@FreeBSD.org> In-Reply-To: <20080303214214.G920@desktop> References: <200803020821.m228L0Yw042389@repoman.freebsd.org> <20080301222513.Y920@desktop> <18380.58229.379738.408078@hergotha.csail.mit.edu> <20080303214214.G920@desktop>
next in thread | previous in thread | raw e-mail | index | archive | help
Jeff Roberson wrote: > On Tue, 4 Mar 2008, Garrett Wollman wrote: > >> <<On Sat, 1 Mar 2008 22:29:50 -1000 (HST), Jeff Roberson >> <jroberson@chesapeake.net> said: >> >>> Kris has done some excellent benchmarking as usual. Here you can see >>> the >>> improvement in postgres depending on various scheduler debug settings: >> >>> http://people.freebsd.org/~kris/scaling/pgsql-16cpu.png >> >> Can you comment on the area under the knee in the 8-cpu topologies? I >> seems surprising that 16 cores performs worse than 8 cores in this >> regime. > > Depending on the flags you can see different scaling properties of > different cpu selection algorithms. That's what the userret=x, > tryself=y parameters are changing. Certain parameters can cause less > concurrency which works better when workloads are heavily contended. > > See the light blue line, tryself=0, userret=0. This scales up more > poorly because there is less concurrency when there is no lock > contention but behaves better when there is contention because we're > less likely to distribute load that would preempt a lock holder. > > The default settings scale the best when there is little or no > contention. That's userret=1, tryself=1. There are other parameters > that are important but these were the ones we were most recently > experimenting with. This drops off harshly when there is significant > contention because most of the threads end up blocked against the same > lock and may be preempted then rely on priority propagation to kick in. > > The default settings should encourage further refinements to subsystem > locking to yield the best performance. I didnt run the 8-core configuration with the ULE topology patch, so part of the reason why it has a kink at 5 threads is probably due to poor scheduling. This system is very sensitive to scheduling decisions, as you can see from the previous CVS curve. I think there is also something else going on at high loads (>15) on this test, so it should be viewed as a WIP. Specifically, contention doesnt seem to be high enough to account for a 30% performance drop, and I see similar drops on other 8-core tests where contention is eliminated. What you should focus on is the large difference between the green curve showing previous CVS performance, with the brown curve showing current default performance. Kris
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?47CD1E88.7000608>