Date: Sat, 1 Mar 2008 22:29:50 -1000 (HST) From: Jeff Roberson <jroberson@chesapeake.net> To: Jeff Roberson <jeff@FreeBSD.org> Cc: cvs-src@FreeBSD.org, src-committers@FreeBSD.org, cvs-all@FreeBSD.org Subject: Re: cvs commit: src/sys/kern sched_ule.c Message-ID: <20080301222513.Y920@desktop> In-Reply-To: <200803020821.m228L0Yw042389@repoman.freebsd.org> References: <200803020821.m228L0Yw042389@repoman.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 2 Mar 2008, Jeff Roberson wrote: > jeff 2008-03-02 08:20:59 UTC > > FreeBSD src repository > > Modified files: > sys/kern sched_ule.c > Log: > Add support for the new cpu topology api: > - When searching for affinity search backwards in the tree from the last > cpu we ran on while the thread still has affinity for the group. This > can take advantage of knowledge of shared L2 or L3 caches among a > group of cores. > - When searching for the least loaded cpu find the least loaded cpu via > the least loaded path through the tree. This load balances system bus > links, individual cache levels, and hyper-threaded/SMT cores. > - Make the periodic balancer recursively balance the highest and lowest > loaded cpu across each link. > > Add support for cpusets: > - Convert the cpuset to a simple native cpumask_t while the kernel still > only supports cpumask. > - Pass the derived cpumask down through the cpu_search functions to > restrict the result cpus. > - Make the various steal functions resilient to failure since all threads > can not run on all cpus any longer. > > General improvements: > - Precisely track the lowest priority thread on every runq with > tdq_setlowpri(). Before it was more advisory but this ended up having > pathological behaviors. > - Remove many #ifdef SMP conditions to simplify the code. > - Get rid of the old cumbersome tdq_group. This is more naturally > expressed via the cpu_group tree. > With these changes ULE is the only scheduler that supports the new cpuset api. It succeeds on 4BSD but the scheduler doesn't obey the masks. I don't presently have a plan to implement it on 4BSD as it will be potentially very inefficient to search the runq for a compatible thread on every context switch. I won't object if someone else wants to implement this, otherwise I'll make the syscalls return ENOSYS if 4BSD is compiled in. The improved cpu topology load balancing is a mixed bag. On some workloads we see considerable improvements. Right now mysql suffers when it has large numbers of threads but other things seem much improved. I will be continuing to tune this however and in most cases it's a win already. Kris has done some excellent benchmarking as usual. Here you can see the improvement in postgres depending on various scheduler debug settings: http://people.freebsd.org/~kris/scaling/pgsql-16cpu.png The horrible green line is 7.0 for reference. The blue line is the same 16core machine with half of the cores disabled. Thanks, Jeff > Sponsored by: Nokia > Testing by: kris > > Revision Changes Path > 1.226 +443 -501 src/sys/kern/sched_ule.c >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080301222513.Y920>