From owner-cvs-src@FreeBSD.ORG Sun Mar 2 08:27:44 2008 Return-Path: Delivered-To: cvs-src@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4368B1065670; Sun, 2 Mar 2008 08:27:44 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from webaccess-cl.virtdom.com (webaccess-cl.virtdom.com [216.240.101.25]) by mx1.freebsd.org (Postfix) with ESMTP id E1A0C8FC2B; Sun, 2 Mar 2008 08:27:43 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from [192.168.1.107] (cpe-24-94-75-93.hawaii.res.rr.com [24.94.75.93]) (authenticated bits=0) by webaccess-cl.virtdom.com (8.13.6/8.13.6) with ESMTP id m228RftC019065; Sun, 2 Mar 2008 03:27:42 -0500 (EST) (envelope-from jroberson@chesapeake.net) Date: Sat, 1 Mar 2008 22:29:50 -1000 (HST) From: Jeff Roberson X-X-Sender: jroberson@desktop To: Jeff Roberson In-Reply-To: <200803020821.m228L0Yw042389@repoman.freebsd.org> Message-ID: <20080301222513.Y920@desktop> References: <200803020821.m228L0Yw042389@repoman.freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: cvs-src@FreeBSD.org, src-committers@FreeBSD.org, cvs-all@FreeBSD.org Subject: Re: cvs commit: src/sys/kern sched_ule.c X-BeenThere: cvs-src@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: CVS commit messages for the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 02 Mar 2008 08:27:44 -0000 On Sun, 2 Mar 2008, Jeff Roberson wrote: > jeff 2008-03-02 08:20:59 UTC > > FreeBSD src repository > > Modified files: > sys/kern sched_ule.c > Log: > Add support for the new cpu topology api: > - When searching for affinity search backwards in the tree from the last > cpu we ran on while the thread still has affinity for the group. This > can take advantage of knowledge of shared L2 or L3 caches among a > group of cores. > - When searching for the least loaded cpu find the least loaded cpu via > the least loaded path through the tree. This load balances system bus > links, individual cache levels, and hyper-threaded/SMT cores. > - Make the periodic balancer recursively balance the highest and lowest > loaded cpu across each link. > > Add support for cpusets: > - Convert the cpuset to a simple native cpumask_t while the kernel still > only supports cpumask. > - Pass the derived cpumask down through the cpu_search functions to > restrict the result cpus. > - Make the various steal functions resilient to failure since all threads > can not run on all cpus any longer. > > General improvements: > - Precisely track the lowest priority thread on every runq with > tdq_setlowpri(). Before it was more advisory but this ended up having > pathological behaviors. > - Remove many #ifdef SMP conditions to simplify the code. > - Get rid of the old cumbersome tdq_group. This is more naturally > expressed via the cpu_group tree. > With these changes ULE is the only scheduler that supports the new cpuset api. It succeeds on 4BSD but the scheduler doesn't obey the masks. I don't presently have a plan to implement it on 4BSD as it will be potentially very inefficient to search the runq for a compatible thread on every context switch. I won't object if someone else wants to implement this, otherwise I'll make the syscalls return ENOSYS if 4BSD is compiled in. The improved cpu topology load balancing is a mixed bag. On some workloads we see considerable improvements. Right now mysql suffers when it has large numbers of threads but other things seem much improved. I will be continuing to tune this however and in most cases it's a win already. Kris has done some excellent benchmarking as usual. Here you can see the improvement in postgres depending on various scheduler debug settings: http://people.freebsd.org/~kris/scaling/pgsql-16cpu.png The horrible green line is 7.0 for reference. The blue line is the same 16core machine with half of the cores disabled. Thanks, Jeff > Sponsored by: Nokia > Testing by: kris > > Revision Changes Path > 1.226 +443 -501 src/sys/kern/sched_ule.c >