From owner-freebsd-current@FreeBSD.ORG Fri Jan 5 07:10:31 2007 Return-Path: X-Original-To: current@freebsd.org Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 648C216A40F; Fri, 5 Jan 2007 07:10:31 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from webaccess-cl.virtdom.com (webaccess-cl.virtdom.com [216.240.101.25]) by mx1.freebsd.org (Postfix) with ESMTP id 3E30A13C44C; Fri, 5 Jan 2007 07:10:31 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from [10.0.0.1] (216-160-98-154.tukw.qwest.net [216.160.98.154]) (authenticated bits=0) by webaccess-cl.virtdom.com (8.13.6/8.13.6) with ESMTP id l057AMC8003907 (version=TLSv1/SSLv3 cipher=DHE-DSS-AES256-SHA bits=256 verify=NO); Fri, 5 Jan 2007 02:10:25 -0500 (EST) (envelope-from jroberson@chesapeake.net) Date: Thu, 4 Jan 2007 23:08:58 -0800 (PST) From: Jeff Roberson X-X-Sender: jroberson@10.0.0.1 To: David Xu In-Reply-To: <459DB871.1050109@freebsd.org> Message-ID: <20070104224925.B552@10.0.0.1> References: <20070104005625.D1508@10.0.0.1> <459CCBA1.40305@freebsd.org> <200701050749.49058.davidxu@freebsd.org> <20070104155755.Y552@10.0.0.1> <20070104163449.D552@10.0.0.1> <459DB871.1050109@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Kip Macy , current@freebsd.org Subject: Re: ULE 2.0 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Jan 2007 07:10:31 -0000 On Fri, 5 Jan 2007, David Xu wrote: > Jeff Roberson wrote: > > >> I just looked it up. The solaris algorithm prevents direct preemption if >> the priority is not in the kernel or realtime priority range. Otherwise it >> simply sets NEEDRESCHED. When assigning a thread to a cpu it checks to see >> if the last cpu it executed on is running a higher priority thread (lower >> for solaris where higher is better) and if so puts it there. Otherwise it >> finds the cpu which is executing the lowest priority thread and assigns it >> there. >> >> This basically balances the cpus by priority rather than by load. Balancing >> by priority indirectly balances by load because threads which have slept >> for a long time will eventually have a higher priority. This is pretty >> smart, although it ends up inspecting per-cpu state of remote cpus on >> almost every wakeup. I would expect this to perform badly but perhaps the >> improvement in user-space speed outweighs the kernel time. >> > The lookup does not always happen, there is a per-cpu rechoose value, it > might be some ticks, a thread slept long enough (exceed the interval > ticks) will end up looking for a cpu which has lower priority can run > it, if it can not find one, it will stay at a cpu which has lowest priority > but closest to the thread's last cpu, this is done by looking > a cpu from leaf to root in a cpu topology tree. if the resumed thread > is still within the interval, it will stay at its last cpu. > The alogrithm bets that after some ticks, a sleeping thread's cache is > trashed by other threads, so it should look for a cpu can run it > immediately, but still closest to its last cpu. if a cpu is selected, > it will look for its next cpu which is neighbour in a circle link list, > check its runqueue length at the priority, and may use the next cpu > rather than the cpu returned by above algorithm, the circle link list > might link cpus within same locality together, e.g dual-core cpus > might be a candidate. there is another path which makes a chip balance, e.g > it tries to balance two chips which are multiple-cores, and make them have > same load ratio. I am not Solaris expert, so above may not be > very accurate. :-) Thanks for the description. I'm going to look into this over the next few weeks. > >> I'll investigate this furher. One slight problem with this approach for >> ULE is that sitting on the run-queue for a long period of time improves >> your position on the run-queue but not directly your priority. I think I >> can solve this though by resetting the priority as soon as you run which >> will take into account the ticks that you were waiting. > > Another problem in linux and ULE scheduler is they are less history > friendly within past load, it only looks the thread itself's run time > and sleep time, and don't know the system load, and calculate its priority > based on this, under heavy load, it could be very inaccuracy, > maybe you still need some anti-unfairness code, for example a thread > stayed on a runqueue too long, ULE boosts its priority a bit, though I > don't know how to do it in cheapest way. This has changed significantly with the ULE 2.0 commit. I no longer have the split run queue. ULE now boosts the priority as the thread sits on the run queue via a circular queue mechanism. > >> Anyway, thanks for the pointer Xu. I may hack this up and compare it to >> ULE's current balancing. > > Yesterday, I have tested super-smack benchmark on 2-cpu machine, ULE > decreased performance about 40%, this might be a regression though. I have found significant new bugs that were introduced mostly by changes in the threading system that ULE did not keep up with. It was, for a time, slightly faster than 4BSD on supersmack on several of my machines. I know it can be improved. I'm going to look into this sun algorithm in the next few weeks and see how it works out with ULE. Thanks, Jeff > >> >> Cheers, >> Jeff >> > > Regards, > David Xu >