From owner-freebsd-hackers@FreeBSD.ORG Tue Feb 7 06:10:03 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 813FE1065670; Tue, 7 Feb 2012 06:10:03 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) by mx1.freebsd.org (Postfix) with ESMTP id 4C0508FC16; Tue, 7 Feb 2012 06:10:03 +0000 (UTC) Received: from julian-mac.elischer.org (c-67-180-24-15.hsd1.ca.comcast.net [67.180.24.15]) (authenticated bits=0) by vps1.elischer.org (8.14.4/8.14.4) with ESMTP id q1769uOQ043349 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Mon, 6 Feb 2012 22:09:58 -0800 (PST) (envelope-from julian@freebsd.org) Message-ID: <4F30C085.3070305@freebsd.org> Date: Mon, 06 Feb 2012 22:11:17 -0800 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X 10.4; en-US; rv:1.9.2.26) Gecko/20120129 Thunderbird/3.1.18 MIME-Version: 1.0 To: Alexander Best References: <4F2F7B7F.40508@FreeBSD.org> <20120206160136.GA35918@freebsd.org> <4F2FFFDA.2080608@FreeBSD.org> <201202061837.48946.tijl@coosemans.org> <4F301406.7080906@FreeBSD.org> <20120206191031.GA76990@freebsd.org> In-Reply-To: <20120206191031.GA76990@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-hackers@freebsd.org, Alexander Motin , Tijl Coosemans Subject: Re: [RFT][patch] Scheduling for HTT and not only X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Feb 2012 06:10:03 -0000 On 2/6/12 11:10 AM, Alexander Best wrote: > On Mon Feb 6 12, Alexander Motin wrote: >> On 02/06/12 19:37, Tijl Coosemans wrote: >>> On Monday 06 February 2012 17:29:14 Alexander Motin wrote: >>>> On 02/06/12 18:01, Alexander Best wrote: >>>>> On Mon Feb 6 12, Alexander Motin wrote: >>>>>> I've analyzed scheduler behavior and think found the problem with HTT. >>>>>> SCHED_ULE knows about HTT and when doing load balancing once a second, >>>>>> it does right things. Unluckily, if some other thread gets in the way, >>>>>> process can be easily pushed out to another CPU, where it will stay for >>>>>> another second because of CPU affinity, possibly sharing physical core >>>>>> with something else without need. >>>>>> >>>>>> I've made a patch, reworking SCHED_ULE affinity code, to fix that: >>>>>> http://people.freebsd.org/~mav/sched.htt.patch >>>>>> >>>>>> This patch does three things: >>>>>> - Disables strict affinity optimization when HTT detected to let more >>>>>> sophisticated code to take into account load of other logical core(s). >>>>>> - Adds affinity support to the sched_lowest() function to prefer >>>>>> specified (last used) CPU (and CPU groups it belongs to) in case of >>>>>> equal load. Previous code always selected first valid CPU of evens. It >>>>>> caused threads migration to lower CPUs without need. >>>>>> - If current CPU group has no CPU where the process with its priority >>>>>> can run now, sequentially check parent CPU groups before doing global >>>>>> search. That should improve affinity for the next cache levels. >>>>>> >>>>>> Who wants to do independent testing to verify my results or do some more >>>>>> interesting benchmarks? :) >>>>> i don't have any benchmarks to offer, but i'm seeing a massive increase >>>>> in >>>>> responsiveness with your patch. with an unpatched kernel, opening xterm >>>>> while >>>>> unrar'ing some huge archive could take up to 3 minutes!!! with your >>>>> patch the >>>>> time it takes for xterm to start is never> 10 seconds!!! >>>> Thank you for the report. I can suggest explanation for this. Original >>>> code does only one pass looking for CPU where the thread can run >>>> immediately. That pass limited to the first level of CPU topology (for >>>> HTT systems it is one physical core). If it sees no good candidate, it >>>> just looks for the CPU with minimal load, ignoring thread priority. I >>>> suppose that may lead to priority violation, scheduling thread to CPU >>>> where higher-priority thread is running, where it may wait for a very >>>> long time, while there is some other CPU with minimal priority thread. >>>> My patch does more searches, that allows to handle priorities better. >>> But why would unrar have a higher priority? >> I am not good with ULE priority calculations yet, but I think there >> could be many factors. Both GUI and unrar probably running with the same >> nice level of 0, so initially they are equal. After they started, their >> priorities depend on spent CPU time, calculated "interactivity" and who >> knows what else. So possibly at some moments unrar may get priority >> higher then GUI. Also there can participate several kernel threads, that >> have higher priority by definition. > one factor might also be that by doing i/o, a process can gather a higher > priority compared to a task that's doing cpu intensive stuff. statistics have > shown (at least historically) that i/o intensive tasks tend to trigger i/o > bursts and then either quit or stay idle. so schedulers usually prefer those > processes, since there's quite a big chance their request can be delivered in > one time slice. cpu intensive tasks on the other hand usually require more than > one time slice. this might cause a situation, where the 'unrar'-process is > being handled at a very high priority in order to finish its i/o burst. however > since unrar'ing a big file isn't really an i/o burst, the cpu intensive process > (or any other process in general) will suffer from starvation. > > this is just a wild guess though. it might be complete nonsense, since i > haven't got a clue what kind of scheduling (rr, fcfs, ...) ULE is doing. > > btw: does anybody know, if there are plans to commit the BFS scheduler to HEAD > at some point? personally i think what be even better than a new schuduling > algorithm would be a scheduling infrastructure similar to GEOM. that way it > would be much easier to implement new algorithms (maybe in XML). a scheduler 'framework' is rather hard to do because the scheduler interface is "complicated".. I and others did try abstract teh scheduler a bit in about 2000. As you can see we did manage to have 2 separate schedulers.. adding a third should be easier but the current interface is still slightly tied to the things that those two schedulers need. > cheers. > alex > >> -- >> Alexander Motin > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" >