From owner-freebsd-hackers@FreeBSD.ORG Mon Feb 13 20:47:29 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 08F4E106566C; Mon, 13 Feb 2012 20:47:29 +0000 (UTC) (envelope-from jroberson@jroberson.net) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id AD5888FC0A; Mon, 13 Feb 2012 20:47:28 +0000 (UTC) Received: by iaeo4 with SMTP id o4so5924184iae.13 for ; Mon, 13 Feb 2012 12:47:28 -0800 (PST) Received: by 10.42.148.197 with SMTP id s5mr24153663icv.45.1329164563257; Mon, 13 Feb 2012 12:22:43 -0800 (PST) Received: from [72.253.42.56] ([72.253.42.56]) by mx.google.com with ESMTPS id wn7sm15544246igc.0.2012.02.13.12.22.40 (version=SSLv3 cipher=OTHER); Mon, 13 Feb 2012 12:22:41 -0800 (PST) Date: Mon, 13 Feb 2012 10:23:36 -1000 (HST) From: Jeff Roberson X-X-Sender: jroberson@desktop To: Alexander Motin In-Reply-To: <4F396B24.5090602@FreeBSD.org> Message-ID: References: <4F2F7B7F.40508@FreeBSD.org> <4F366E8F.9060207@FreeBSD.org> <4F367965.6000602@FreeBSD.org> <4F396B24.5090602@FreeBSD.org> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Gm-Message-State: ALoCoQnzahpLMcaV9KuOSHtnnvjREZ3TDUQmOTGln2MnFGZOIYQbMg+hbwtFl2beC/IU5DNMh1Zn X-Mailman-Approved-At: Mon, 13 Feb 2012 20:49:42 +0000 Cc: freebsd-hackers@FreeBSD.org, Florian Smeets , Andriy Gapon Subject: Re: [RFT][patch] Scheduling for HTT and not only X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Feb 2012 20:47:29 -0000 On Mon, 13 Feb 2012, Alexander Motin wrote: > On 02/11/12 16:21, Alexander Motin wrote: >> I've heavily rewritten the patch already. So at least some of the ideas >> are already addressed. :) At this moment I am mostly satisfied with >> results and after final tests today I'll probably publish new version. > > It took more time, but finally I think I've put pieces together: > http://people.freebsd.org/~mav/sched.htt23.patch I need some time to read and digest this. However, at first glance, a global pickcpu lock will not be acceptable. Better to make a rarely imperfect decision than too often cause contention. > > The patch is more complicated then previous one both logically and > computationally, but with growing CPU power and complexity I think we can > possibly spend some more time deciding how to spend time. :) > It is probably worth more cycles but we need to evaluate this much more complex algorithm carefully to make sure that each of these new features provides an advantage. > Patch formalizes several ideas of the previous code about how to select CPU > for running a thread and adds some new. It's main idea is that I've moved > from comparing raw integer queue lengths to higher-resolution flexible > values. That additional 8-bit precision allows same time take into account > many factors affecting performance. Beside just choosing best from > equally-loaded CPUs, with new code it may even happen that because of SMT, > cache affinity, etc, CPU with more threads on it's queue will be reported as > less loaded and opposite. > > New code takes into account such factors: > - SMT sharing penalty. > - Cache sharing penalty. > - Cache affinity (with separate coefficients for last-level and other level > caches) to the: We already used separate affinity values for different cache levels. Keep in mind that if something else has run on a core the cache affinity is lost in very short order. Trying too hard to preserve it beyond a few ms never seems to pan out. > - other running threads of it's process, This is not really a great indicator of whether things should be scheduled together or not. What workload are you targeting here? > - previous CPU where it was running, > - current CPU (usually where it was called from). These two were also already used. Additionally: + * Hide part of the current thread + * load, hoping it or the scheduled + * one complete soon. + * XXX: We need more stats for this. I had something like this before. Unfortunately interactive tasks are allowed fairly aggressive bursts of cpu to account for things like xorg and web browsers. Also, I tried this for ithreads but they can be very expensive in some workloads so other cpus will idle as you try to schedule behind an ithread. > All of these factors are configurable via sysctls, but I think reasonable > defaults should fit most. > > Also, comparing to previous patch, I've resurrected optimized shortcut in CPU > selection for the case of SMT. Comparing to original code having problems > with this, I've added check for other logical cores load that should make it > safe and still very fast when there are less running threads then physical > cores. > > I've tested in on Core i7 and Atom systems, but more interesting would be to > test it on multi-socket system with properly detected topology to check > benefits from affinity. > > At this moment the main issue I see is that this patch affects only time when > thread is starting. If thread runs continuously, it will stay where it was, > even if due to situation change that is not very effective (causes SMT > sharing, etc). I haven't looked much on periodic load balancer yet, but > probably it could also be somehow improved. > > What is your opinion, is it too over-engineered, or it is the right way to > go? I think it's a little too much change all at once. I also believe that the changes that try very hard to preserve affinity likely help a much smaller number of cases than they hurt. I would prefer you do one piece at a time and validate each step. There are a lot of good ideas in here but good ideas don't always turn into results. Thanks, Jeff > > -- > Alexander Motin >