Date: Mon, 13 Feb 2012 10:23:36 -1000 (HST) From: Jeff Roberson <jroberson@jroberson.net> To: Alexander Motin <mav@FreeBSD.org> Cc: freebsd-hackers@FreeBSD.org, Florian Smeets <flo@FreeBSD.org>, Andriy Gapon <avg@FreeBSD.org> Subject: Re: [RFT][patch] Scheduling for HTT and not only Message-ID: <alpine.BSF.2.00.1202131012270.2020@desktop> In-Reply-To: <4F396B24.5090602@FreeBSD.org> References: <4F2F7B7F.40508@FreeBSD.org> <4F366E8F.9060207@FreeBSD.org> <4F367965.6000602@FreeBSD.org> <4F396B24.5090602@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 13 Feb 2012, Alexander Motin wrote: > On 02/11/12 16:21, Alexander Motin wrote: >> I've heavily rewritten the patch already. So at least some of the ideas >> are already addressed. :) At this moment I am mostly satisfied with >> results and after final tests today I'll probably publish new version. > > It took more time, but finally I think I've put pieces together: > http://people.freebsd.org/~mav/sched.htt23.patch I need some time to read and digest this. However, at first glance, a global pickcpu lock will not be acceptable. Better to make a rarely imperfect decision than too often cause contention. > > The patch is more complicated then previous one both logically and > computationally, but with growing CPU power and complexity I think we can > possibly spend some more time deciding how to spend time. :) > It is probably worth more cycles but we need to evaluate this much more complex algorithm carefully to make sure that each of these new features provides an advantage. > Patch formalizes several ideas of the previous code about how to select CPU > for running a thread and adds some new. It's main idea is that I've moved > from comparing raw integer queue lengths to higher-resolution flexible > values. That additional 8-bit precision allows same time take into account > many factors affecting performance. Beside just choosing best from > equally-loaded CPUs, with new code it may even happen that because of SMT, > cache affinity, etc, CPU with more threads on it's queue will be reported as > less loaded and opposite. > > New code takes into account such factors: > - SMT sharing penalty. > - Cache sharing penalty. > - Cache affinity (with separate coefficients for last-level and other level > caches) to the: We already used separate affinity values for different cache levels. Keep in mind that if something else has run on a core the cache affinity is lost in very short order. Trying too hard to preserve it beyond a few ms never seems to pan out. > - other running threads of it's process, This is not really a great indicator of whether things should be scheduled together or not. What workload are you targeting here? > - previous CPU where it was running, > - current CPU (usually where it was called from). These two were also already used. Additionally: + * Hide part of the current thread + * load, hoping it or the scheduled + * one complete soon. + * XXX: We need more stats for this. I had something like this before. Unfortunately interactive tasks are allowed fairly aggressive bursts of cpu to account for things like xorg and web browsers. Also, I tried this for ithreads but they can be very expensive in some workloads so other cpus will idle as you try to schedule behind an ithread. > All of these factors are configurable via sysctls, but I think reasonable > defaults should fit most. > > Also, comparing to previous patch, I've resurrected optimized shortcut in CPU > selection for the case of SMT. Comparing to original code having problems > with this, I've added check for other logical cores load that should make it > safe and still very fast when there are less running threads then physical > cores. > > I've tested in on Core i7 and Atom systems, but more interesting would be to > test it on multi-socket system with properly detected topology to check > benefits from affinity. > > At this moment the main issue I see is that this patch affects only time when > thread is starting. If thread runs continuously, it will stay where it was, > even if due to situation change that is not very effective (causes SMT > sharing, etc). I haven't looked much on periodic load balancer yet, but > probably it could also be somehow improved. > > What is your opinion, is it too over-engineered, or it is the right way to > go? I think it's a little too much change all at once. I also believe that the changes that try very hard to preserve affinity likely help a much smaller number of cases than they hurt. I would prefer you do one piece at a time and validate each step. There are a lot of good ideas in here but good ideas don't always turn into results. Thanks, Jeff > > -- > Alexander Motin >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.2.00.1202131012270.2020>