From owner-freebsd-hackers@FreeBSD.ORG Tue Sep 18 19:08:00 2012 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C1935106566C; Tue, 18 Sep 2012 19:08:00 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 887CD8FC1C; Tue, 18 Sep 2012 19:07:59 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id WAA23024; Tue, 18 Sep 2012 22:07:57 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1TE39J-0006Ux-FQ; Tue, 18 Sep 2012 22:07:57 +0300 Message-ID: <5058C68B.1010508@FreeBSD.org> Date: Tue, 18 Sep 2012 22:07:55 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:15.0) Gecko/20120913 Thunderbird/15.0.1 MIME-Version: 1.0 To: attilio@FreeBSD.org References: <50587F8D.9060102@FreeBSD.org> In-Reply-To: X-Enigmail-Version: 1.4.3 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-hackers , Jeff Roberson Subject: Re: ule+smp: small optimization for turnstile priority lending X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Sep 2012 19:08:01 -0000 on 18/09/2012 19:50 Attilio Rao said the following: > On 9/18/12, Andriy Gapon wrote: >> >> Here is a snippet that demonstrates the issue on a supposedly fully loaded >> 2-processor system: >> >> 136794 0 3670427870244462 KTRGRAPH group:"thread", id:"Xorg tid 102818", >> state:"running", attributes: prio:122 >> >> 136793 0 3670427870241000 KTRGRAPH group:"thread", id:"cc1plus tid >> 111916", >> state:"yielding", attributes: prio:183, wmesg:"(null)", lockname:"(null)" >> >> 136792 1 3670427870240829 KTRGRAPH group:"thread", id:"idle: cpu1 tid >> 100004", >> state:"running", attributes: prio:255 >> >> 136791 1 3670427870239520 KTRGRAPH group:"load", id:"CPU 1 load", >> counter:0, >> attributes: none >> >> 136790 1 3670427870239248 KTRGRAPH group:"thread", id:"firefox tid >> 113473", >> state:"blocked", attributes: prio:122, wmesg:"(null)", lockname:"unp_mtx" >> >> 136789 1 3670427870237697 KTRGRAPH group:"load", id:"CPU 0 load", >> counter:2, >> attributes: none >> >> 136788 1 3670427870236394 KTRGRAPH group:"thread", id:"firefox tid >> 113473", >> point:"wokeup", attributes: linkedto:"Xorg tid 102818" >> >> 136787 1 3670427870236145 KTRGRAPH group:"thread", id:"Xorg tid 102818", >> state:"runq add", attributes: prio:122, linkedto:"firefox tid 113473" >> >> 136786 1 3670427870235981 KTRGRAPH group:"load", id:"CPU 1 load", >> counter:1, >> attributes: none >> >> 136785 1 3670427870235707 KTRGRAPH group:"thread", id:"Xorg tid 102818", >> state:"runq rem", attributes: prio:176 >> >> 136784 1 3670427870235423 KTRGRAPH group:"thread", id:"Xorg tid 102818", >> point:"prio", attributes: prio:176, new prio:122, linkedto:"firefox tid >> 113473" >> >> 136783 1 3670427870202392 KTRGRAPH group:"thread", id:"firefox tid >> 113473", >> state:"running", attributes: prio:104 >> >> See how how the Xorg thread was forced from CPU 1 to CPU 0 where it >> preempted >> cc1plus thread (I do have preemption enabled) only to leave CPU 1 with zero >> load. > > I think that the idea is bright, but I have reservations against the > implementation because it seems to me there are too many layering > violations. Just one - for a layer between tunrstile and scheduler :-) But I agree. > What is suggest is somewhat summarized like that: > - Add a new SRQ_WILLSLEEP or the name you prefer > - Add a new "flags" argument to sched_lend_prio() (both ule and 4bsd) > and sched_thread_priority (ule only) > - sched_thread_priority() will pass down the new flag to sched_add() > which passed down to sched_pickcpu(). > > This way sched_pickcpu() has the correct knowledge of what is going on > and it can make the right decision. You likely don't need to lower the > tdq_load at that time either this way, because sched_pickcpu() can > just adjust it locally for its decision. > > What do you think? This sounds easy but it is not quite so given the implementation of sched_pickcpu and sched_lowest. This is probably more work than I am able to take now. -- Andriy Gapon