From owner-freebsd-hackers@FreeBSD.ORG Tue Sep 18 16:50:05 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3F3EE10657D8; Tue, 18 Sep 2012 16:50:04 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from mail-lpp01m010-f54.google.com (mail-lpp01m010-f54.google.com [209.85.215.54]) by mx1.freebsd.org (Postfix) with ESMTP id 95CE78FC1E; Tue, 18 Sep 2012 16:50:03 +0000 (UTC) Received: by lage12 with SMTP id e12so50053lag.13 for ; Tue, 18 Sep 2012 09:50:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=Z2TF9Jo6TCl47yDrNXppNwi6d5GbpkdcAcE5zBRm+4g=; b=Rce5SC1Qy3/vknmxzJky2EIOIIfyGG12nMYwowwgrjFEzMQDv6pdGZtCC/pCFIYJ52 Zrb1x4YsBhC0ba4cJ254+reT0ZAf/aCu20zQjUcwI9kC8msXXGlxs2x04GkrbtBg03J4 Z2uiXFl1vX8b6APCFLgEnV+ZdYUAR8W1p8c5QxOMUQpmsK4G8+9pdqZBPcDYEcIA4QXi EC/i1QXxJadSnjVi8gEn6oHRGhioZz1+1EcNqj1NH/kpIT0UnHFpWSjPoy0x0x3YLU85 nq7T0yWWWEmlqmswPF7LsJ0Auw/yYRY1mVKvXUAxqRfx3af0eT7EdpUDZRNb4q1HZNuM E5Vw== MIME-Version: 1.0 Received: by 10.112.86.41 with SMTP id m9mr117120lbz.108.1347987002034; Tue, 18 Sep 2012 09:50:02 -0700 (PDT) Sender: asmrookie@gmail.com Received: by 10.112.102.39 with HTTP; Tue, 18 Sep 2012 09:50:01 -0700 (PDT) In-Reply-To: <50587F8D.9060102@FreeBSD.org> References: <50587F8D.9060102@FreeBSD.org> Date: Tue, 18 Sep 2012 17:50:01 +0100 X-Google-Sender-Auth: 3wZnj3yjlScZEViqG8bwU2qMLnQ Message-ID: From: Attilio Rao To: Andriy Gapon Content-Type: text/plain; charset=UTF-8 Cc: freebsd-hackers , Jeff Roberson Subject: Re: ule+smp: small optimization for turnstile priority lending X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: attilio@FreeBSD.org List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Sep 2012 16:50:05 -0000 On 9/18/12, Andriy Gapon wrote: > > Here is a snippet that demonstrates the issue on a supposedly fully loaded > 2-processor system: > > 136794 0 3670427870244462 KTRGRAPH group:"thread", id:"Xorg tid 102818", > state:"running", attributes: prio:122 > > 136793 0 3670427870241000 KTRGRAPH group:"thread", id:"cc1plus tid > 111916", > state:"yielding", attributes: prio:183, wmesg:"(null)", lockname:"(null)" > > 136792 1 3670427870240829 KTRGRAPH group:"thread", id:"idle: cpu1 tid > 100004", > state:"running", attributes: prio:255 > > 136791 1 3670427870239520 KTRGRAPH group:"load", id:"CPU 1 load", > counter:0, > attributes: none > > 136790 1 3670427870239248 KTRGRAPH group:"thread", id:"firefox tid > 113473", > state:"blocked", attributes: prio:122, wmesg:"(null)", lockname:"unp_mtx" > > 136789 1 3670427870237697 KTRGRAPH group:"load", id:"CPU 0 load", > counter:2, > attributes: none > > 136788 1 3670427870236394 KTRGRAPH group:"thread", id:"firefox tid > 113473", > point:"wokeup", attributes: linkedto:"Xorg tid 102818" > > 136787 1 3670427870236145 KTRGRAPH group:"thread", id:"Xorg tid 102818", > state:"runq add", attributes: prio:122, linkedto:"firefox tid 113473" > > 136786 1 3670427870235981 KTRGRAPH group:"load", id:"CPU 1 load", > counter:1, > attributes: none > > 136785 1 3670427870235707 KTRGRAPH group:"thread", id:"Xorg tid 102818", > state:"runq rem", attributes: prio:176 > > 136784 1 3670427870235423 KTRGRAPH group:"thread", id:"Xorg tid 102818", > point:"prio", attributes: prio:176, new prio:122, linkedto:"firefox tid > 113473" > > 136783 1 3670427870202392 KTRGRAPH group:"thread", id:"firefox tid > 113473", > state:"running", attributes: prio:104 > > See how how the Xorg thread was forced from CPU 1 to CPU 0 where it > preempted > cc1plus thread (I do have preemption enabled) only to leave CPU 1 with zero > load. I think that the idea is bright, but I have reservations against the implementation because it seems to me there are too many layering violations. What is suggest is somewhat summarized like that: - Add a new SRQ_WILLSLEEP or the name you prefer - Add a new "flags" argument to sched_lend_prio() (both ule and 4bsd) and sched_thread_priority (ule only) - sched_thread_priority() will pass down the new flag to sched_add() which passed down to sched_pickcpu(). This way sched_pickcpu() has the correct knowledge of what is going on and it can make the right decision. You likely don't need to lower the tdq_load at that time either this way, because sched_pickcpu() can just adjust it locally for its decision. What do you think? Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein