From owner-freebsd-hackers@FreeBSD.ORG Wed Nov 7 06:17:45 2012 Return-Path: <owner-freebsd-hackers@FreeBSD.ORG> Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 0F6606AD for <freebsd-hackers@freebsd.org>; Wed, 7 Nov 2012 06:17:45 +0000 (UTC) (envelope-from jroberson@jroberson.net) Received: from mail-pa0-f54.google.com (mail-pa0-f54.google.com [209.85.220.54]) by mx1.freebsd.org (Postfix) with ESMTP id C771B8FC12 for <freebsd-hackers@freebsd.org>; Wed, 7 Nov 2012 06:17:44 +0000 (UTC) Received: by mail-pa0-f54.google.com with SMTP id bi1so978861pad.13 for <freebsd-hackers@freebsd.org>; Tue, 06 Nov 2012 22:17:43 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=date:from:x-x-sender:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version:content-type:x-gm-message-state; bh=VN1hk01Xq8BVYQs6E74eurX4JSgKBFOnRMU+O7uZtsM=; b=OQMbVclRcnPsJ8ajpD5cxhj3COUftC5WRCNjizp6dRiIIipjCfiOtcsza8RUsydxCN cem/wMoe+miwkGpBOGWw1blG+ebeqYusiBBgks4XF7KV5QaQAe8zPnylkc8J2vO7gCEs 6ou1p5TSY2T/neI4XbxywjMJ16eSilSfAvxDLAhKHfwoHsBQqwszetOAqJ6kCTCz3sfn yTI4W9AKbKBMM3z8TODr4jB9RhgS9coV2+iEwcBu+CVP6iozdcf59EVehbd8EhcplCnM cXmO4ND/fACmtrMztT5Whudj14VH847tAJQUx2pYVKT8gawGCz1GbowktJMV+dXIhhV2 Vt+Q== Received: by 10.68.228.36 with SMTP id sf4mr10824677pbc.20.1352269063401; Tue, 06 Nov 2012 22:17:43 -0800 (PST) Received: from rrcs-66-91-135-210.west.biz.rr.com (rrcs-66-91-135-210.west.biz.rr.com. [66.91.135.210]) by mx.google.com with ESMTPS id ty4sm13595301pbc.57.2012.11.06.22.17.41 (version=SSLv3 cipher=OTHER); Tue, 06 Nov 2012 22:17:42 -0800 (PST) Date: Tue, 6 Nov 2012 20:17:20 -1000 (HST) From: Jeff Roberson <jroberson@jroberson.net> X-X-Sender: jroberson@desktop To: David Xu <davidxu@freebsd.org> Subject: Re: ule+smp: small optimization for turnstile priority lending In-Reply-To: <5099C5C9.4040703@freebsd.org> Message-ID: <alpine.BSF.2.00.1211062016380.4081@desktop> References: <50587F8D.9060102@FreeBSD.org> <505AD2A5.6060008@freebsd.org> <CAJ-FndBGEgkrKJ9bNdq0QrdyYZb=LXUsAG3wz5Lp-HLUBd5d9w@mail.gmail.com> <5099C5C9.4040703@freebsd.org> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Gm-Message-State: ALoCoQnwzHgWWnmALnhQZHkeRcph8EX6ZXZfoANS+nZqcdKjzGs8pTTfI0dZKEWF2j0uIsThjy3M X-Mailman-Approved-At: Wed, 07 Nov 2012 12:40:11 +0000 Cc: attilio@freebsd.org, freebsd-hackers <freebsd-hackers@freebsd.org>, Jeff Roberson <jeff@freebsd.org>, Andriy Gapon <avg@freebsd.org> X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD <freebsd-hackers.freebsd.org> List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-hackers>, <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe> List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers> List-Post: <mailto:freebsd-hackers@freebsd.org> List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help> List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe> X-List-Received-Date: Wed, 07 Nov 2012 06:17:45 -0000 On Wed, 7 Nov 2012, David Xu wrote: > On 2012/11/06 19:03, Attilio Rao wrote: >> On 9/20/12, David Xu <davidxu@freebsd.org> wrote: >>> On 2012/09/18 22:05, Andriy Gapon wrote: >>>> >>>> Here is a snippet that demonstrates the issue on a supposedly fully >>>> loaded >>>> 2-processor system: >>>> >>>> 136794 0 3670427870244462 KTRGRAPH group:"thread", id:"Xorg tid >>>> 102818", >>>> state:"running", attributes: prio:122 >>>> >>>> 136793 0 3670427870241000 KTRGRAPH group:"thread", id:"cc1plus tid >>>> 111916", >>>> state:"yielding", attributes: prio:183, wmesg:"(null)", lockname:"(null)" >>>> >>>> 136792 1 3670427870240829 KTRGRAPH group:"thread", id:"idle: cpu1 tid >>>> 100004", >>>> state:"running", attributes: prio:255 >>>> >>>> 136791 1 3670427870239520 KTRGRAPH group:"load", id:"CPU 1 load", >>>> counter:0, >>>> attributes: none >>>> >>>> 136790 1 3670427870239248 KTRGRAPH group:"thread", id:"firefox tid >>>> 113473", >>>> state:"blocked", attributes: prio:122, wmesg:"(null)", lockname:"unp_mtx" >>>> >>>> 136789 1 3670427870237697 KTRGRAPH group:"load", id:"CPU 0 load", >>>> counter:2, >>>> attributes: none >>>> >>>> 136788 1 3670427870236394 KTRGRAPH group:"thread", id:"firefox tid >>>> 113473", >>>> point:"wokeup", attributes: linkedto:"Xorg tid 102818" >>>> >>>> 136787 1 3670427870236145 KTRGRAPH group:"thread", id:"Xorg tid >>>> 102818", >>>> state:"runq add", attributes: prio:122, linkedto:"firefox tid 113473" >>>> >>>> 136786 1 3670427870235981 KTRGRAPH group:"load", id:"CPU 1 load", >>>> counter:1, >>>> attributes: none >>>> >>>> 136785 1 3670427870235707 KTRGRAPH group:"thread", id:"Xorg tid >>>> 102818", >>>> state:"runq rem", attributes: prio:176 >>>> >>>> 136784 1 3670427870235423 KTRGRAPH group:"thread", id:"Xorg tid >>>> 102818", >>>> point:"prio", attributes: prio:176, new prio:122, linkedto:"firefox tid >>>> 113473" >>>> >>>> 136783 1 3670427870202392 KTRGRAPH group:"thread", id:"firefox tid >>>> 113473", >>>> state:"running", attributes: prio:104 >>>> >>>> See how how the Xorg thread was forced from CPU 1 to CPU 0 where it >>>> preempted >>>> cc1plus thread (I do have preemption enabled) only to leave CPU 1 with >>>> zero load. >>>> >>>> Here is a proposed solution: >>>> >>>> turnstile_wait: optimize priority lending to a thread on a runqueue >>>> >>>> As the current thread is definitely going into mi_switch, it now >>>> removes >>>> its load before doing priority propagation which can potentially >>>> result >>>> in sched_add. In the SMP && ULE case the latter searches for the >>>> least loaded CPU to place a boosted thread, which is supposedly >>>> about >>>> to run. >>>> >>>> diff --git a/sys/kern/sched_ule.c b/sys/kern/sched_ule.c >>>> index 8e466cd..3299cae 100644 >>>> --- a/sys/kern/sched_ule.c >>>> +++ b/sys/kern/sched_ule.c >>>> @@ -1878,7 +1878,10 @@ sched_switch(struct thread *td, struct thread >>>> *newtd, int >>>> flags) >>>> /* This thread must be going to sleep. */ >>>> TDQ_LOCK(tdq); >>>> mtx = thread_lock_block(td); >>>> - tdq_load_rem(tdq, td); >>>> +#if defined(SMP) >>>> + if ((flags & SW_TYPE_MASK) != SWT_TURNSTILE) >>>> +#endif >>>> + tdq_load_rem(tdq, td); >>>> } >>>> /* >>>> * We enter here with the thread blocked and assigned to the >>>> @@ -2412,6 +2415,21 @@ sched_rem(struct thread *td) >>>> tdq_setlowpri(tdq, NULL); >>>> } >>>> >>>> +void >>>> +sched_load_rem(struct thread *td) >>>> +{ >>>> + struct tdq *tdq; >>>> + >>>> + KASSERT(td == curthread, >>>> + ("sched_rem_load: only curthread is supported")); >>>> + KASSERT(td->td_oncpu == td->td_sched->ts_cpu, >>>> + ("thread running on cpu different from ts_cpu")); >>>> + tdq = TDQ_CPU(td->td_sched->ts_cpu); >>>> + TDQ_LOCK_ASSERT(tdq, MA_OWNED); >>>> + MPASS(td->td_lock == TDQ_LOCKPTR(tdq)); >>>> + tdq_load_rem(tdq, td); >>>> +} >>>> + >>>> /* >>>> * Fetch cpu utilization information. Updates on demand. >>>> */ >>>> diff --git a/sys/kern/subr_turnstile.c b/sys/kern/subr_turnstile.c >>>> index 31d16fe..d1d68e9 100644 >>>> --- a/sys/kern/subr_turnstile.c >>>> +++ b/sys/kern/subr_turnstile.c >>>> @@ -731,6 +731,13 @@ turnstile_wait(struct turnstile *ts, struct thread >>>> *owner, >>>> int queue) >>>> LIST_INSERT_HEAD(&ts->ts_free, td->td_turnstile, ts_hash); >>>> } >>>> thread_lock(td); >>>> +#if defined(SCHED_ULE) && defined(SMP) >>>> + /* >>>> + * Remove load earlier so that it does not affect cpu selection >>>> + * for a thread waken up due to priority lending, if any. >>>> + */ >>>> + sched_load_rem(td); >>>> +#endif >>>> thread_lock_set(td, &ts->ts_lock); >>>> td->td_turnstile = NULL; >>>> >>>> diff --git a/sys/sys/sched.h b/sys/sys/sched.h >>>> index 4b8387c..b1ead1b 100644 >>>> --- a/sys/sys/sched.h >>>> +++ b/sys/sys/sched.h >>>> @@ -110,6 +110,9 @@ void sched_preempt(struct thread *td); >>>> void sched_add(struct thread *td, int flags); >>>> void sched_clock(struct thread *td); >>>> void sched_rem(struct thread *td); >>>> +#if defined(SCHED_ULE) && defined(SMP) >>>> +void sched_load_rem(struct thread *td); >>>> +#endif >>>> void sched_tick(int cnt); >>>> void sched_relinquish(struct thread *td); >>>> struct thread *sched_choose(void); >>>> >>> >>> I found another scenario in taskqueue, in the function >>> taskqueue_terminate, current thread tries to wake >>> another thread up and sleep immediately, the tq_mutex sometimes >>> is a spinlock. So if you remove one thread load from current cpu >>> before wakeup, the resumed thread may be put on same cpu, >>> so it will optimize the cpu scheduling too. >> >> I think that in order to fit with sched_add() modifies I have in mind >> (see other patches within this thread) wakeup() should grow a new >> argument, or maybe we can use wakeup_flags() new KPI. >> If the latter is the case, I would also propose to let wakeup_one() to >> be absorbed into wakeup_flags() with its own flag. >> > > Yes, I like the idea. >From ~2007 http://people.freebsd.org/~jeff/wakeupflags.diff This has some different optimizations that can be done when you have a hinted wakeup. Thanks, Jeff > >> Attilio >> >> >