From owner-freebsd-current@FreeBSD.ORG Tue Jan 29 15:42:33 2013 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 4116F339; Tue, 29 Jan 2013 15:42:33 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 10AD82EC; Tue, 29 Jan 2013 15:42:33 +0000 (UTC) Received: from pakbsde14.localnet (unknown [38.105.238.108]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 74F34B91E; Tue, 29 Jan 2013 10:42:32 -0500 (EST) From: John Baldwin To: Ryan Stone Subject: Re: Why don't we check for preemption when we unlend priority? Date: Tue, 29 Jan 2013 09:59:36 -0500 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p22; KDE/4.5.5; amd64; ; ) References: In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Message-Id: <201301290959.37166.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Tue, 29 Jan 2013 10:42:32 -0500 (EST) Cc: Attilio Rao , FreeBSD Current X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 29 Jan 2013 15:42:33 -0000 On Friday, January 25, 2013 3:01:14 pm Ryan Stone wrote: > I'm having a problem where userland threads are running with a loaned (via > mutex priority propagation) priority even after they have released the > mutex. This is causing the low-priority userland thread to starve > high-priority interrupt threads. One scenario is(which I am seeing on > FreeBSD 8.2, but the priority lending code does not seem to have changed in > HEAD): > > 1) I have several swi threads that are pulling packets off of interfaces > and forwarding them. Each swi thread is pinned to a CPU. > > 2) A remote user initiates an scp of a large file to this machine. > > 3) sshd (which handles the scp) takes a mutex on its socket > > 4) An interrupt or taskqueue thread belonging to the driver that is > receiving the scp traffic tries to take the same mutex. It can't and so it > lends its priority to the sshd thread. > > 5) My swi thread is woken. It is pinned to the same CPU as sshd, but it > has a lower priority than the lent priority, so it is placed on the > runqueue. > > 6) The sshd thread releases the mutex. sched_unlend_prio is called to set > its priority back to a normal user priority. However, it does not check > whether any higher-priority threads are waiting in the runqueue. > > 7) The sshd thread is allowed to run for its full quantum (100ms). This is > easily long enough for my swi thread to start and I drop packets. Well, I think there is an implicit assumption that the only higher priority things are ones that are about to be woken up later in turnstile_unpend(). The order of operations in turnstile_unpend() should handle the case where one of the threads waiting on the lock needs your inherited priority. In the absence of CPU pinning this would indeed work correctly because if the current thread's priority is lowered, it is definitely going to preempt to the thread it borrowed it from, and when that thread finishes running it will examine the run queues to find which thread to run next. CPU pinning has broken that assumption however. I can't fully evaulate your patch, though sched_unlend_prio should only be called on curthread, so I think that your sched_check_preempt() should always see tdq == TDQ_SELF()? > diff --git a/sys/kern/sched_ule.c b/sys/kern/sched_ule.c > index 01559ee..2659614 100644 > --- a/sys/kern/sched_ule.c > +++ b/sys/kern/sched_ule.c > @@ -1586,6 +1586,22 @@ sched_pctcpu_update(struct td_sched *ts) > ts->ts_ftick = ts->ts_ltick - SCHED_TICK_TARG; > } > > +static void > +sched_check_preempt(struct tdq *tdq, struct thread *td) > +{ > + > + KASSERT(TD_IS_RUNNING(td), ("thread is not running")); > + TDQ_LOCK_ASSERT(tdq, MA_OWNED); > + KASSERT(tdq == TDQ_CPU(td->td_sched->ts_cpu), ("tdq does not td")); > + > + if (tdq == TDQ_SELF()) { > + if (td->td_priority > tdq->tdq_lowpri && > + sched_shouldpreempt(tdq->tdq_lowpri, td->td_priority, > 0)) > + td->td_owepreempt = 1; > + } else > + tdq_notify(tdq, td); > +} > + > /* > * Adjust the priority of a thread. Move it to the appropriate run-queue > * if necessary. This is the back-end for several priority related > @@ -1635,8 +1651,10 @@ sched_thread_priority(struct thread *td, u_char prio) > td->td_priority = prio; > if (prio < tdq->tdq_lowpri) > tdq->tdq_lowpri = prio; > - else if (tdq->tdq_lowpri == oldpri) > + else if (tdq->tdq_lowpri == oldpri) { > tdq_setlowpri(tdq, td); > + sched_check_preempt(tdq, td); > + } > return; > } > td->td_priority = prio; > -- John Baldwin