From owner-freebsd-arch@FreeBSD.ORG Thu Sep 30 20:38:30 2004 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E03A316A4D0 for ; Thu, 30 Sep 2004 20:38:30 +0000 (GMT) Received: from relay.pair.com (relay.pair.com [209.68.1.20]) by mx1.FreeBSD.org (Postfix) with SMTP id 24EF443D1D for ; Thu, 30 Sep 2004 20:38:30 +0000 (GMT) (envelope-from pho@holm.cc) Received: (qmail 13021 invoked from network); 30 Sep 2004 20:38:27 -0000 Received: from 0x50a43fc7.hknxx1.adsl-dhcp.tele.dk (HELO peter.osted.lan) (80.164.63.199) by relay.pair.com with SMTP; 30 Sep 2004 20:38:27 -0000 X-pair-Authenticated: 80.164.63.199 Received: from peter.osted.lan (localhost.osted.lan [127.0.0.1]) by peter.osted.lan (8.12.10/8.12.10) with ESMTP id i8UKcRCs055209; Thu, 30 Sep 2004 22:38:27 +0200 (CEST) (envelope-from pho@peter.osted.lan) Received: (from pho@localhost) by peter.osted.lan (8.12.10/8.12.10/Submit) id i8UKcQti055208; Thu, 30 Sep 2004 22:38:26 +0200 (CEST) (envelope-from pho) Date: Thu, 30 Sep 2004 22:38:26 +0200 From: Peter Holm To: John Baldwin Message-ID: <20040930203826.GA55153@peter.osted.lan> References: <1095468747.31297.241.camel@palm.tree.com> <200409291652.29990.jhb@FreeBSD.org> <1096496057.3733.2163.camel@palm.tree.com> <200409301017.54350.jhb@FreeBSD.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200409301017.54350.jhb@FreeBSD.org> User-Agent: Mutt/1.4.1i cc: Peter Holm cc: Stephan Uphoff cc: Julian Elischer cc: freebsd-arch@FreeBSD.org Subject: Re: scheduler (sched_4bsd) questions X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2004 20:38:31 -0000 On Thu, Sep 30, 2004 at 10:17:54AM -0400, John Baldwin wrote: > On Wednesday 29 September 2004 06:14 pm, Stephan Uphoff wrote: > > On Wed, 2004-09-29 at 16:52, John Baldwin wrote: > > > > > OK - here is a crude patch to fix some problems with mutex priority > > > > > inheritance. My theory is that the clock thread gets stuck waiting on > > > > > GIANT. > > > > > > > > > > During release/acquisition of a contested sleep mutex there are a few > > > > > windows where a task can be preempted when actions (waking up blocked > > > > > threads, ownership of the mutex, ..) need to be atomic as far as > > > > > scheduling is concerned. Otherwise priority inheritance may fail. The > > > > > patch uses critical_enter/critical_exit to protect these regions > > > > > against preemption. > > > > > > > > > > It would be great if could run this in addition to the other patches. > > > > > > turnstile_claim() doesn't make any threads runnable and thus can't > > > preempt. The other place is supposed to preempt, and it should be ok to > > > do so. Note that since the turnstile chain lock is held, that includes a > > > nested critical section and any preemption will be deferred until the > > > turnstile lock is released via turnstile_release which happens in the > > > middle of > > > turnstile_unpend() after it has finished building a list of all the > > > threads to be made runnable so that the turnstile object can be re-used > > > safely. I don't think this patch will make much of a difference (if > > > any). Can you provide a description of a case where you think the > > > priority inheritance can fail if turnstile_unpend() doesn't run in a > > > nested critical section? > > > > This is a bit of a mind bender. > > I hope you have some aspirins close by ;-) > > > > Thread A holds a mutex x contested by Thread B and has priority pri(A). > > Thread B holds a mutex y. > > There is a thread C with priority pri(C) with pri(C) < pri(A). > > > > Thread A is in the process of releasing x. > > It removes thread B from the turnstile and holds a pointer to B in a > > private list. > > Thread A sets the owner of the turnstile to NULL and releases all spin > > locks. ( mtx_unlock_spin(&tc->tc_lock); line 148) > > This means interrupts are now enabled. > > > > An interrupt occurs (or is already pending) and the interrupt handler > > puts the associated interrupt thread I on the run queue. > > This causes a preemption from A to I. > > The interrupt thread I tries to acquire mutex y owned by B and blocks. > > I donates its priority to B - but inheritance stops at B. > > The next thread with the best priority is C and the cpu switches to C. > > However B needs A to run to make it to the run-queue. > > > > If y is GIANT and I is the clock thread C could run forever in userspace > > without being interrupted. > > Fair enough. The right place to fix this is in turnstile_unpend() though I > think. I have had these patches that try to "clump" setrunqueue's before > preempting lying around (but not thoroughly tested yet) that might fix this > as well but in the turnstile code itself: > > --- //depot/projects/smpng/sys/kern/kern_thread.c 2004/09/22 15:31:15 > +++ //depot/user/jhb/preemption/kern/kern_thread.c 2004/09/22 16:59:47 > @@ -954,6 +954,7 @@ > p->p_suspcount++; > TD_SET_SUSPENDED(td); > TAILQ_INSERT_TAIL(&p->p_suspended, td, td_runq); > +#if 0 > /* > * Hack: If we are suspending but are on the sleep queue > * then we are in msleep or the cv equivalent. We > @@ -962,6 +963,7 @@ > */ > if (TD_ON_SLEEPQ(td)) > TD_SET_SLEEPING(td); > +#endif > } > > void > @@ -988,9 +990,11 @@ > mtx_assert(&sched_lock, MA_OWNED); > PROC_LOCK_ASSERT(p, MA_OWNED); > if (!P_SHOULDSTOP(p)) { > + critical_enter(); > while ((td = TAILQ_FIRST(&p->p_suspended))) { > thread_unsuspend_one(td); > } > + critical_exit(); > } else if ((P_SHOULDSTOP(p) == P_STOPPED_SINGLE) && > (p->p_numthreads == p->p_suspcount)) { > /* > @@ -1025,9 +1029,11 @@ > * to continue however as this is a bad place to stop. > */ > if ((p->p_numthreads != 1) && (!P_SHOULDSTOP(p))) { > - while (( td = TAILQ_FIRST(&p->p_suspended))) { > + critical_enter(); > + while ((td = TAILQ_FIRST(&p->p_suspended))) { > thread_unsuspend_one(td); > } > + critical_exit(); > } > mtx_unlock_spin(&sched_lock); > } > --- //depot/projects/smpng/sys/kern/subr_sleepqueue.c 2004/08/20 17:10:02 > +++ //depot/user/jhb/preemption/kern/subr_sleepqueue.c 2004/09/10 21:36:10 > @@ -400,9 +400,10 @@ > * just return. > */ > if (td->td_sleepqueue != NULL) { > - MPASS(!TD_ON_SLEEPQ(td)); > mtx_unlock_spin(&sc->sc_lock); > mtx_lock_spin(&sched_lock); > + MPASS(!TD_ON_SLEEPQ(td)); > + MPASS(!TD_IS_SLEEPING(td)); > return; > } > > @@ -709,11 +710,13 @@ > sleepq_release(wchan); > > /* Resume all the threads on the temporary list. */ > + critical_enter(); > while (!TAILQ_EMPTY(&list)) { > td = TAILQ_FIRST(&list); > TAILQ_REMOVE(&list, td, td_slpq); > sleepq_resume_thread(td, pri); > } > + critical_exit(); > } > > /* > --- //depot/projects/smpng/sys/kern/subr_turnstile.c 2004/09/03 14:14:21 > +++ //depot/user/jhb/preemption/kern/subr_turnstile.c 2004/09/10 21:36:10 > @@ -727,6 +726,7 @@ > * in turnstile_wait(). Set a flag to force it to try to acquire > * the lock again instead of blocking. > */ > + critical_enter(); > while (!TAILQ_EMPTY(&pending_threads)) { > td = TAILQ_FIRST(&pending_threads); > TAILQ_REMOVE(&pending_threads, td, td_lockq); > @@ -742,6 +742,7 @@ > MPASS(TD_IS_RUNNING(td) || TD_ON_RUNQ(td)); > } > } > + critical_exit(); > mtx_unlock_spin(&sched_lock); > } > > --- //depot/projects/smpng/sys/vm/vm_glue.c 2004/09/22 15:31:15 > +++ //depot/user/jhb/preemption/vm/vm_glue.c 2004/09/22 16:59:47 > @@ -753,6 +753,7 @@ > vm_thread_swapin(td); > > PROC_LOCK(p); > + critical_enter(); > mtx_lock_spin(&sched_lock); > p->p_sflag &= ~PS_SWAPPINGIN; > p->p_sflag |= PS_INMEM; > @@ -767,6 +768,7 @@ > > /* Allow other threads to swap p out now. */ > --p->p_lock; > + critical_exit(); > } > #endif /* NO_SWAPPING */ > } > > > I.e., you could just move the critical_enter() in subr_turnstile.c earlier so > it is before the mtx_unlock_spin() of the turnstile chain lock. > > -- > John Baldwin <>< http://www.FreeBSD.org/~jhb/ > "Power Users Use the Power to Serve" = http://www.FreeBSD.org This patch did not seem to make the freeze problem go away. -- Peter Holm