Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 4 Oct 2004 13:20:22 -0400
From:      John Baldwin <jhb@FreeBSD.org>
To:        Peter Holm <peter@holm.cc>
Cc:        freebsd-arch@FreeBSD.org
Subject:   Re: scheduler (sched_4bsd) questions
Message-ID:  <200410041320.22267.jhb@FreeBSD.org>
In-Reply-To: <20040930203826.GA55153@peter.osted.lan>
References:  <1095468747.31297.241.camel@palm.tree.com> <200409301017.54350.jhb@FreeBSD.org> <20040930203826.GA55153@peter.osted.lan>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday 30 September 2004 04:38 pm, Peter Holm wrote:
> On Thu, Sep 30, 2004 at 10:17:54AM -0400, John Baldwin wrote:
> > On Wednesday 29 September 2004 06:14 pm, Stephan Uphoff wrote:
> > > On Wed, 2004-09-29 at 16:52, John Baldwin wrote:
> > > > > > OK - here is a crude patch to fix some problems with mutex
> > > > > > priority inheritance. My theory is that the clock thread gets
> > > > > > stuck waiting on GIANT.
> > > > > >
> > > > > > During release/acquisition of a contested sleep mutex there are a
> > > > > > few windows where a task can be preempted when actions (waking up
> > > > > > blocked threads, ownership of the mutex, ..) need to be atomic as
> > > > > > far as scheduling is concerned. Otherwise priority inheritance
> > > > > > may fail. The patch uses critical_enter/critical_exit to protect
> > > > > > these regions against preemption.
> > > > > >
> > > > > > It would be great if could run this in addition to the other
> > > > > > patches.
> > > >
> > > > turnstile_claim() doesn't make any threads runnable and thus can't
> > > > preempt. The other place is supposed to preempt, and it should be ok
> > > > to do so.  Note that since the turnstile chain lock is held, that
> > > > includes a nested critical section and any preemption will be
> > > > deferred until the turnstile lock is released via turnstile_release
> > > > which happens in the middle of
> > > > turnstile_unpend() after it has finished building a list of all the
> > > > threads to be made runnable so that the turnstile object can be
> > > > re-used safely.  I don't think this patch will make much of a
> > > > difference (if any).  Can you provide a description of a case where
> > > > you think the priority inheritance can fail if turnstile_unpend()
> > > > doesn't run in a nested critical section?
> > >
> > > This is a bit of a mind bender.
> > > I hope you have some aspirins close by ;-)
> > >
> > > Thread A holds a mutex x contested by Thread B and has priority pri(A).
> > > Thread B holds a mutex y.
> > > There is a thread C with priority pri(C) with pri(C) < pri(A).
> > >
> > > Thread A is in the process of releasing x.
> > > It removes thread B from the turnstile and holds a pointer to B in a
> > > private list.
> > > Thread A sets the owner of the turnstile to NULL and releases all spin
> > > locks. ( mtx_unlock_spin(&tc->tc_lock); line 148)
> > > This means interrupts are now enabled.
> > >
> > > An interrupt occurs (or is already pending) and the interrupt handler
> > > puts the associated interrupt thread I on the run queue.
> > > This causes a preemption from A to I.
> > > The interrupt thread I tries to acquire mutex y owned by B and blocks.
> > > I donates its priority to B - but inheritance stops at B.
> > > The next thread with the best priority is C and the cpu switches to C.
> > > However B needs A to run to make it to the run-queue.
> > >
> > > If y is GIANT and I is the clock thread C could run forever in
> > > userspace without being interrupted.
> >
> > Fair enough.  The right place to fix this is in turnstile_unpend() though
> > I think.  I have had these patches that try to "clump" setrunqueue's
> > before preempting lying around (but not thoroughly tested yet) that might
> > fix this as well but in the turnstile code itself:
> >
> > --- //depot/projects/smpng/sys/kern/kern_thread.c	2004/09/22 15:31:15
> > +++ //depot/user/jhb/preemption/kern/kern_thread.c	2004/09/22 16:59:47
> > @@ -954,6 +954,7 @@
> >  	p->p_suspcount++;
> >  	TD_SET_SUSPENDED(td);
> >  	TAILQ_INSERT_TAIL(&p->p_suspended, td, td_runq);
> > +#if 0
> >  	/*
> >  	 * Hack: If we are suspending but are on the sleep queue
> >  	 * then we are in msleep or the cv equivalent. We
> > @@ -962,6 +963,7 @@
> >  	 */
> >  	if (TD_ON_SLEEPQ(td))
> >  		TD_SET_SLEEPING(td);
> > +#endif
> >  }
> >
> >  void
> > @@ -988,9 +990,11 @@
> >  	mtx_assert(&sched_lock, MA_OWNED);
> >  	PROC_LOCK_ASSERT(p, MA_OWNED);
> >  	if (!P_SHOULDSTOP(p)) {
> > +		critical_enter();
> >  		while ((td = TAILQ_FIRST(&p->p_suspended))) {
> >  			thread_unsuspend_one(td);
> >  		}
> > +		critical_exit();
> >  	} else if ((P_SHOULDSTOP(p) == P_STOPPED_SINGLE) &&
> >  	    (p->p_numthreads == p->p_suspcount)) {
> >  		/*
> > @@ -1025,9 +1029,11 @@
> >  	 * to continue however as this is a bad place to stop.
> >  	 */
> >  	if ((p->p_numthreads != 1) && (!P_SHOULDSTOP(p))) {
> > -		while (( td = TAILQ_FIRST(&p->p_suspended))) {
> > +		critical_enter();
> > +		while ((td = TAILQ_FIRST(&p->p_suspended))) {
> >  			thread_unsuspend_one(td);
> >  		}
> > +		critical_exit();
> >  	}
> >  	mtx_unlock_spin(&sched_lock);
> >  }
> > --- //depot/projects/smpng/sys/kern/subr_sleepqueue.c	2004/08/20 17:10:02
> > +++ //depot/user/jhb/preemption/kern/subr_sleepqueue.c	2004/09/10
> > 21:36:10 @@ -400,9 +400,10 @@
> >  	 * just return.
> >  	 */
> >  	if (td->td_sleepqueue != NULL) {
> > -		MPASS(!TD_ON_SLEEPQ(td));
> >  		mtx_unlock_spin(&sc->sc_lock);
> >  		mtx_lock_spin(&sched_lock);
> > +		MPASS(!TD_ON_SLEEPQ(td));
> > +		MPASS(!TD_IS_SLEEPING(td));
> >  		return;
> >  	}
> >
> > @@ -709,11 +710,13 @@
> >  	sleepq_release(wchan);
> >
> >  	/* Resume all the threads on the temporary list. */
> > +	critical_enter();
> >  	while (!TAILQ_EMPTY(&list)) {
> >  		td = TAILQ_FIRST(&list);
> >  		TAILQ_REMOVE(&list, td, td_slpq);
> >  		sleepq_resume_thread(td, pri);
> >  	}
> > +	critical_exit();
> >  }
> >
> >  /*
> > --- //depot/projects/smpng/sys/kern/subr_turnstile.c	2004/09/03 14:14:21
> > +++ //depot/user/jhb/preemption/kern/subr_turnstile.c	2004/09/10 21:36:10
> > @@ -727,6 +726,7 @@
> >  	 * in turnstile_wait().  Set a flag to force it to try to acquire
> >  	 * the lock again instead of blocking.
> >  	 */
> > +	critical_enter();
> >  	while (!TAILQ_EMPTY(&pending_threads)) {
> >  		td = TAILQ_FIRST(&pending_threads);
> >  		TAILQ_REMOVE(&pending_threads, td, td_lockq);
> > @@ -742,6 +742,7 @@
> >  			MPASS(TD_IS_RUNNING(td) || TD_ON_RUNQ(td));
> >  		}
> >  	}
> > +	critical_exit();
> >  	mtx_unlock_spin(&sched_lock);
> >  }
> >
> > --- //depot/projects/smpng/sys/vm/vm_glue.c	2004/09/22 15:31:15
> > +++ //depot/user/jhb/preemption/vm/vm_glue.c	2004/09/22 16:59:47
> > @@ -753,6 +753,7 @@
> >  			vm_thread_swapin(td);
> >
> >  		PROC_LOCK(p);
> > +		critical_enter();
> >  		mtx_lock_spin(&sched_lock);
> >  		p->p_sflag &= ~PS_SWAPPINGIN;
> >  		p->p_sflag |= PS_INMEM;
> > @@ -767,6 +768,7 @@
> >
> >  		/* Allow other threads to swap p out now. */
> >  		--p->p_lock;
> > +		critical_exit();
> >  	}
> >  #endif /* NO_SWAPPING */
> >  }
> >
> >
> > I.e., you could just move the critical_enter() in subr_turnstile.c
> > earlier so it is before the mtx_unlock_spin() of the turnstile chain
> > lock.
> >
> > --
> > John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
> > "Power Users Use the Power to Serve"  =  http://www.FreeBSD.org
>
> This patch did not seem to make the freeze problem go away.

It requires tweaking to get Stephan's fix for turnstile_unpend().

-- 
John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve"  =  http://www.FreeBSD.org



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200410041320.22267.jhb>