From owner-freebsd-smp  Fri Jun 23 10:15: 2 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from gidora.zeta.org.au (gidora.zeta.org.au [203.26.10.25])
	by hub.freebsd.org (Postfix) with SMTP id 24F9137C3E3
	for <freebsd-smp@FreeBSD.ORG>; Fri, 23 Jun 2000 10:14:51 -0700 (PDT)
	(envelope-from bde@zeta.org.au)
Received: (qmail 6652 invoked from network); 23 Jun 2000 17:14:43 -0000
Received: from unknown (HELO bde.zeta.org.au) (203.2.228.102)
  by gidora.zeta.org.au with SMTP; 23 Jun 2000 17:14:43 -0000
Date: Sat, 24 Jun 2000 03:14:40 +1000 (EST)
From: Bruce Evans <bde@zeta.org.au>
X-Sender: bde@besplex.bde.org
To: Matthew Dillon <dillon@apollo.backplane.com>
Cc: Jason Evans <jasone@canonware.com>, Greg Lehey <grog@lemis.com>,
	Warner Losh <imp@village.org>, The Hermit Hacker <scrappy@hub.org>,
	freebsd-smp@FreeBSD.ORG
Subject: Re: SP Patchset #1 up
In-Reply-To: <200006231556.IAA10465@apollo.backplane.com>
Message-ID: <Pine.BSF.4.21.0006240215530.295-100000@besplex.bde.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Fri, 23 Jun 2000, Matthew Dillon wrote:

> :>     I don't follow.  Why do I have a deadlock?  This is SchedMutex I'm
> :>     talking about here, not GiantMutex.
> :
> :mtx_enter_sched_quick() is mtx_enter(&SchedMutex, MTX_SPIN).
> :
> :| /*
> :|  * mtx_enter()
> :|  * ...
> :|  *	WARNING: DO NOT PUT KERNEL PRINTFS IN THE SPIN CODE!, you may
> :|  *	create a recursion.
> :|  */
> :
> :This warning gives another reason why ddb can't call locking functions.
> :Putting a breakpoint at a locking functions or wandering into a locking
> :function while single stepping will cause console i/o.
> 
>     A break point is a one-time deal.  That'l work just fine.  If the

Except for a breakpoint in the debugger's breakpoint handling code or
in the debugger's i/o functions.  Setting a breakpoint in the debugger's
breakpoint handling code is user error and very difficult to render
harmless.  Setting a breakpoint in debugger's i/o functions is not
necessarily a user error, since parts of the functions are used by
the kernel generally.  E.g., setting a breakpoint at cnputc() currently
causes a recursive panic in some cases, because trap_fatal() calls
printf() before ddb can remove the breakpoint and kdb_trap() calls
db_printf() before ddb removes the breakpoint.

>     SchedMutex is already held then getting it again isn't going to hurt.
>     If it isn't held then getting it the first time isn't going to hurt.

In that case, it doesn't protect against unsupported recursion in the syscons
console i/o routines any better than spltty().

> :| static __inline void
> :| mtx_enter(struct mtx *mtp, int type)
> :| {
> :| 	u_int32_t newv;
> :| 
> :| 	newv = (u_int32_t)curproc | MTF_HELD;
> :| 	if ((mtp->mtx_lock & ~MTF_CONTESTED) == newv) {
> :
> :Here is another reason why ddb can't call locking functions.  curproc
> :may be in the middle of being switched when a debugger trap occurs, so
> :ddb can't call anything that depends on it being valid.  SchedMutex may
> :be invalid for the same reason.
> 
>     SchedMutex is never invalid.  No mutex is ever 'invalid'... they are
>     locked in an atomic cmpexg.  Either SchedMutex is held when the break
>     point occurs, or it isn't.

From your patches for exception.s:

| +	/*
| +	 * We have to protect the SchedMutex and curproc fixup from 
| +	 * interrupts on this cpu (which check for SchedMutex being
| +	 * held on the 'current' cpu).
| +	 */
| +	cli
| +
|  #ifdef SMP
|  #ifdef GRAB_LOPRIO				/* hold LOPRIO for INTs */
|  #ifdef CHEAP_TPR
| @@ -455,12 +297,22 @@
|  	movl	%edx, _curpcb
|  	movl	%ecx, _curproc			/* into next process */
|  
| -#ifdef SMP
| -	movl	_cpu_lockid, %eax
| -	orl	PCB_MPNEST(%edx), %eax		/* add next count from PROC */
| -	movl	%eax, _mp_lock			/* load the mp_lock */
| +	/*
| +	 * Restore SchedMutex, be careful not to loose MTF_CONTESTED.
| +	 * Note that MTF_HELD *MUST* be set here (XXX add assertion/panic if
| +	 * not).
| +	 *
| +	 * The curproc portion of SchedMutex will temporarily cycle through
| +	 * NULL, but since interrupts are disabled and curproc can never
| +	 * be NULL, this will not result in improper operation.
| +	 */
| +	movl	PCB_SCHEDNEST(%edx), %eax	/* add next count from PROC */
| +	MPLOCKED
| +	andl	$(MTF_HELD|MTF_CONTESTED),_SchedMutex+MTX_LOCK
| +	MPLOCKED
| +	orl	%ecx, _SchedMutex+MTX_LOCK
| +	movl	%eax, _SchedMutex+MTX_RECURSE	/* load the sched mutex */
|  	/* XXX FIXME: we should be restoring the local APIC TPR */
| -#endif /* SMP */

"cli" and other forms of disabling interrupts provide no protection against
exceptions which aren't interrupts, in particular against debugger traps.

> :Many levels of recursion are possible:
> :
> :  printf in process context
> :    --> interrupt ... printf in interrupt context
> :      --> higher priority interrupt ... printf in interrupt context
> :        ... even higher priority interrupts
> :          --> debugger trap ... debugger printf
> :	    --> fatal trap for bug in printf ... bogus printfs (should longjmp)
> :	      --> double fault for bug in fatal trap handler ... final printfs
> 
>     I don't see the issue here.  interrupts cannot nest SchedMutex - look
>     at the interrupt code.  At the moment interrupt can nest other interrupts
>     only insofar as the new interrupt occuring before the old interrupt has
>     obtained the giant mutex.

Traps are not interrupts.  Debugger traps are the main problem.  NMIs
would be a bigger problem if they occurred in normal operation (note
that our NMI handling is broken; it does dangerous things like calling
printf).

> :If switching from a high priority interrupt task to a low priority one is
> :allowed, then the first few levels don't need to be supported.
> 
>     I don't follow.  The SchedMutex recursion count is saved and restored
>     when a switch occurs.

This will only matter when (if) prioritized interrupts are supported again.

> :| 	} else {
> :| 		if (atomic_cmpexg_int(&mtp->mtx_lock, 0, newv) != 0) {
> :| 			if (type & MTX_SPIN) {
> :| 				_mtx_enter_spin(mtp, newv);
> :
> :If this code is reached, then it should spin for ever, since other tasks
> :should not run, even if there is another CPU for it to run on.
> 
>     If you place a debugger trap at a point where SchedMutex is held,
>     guess which cpu the debugger takes the trap on?  The one holding
>     SchedMutex.  Thus no problem... the debug trap will simply bump the
>     recursion count of SchedMutex temporarily.

Are you saying that this code is never reached for debugger traps?

I'm not even thinking about complications for SMP.  The problems I'm
talking about affect for UP.  The SMP case shouldn't be significantly
different for debugger traps, since the first thing the debugger trap
handler should do is stop the other CPUs.

>     The only thing we have to worry about insofar as debugger traps go
>     is to be sure to clear the debug point prior to entering the SchedMutex
>     within the debugger trap.  I think this is trivial.

This is the problem that I mentioned above.  It is not completely trivial
to fix, because the debugger is entered for all types of fatal traps.
Fatal traps aren't necessarily fatal for people who can use ddb.

Bruce


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message