Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 5 Oct 2004 15:03:08 +0200
From:      Peter Holm <peter@holm.cc>
To:        Julian Elischer <julian@elischer.org>
Cc:        "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org>
Subject:   Re: scheduler (sched_4bsd) questions
Message-ID:  <20041005130308.GA2586@peter.osted.lan>
In-Reply-To: <4161A7BD.3040706@elischer.org>
References:  <1095468747.31297.241.camel@palm.tree.com> <1096496057.3733.2163.camel@palm.tree.com> <1096603981.21577.195.camel@palm.tree.com> <200410041131.35387.jhb@FreeBSD.org> <1096911278.44307.17.camel@palm.tree.com> <20041004184939.GA8178@peter.osted.lan> <41619D29.1000704@elischer.org> <20041004191410.GA8423@peter.osted.lan> <4161A7BD.3040706@elischer.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--NzB8fVQJ5HfG6fxh
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

On Mon, Oct 04, 2004 at 12:42:53PM -0700, Julian Elischer wrote:

OK, I got a crash dump now, after a few modifications to kern_shutdown.c

There are however a few strange things worth noticing:

1) The are no panic string:

Mounted root from ufs:/dev/ad0s1a.
pid 1146: corrected slot count (2->1)
[thread 100796]
Stopped at      sched_add+0x13: movl    0x14c(%esi),%ebx

2) The gdb stack trace gets a bit weird at:

#8  0xc07812da in calltrap () at ../../../i386/i386/exception.s:140
#9  0xc05f0018 in flock (td=0x0, uap=0x0) at ../../../kern/kern_descrip.c:2138
#10 0xc0619fd1 in setrunqueue (td=0xc2319180, flags=0x0) at kern_switch.c:521
#11 0xc061921f in sched_wakeup (td=0xc2319180) at ../../../kern/sched_4bsd.c:859

Where did flock() come from?

The full console output is at http://www.holm.cc/stress/log/cons82.html

- Peter

> ok, then  if it happens again,  from ddb, run
> show ktr
> after you've done the 'ps' and go back a couple of hundred events..
> 
> thanks.
> 
> 
> Peter Holm wrote:
> 
> >On Mon, Oct 04, 2004 at 11:57:45AM -0700, Julian Elischer wrote:
> > 
> >
> >>can you run ktrdump against teh corefile and get the ktr output?
> >>(you do have it enabled right?)
> >>
> >>   
> >>
> >
> >No, that's one of the problems: doadump() fails with this specific panic.
> >
> >- Peter
> >
> > 
> >
> >>Peter Holm wrote:
> >>
> >>   
> >>
> >>>On Mon, Oct 04, 2004 at 01:34:38PM -0400, Stephan Uphoff wrote:
> >>>
> >>>
> >>>     
> >>>
> >>>>On Mon, 2004-10-04 at 11:31, John Baldwin wrote:
> >>>> 
> >>>>
> >>>>       
> >>>>
> >>>>>On Friday 01 October 2004 12:13 am, Stephan Uphoff wrote:
> >>>>>   
> >>>>>
> >>>>>         
> >>>>>
> >>>>>>On Wed, 2004-09-29 at 18:14, Stephan Uphoff wrote:
> >>>>>>     
> >>>>>>
> >>>>>>           
> >>>>>>
> >>>>>>>I was looking at the MUTEX_WAKE_ALL undefined case when I used the
> >>>>>>>critical section for turnstile_claim().
> >>>>>>>However there are bigger problems with MUTEX_WAKE_ALL undefined
> >>>>>>>so you are right - the critical section for turnstile_claim is pretty
> >>>>>>>useless.
> >>>>>>>       
> >>>>>>>
> >>>>>>>             
> >>>>>>>
> >>>>>>Arghhh !!!
> >>>>>>
> >>>>>>MUTEX_WAKE_ALL is NOT an option in GENERIC.
> >>>>>>I recall verifying that it is defined twice. Guess I must have looked 
> >>>>>>at
> >>>>>>the wrong source tree :-(
> >>>>>>This means yes - we have bigger problems!
> >>>>>>
> >>>>>>Example:
> >>>>>>
> >>>>>>Thread A holds a mutex x contested by Thread B and C and has priority
> >>>>>>pri(A).
> >>>>>>
> >>>>>>Thread C holds a mutex y and pri(B) < pri(C)
> >>>>>>
> >>>>>>Thread A releases the lock wakes thread B but lets C on the turnstile
> >>>>>>wait queue.
> >>>>>>
> >>>>>>An interrupt thread I tries to lock mutex y owned by C.
> >>>>>>
> >>>>>>However priority inheritance does not work since B needs to run first 
> >>>>>>to
> >>>>>>take ownership of the lock.
> >>>>>>
> >>>>>>I is blocked :-(
> >>>>>>     
> >>>>>>
> >>>>>>           
> >>>>>>
> >>>>>Ermm, if the interrupt happens after x is released then I's priority 
> >>>>>should propagate from I to C to B.  
> >>>>>   
> >>>>>
> >>>>>         
> >>>>>
> >>>>There is a hole after the mutex x is released by A - but before B can
> >>>>claim the mutex. The turnstile for mutex x is unowned and interrupt
> >>>>thread I when trying to donate its priority will run into:
> >>>>
> >>>>	if (td == NULL) {
> >>>>			/*
> >>>>			 * This really isn't quite right. Really
> >>>>			 * ought to bump priority of thread that
> >>>>			 * next acquires the lock.
> >>>>			 */
> >>>>			return;
> >>>>		}
> >>>>
> >>>>So B needs to run and acquire the mutex before priority inheritance
> >>>>works again and does not get a priority boost to do so. 
> >>>>
> >>>>This is easy to fix and MUTEX_WAKE_ALL can be removed again at that time
> >>>>- but my time budget is limited and Peter has an interesting bug left
> >>>>that has priority.
> >>>> 
> >>>>
> >>>>       
> >>>>
> >>>I'm not closer to being able to create this panic in a controlled way.
> >>>After a whole day of different tests I finally got this panic:
> >>>http://www.holm.cc/stress/log/cons81.html. The trigger seems to be one
> >>>particular Java applet, but it is not easily reproduceable.
> >>>
> >>>- Peter
> >>>
> >>>
> >>>
> >>>     
> >>>
> >>>>>If the interrupt happens before x is released, 
> >>>>>then the final bit of propagate_priority() should handle it since it 
> >>>>>resorts the turnstile's thread queue so that C will be awakened rather 
> >>>>>than B.
> >>>>>   
> >>>>>
> >>>>>         
> >>>>>
> >>>>Agreed.
> >>>>
> >>>>	Stephan
> >>>> 
> >>>>
> >>>>       
> >>>>
> >>>_______________________________________________
> >>>freebsd-arch@freebsd.org mailing list
> >>>http://lists.freebsd.org/mailman/listinfo/freebsd-arch
> >>>To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org"
> >>>
> >>>
> >>>     
> >>>
> >
> > 
> >

-- 
Peter Holm

--NzB8fVQJ5HfG6fxh
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="kern_shutdown.diff"

Index: kern_shutdown.c
===================================================================
RCS file: /home/ncvs/src/sys/kern/kern_shutdown.c,v
retrieving revision 1.166
diff -u -r1.166 kern_shutdown.c
--- kern_shutdown.c	2 Sep 2004 18:59:15 -0000	1.166
+++ kern_shutdown.c	5 Oct 2004 12:23:45 -0000
@@ -230,10 +230,14 @@
 		return;
 	}
 
+	if (panicstr == NULL)
+		panicstr = "In doadump()";	/* Major hack XXX pho */
 	savectx(&dumppcb);
 	dumptid = curthread->td_tid;
 	dumping++;
 	dumpsys(&dumper);
+	if (!strcmp(panicstr, "In doadump()"))
+		panicstr = NULL;	/* Major hack XXX pho */
 }
 
 /*
@@ -519,6 +523,8 @@
 #endif
 
 #ifdef KDB
+	if (panicstr == NULL)
+		panicstr = "(NULL)";	/* XXX pho */
 	if (newpanic && trace_on_panic)
 		kdb_backtrace();
 	if (debugger_on_panic)

--NzB8fVQJ5HfG6fxh--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20041005130308.GA2586>