Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 28 Sep 1997 14:09:33 -0600
From:      Steve Passe <smp@csn.net>
To:        smp@freebsd.org
Subject:   deadlock in PUSHDOWN_LEVEL_3
Message-ID:  <199709282009.OAA06017@Ilsa.StevesCafe.com>

next in thread | raw e-mail | index | archive | help
Hi,
 
I have been fighting a deadlock in the "giant lock" PUSHDOWN_LEVEL_3 code.
I would like help from any adverturous souls out there.

I just committed new versions of ipl.s and ipl_funcs.c that have breakpoint()s
at the appropriate places.  To activate the code edit smptests.h, changing

from:
#define PUSHDOWN_LEVEL_3_NOT
to:
#define PUSHDOWN_LEVEL_3

then rebuild and install the test SMP kernel.

It should deadlock anywhere from several minutes to several hours later.
Start alot of processes of some sort to hasten the lockup.

The DDB screen will look like:

------------------------------- cut ---------------------------------
cil: 0x00000100Stopped at      _breakpoint+0x1:        ret
db> 
------------------------------- cut ---------------------------------

a trace shows:

------------------------------- cut ---------------------------------
db> trace
_breakpoint(f4df0f74,5f5e101,ffffffff,80000000,f0112d21) at _breakpoint+0x1
_splhigh(80000000,f0220010,f0930010,0,0) at _splhigh+0xc4
doreti_swi() at doreti_swi+0x2a
------------------------------- cut ---------------------------------

It always occurs with doreti_swi() at the bottom of the stack.  The value
of cil is USUALLY 0x00000100, but I once saw another value (disk INT)

'cil' is the 'current interrupt level'.  It denotes the interrupt currently
active.  The value of 0100 specifies the rtc clock interrupt.  The fact that
cil is non-zero claims that we are currently inside an ISR, but this
should be impossible when at point doreti_swi.  'inside_intr' is also
always 0 when this occurs.

 I have found that you can
clear cil (db> write cil 0) tnen continue without problem.  This tends
to confirm that we really are no longer inside the ISR specific code, and
that somehow we are leaving cil set when we shouldn't.  Another possibility
is that the code in ipl.s, doreti_unpend sets cil in preparation for
the call to XresumeNN, but somehow doesn't go down that path.

Note that the giant lock still protects all the interrupt  code for this test.
It isn't removed till LEVEL_4 is enabled(don't bother, NOT ready yet)

Hopefully a fresh set of eyes can find this!

--
Steve Passe	| powered by
smp@csn.net	|            Symmetric MultiProcessor FreeBSD





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199709282009.OAA06017>