Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 22 Aug 1999 20:49:27 -0700 (PDT)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        Alan Cox <alc@cs.rice.edu>
Cc:        Luoqi Chen <luoqi@watermarkgroup.com>, freebsd-smp@FreeBSD.ORG
Subject:   Re: Weird infinite lockup in splx() (in IFCPL_UNLOCK) w/ latest CURRENT/SMP
Message-ID:  <199908230349.UAA01534@apollo.backplane.com>
References:  <199908230023.RAA00824@apollo.backplane.com> <19990822213421.E47586@nonpc.cs.rice.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
:>     I had a weird lockup w/ the latest CURRENT while doing an installworld.
:> 
:>     I tracked the lockup down to an infinite loop in the current process...
:>     an infinite loop in splx()!
:> 
:>     I've never had this lockup before so I believe it to be due to
:>     some recent change.   The lockup occured during very heavy use of
:>     the lockmgr on the same vnode lock (the uudecode binary) during a 
:>     parallel installworld.
:> 
:>     My kernel was as of today, 22Aug.  This is on a 2xPIII/450 SMP box.
:> 
:>     I believe there to be a race condition somewhere.
:> 
:> 					-Matt
:> 
:> ...
:> #9  0xc021f1e6 in scgetc (sc=0xc02a37e0, flags=2)
:>     at ../../dev/syscons/syscons.c:3782
:> #10 0xc021aef1 in sckbdevent (thiskbd=0xc02b4d20, event=0, arg=0xc02a37e0)
:>     at ../../dev/syscons/syscons.c:663
:> #11 0xc021481f in atkbd_intr (kbd=0xc02b4d20, arg=0x0)
:>     at ../../dev/kbd/atkbd.c:439
:> #12 0xc024b764 in atkbd_isa_intr (arg=0xc02b4d20) at ../../isa/atkbd_isa.c:123
:> #13 0xc0243194 in splx (ipl=3224034576) at ../../i386/isa/ipl_funcs.c:275
:> 
:> 		^^^^^^^ it was looping splx, in IFCPL_UNLOCK.
:> 
:
:Are you sure about this?  There's no loop in IFCPL_UNLOCK or splx (proper)
:for that matter.  Only the IFCPL_LOCK at the beginning and the splz at
:the end contain loops within them.

    Hmm.  Here's a dissasembly:

0xc024316c <splx>:      pushl  %ebx
0xc024316d <splx+1>:    movl   0x8(%esp,1),%ebx
0xc0243171 <splx+5>:    pushl  $0xc02c1940
0xc0243176 <splx+10>:   call   0xc0234324 <ss_lock>
0xc024317b <splx+15>:   movl   %ebx,0xc029e4b0
0xc0243181 <splx+21>:   notl   %ebx
0xc0243183 <splx+23>:   movl   0xc029e4d4,%eax
0xc0243188 <splx+28>:   andl   %eax,%ebx
0xc024318a <splx+30>:   pushl  $0xc02c1940
0xc024318f <splx+35>:   call   0xc023439c <ss_unlock>
0xc0243194 <splx+40>:   addl   $0x8,%esp	<----------
0xc0243197 <splx+43>:   testl  %ebx,%ebx
0xc0243199 <splx+45>:   je     0xc02431b1 <splx+69>
0xc024319b <splx+47>:   movl   $0xa0,%eax
0xc02431a0 <splx+52>:   addl   %fs:0x0,%eax
0xc02431a7 <splx+59>:   cmpl   $0x0,(%eax)
0xc02431aa <splx+62>:   jne    0xc02431b1 <splx+69>
0xc02431ac <splx+64>:   call   0xc0227b70 <splz>
0xc02431b1 <splx+69>:   popl   %ebx
0xc02431b2 <splx+70>:   ret    

    I'm confused.  The machine was definitely locked up --- it was definitely
    in an infinite loop (there were a lot of runnable processes but not
    running processes except the one that was stuck looping in supervisor
    mode).  ping worked, and ctl-alt-esc worked, so interrupts worked, but
    nothing else.

    This is an unlock call, perhaps the kmem_alloc_wait() is stuck in a tight
    loop and I just happened to catch it in the sti.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199908230349.UAA01534>