From owner-freebsd-smp Sun Aug 22 20:49:37 1999 Delivered-To: freebsd-smp@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [209.157.86.2]) by hub.freebsd.org (Postfix) with ESMTP id 29AE014E2F for ; Sun, 22 Aug 1999 20:49:30 -0700 (PDT) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.9.3/8.9.1) id UAA01534; Sun, 22 Aug 1999 20:49:27 -0700 (PDT) (envelope-from dillon) Date: Sun, 22 Aug 1999 20:49:27 -0700 (PDT) From: Matthew Dillon Message-Id: <199908230349.UAA01534@apollo.backplane.com> To: Alan Cox Cc: Luoqi Chen , freebsd-smp@FreeBSD.ORG Subject: Re: Weird infinite lockup in splx() (in IFCPL_UNLOCK) w/ latest CURRENT/SMP References: <199908230023.RAA00824@apollo.backplane.com> <19990822213421.E47586@nonpc.cs.rice.edu> Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org :> I had a weird lockup w/ the latest CURRENT while doing an installworld. :> :> I tracked the lockup down to an infinite loop in the current process... :> an infinite loop in splx()! :> :> I've never had this lockup before so I believe it to be due to :> some recent change. The lockup occured during very heavy use of :> the lockmgr on the same vnode lock (the uudecode binary) during a :> parallel installworld. :> :> My kernel was as of today, 22Aug. This is on a 2xPIII/450 SMP box. :> :> I believe there to be a race condition somewhere. :> :> -Matt :> :> ... :> #9 0xc021f1e6 in scgetc (sc=0xc02a37e0, flags=2) :> at ../../dev/syscons/syscons.c:3782 :> #10 0xc021aef1 in sckbdevent (thiskbd=0xc02b4d20, event=0, arg=0xc02a37e0) :> at ../../dev/syscons/syscons.c:663 :> #11 0xc021481f in atkbd_intr (kbd=0xc02b4d20, arg=0x0) :> at ../../dev/kbd/atkbd.c:439 :> #12 0xc024b764 in atkbd_isa_intr (arg=0xc02b4d20) at ../../isa/atkbd_isa.c:123 :> #13 0xc0243194 in splx (ipl=3224034576) at ../../i386/isa/ipl_funcs.c:275 :> :> ^^^^^^^ it was looping splx, in IFCPL_UNLOCK. :> : :Are you sure about this? There's no loop in IFCPL_UNLOCK or splx (proper) :for that matter. Only the IFCPL_LOCK at the beginning and the splz at :the end contain loops within them. Hmm. Here's a dissasembly: 0xc024316c : pushl %ebx 0xc024316d : movl 0x8(%esp,1),%ebx 0xc0243171 : pushl $0xc02c1940 0xc0243176 : call 0xc0234324 0xc024317b : movl %ebx,0xc029e4b0 0xc0243181 : notl %ebx 0xc0243183 : movl 0xc029e4d4,%eax 0xc0243188 : andl %eax,%ebx 0xc024318a : pushl $0xc02c1940 0xc024318f : call 0xc023439c 0xc0243194 : addl $0x8,%esp <---------- 0xc0243197 : testl %ebx,%ebx 0xc0243199 : je 0xc02431b1 0xc024319b : movl $0xa0,%eax 0xc02431a0 : addl %fs:0x0,%eax 0xc02431a7 : cmpl $0x0,(%eax) 0xc02431aa : jne 0xc02431b1 0xc02431ac : call 0xc0227b70 0xc02431b1 : popl %ebx 0xc02431b2 : ret I'm confused. The machine was definitely locked up --- it was definitely in an infinite loop (there were a lot of runnable processes but not running processes except the one that was stuck looping in supervisor mode). ping worked, and ctl-alt-esc worked, so interrupts worked, but nothing else. This is an unlock call, perhaps the kmem_alloc_wait() is stuck in a tight loop and I just happened to catch it in the sti. -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message