From owner-freebsd-current  Sun Apr 28  6:53:20 2002
Delivered-To: freebsd-current@freebsd.org
Received: from fledge.watson.org (fledge.watson.org [204.156.12.50])
	by hub.freebsd.org (Postfix) with ESMTP id E0DDB37B41F
	for <current@FreeBSD.org>; Sun, 28 Apr 2002 06:53:12 -0700 (PDT)
Received: from fledge.watson.org (fledge.pr.watson.org [192.0.2.3])
	by fledge.watson.org (8.11.6/8.11.6) with SMTP id g3SDqvw74829
	for <current@FreeBSD.org>; Sun, 28 Apr 2002 09:52:58 -0400 (EDT)
	(envelope-from robert@fledge.watson.org)
Date: Sun, 28 Apr 2002 09:52:57 -0400 (EDT)
From: Robert Watson <rwatson@FreeBSD.org>
X-Sender: robert@fledge.watson.org
To: current@FreeBSD.org
Subject: page fault in _mtx_lock_flags
Message-ID: <Pine.NEB.3.96L.1020428094421.64976J-100000@fledge.watson.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-current@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-current.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-current>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-current>
X-Loop: FreeBSD.ORG


As usual, GENERIC -CURRENT head from last night, from the main tree. 
Dual-proc SMP box netbooted using PXE.  System usually boots, does a
buildkernel -j 8 over NFS, then reboots and repeats.  This time it didn't. 

I actually have two boxes doing this, which does seem to double the rate
of panics I get.

APIC_IO: Testing 8254 interrupt delivery
APIC_IO: Broken MP table detected: 8254 is not connected to IOAPIC #0 intpin 2
APIC_IO: routing 8254 via 8259 and IOAPIC #0 intpin 0
ad0: 19458MB <ST320420A> [39535/16/63] at ata0-master UDMA33
acd0: CDROM <MATSHITA CR-176> at ata1-master PIO4
doSuMnPt:i nAgP  rCoPoUt  #f1r oLma unnfcsh:etsray irq 10
NFS ROOT: 192.168.50.1:/cboss/devel/nfsroot/crash1.cboss.tislabs.com


Fatal trap 12: page fault while in kernel mode
cpuid = 0; lapic.id = 00000000
fault virtual address   = 0x7974748b
fault code              = supervisor write, page not present
instruction pointer     = 0x8:0xc02449b6
stack pointer           = 0x10:0xc93dea14
frame pointer           = 0x10:0xc93dea20
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 41 (sh)
kernel: type 12 trap, code=0
Stopped at      _mtx_lock_flags+0x42:   lock cmpxchgl   %ecx,0x18(%ebx)
db> trace
_mtx_lock_flags(79747473,0,c03cb862,e3) at _mtx_lock_flags+0x42
lockmgr(c93a8228,1000001,0,c8f27100) at lockmgr+0x42
vfs_busy(c93a8200,0,0,c8f27100) at vfs_busy+0x58
lookup(c93dec28,1a4,c8f03034,c93ded20,c8f27100) at lookup+0x3a2
namei(c93dec28,1a4,c8f03034,c93ded20,0) at namei+0x1c8
vn_open_cred(c93dec28,c93debf4,1a4,c3f80c80,c93dece8) at vn_open_cred+0x67
vn_open(c93dec28,c93debf4,1a4,c8f271dc,c8f27000) at vn_open+0x18
open(c8f27100,c93ded20,8125005,0,0) at open+0x158
syscall(2f,2f,2f,0,0) at syscall+0x223
syscall_with_err_pushed() at syscall_with_err_pushed+0x1b
--- syscall (5, FreeBSD ELF, open), eip = 0x808969b, esp = 0xbfbff8f0, ebp
= 0xbfbff91c ---
db> 

(kgdb) l *_mtx_lock_flags+0x42
0xc02449b6 is in _mtx_lock_flags (machine/atomic.h:139).
134     static __inline int
135     atomic_cmpset_int(volatile u_int *dst, u_int exp, u_int src)
136     {
137             int res = exp;
138
139             __asm __volatile (
140             "       " __XSTRING(MPLOCKED) " "
141             "       cmpxchgl %1,%2 ;        "
142             "       setz    %%al ;          "
143             "       movzbl  %%al,%0 ;       "
(gdb) l *lockmgr+0x42
0xc0242376 is in lockmgr (../../../kern/kern_lock.c:228).
223                     pid = LK_KERNPROC;
224             else
225                     pid = td->td_proc->p_pid;
226
227             mtx_lock(lkp->lk_interlock);
228             if (flags & LK_INTERLOCK) {
229                     mtx_assert(interlkp, MA_OWNED | MA_NOTRECURSED);
230                     mtx_unlock(interlkp);
231             }
232

Attempts to get into serial gdb failed:

Fatal trap 12: page fault while in kernel mode
cpuid = 1; lapic.id = 01000000
fault virtual address   = 0x6aa
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc93debf4
stack pointer           = 0x10:0xc93debd4
frame pointer           = 0x10:0xc93dec28
tokdke nselg trnatp             1=2 waith  0ixn0terlruptts  0dxisfablfed
cpan ic: bblo   ck      a=b leP Lsle,epp rlosc k1 ,(sdleefep2  m1ut egx)a
pro
ssroclescsor  e../a.g./ .=. /ii38e6/iu386 /etnraapl.cd:,7 11e
pcmeu, I O=P L0 ;=  l0
ccu.rrde =t 0p00o0000s0
"Deb1u g(gsehr)(
$T0b08:f4eb3dc9;05:28ec3dc9;04:d4eb3dc9;#01~

I'm guessing that I'm dealing with an smp/locking issue there, but
unfortunately I didn't get much further:

(kgdb) target remote /dev/cuaa0
Remote debugging using /dev/cuaa0
0xc93debf4 in ?? ()
(kgdb) bt
#0  0xc93debf4 in ?? ()
#1  0x0 in ?? ()

Normally getting into serial gdb works OK, perhaps there's an interaction
from the mutex code.

Robert N M Watson             FreeBSD Core Team, TrustedBSD Project
robert@fledge.watson.org      NAI Labs, Safeport Network Services


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message