From owner-freebsd-hackers Wed Mar 10 14:39:41 1999 Delivered-To: freebsd-hackers@freebsd.org Received: from psv.oss.uswest.net (psv.oss.uswest.net [204.147.85.6]) by hub.freebsd.org (Postfix) with ESMTP id 4B8471516A for ; Wed, 10 Mar 1999 14:39:23 -0800 (PST) (envelope-from greg@psv.oss.uswest.net) Received: (from greg@localhost) by psv.oss.uswest.net (8.9.2/8.9.2) id QAA55379; Wed, 10 Mar 1999 16:38:28 -0600 (CST) (envelope-from greg) Message-ID: X-Mailer: XFMail 1.3 [p0] on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <199903101906.LAA05432@implode.root.com> Date: Wed, 10 Mar 1999 16:38:28 -0600 (CST) Reply-To: greg@uswest.net Organization: US WEST !NTERACT From: Greg Rowe To: David Greenman Subject: Re: SMP Woes Cc: freebsd-hackers@FreeBSD.ORG Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Hi David, I built a kernel with debugging and had one of the guys here take a look through the crash dump. First, is the new "Fatal Trap" DDB output and then his comments on what he saw. Anything else we should try ?? Greg On 10-Mar-99 David Greenman wrote: > There are at least two things that are strange in the following. First, > there is no call to bzero() from zalloci() (or in zlock(), _zalloc(), and > zunlock(), which are inlined). Second, the parameters to generic_bzero() > indicate that 0 bytes are to be zeroed. It's also strange that the address > of the first arg is the same as in the zalloci call, which might indicate > that the first structure element of vm_zone, which is the simplelock, is > being zeroed. It might be interesting to see if the addresses of the > generic_bzero() and simple_lock() functions are similar (such as different > by one bit or something). Fatal trap 12: page fault while in kernel mode mp_lock = 03000002; cpuid = 3; lapic.id = 02000000 fault virtual address = 0x0 fault code = supervisor write, page not present instruction pointer = 0x8:0xf020ec9f stack pointer = 0x10:0xfe5d2c34 frame pointer = 0x10:0xfe5d2c58 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 243 (cpio) interrupt mask = net tty bio cam <- SMP: XXX kernel: type 12 trap, code=0 Stopped at generic_bzero+0xf: repe stosl %es:(%edi) db> trace generic_bzero(f3283f80,0,f47f7000,fe5d2c90,fe5d2c98) at generic_bzero+0xf zalloci(f3283f80,f4880100,f47f7000,6f7f9,fe540ec0) at zalloci+0x29 getnewvnode(1,f33f0400,f3266200,fe5d2cfc,100) at getnewvnode+0x2f8 ffs_vget(f33f0400,6f7f9,fe5d2d7c,ff77d700,fe5d2edc) at ffs_vget+0xa5 ufs_lookup(fe5d2dd4,fe5d2de8,f016f6d4,fe5d2dd4,fe553021) at ufs_lookup+0x936 ufs_vnoperate(fe5d2dd4,fe553021,ff77d700,fe5d2edc,0) at ufs_vnoperate+0x15 vfs_cache_lookup(fe5d2e30,fe5d2e40,f0171ae9,fe5d2e30,fe53b640) at vfs_cache_lookup+0x248 ufs_vnoperate(fe5d2e30,fe53b640,fe5d2edc,fe5d2eb8,0) at ufs_vnoperate+0x15 lookup(fe5d2eb8,fe540ec0,f0250848,fe540ec0,1) at lookup+0x2c1 namei(fe5d2eb8,fe540ec0,f0250848,0,8057000) at namei+0x133 lstat(fe540ec0,fe5d2f94,8057000,ffffffff,3) at lstat+0x44 syscall(2f,efbf002f,3,ffffffff,efbfdc70) at syscall+0x187 Xint0x80_syscall() at Xint0x80_syscall+0x4c db> ******************************************************************************** Kgdb gives us a bit more info: (kgdb) bt ....snip.... #10 0xf020ec9f in generic_bzero () #11 0xf01f1869 in zalloci (z=0xf3283f80) at ../../vm/vm_zone.h:....snip.... (ki5 ....snip....gdb) frame 11 #11 0xf01f1869 in zalloci (z=0xf3283f80) at ../../vm/vm_zone.h:85 85 return _zget(z); (kgdb) print _zget $6 = {void *(struct vm_zone *)} 0xf01f18d4 <_zget> (kgdb) print generic_bzero $7 = {} 0xf020ec90 So, it doesn't look like a 1-bit off error.... Because zget/zalloc is an inline function, I can't seem to get gdb to print simple_lock or simple_unlock. -Chris Greg Rowe US WEST - Internet Service Operations To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message