Date: Wed, 10 Mar 1999 23:32:43 -0800 (PST) From: Matthew Dillon <dillon@apollo.backplane.com> To: Greg Rowe <greg@uswest.net> Cc: David Greenman <dg@root.com>, freebsd-hackers@FreeBSD.ORG Subject: Re: SMP Woes Message-ID: <199903110732.XAA61853@apollo.backplane.com> References: <XFMail.990310163828.greg@uswest.net>
next in thread | previous in thread | raw e-mail | index | archive | help
:Hi David,
:
: I built a kernel with debugging and had one of the guys here take a look
:through the crash dump. First, is the new "Fatal Trap" DDB output and then his
:comments on what he saw. Anything else we should try ??
:
:Greg
:
:On 10-Mar-99 David Greenman wrote:
:> There are at least two things that are strange in the following. First,
:> there is no call to bzero() from zalloci() (or in zlock(), _zalloc(), and
:> zunlock(), which are inlined). Second, the parameters to generic_bzero()
:> indicate that 0 bytes are to be zeroed. It's also strange that the address
Well, zalloci() can call _zget(), which can call bzero(). Maybe the
underscore in the _zget() is preventing DDB from listing it.
The call offset in zalloci() in the trace below is zalloci+0x29. If
you disassemble zalloci, you will note that this is the call-return
point for _zget:
0xf020b59f <zalloci+23>: pushl %ebx
0xf020b5a0 <zalloci+24>: call 0xf020b5f8 <_zget>
0xf020b5a5 <zalloci+29>: movl %eax,%ebx
The generic_bzero() call arguments are either bogus, or the stack
length argument has been modified by generic_bzero().
The fault virtual address is 0, but vm_page_alloc() seems to properly
test for m == NULL so this should not be possible.
It would be useful to print out the contents of *m from the _zget
frame, and also the *z structure.
--
If this machine has a large amount of memory, it may have overrun its
KVA allocation. This can also happen if you have a large 'maxusers'
in the kernel config. If so, try reducing maxusers to 128 or less.
-Matt
Matthew Dillon
<dillon@backplane.com>
:Fatal trap 12: page fault while in kernel mode
:mp_lock = 03000002; cpuid = 3; lapic.id = 02000000
:fault virtual address = 0x0
:fault code = supervisor write, page not present
:instruction pointer = 0x8:0xf020ec9f
:stack pointer = 0x10:0xfe5d2c34
:frame pointer = 0x10:0xfe5d2c58
:code segment = base 0x0, limit 0xfffff, type 0x1b
: = DPL 0, pres 1, def32 1, gran 1
:processor eflags = interrupt enabled, resume, IOPL = 0
:current process = 243 (cpio)
:interrupt mask = net tty bio cam <- SMP: XXX
:kernel: type 12 trap, code=0
:Stopped at generic_bzero+0xf: repe stosl %es:(%edi)
:db> trace
:generic_bzero(f3283f80,0,f47f7000,fe5d2c90,fe5d2c98) at generic_bzero+0xf
:zalloci(f3283f80,f4880100,f47f7000,6f7f9,fe540ec0) at zalloci+0x29
:getnewvnode(1,f33f0400,f3266200,fe5d2cfc,100) at getnewvnode+0x2f8
:...
:....snip....gdb) frame 11
:#11 0xf01f1869 in zalloci (z=0xf3283f80) at ../../vm/vm_zone.h:85
:85 return _zget(z);
:(kgdb) print _zget
:$6 = {void *(struct vm_zone *)} 0xf01f18d4 <_zget>
:(kgdb) print generic_bzero
:$7 = {<text variable, no debug info>} 0xf020ec90 <generic_bzero>
:
:So, it doesn't look like a 1-bit off error....
:
:Because zget/zalloc is an inline function, I can't seem to get gdb to
:print simple_lock or simple_unlock.
:
:-Chris
:
:
:Greg Rowe <greg@uswest.net> US WEST - Internet Service Operations
:
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199903110732.XAA61853>
