Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 10 Mar 1999 23:32:43 -0800 (PST)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        Greg Rowe <greg@uswest.net>
Cc:        David Greenman <dg@root.com>, freebsd-hackers@FreeBSD.ORG
Subject:   Re: SMP Woes
Message-ID:  <199903110732.XAA61853@apollo.backplane.com>
References:   <XFMail.990310163828.greg@uswest.net>

next in thread | previous in thread | raw e-mail | index | archive | help

:Hi David,
:
: I built a kernel with debugging and had one of the guys here take a look
:through the crash dump. First, is the new "Fatal Trap" DDB output and then his
:comments on what he saw. Anything else we should try ??
:
:Greg
:
:On 10-Mar-99 David Greenman wrote:
:>    There are at least two things that are strange in the following. First,
:> there is no call to bzero() from zalloci() (or in zlock(), _zalloc(), and
:> zunlock(), which are inlined). Second, the parameters to generic_bzero()
:> indicate that 0 bytes are to be zeroed. It's also strange that the address

    Well, zalloci() can call _zget(), which can call bzero().  Maybe the
    underscore in the _zget() is preventing DDB from listing it.

    The call offset in zalloci() in the trace below is zalloci+0x29.  If
    you disassemble zalloci, you will note that this is the call-return
    point for _zget:

0xf020b59f <zalloci+23>:        pushl  %ebx
0xf020b5a0 <zalloci+24>:        call   0xf020b5f8 <_zget>
0xf020b5a5 <zalloci+29>:        movl   %eax,%ebx

    The generic_bzero() call arguments are either bogus, or the stack
    length argument has been modified by generic_bzero().

    The fault virtual address is 0, but vm_page_alloc() seems to properly
    test for m == NULL so this should not be possible.

    It would be useful to print out the contents of *m from the _zget
    frame, and also the *z structure.

    --

    If this machine has a large amount of memory, it may have overrun its
    KVA allocation.  This can also happen if you have a large 'maxusers'
    in the kernel config.  If so, try reducing maxusers to 128 or less.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>

:Fatal trap 12: page fault while in kernel mode
:mp_lock = 03000002; cpuid = 3; lapic.id = 02000000
:fault virtual address   = 0x0
:fault code              = supervisor write, page not present
:instruction pointer     = 0x8:0xf020ec9f
:stack pointer           = 0x10:0xfe5d2c34
:frame pointer           = 0x10:0xfe5d2c58
:code segment            = base 0x0, limit 0xfffff, type 0x1b
:                        = DPL 0, pres 1, def32 1, gran 1
:processor eflags        = interrupt enabled, resume, IOPL = 0
:current process         = 243 (cpio)
:interrupt mask          = net tty bio cam  <- SMP: XXX
:kernel: type 12 trap, code=0
:Stopped at      generic_bzero+0xf:      repe stosl      %es:(%edi)
:db> trace
:generic_bzero(f3283f80,0,f47f7000,fe5d2c90,fe5d2c98) at generic_bzero+0xf
:zalloci(f3283f80,f4880100,f47f7000,6f7f9,fe540ec0) at zalloci+0x29
:getnewvnode(1,f33f0400,f3266200,fe5d2cfc,100) at getnewvnode+0x2f8
:...

:....snip....gdb) frame 11
:#11 0xf01f1869 in zalloci (z=0xf3283f80) at ../../vm/vm_zone.h:85
:85                      return _zget(z);
:(kgdb) print _zget
:$6 = {void *(struct vm_zone *)} 0xf01f18d4 <_zget>
:(kgdb) print generic_bzero
:$7 = {<text variable, no debug info>} 0xf020ec90 <generic_bzero>
:
:So, it doesn't look like a 1-bit off error....
:
:Because zget/zalloc is an inline function, I can't seem to get gdb to
:print simple_lock or simple_unlock.
:
:-Chris
:
:
:Greg Rowe <greg@uswest.net>   US WEST -  Internet Service Operations
:



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199903110732.XAA61853>