From owner-freebsd-hackers  Wed Mar 10 14:39:41 1999
Delivered-To: freebsd-hackers@freebsd.org
Received: from psv.oss.uswest.net (psv.oss.uswest.net [204.147.85.6])
	by hub.freebsd.org (Postfix) with ESMTP id 4B8471516A
	for <freebsd-hackers@FreeBSD.ORG>; Wed, 10 Mar 1999 14:39:23 -0800 (PST)
	(envelope-from greg@psv.oss.uswest.net)
Received: (from greg@localhost)
	by psv.oss.uswest.net (8.9.2/8.9.2) id QAA55379;
	Wed, 10 Mar 1999 16:38:28 -0600 (CST)
	(envelope-from greg)
Message-ID: <XFMail.990310163828.greg@uswest.net>
X-Mailer: XFMail 1.3 [p0] on FreeBSD
X-Priority: 3 (Normal)
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
In-Reply-To: <199903101906.LAA05432@implode.root.com>
Date: Wed, 10 Mar 1999 16:38:28 -0600 (CST)
Reply-To: greg@uswest.net
Organization: US WEST !NTERACT
From: Greg Rowe <greg@uswest.net>
To: David Greenman <dg@root.com>
Subject: Re: SMP Woes
Cc: freebsd-hackers@FreeBSD.ORG
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

Hi David,

 I built a kernel with debugging and had one of the guys here take a look
through the crash dump. First, is the new "Fatal Trap" DDB output and then his
comments on what he saw. Anything else we should try ??

Greg

On 10-Mar-99 David Greenman wrote:
>    There are at least two things that are strange in the following. First,
> there is no call to bzero() from zalloci() (or in zlock(), _zalloc(), and
> zunlock(), which are inlined). Second, the parameters to generic_bzero()
> indicate that 0 bytes are to be zeroed. It's also strange that the address
> of the first arg is the same as in the zalloci call, which might indicate
> that the first structure element of vm_zone, which is the simplelock, is
> being zeroed. It might be interesting to see if the addresses of the
> generic_bzero() and simple_lock() functions are similar (such as different
> by one bit or something).

Fatal trap 12: page fault while in kernel mode
mp_lock = 03000002; cpuid = 3; lapic.id = 02000000
fault virtual address   = 0x0
fault code              = supervisor write, page not present
instruction pointer     = 0x8:0xf020ec9f
stack pointer           = 0x10:0xfe5d2c34
frame pointer           = 0x10:0xfe5d2c58
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 243 (cpio)
interrupt mask          = net tty bio cam  <- SMP: XXX
kernel: type 12 trap, code=0
Stopped at      generic_bzero+0xf:      repe stosl      %es:(%edi)
db> trace
generic_bzero(f3283f80,0,f47f7000,fe5d2c90,fe5d2c98) at generic_bzero+0xf
zalloci(f3283f80,f4880100,f47f7000,6f7f9,fe540ec0) at zalloci+0x29
getnewvnode(1,f33f0400,f3266200,fe5d2cfc,100) at getnewvnode+0x2f8
ffs_vget(f33f0400,6f7f9,fe5d2d7c,ff77d700,fe5d2edc) at ffs_vget+0xa5
ufs_lookup(fe5d2dd4,fe5d2de8,f016f6d4,fe5d2dd4,fe553021) at ufs_lookup+0x936
ufs_vnoperate(fe5d2dd4,fe553021,ff77d700,fe5d2edc,0) at ufs_vnoperate+0x15
vfs_cache_lookup(fe5d2e30,fe5d2e40,f0171ae9,fe5d2e30,fe53b640) at
vfs_cache_lookup+0x248
ufs_vnoperate(fe5d2e30,fe53b640,fe5d2edc,fe5d2eb8,0) at ufs_vnoperate+0x15
lookup(fe5d2eb8,fe540ec0,f0250848,fe540ec0,1) at lookup+0x2c1
namei(fe5d2eb8,fe540ec0,f0250848,0,8057000) at namei+0x133
lstat(fe540ec0,fe5d2f94,8057000,ffffffff,3) at lstat+0x44
syscall(2f,efbf002f,3,ffffffff,efbfdc70) at syscall+0x187
Xint0x80_syscall() at Xint0x80_syscall+0x4c
db> 
********************************************************************************

Kgdb gives us a bit more info:

(kgdb) bt
....snip....
#10 0xf020ec9f in generic_bzero ()
#11 0xf01f1869 in zalloci (z=0xf3283f80) at ../../vm/vm_zone.h:....snip....
(ki5
....snip....gdb) frame 11
#11 0xf01f1869 in zalloci (z=0xf3283f80) at ../../vm/vm_zone.h:85
85                      return _zget(z);
(kgdb) print _zget
$6 = {void *(struct vm_zone *)} 0xf01f18d4 <_zget>
(kgdb) print generic_bzero
$7 = {<text variable, no debug info>} 0xf020ec90 <generic_bzero>

So, it doesn't look like a 1-bit off error....

Because zget/zalloc is an inline function, I can't seem to get gdb to
print simple_lock or simple_unlock.

-Chris


Greg Rowe <greg@uswest.net>   US WEST -  Internet Service Operations



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message