From owner-freebsd-hackers@FreeBSD.ORG Sun Apr 20 01:16:29 2003 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9456D37B401 for ; Sun, 20 Apr 2003 01:16:29 -0700 (PDT) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id B161943F85 for ; Sun, 20 Apr 2003 01:16:28 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (scratch.catspoiler.org [192.168.101.3]) by gw.catspoiler.org (8.12.6/8.12.6) with ESMTP id h3K8GGXB020865; Sun, 20 Apr 2003 01:16:20 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <200304200816.h3K8GGXB020865@gw.catspoiler.org> Date: Sun, 20 Apr 2003 01:16:16 -0700 (PDT) From: Don Lewis To: mitya@cavia.pp.ru In-Reply-To: <20030420073047.GA64397@fling-wing.demos.su> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii cc: hackers@FreeBSD.org Subject: Re: Repeated similar panics on -STABLE X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 20 Apr 2003 08:16:29 -0000 On 20 Apr, Dmitry Sivachenko wrote: > On Sat, Apr 19, 2003 at 03:15:20PM -0700, Don Lewis wrote: >> Dmitry, >> >> If you still have the core and kernel files, I'd appreciate it if you >> could point gdb at them and print the following stuff from the malloc() >> stack frame. >> >> indx >> &bucket[0] >> kbp >> *kpb >> allocsize >> npg >> cp >> freep >> *freep >> va >> > > In backtrace I posted malloc was called twice. Since I am not sure > which one you are interested in, I am sending these values for both. The second one, which is where the first trap occurred. > (kgdb) up 6 > #6 0xc015daff in malloc (size=128, type=0xc02aca60, flags=2) > at /mnt/se3/releng_4/src/sys/kern/kern_malloc.c:243 > 243 va = kbp->kb_next; > (kgdb) p indx > $1 = 7 > (kgdb) p &bucket[0] > $2 = (struct kmembuckets *) 0xc02bcee0 > (kgdb) p kbp > $3 = (struct kmembuckets *) 0x5cdd8000 > (kgdb) p *kpb > No symbol "kpb" in current context. > (kgdb) p *kbp > Cannot access memory at address 0x5cdd8000. > (kgdb) p allocsize > $4 = -730301488 > (kgdb) p npg > $5 = 0 > (kgdb) p cp > $6 = 0x0 > (kgdb) p freep > $7 = (struct freelist *) 0x0 > (kgdb) p va > $8 = 0x5cdd8000
> (kgdb) > > > (kgdb) up 16 > #22 0xc015daff in malloc (size=72, type=0xc029fee0, flags=0) > at /mnt/se3/releng_4/src/sys/kern/kern_malloc.c:243 > 243 va = kbp->kb_next; > (kgdb) p indx > $9 = 7 > (kgdb) p &bucket[0] > $10 = (struct kmembuckets *) 0xc02bcee0 > (kgdb) p kbp > $11 = (struct kmembuckets *) 0x5cdd8000 That's not good ... if index is 7, and sizeof struct kmembuckets is 36, I believe that kbp = &bucket[indx]; should assign 0xc02bcfdc to kbp. It looks like kbp might be getting stomped on somehow. > (kgdb) p *kbp > Cannot access memory at address 0x5cdd8000. Yeah, kbp is pointing at a non-existent memory location. > (kgdb) p allocsize > $12 = -349729108 > (kgdb) p npg > $13 = 0 Since allocsize is is garbage that doesn't show any relationship to size, and since npg isn't consistent with either size of allocsize, it looks to me like kbp->kb_next is initially non-NULL, so we're not executing the "if" block. if (kbp->kb_next == NULL) { kbp->kb_last = NULL; if (size > MAXALLOCSAVE) allocsize = roundup(size, PAGE_SIZE); else allocsize = 1 << indx; npg = btoc(allocsize); va = (caddr_t) kmem_malloc(kmem_map, (vm_size_t)ctob(npg), flags ); The other possibility is that these are also getting smashed somehow. > (kgdb) p cp > $14 = 0x0 > (kgdb) p freep > $15 = (struct freelist *) 0x0 If we don't get into the "if" block, these will contain leftover garbages from the stack. > (kgdb) p va > $16 = 0x5cdd8000
Interesting ... how does va get the same value as kbp? Is it coming from the va = (caddr_t) kmem_malloc() assignment in the "if" block that we don't think we're executing, or the va = kbp->kb_next; assignment that is causing the panic? If kbp is pointing to a non-existent page, why does Terry's patch seem to fix the problem for you? I wonder if things are getting further munged after the trap occurs? That would make it more difficult to track down the problem from the core file. Something else of interest to print is bucket[7] bucket[7].kb_next and bucket[7].kb_last might shed some light. Something interesting is that the fault address, and the values of va and kbp are the same in both stack frames ... One other question ... is your kernel compiled with INVARIANTS? That changes the definition of struct freelist.