Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 20 Apr 2003 13:36:28 +0400
From:      Dmitry Sivachenko <demon@FreeBSD.org>
To:        Don Lewis <truckman@FreeBSD.org>
Cc:        hackers@FreeBSD.org
Subject:   Re: Repeated similar panics on -STABLE
Message-ID:  <20030420093628.GA76333@fling-wing.demos.su>
In-Reply-To: <200304200816.h3K8GGXB020865@gw.catspoiler.org>
References:  <20030420073047.GA64397@fling-wing.demos.su> <200304200816.h3K8GGXB020865@gw.catspoiler.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Apr 20, 2003 at 01:16:16AM -0700, Don Lewis wrote:
> On 20 Apr, Dmitry Sivachenko wrote:
> > On Sat, Apr 19, 2003 at 03:15:20PM -0700, Don Lewis wrote:
<snip>

> If we don't get into the "if" block, these will contain leftover
> garbages from the stack.
> 
> > (kgdb) p va
> > $16 = 0x5cdd8000 <Address 0x5cdd8000 out of bounds>
> 
> Interesting ... how does va get the same value as kbp?  Is it coming
> from the
> 	va = (caddr_t) kmem_malloc()
> assignment in the "if" block that we don't think we're executing, or the
> 	va = kbp->kb_next;
> assignment that is causing the panic?
> 
> If kbp is pointing to a non-existent page, why does Terry's patch seem
> to fix the problem for you?

Well, here is probably a misunderstanding..
We did NOT apply Terry's patch.  Let me quote a bit from my e-mail to Terry:

TL> Did my patch fix your problem?
TL>
TL> Or did you tune your kernel, as I suggested, to fix your problem?
TL>
TL> Or is it still a problem?

DS>We changed maxusers from 512 to 0 and decreased the number of
DS>NMBCLUSTERS.  Now everything is working fine, but since these panics occured
DS>about once a week I can't say for sure they are completely gone.
DS>Let's wait at least one more week...

Thus I wanted to say that we only tuned maxusers and NMBCLUSTERS.  We
run virgin -STABLE kernel without any patches.  Probably my english leaves much
to be desired ;-((


> 
> I wonder if things are getting further munged after the trap occurs?
> That would make it more difficult to track down the problem from the
> core file.
> 
> Something else of interest to print is
> 	bucket[7]
> bucket[7].kb_next and bucket[7].kb_last might shed some light.
> 

(kgdb) up 22
#22 0xc015daff in malloc (size=72, type=0xc029fee0, flags=0)
    at /mnt/se3/releng_4/src/sys/kern/kern_malloc.c:243
243             va = kbp->kb_next;
(kgdb) p bucket[7]
$1 = {kb_next = 0x5cdd8000 <Address 0x5cdd8000 out of bounds>,
  kb_last = 0xc8fcb000 "", kb_calls = 2127276, kb_total = 4256,
  kb_elmpercl = 32, kb_totalfree = 1264, kb_highwat = 160, kb_couldfree = 5497}
(kgdb) p bucket[7].kb_next
$2 = 0x5cdd8000 <Address 0x5cdd8000 out of bounds>
(kgdb) p bucket[7].kb_last
$3 = 0xc8fcb000 ""
(kgdb)

> 
> Something interesting is that the fault address, and the values of va
> and kbp are the same in both stack frames ...
> 
> 
> One other question ... is your kernel compiled with INVARIANTS?  That
> changes the definition of struct freelist.

Without.


PS: If you need additional information, feel free to ask.
I am more than willing to help.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030420093628.GA76333>