Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 19 Apr 2003 16:50:28 -0700
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Don Lewis <truckman@FreeBSD.org>
Cc:        hackers@FreeBSD.org
Subject:   Re: Repeated similar panics on -STABLE
Message-ID:  <3EA1E0C4.F097ADBA@mindspring.com>
References:  <200304192156.h3JLuDXB019980@gw.catspoiler.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Don Lewis wrote:
> > Take an interrupt somewhere around here, and have the available
> > entries removed from the freelist by an interrupt level driver.
> >
> > Or take a page fault, and have the same thing happen with
> > page-related metadata coming from the freelist in question.
> 
> How can an interrupt or another process touch the freelist while we're
> protected by splmem()?  If that were possible, the block could be stolen
> out from under us in the code below between the assignment to va and the
> update of kbb->kb_next, allocating the same block of memory to two
> different consumers.

Personally, I think it's a page fault.

In any case, the stack traces were posted about 02 Apr 2003, and
the patch fixes the problem empirically, so we can argue about why,
or we can fix the problem for everyone.


> >                         if (cp <= va)
> >                                 break;
> >                         cp -= allocsize;
> >
> > ?  The "<= saves you.
> 
> It only works because allocsize evenly divides npg*PAGE_SIZE.

Yes.

> If there was a heavy consumer of 129 byte blocks, someone might get
> the bright idea to allocate a special bucket for them because a lot
> more 129 byte blocks fit in a page than 256 byte blocks.  As the
> "for" loop iterated, we'd get to the point where va < cp < va +
> allocsize.  The "<=" test would pass, we'd decrement cp, causing it
> to be less than va, do the
>         freep->next = cp;
> assignment, return to the top of the "for" loop, do the
>         freep = (struct freelist *)cp;
> assignment, so that freep now points outside the block of memory
> allocated by kmem_malloc().  Now the "<=" test will kick us out of the
> loop and we'll do the
>         freep->next = savedlist;
> assignment and stomping on someone else's memory.

Yes.  8-).  I did exactly this, in fact, at Clickarray, though I
rounded to an 8 byte alignment boundary.  The way I did it was to
figure out the minimal number of pages to allocate at one time that
resulted in an even number of structures.

This actually saved a hell of a lot of RAM.  This is the method I
described in my first response to you.  8-).

> A safer, but slightly
> more expensive test would be
>         cp < va + allocsize

This is really painful, actually.  You can lose a full object per
page doing this.  It also makes the coelesce logic almost
incomprehensible (at least in the implementation I tried).

-- Tery



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3EA1E0C4.F097ADBA>