Date: Sat, 19 Apr 2003 14:56:13 -0700 (PDT) From: Don Lewis <truckman@FreeBSD.org> To: tlambert2@mindspring.com Cc: hackers@FreeBSD.org Subject: Re: Repeated similar panics on -STABLE Message-ID: <200304192156.h3JLuDXB019980@gw.catspoiler.org> In-Reply-To: <3EA19E3C.661C9A28@mindspring.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 19 Apr, Terry Lambert wrote: > Don Lewis wrote: >> Ok, here's a Reader's Digest version of the code in question: >> >> if (kbp->kb_next == NULL) { >> kbp->kb_last = NULL; >> if (size > MAXALLOCSAVE) >> allocsize = roundup(size, PAGE_SIZE); >> else >> allocsize = 1 << indx; >> npg = btoc(allocsize); >> va = (caddr_t) kmem_malloc(kmem_map, (vm_size_t)ctob(npg), flags >> ); >> /* >> * Just in case we blocked while allocating memory, >> * and someone else also allocated memory for this >> * bucket, don't assume the list is still empty. >> */ >> savedlist = kbp->kb_next; >> kbp->kb_next = cp = va + (npg * PAGE_SIZE) - allocsize; >> for (;;) { >> freep = (struct freelist *)cp; >> if (cp <= va) >> break; >> cp -= allocsize; >> freep->next = cp; >> } >> freep->next = savedlist; > > Take an interrupt somewhere around here, and have the available > entries removed from the freelist by an interrupt level driver. > > Or take a page fault, and have the same thing happen with > page-related metadata coming from the freelist in question. How can an interrupt or another process touch the freelist while we're protected by splmem()? If that were possible, the block could be stolen out from under us in the code below between the assignment to va and the update of kbb->kb_next, allocating the same block of memory to two different consumers. >> if (kbp->kb_last == NULL) >> kbp->kb_last = (caddr_t)freep; >> } >> va = kbp->kb_next; >> kbp->kb_next = ((struct freelist *)va)->next; > > [ ... ] > >> This code bothers me, though, because if allocsize ever happened to not >> evenly divide (npg * PAGE_SIZE), the loop termination condition would be >> wrong and the end of the previous page (or more) would end up on the >> free list. If va were small enough and allocsize were big enough, it >> would even be possible to wrap around. > > That actually can't happen. Basically, the allocators "waste" > the ends of pages. See the: > > if (cp <= va) > break; > cp -= allocsize; > > ? The "<= saves you. It only works because allocsize evenly divides npg*PAGE_SIZE. If there was a heavy consumer of 129 byte blocks, someone might get the bright idea to allocate a special bucket for them because a lot more 129 byte blocks fit in a page than 256 byte blocks. As the "for" loop iterated, we'd get to the point where va < cp < va + allocsize. The "<=" test would pass, we'd decrement cp, causing it to be less than va, do the freep->next = cp; assignment, return to the top of the "for" loop, do the freep = (struct freelist *)cp; assignment, so that freep now points outside the block of memory allocated by kmem_malloc(). Now the "<=" test will kick us out of the loop and we'll do the freep->next = savedlist; assignment and stomping on someone else's memory. A safer, but slightly more expensive test would be cp < va + allocsize
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200304192156.h3JLuDXB019980>