From owner-freebsd-hackers@FreeBSD.ORG Sat Apr 19 16:51:46 2003 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7D6E637B401; Sat, 19 Apr 2003 16:51:46 -0700 (PDT) Received: from stork.mail.pas.earthlink.net (stork.mail.pas.earthlink.net [207.217.120.188]) by mx1.FreeBSD.org (Postfix) with ESMTP id DE7B343FA3; Sat, 19 Apr 2003 16:51:45 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from pool0077.cvx40-bradley.dialup.earthlink.net ([216.244.42.77] helo=mindspring.com) by stork.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 19727T-0000ry-00; Sat, 19 Apr 2003 16:51:45 -0700 Message-ID: <3EA1E0C4.F097ADBA@mindspring.com> Date: Sat, 19 Apr 2003 16:50:28 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Don Lewis References: <200304192156.h3JLuDXB019980@gw.catspoiler.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a49862cec7e0c614debcc007437d5d55e6350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c cc: hackers@FreeBSD.org Subject: Re: Repeated similar panics on -STABLE X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 19 Apr 2003 23:51:46 -0000 Don Lewis wrote: > > Take an interrupt somewhere around here, and have the available > > entries removed from the freelist by an interrupt level driver. > > > > Or take a page fault, and have the same thing happen with > > page-related metadata coming from the freelist in question. > > How can an interrupt or another process touch the freelist while we're > protected by splmem()? If that were possible, the block could be stolen > out from under us in the code below between the assignment to va and the > update of kbb->kb_next, allocating the same block of memory to two > different consumers. Personally, I think it's a page fault. In any case, the stack traces were posted about 02 Apr 2003, and the patch fixes the problem empirically, so we can argue about why, or we can fix the problem for everyone. > > if (cp <= va) > > break; > > cp -= allocsize; > > > > ? The "<= saves you. > > It only works because allocsize evenly divides npg*PAGE_SIZE. Yes. > If there was a heavy consumer of 129 byte blocks, someone might get > the bright idea to allocate a special bucket for them because a lot > more 129 byte blocks fit in a page than 256 byte blocks. As the > "for" loop iterated, we'd get to the point where va < cp < va + > allocsize. The "<=" test would pass, we'd decrement cp, causing it > to be less than va, do the > freep->next = cp; > assignment, return to the top of the "for" loop, do the > freep = (struct freelist *)cp; > assignment, so that freep now points outside the block of memory > allocated by kmem_malloc(). Now the "<=" test will kick us out of the > loop and we'll do the > freep->next = savedlist; > assignment and stomping on someone else's memory. Yes. 8-). I did exactly this, in fact, at Clickarray, though I rounded to an 8 byte alignment boundary. The way I did it was to figure out the minimal number of pages to allocate at one time that resulted in an even number of structures. This actually saved a hell of a lot of RAM. This is the method I described in my first response to you. 8-). > A safer, but slightly > more expensive test would be > cp < va + allocsize This is really painful, actually. You can lose a full object per page doing this. It also makes the coelesce logic almost incomprehensible (at least in the implementation I tried). -- Tery