From owner-freebsd-current@FreeBSD.ORG Sun Dec 26 18:55:20 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2AFC316A4CE for ; Sun, 26 Dec 2004 18:55:20 +0000 (GMT) Received: from relay01.pair.com (relay01.pair.com [209.68.5.15]) by mx1.FreeBSD.org (Postfix) with SMTP id B974543D1D for ; Sun, 26 Dec 2004 18:55:19 +0000 (GMT) (envelope-from pho@holm.cc) Received: (qmail 89101 invoked from network); 26 Dec 2004 18:55:18 -0000 Received: from unknown (HELO peter.osted.lan) (unknown) by unknown with SMTP; 26 Dec 2004 18:55:18 -0000 X-pair-Authenticated: 80.164.63.199 Received: from peter.osted.lan (localhost.osted.lan [127.0.0.1]) by peter.osted.lan (8.13.1/8.13.1) with ESMTP id iBQItH6m076667; Sun, 26 Dec 2004 19:55:17 +0100 (CET) (envelope-from pho@peter.osted.lan) Received: (from pho@localhost) by peter.osted.lan (8.13.1/8.13.1/Submit) id iBQItHup076666; Sun, 26 Dec 2004 19:55:17 +0100 (CET) (envelope-from pho) Date: Sun, 26 Dec 2004 19:55:17 +0100 From: Peter Holm To: Bosko Milekic Message-ID: <20041226185517.GB76499@peter.osted.lan> References: <20041209144233.GA46928@peter.osted.lan> <20041220234103.GA59225@technokratis.com> <20041222210553.GA28108@peter.osted.lan> <20041222221540.GA70052@technokratis.com> <20041226161153.GA74592@peter.osted.lan> <20041226181738.GA21533@technokratis.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041226181738.GA21533@technokratis.com> User-Agent: Mutt/1.4.2.1i cc: current@freebsd.org Subject: Re: panic: uma_zone_slab is looping X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 26 Dec 2004 18:55:20 -0000 On Sun, Dec 26, 2004 at 01:17:38PM -0500, Bosko Milekic wrote: > > On Sun, Dec 26, 2004 at 05:11:53PM +0100, Peter Holm wrote: > > > > Yes, I think that I have verified your exelent analysis of the > > problem: http://www.holm.cc/stress/log/freeze04.html > > > > So, do have any fix suggenstons? :-) > > Not yet, because the problem is non-obvious from the trace. > > I need to know exactly when the UMA RCntSlabs zone recurses _first_, > and I need to confirm that it is an actual recursion. I've looked at > the VM code and I don't see how/why recursion on the RCntSlabs zone > would happen. > > Please modify the printf code to look exactly like this: > > if (keg->uk_flags & UMA_ZFLAG_INTERNAL && keg->uk_recurse != 0) { > if ((zone == slabzone) || (zone == slabrefzone)) > panic("Zone %s forced to fail due to recurse non-null: %d\n", > zone->uz_name, keg->uk_recurse); > return (NULL); > } > OK, I'll apply your latest patch and hope for a fast panic (I've been running for "1+02:20:48" without any problems). - Peter > (You don't need to check any global counter -- the counter is imperfect > anyway -- because even a single recursion on slabzone or slabrefzone > should be illegal). > > I'd like to see the trace from the above panic, if possible. > > Also, from your current crash dump, see if you can print the value of > keg->uk_recurse (from frame 11, pid 74804). > How do I do that? > It appears that the other KASSERT being triggered from > propagate_priority() is due to some weird side-effect of process > 74804 looping with the UMA RCntSlabs zone lock held (without it > ever being dropped). We'll have to see. > > The point is: the trace is useless unless it shows where/when the > recursion on slabrefzone _begins_ to happen (not that it has already > happened, that part is obvious now). > > Happy holidays, > -- > Bosko Milekic > bmilekic@technokratis.com > bmilekic@FreeBSD.org -- Peter Holm