From owner-freebsd-current@FreeBSD.ORG Fri Nov 19 22:11:30 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D42BB16A4D0 for ; Fri, 19 Nov 2004 22:11:30 +0000 (GMT) Received: from mail6.speakeasy.net (mail6.speakeasy.net [216.254.0.206]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8B6B443D55 for ; Fri, 19 Nov 2004 22:11:30 +0000 (GMT) (envelope-from jhb@FreeBSD.org) Received: (qmail 21469 invoked from network); 19 Nov 2004 22:11:30 -0000 Received: from dsl027-160-063.atl1.dsl.speakeasy.net (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) encrypted SMTP for ; 19 Nov 2004 22:11:29 -0000 Received: from [10.50.41.235] (gw1.twc.weather.com [216.133.140.1]) (authenticated bits=0) by server.baldwin.cx (8.12.11/8.12.11) with ESMTP id iAJMBJbs008149; Fri, 19 Nov 2004 17:11:25 -0500 (EST) (envelope-from jhb@FreeBSD.org) From: John Baldwin To: Peter Holm Date: Fri, 19 Nov 2004 17:10:19 -0500 User-Agent: KMail/1.6.2 References: <20041112123343.GA12048@peter.osted.lan> <200411151546.15533.jhb@FreeBSD.org> <20041119075924.GA22320@peter.osted.lan> In-Reply-To: <20041119075924.GA22320@peter.osted.lan> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200411191710.19215.jhb@FreeBSD.org> X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on server.baldwin.cx cc: freebsd-current@FreeBSD.org cc: bmilekic@FreeBSD.org cc: jeffr@FreeBSD.org Subject: Re: Freeze X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Nov 2004 22:11:30 -0000 On Friday 19 November 2004 02:59 am, Peter Holm wrote: > On Mon, Nov 15, 2004 at 03:46:15PM -0500, John Baldwin wrote: > > On Friday 12 November 2004 07:33 am, Peter Holm wrote: > > > GENERIC HEAD from Nov 11 08:05 UTC > > > > > > The following stack traces etc. was done before my first > > > cup of coffee, so it's not so informative as it could have been :-( > > > > > > The test box appeared to have been frozen for more than 6 hours, > > > but was pingable. > > > > > > http://www.holm.cc/stress/log/cons86.html > > > > A weak guess is that you have the system in some sort of livelock due to > > fork()? Have you tried running with 'debug.mpsafevm=1' set from the > > loader? > > > > -- > > John Baldwin <>< http://www.FreeBSD.org/~jhb/ > > "Power Users Use the Power to Serve" = http://www.FreeBSD.org > > OK, I've got some more info: > > http://www.holm.cc/stress/log/cons88.html > > Looks like a spin in uma_zone_slab() when slab_zalloc() fails? Yes, I think if you specify M_WAITOK, then that might happen. slab_zalloc() can fail if any of the init functions fail for example, in which case it would loop forever. You can try this hack (though it may very well be wrong) to return failure if that is what is triggering: Index: uma_core.c =================================================================== RCS file: /usr/cvs/src/sys/vm/uma_core.c,v retrieving revision 1.110 diff -u -r1.110 uma_core.c --- uma_core.c 6 Nov 2004 11:43:30 -0000 1.110 +++ uma_core.c 19 Nov 2004 22:08:26 -0000 @@ -1998,6 +1998,10 @@ */ if (flags & M_NOWAIT) flags |= M_NOVM; + + /* XXXHACK */ + if (flags & M_WAITOK) + break; } return (slab); } -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org