From owner-freebsd-hackers@FreeBSD.ORG Thu Jan 11 03:12:52 2007 Return-Path: X-Original-To: freebsd-hackers@freebsd.org Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 10F4316A412 for ; Thu, 11 Jan 2007 03:12:52 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (66-23-211-162.clients.speedfactory.net [66.23.211.162]) by mx1.freebsd.org (Postfix) with ESMTP id 9C3BC13C455 for ; Thu, 11 Jan 2007 03:12:51 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from zion.baldwin.cx (zion.baldwin.cx [192.168.0.7]) (authenticated bits=0) by server.baldwin.cx (8.13.6/8.13.6) with ESMTP id l0B3CiAm090120; Wed, 10 Jan 2007 22:12:45 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: "Brad L. Chisholm" Date: Wed, 10 Jan 2007 22:11:38 -0500 User-Agent: KMail/1.9.4 References: <20070110215207.GA85834@bsdone.bsdwins.com> <200701101753.24716.jhb@freebsd.org> <20070111001534.GA319@bsdone.bsdwins.com> In-Reply-To: <20070111001534.GA319@bsdone.bsdwins.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200701102211.39412.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [192.168.0.1]); Wed, 10 Jan 2007 22:12:45 -0500 (EST) X-Virus-Scanned: ClamAV 0.88.3/2434/Wed Jan 10 19:47:38 2007 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: freebsd-hackers@freebsd.org Subject: Re: Kernel hang on 6.x X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Jan 2007 03:12:52 -0000 On Wednesday 10 January 2007 19:15, Brad L. Chisholm wrote: > On Wed, Jan 10, 2007 at 05:53:24PM -0500, John Baldwin wrote: > > On Wednesday 10 January 2007 16:52, Brad L. Chisholm wrote: > > > > > > I work with Brian, and have been helping him analyze this problem. We have > > > been able to generate kernel dumps, and have also done some additional > > > analysis under ddb. Here is a summary of our analysis so far. Suggestions > > > as to how to proceed from here are most welcome. > > > > How much swap do you have? You might have run out of buckets in the > > swap_zone before you ran out of swap space, in which case the kernel > > deadlocks rather than killing the hog like it does when it runs out of > > swap space. I added a printf to catch this on HEAD recently that will > > be MFC'd soonish. You can try bumping up kern.maxswzone (loader tunable). > > > > It has a 32GB swap partition. We have also run it configured with an > additional 32GB swap file, for a total of 64GB. Changing the amount of > swap did not seem to affect the hang. However, as I mentioned in my > previous post, the hang appears to always occur when ~14GB of swap have > been consumed, regardless of the amount of swap or physmen configured. > This does make it sound like a limit (such as swap_zone buckets) has > been reached. > > I notice the following in the vm.zone output captured just prior to > a hang. Does this value correspond to the swap_zone you were referring > to? This looks like a limit may have been reached. > > SWAPMETA: 288, 116519, 116519, 0, 116543 yep, that's exactly the issue you are hitting. > I don't seem to be able to query kern.maxswzone on our 6.2-BETA2 image: > > # sysctl kern.maxswzone > sysctl: unknown oid 'kern.maxswzone' > > Is it available in 6.x, or is it something newer? It's only a tunable, not available as a sysctl. You can figure out the current size from the vmstat output above, then do some math to figure out a good guess to use based on how much swap it had in use when it locked up. For example, right now you have 116519 objects of size 288, so 33557472 bytes allocated. You said you die when 14 GB out of 64 total is used, so you should probably try taking that value and multiplying it by 64 / 14. That gives a result of 153405586. However, you really want to round this up to a multiple of 288 (because the kernel rounds it down to a multiple of 288), so I'd use a value of at least 153405792. And yes, that means you are setting aside a little over 146 MB of wired, physical RAM just to hold metadata for your swap. :) -- John Baldwin