Date: Sun, 20 Apr 2008 10:43:20 +0100 (BST) From: Robert Watson <rwatson@FreeBSD.org> To: Chris Pratt <eagletree@hughes.net> Cc: gnn@freebsd.org, d@delphij.net, net@freebsd.org Subject: Re: zonelimit issues... Message-ID: <20080420103258.D67663@fledge.watson.org> In-Reply-To: <382258DB-13B8-4108-B8F4-157F247A7E4B@hughes.net> References: <m2hcdztsx2.wl%gnn@neville-neil.com> <48087C98.8060600@delphij.net> <382258DB-13B8-4108-B8F4-157F247A7E4B@hughes.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 18 Apr 2008, Chris Pratt wrote: > Doesn't 7.0 fix this? I'd like to see an official definitive answer and all > I've been going on is that the problem description is no longer in the > errata. Unfortunately, bugs of this sort don't really "work" that way -- specific bugs are a property of a problem in code (or a problem in design), but what we have right now is a report of a symptom that might reflect zero or more specific bugs. It's unclear that the problem described in errata is the problem you've been experiencing, or that the (at least one) fixed bug with the same symptoms is that one you've been experiencing. For better or worse, the only way to really tell of a generic class of hang or wedging is fixed is to try out the new version and see. In most cases, "zonelimit" wedging reflects one of two things: (1) Inadequate resource allocation to the network stack or some other component, try tuning up the memory tunable for clusters (for example). (2) A memory leak in a network device driver or other network part, which needs to be debugged and fixed. On at least one prior occasion, there has been a bug in UMA itself that lead to getting stuck in zonelimit, and it's not impossible there's a scheduler sleep/wakeup bug that would lead to a similar symptom but for a different reason. In FreeBSD 7-STABLE, you can now use procstat -k to print kernel stack traces of user threads blocked in kernel, which may make diagnosing the general class of problem a bit easier without using a kernel debugger. "zonelimit" is the generic wait channel across all memory type and allocation paths, so doesn't reveal a lot about *which* limit is being hit. Using a kernel stack trace, we can see which specific memory type and allocation context is involved. Robert N M Watson Computer Laboratory University of Cambridge
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080420103258.D67663>