From owner-freebsd-net@FreeBSD.ORG Mon Apr 21 07:46:19 2008 Return-Path: Delivered-To: net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8749C106564A; Mon, 21 Apr 2008 07:46:19 +0000 (UTC) (envelope-from gnn@neville-neil.com) Received: from outbound0.mx.meer.net (outbound0.mx.meer.net [209.157.153.23]) by mx1.freebsd.org (Postfix) with ESMTP id 637BB8FC1C; Mon, 21 Apr 2008 07:46:19 +0000 (UTC) (envelope-from gnn@neville-neil.com) Received: from mail.meer.net (mail.meer.net [209.157.152.14]) by outbound0.mx.meer.net (8.12.10/8.12.6) with ESMTP id m3L7kIhs095581; Mon, 21 Apr 2008 00:46:19 -0700 (PDT) (envelope-from gnn@neville-neil.com) Received: from mail2.meer.net (mail2.meer.net [64.13.141.16]) by mail.meer.net (8.13.3/8.13.3/meer) with ESMTP id m3L7k1tM026124; Mon, 21 Apr 2008 00:46:01 -0700 (PDT) (envelope-from gnn@neville-neil.com) Received: from minion.local.neville-neil.com (61.204.211.246.customerlink.pwd.ne.jp [61.204.211.246]) (authenticated bits=0) by mail2.meer.net (8.14.1/8.14.1) with ESMTP id m3L7k1Su014039; Mon, 21 Apr 2008 00:46:01 -0700 (PDT) (envelope-from gnn@neville-neil.com) Date: Mon, 21 Apr 2008 16:46:00 +0900 Message-ID: From: gnn@FreeBSD.org To: Robert Watson In-Reply-To: <20080420102827.U67663@fledge.watson.org> References: <20080420102827.U67663@fledge.watson.org> User-Agent: Wanderlust/2.15.5 (Almost Unreal) SEMI/1.14.6 (Maruoka) FLIM/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL/10.7 Emacs/22.1.50 (i386-apple-darwin8.11.1) MULE/5.0 (SAKAKI) MIME-Version: 1.0 (generated by SEMI 1.14.6 - "Maruoka") Content-Type: text/plain; charset=US-ASCII Cc: net@FreeBSD.org Subject: Re: zonelimit issues... X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Apr 2008 07:46:19 -0000 At Sun, 20 Apr 2008 10:32:25 +0100 (BST), rwatson wrote: > > > On Fri, 18 Apr 2008, gnn@freebsd.org wrote: > > > I am wondering why this patch was never committed? > > > > http://people.freebsd.org/~delphij/misc/patch-zonelimit-workaround > > > > It does seem to address an issue I'm seeing where processes get into the > > zonelimit state through the use of mbufs (a high speed UDP packet receiver) > > but even after network pressure is reduced/removed the process never gets > > out of that state again. Applying the patch fixed the issue, but I'd like > > to have some discussion as to the general merits of the approach. > > > > Unfortunately the test that currently causes this is tied very tightly to > > code at work that I can't share, but I will hopefully be improving mctest to > > try to exhibit this behavior. > > When you take all load off the system, do mbufs and clusters get properly > freed back to UMA (as visible in netstat -m)? If not, continuing to bump up > against the zonelimit would suggest an mbuf/cluster leak, in which case we > need to track that bug. > This is unclear as the process that creates the issue opens 50 UDP multicast sockets with very large socket buffers. I am investigating this aspect some more. > You might consider adding a debugging-only zonelimit waiter count to > the UMA zone, and checks/assertions that a wakeup is being generated > properly. Yes. Do you have an example I can easily steal? > That is, to confirm that the wakeup is generated when memory is > freed up if there are threads waiting. There is at least one as-yet > MFC'd fix to the sleep/wakeup code, I believe, that might be > relevant here. Is the problem you're reporting on 7.x, or on 8.x? > If 8.x, that's probably not it, but if 7.x, it could be. (This same > sleep/wakeup bug occasionally leads to wedging of dump(8), I > believe). I have seen this on 7.0 RELEASE, and STABLE and on CURRENT (8). I am currently working on it on CURRENT because if I have a fix it's going to have to go there first. Best, George