From owner-freebsd-net@FreeBSD.ORG Sat Apr 19 02:00:58 2008 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2F008106564A for ; Sat, 19 Apr 2008 02:00:58 +0000 (UTC) (envelope-from gnn@neville-neil.com) Received: from outbound0.mx.meer.net (outbound0.mx.meer.net [209.157.153.23]) by mx1.freebsd.org (Postfix) with ESMTP id 12F188FC0C for ; Sat, 19 Apr 2008 02:00:57 +0000 (UTC) (envelope-from gnn@neville-neil.com) Received: from mail.meer.net (mail.meer.net [209.157.152.14]) by outbound0.mx.meer.net (8.12.10/8.12.6) with ESMTP id m3J20hi6026257; Fri, 18 Apr 2008 19:00:47 -0700 (PDT) (envelope-from gnn@neville-neil.com) Received: from mail2.meer.net (mail2.meer.net [64.13.141.16]) by mail.meer.net (8.13.3/8.13.3/meer) with ESMTP id m3J20CtO091284; Fri, 18 Apr 2008 19:00:12 -0700 (PDT) (envelope-from gnn@neville-neil.com) Received: from minion.local.neville-neil.com (61.204.211.246.customerlink.pwd.ne.jp [61.204.211.246]) (authenticated bits=0) by mail2.meer.net (8.14.1/8.14.1) with ESMTP id m3J20B4f098320; Fri, 18 Apr 2008 19:00:11 -0700 (PDT) (envelope-from gnn@neville-neil.com) Date: Sat, 19 Apr 2008 11:00:10 +0900 Message-ID: From: "George V. Neville-Neil" To: Chris Pratt In-Reply-To: <382258DB-13B8-4108-B8F4-157F247A7E4B@hughes.net> References: <48087C98.8060600@delphij.net> <382258DB-13B8-4108-B8F4-157F247A7E4B@hughes.net> User-Agent: Wanderlust/2.15.5 (Almost Unreal) SEMI/1.14.6 (Maruoka) FLIM/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL/10.7 Emacs/22.1.50 (i386-apple-darwin8.11.1) MULE/5.0 (SAKAKI) MIME-Version: 1.0 (generated by SEMI 1.14.6 - "Maruoka") Content-Type: text/plain; charset=US-ASCII Cc: d@delphij.net, net@freebsd.org Subject: Re: zonelimit issues... X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 19 Apr 2008 02:00:58 -0000 At Fri, 18 Apr 2008 06:40:26 -0700, Chris Pratt wrote: > > I am very interested in this topic as I've been waiting > since moving from FreeBSD 5 in 2006. The workaround > in the errata had no effect and the only notice I > could see of something changing was the errata did > not include the problem as of FreeBSD 7.0. > > I stayed with production releases and source > upgraded hoping a fix would be coming but stopped > that at 6.2 when I saw no related changes (re: > your messages in 12/2006 - 02/2007). I was > planning to move to 7.0 based on the lack of the > error description in the 7.0 errata. If no patch has > been made, I'd prefer to keep an otherwise stable > system status quo. > > I guess now I'd really like to know if this has been > fixed or not. I've been tied to my monitor for near > two years now because of one system that seems > to exhibit the problem regardless of what hardware > we put in the role. Without a dump I've never been > able to say that THIS problem is MY problem but the > earmarks are there and I've just been waiting. > > Doesn't 7.0 fix this? I'd like to see an official > definitive answer and all I've been going on is that > the problem description is no longer in the errata. > It happens less often than usual but there are still situations where it is possible. The problem is that if the system is overloaded there may never be a process able to free an mbuf to make progress. The most important thing to do is to size the system correctly. The bug most often crops up when systems have huge numbers of packets outstanding and the system is overloaded. I believe that a better solution is possible, but it will take more careful study. One option is to start adding drain routines to UDP that cause the protocol to drop packets under load, which is the problem we're seeing. In our tests the server process cannot read data fast enough to clear enough mbufs/clusters back to the system and it gets stuck in a write() call. Best, George