From owner-freebsd-net@FreeBSD.ORG Sun Apr 20 09:32:26 2008 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 79451106564A; Sun, 20 Apr 2008 09:32:26 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id 52FAC8FC0A; Sun, 20 Apr 2008 09:32:26 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 0745A46B03; Sun, 20 Apr 2008 05:32:26 -0400 (EDT) Date: Sun, 20 Apr 2008 10:32:25 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: gnn@freebsd.org In-Reply-To: Message-ID: <20080420102827.U67663@fledge.watson.org> References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: net@freebsd.org Subject: Re: zonelimit issues... X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 20 Apr 2008 09:32:26 -0000 On Fri, 18 Apr 2008, gnn@freebsd.org wrote: > I am wondering why this patch was never committed? > > http://people.freebsd.org/~delphij/misc/patch-zonelimit-workaround > > It does seem to address an issue I'm seeing where processes get into the > zonelimit state through the use of mbufs (a high speed UDP packet receiver) > but even after network pressure is reduced/removed the process never gets > out of that state again. Applying the patch fixed the issue, but I'd like > to have some discussion as to the general merits of the approach. > > Unfortunately the test that currently causes this is tied very tightly to > code at work that I can't share, but I will hopefully be improving mctest to > try to exhibit this behavior. When you take all load off the system, do mbufs and clusters get properly freed back to UMA (as visible in netstat -m)? If not, continuing to bump up against the zonelimit would suggest an mbuf/cluster leak, in which case we need to track that bug. You might consider adding a debugging-only zonelimit waiter count to the UMA zone, and checks/assertions that a wakeup is being generated properly. That is, to confirm that the wakeup is generated when memory is freed up if there are threads waiting. There is at least one as-yet MFC'd fix to the sleep/wakeup code, I believe, that might be relevant here. Is the problem you're reporting on 7.x, or on 8.x? If 8.x, that's probably not it, but if 7.x, it could be. (This same sleep/wakeup bug occasionally leads to wedging of dump(8), I believe). Robert N M Watson Computer Laboratory University of Cambridge