From owner-freebsd-stable@FreeBSD.ORG  Mon Jun  7 14:06:14 2010
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B00A91065677
	for <freebsd-stable@freebsd.org>; Mon,  7 Jun 2010 14:06:14 +0000 (UTC)
	(envelope-from O.Seibert@cs.ru.nl)
Received: from kookpunt.science.ru.nl (kookpunt.science.ru.nl [131.174.30.61])
	by mx1.freebsd.org (Postfix) with ESMTP id 30B8F8FC1C
	for <freebsd-stable@freebsd.org>; Mon,  7 Jun 2010 14:06:13 +0000 (UTC)
Received: from twoquid.cs.ru.nl (twoquid.cs.ru.nl [131.174.142.38])
	by kookpunt.science.ru.nl (8.13.7/5.31) with ESMTP id o57E6BkS007693;
	Mon, 7 Jun 2010 16:06:11 +0200 (MEST)
Received: by twoquid.cs.ru.nl (Postfix, from userid 4100)
	id 591492E069; Mon,  7 Jun 2010 16:06:11 +0200 (CEST)
Date: Mon, 7 Jun 2010 16:06:11 +0200
From: Olaf Seibert <O.Seibert@cs.ru.nl>
To: freebsd-stable@freebsd.org
Message-ID: <20100607140611.GX883@twoquid.cs.ru.nl>
References: <20100527131310.GS883@twoquid.cs.ru.nl>
	<20100527174211.GC1211@michelle.cdnetworks.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20100527174211.GC1211@michelle.cdnetworks.com>
User-Agent: Mutt/1.5.19 (2009-01-05)
X-Spam-Score: -1.799 () ALL_TRUSTED,BAYES_50
X-Scanned-By: MIMEDefang 2.63 on 131.174.30.61
Cc: Pyun YongHyeon <pyunyh@gmail.com>, freebsd-stable@freebsd.org,
	Olaf Seibert <O.Seibert@cs.ru.nl>
Subject: Re: nfe0 loses network connectivity (8.0-RELEASE-p2)
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Jun 2010 14:06:14 -0000

On Thu 27 May 2010 at 10:42:11 -0700, Pyun YongHyeon wrote:
> On Thu, May 27, 2010 at 03:13:10PM +0200, Olaf Seibert wrote:
> > Here is the output of netstat -m while the problem was going on:
> > 
> > 25751/1774/27525 mbufs in use (current/cache/total)
> > 24985/615/25600/25600 mbuf clusters in use (current/cache/total/max)
>   ^^^^^^^^^^^^^^^^^^^^^
> As Jeremy said, it seems you're hitting mbuf shortage situation. I
> think nfe(4) is dropping received frames in that case. See how many
> packets were dropped due to mbuf shortage from the output of
> "netstat -ndI nfe0". You can also use "sysctl dev.nfe.0.stats" to
> see MAC statistics maintained in nfe(4) if your MCP controller
> supports hardware MAC counters.

The sysctl command gives me (among other figures):

    dev.nfe.0.stats.rx.drops: 338180

so indeed frames seem to be dropped.

Jeremy Chadwick mentioned that one can tune kern.ipc.nmbclusters in
boot.conf, but apparently it is also changeable at runtime with sysctl.

Since the problem recurred today, I increased the value from 25600 to
32768, the maximum recommended value in the Handbook. (I can probably go
higher if needed; the box has 8 GB of RAM, although up to half of it is
eaten by ZFS)

I do get the impression there is a mbuf leak somehow. On a much older
file server (FreeBSD 6.1, serves a bit of NFS but has no ZFS) the mbuf
cluster useage is much lower, despite a longer uptime:

    256/634/890/25600 mbuf clusters in use (current/cache/total/max)

Also, it shows signs that measures are taken in case of mbuf shortage:

    2259806/466391/598621 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
    1016 calls to protocol drain routines

whereas the FreeBSD 8.0 machine has zero or very low numbers:

    0/3956/1959 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
    0 calls to protocol drain routines

and useage keeps growing:

    26122/1782/27904/32768 mbuf clusters in use (current/cache/total/max)

-Olaf.
--