From owner-freebsd-stable@FreeBSD.ORG  Thu May 27 13:13:14 2010
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 2BF01106566B
	for <freebsd-stable@freebsd.org>; Thu, 27 May 2010 13:13:14 +0000 (UTC)
	(envelope-from O.Seibert@cs.ru.nl)
Received: from kookpunt.science.ru.nl (kookpunt.science.ru.nl [131.174.30.61])
	by mx1.freebsd.org (Postfix) with ESMTP id A1ADB8FC08
	for <freebsd-stable@freebsd.org>; Thu, 27 May 2010 13:13:12 +0000 (UTC)
Received: from twoquid.cs.ru.nl (twoquid.cs.ru.nl [131.174.142.38])
	by kookpunt.science.ru.nl (8.13.7/5.31) with ESMTP id o4RDDAfj009364;
	Thu, 27 May 2010 15:13:10 +0200 (MEST)
Received: by twoquid.cs.ru.nl (Postfix, from userid 4100)
	id C488F2E057; Thu, 27 May 2010 15:13:10 +0200 (CEST)
Date: Thu, 27 May 2010 15:13:10 +0200
From: Olaf Seibert <O.Seibert@cs.ru.nl>
To: freebsd-stable@freebsd.org
Message-ID: <20100527131310.GS883@twoquid.cs.ru.nl>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.5.19 (2009-01-05)
X-Spam-Score: -0.669 () ALL_TRUSTED,BAYES_50,DNS_FROM_OPENWHOIS
X-Scanned-By: MIMEDefang 2.63 on 131.174.30.61
Subject: nfe0 loses network connectivity (8.0-RELEASE-p2)
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 27 May 2010 13:13:14 -0000

I have a machine with FreeBSD 8.0-RELEASE-p2 which has a big ZFS file
system and serves as file server (NFS (newnfs)).

>From time to time however it seems to lose all network connectivity. The
machine isn't down; from the console (an IPMI console) it works fine.

I have tried things like bringing nfe0 down and up again, turning off
things like checksum offload, and none of them really seem to work
(although apparently sometimes by accident, a thing I try seems to help,
but a short time later connectivity is lost again). 

Carrier status and things like that seem all normal:

nfe0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=19b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4>
        ether 00:30:48:xx:xx:xx
        inet 131.174.xx.xx netmask 0xffffff00 broadcast 131.174.xx.xxx
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active

One time when I was doing an "ifconfig nfe0 up" I got the message
"initialization failed: no memory for rx buffers", so I am currently
thinking in the direction of mbuf starvation (with something requiring
too many mbufs to make any progress; I've seen such a thing with inodes
once).

Here is the output of netstat -m while the problem was going on:

25751/1774/27525 mbufs in use (current/cache/total)
24985/615/25600/25600 mbuf clusters in use (current/cache/total/max)
23254/532 mbuf+clusters out of packet secondary zone in use (current/cache)
0/95/95/12800 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
56407K/2053K/58461K bytes allocated to network (current/cache/total)
0/2084/1031 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
10 requests for I/O initiated by sendfile
0 calls to protocol drain routines

while here are the figures a short time after a reboot (a reboot always
"fixes" the problem):

2133/2352/4485 mbufs in use (current/cache/total)
1353/2205/3558/25600 mbuf clusters in use (current/cache/total/max)
409/871 mbuf+clusters out of packet secondary zone in use (current/cache)
0/35/35/12800 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
3239K/5138K/8377K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines

Is there a way to increase the maximum number of mbufs, or better yet,
limit the use by whatever is using them too much?

Thanks in advance,
-Olaf.
--