From owner-freebsd-stable@FreeBSD.ORG Thu May 27 17:43:34 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D76B71065670 for ; Thu, 27 May 2010 17:43:34 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-pw0-f54.google.com (mail-pw0-f54.google.com [209.85.160.54]) by mx1.freebsd.org (Postfix) with ESMTP id A72338FC0A for ; Thu, 27 May 2010 17:43:34 +0000 (UTC) Received: by pwj4 with SMTP id 4so189032pwj.13 for ; Thu, 27 May 2010 10:43:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:received:from:date:to:cc :subject:message-id:reply-to:references:mime-version:content-type :content-disposition:in-reply-to:user-agent; bh=5Es904WF/O9OswEjiRpzq19I7OSpagjdCRvV9txfCsY=; b=s1Ics2DvOttf3bCi/0R2rl5HBp7+i8V+hZMa+iBBXKZvTIoA9wEr6vYtjYzR1qEFlO dFVbUwxs7okCUwJxBgKpV2H2Byo71rqhhtf5QX6j/1MTpg3HB/YVce/h83/h5rueCM5I d2B7fQmppHoQUFzlLZ7173d0CIJj4OOBLvLng= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=Og1NU88IxujH4q/AQTHrqJleKqkmoMUegwqErfYTJhqrp7w2+41xTgbYEsSpEDc/M2 ulfERo+KRAslLNXne1R3luE8CIVQ2X3GB6klvlgUuwOg7SIFJ+S+o2fktb9IhQw08pg9 +7k7SdbIqk1bEn9ENUGTumi7WwtQZ7AV+Tn0A= Received: by 10.115.36.31 with SMTP id o31mr9387113waj.171.1274982213953; Thu, 27 May 2010 10:43:33 -0700 (PDT) Received: from pyunyh@gmail.com ([174.35.1.224]) by mx.google.com with ESMTPS id d20sm11828240waa.15.2010.05.27.10.43.33 (version=TLSv1/SSLv3 cipher=RC4-MD5); Thu, 27 May 2010 10:43:33 -0700 (PDT) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Thu, 27 May 2010 10:42:11 -0700 From: Pyun YongHyeon Date: Thu, 27 May 2010 10:42:11 -0700 To: Olaf Seibert Message-ID: <20100527174211.GC1211@michelle.cdnetworks.com> References: <20100527131310.GS883@twoquid.cs.ru.nl> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20100527131310.GS883@twoquid.cs.ru.nl> User-Agent: Mutt/1.4.2.3i Cc: freebsd-stable@freebsd.org Subject: Re: nfe0 loses network connectivity (8.0-RELEASE-p2) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 May 2010 17:43:34 -0000 On Thu, May 27, 2010 at 03:13:10PM +0200, Olaf Seibert wrote: > I have a machine with FreeBSD 8.0-RELEASE-p2 which has a big ZFS file > system and serves as file server (NFS (newnfs)). > > >From time to time however it seems to lose all network connectivity. The > machine isn't down; from the console (an IPMI console) it works fine. > > I have tried things like bringing nfe0 down and up again, turning off > things like checksum offload, and none of them really seem to work > (although apparently sometimes by accident, a thing I try seems to help, > but a short time later connectivity is lost again). > > Carrier status and things like that seem all normal: > > nfe0: flags=8843 metric 0 mtu 1500 > options=19b > ether 00:30:48:xx:xx:xx > inet 131.174.xx.xx netmask 0xffffff00 broadcast 131.174.xx.xxx > media: Ethernet autoselect (1000baseT ) > status: active > > One time when I was doing an "ifconfig nfe0 up" I got the message > "initialization failed: no memory for rx buffers", so I am currently > thinking in the direction of mbuf starvation (with something requiring > too many mbufs to make any progress; I've seen such a thing with inodes > once). > > Here is the output of netstat -m while the problem was going on: > > 25751/1774/27525 mbufs in use (current/cache/total) > 24985/615/25600/25600 mbuf clusters in use (current/cache/total/max) ^^^^^^^^^^^^^^^^^^^^^ As Jeremy said, it seems you're hitting mbuf shortage situation. I think nfe(4) is dropping received frames in that case. See how many packets were dropped due to mbuf shortage from the output of "netstat -ndI nfe0". You can also use "sysctl dev.nfe.0.stats" to see MAC statistics maintained in nfe(4) if your MCP controller supports hardware MAC counters. > 23254/532 mbuf+clusters out of packet secondary zone in use (current/cache) > 0/95/95/12800 4k (page size) jumbo clusters in use (current/cache/total/max) > 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max) > 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) > 56407K/2053K/58461K bytes allocated to network (current/cache/total) > 0/2084/1031 requests for mbufs denied (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for jumbo clusters denied (4k/9k/16k) > 0/0/0 sfbufs in use (current/peak/max) > 0 requests for sfbufs denied > 0 requests for sfbufs delayed > 10 requests for I/O initiated by sendfile > 0 calls to protocol drain routines > > while here are the figures a short time after a reboot (a reboot always > "fixes" the problem): > > 2133/2352/4485 mbufs in use (current/cache/total) > 1353/2205/3558/25600 mbuf clusters in use (current/cache/total/max) > 409/871 mbuf+clusters out of packet secondary zone in use (current/cache) > 0/35/35/12800 4k (page size) jumbo clusters in use (current/cache/total/max) > 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max) > 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) > 3239K/5138K/8377K bytes allocated to network (current/cache/total) > 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for jumbo clusters denied (4k/9k/16k) > 0/0/0 sfbufs in use (current/peak/max) > 0 requests for sfbufs denied > 0 requests for sfbufs delayed > 0 requests for I/O initiated by sendfile > 0 calls to protocol drain routines > > Is there a way to increase the maximum number of mbufs, or better yet, > limit the use by whatever is using them too much? > You already hit the mbuf limit so nfe(4) might have started to drop incoming frames.