From owner-freebsd-net@FreeBSD.ORG  Thu Sep  2 08:56:24 2010
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id ECD0E1065743
	for <freebsd-net@freebsd.org>; Thu,  2 Sep 2010 08:56:24 +0000 (UTC)
	(envelope-from melissa-freebsd@littlebluecar.co.uk)
Received: from filter.blacknosugar.com (filter.blacknosugar.com
	[212.13.204.214])
	by mx1.freebsd.org (Postfix) with ESMTP id 7B0EE8FC20
	for <freebsd-net@freebsd.org>; Thu,  2 Sep 2010 08:56:24 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;
	d=littlebluecar.co.uk; s=dkim; 
	h=Subject:Mime-Version:To:Message-Id:Date:Content-Transfer-Encoding:Content-Type:From;
	bh=CnrUwDQhwQL2MZWyMHxY+krLTX5loQV9gcdYnCoyyZ8=; 
	b=bj5nVVcq8ZSotqdQwvwNLNE2BY5MlyMqRYKZlubHwfEL2taWWk7sB5uGYnS1kwc+uOnP26ZVTk7vSTzJrJPEhO1vOrxUCzmHwZ+tBOhkAfPA286mSx5epNX9nEXocmml;
Received: from bowser.blacknosugar.com ([78.86.203.16] helo=[192.168.1.47])
	by filter.blacknosugar.com with esmtpsa (TLSv1:AES128-SHA:128)
	(Exim 4.71 (FreeBSD))
	(envelope-from <melissa-freebsd@littlebluecar.co.uk>)
	id 1Or4vg-000HP4-1h
	for freebsd-net@freebsd.org; Thu, 02 Sep 2010 09:13:54 +0100
From: Melissa Jenkins <melissa-freebsd@littlebluecar.co.uk>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: quoted-printable
Date: Thu, 2 Sep 2010 09:13:46 +0100
Message-Id: <5C261F16-6530-47EE-B1C1-BA38CD6D8B01@littlebluecar.co.uk>
To: freebsd-net@freebsd.org
Mime-Version: 1.0 (Apple Message framework v1081)
X-Mailer: Apple Mail (2.1081)
X-SA-Exim-Connect-IP: 78.86.203.16
X-SA-Exim-Mail-From: melissa-freebsd@littlebluecar.co.uk
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on filter
X-Spam-Level: 
X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00
	autolearn=ham version=3.3.1
X-SA-Exim-Version: 4.2
X-SA-Exim-Scanned: Yes (on filter.blacknosugar.com)
Subject: NFE adapter 'hangs'
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 02 Sep 2010 08:56:25 -0000

Hiya,

I've been having trouble with two different machines (FBSD 8.0p3 & FBSD =
7.0p5) using the NFE network adapter.  The machines are, respectively, =
Sun X2200 (AMD64) and a Sun X2100M2 (AMD64) and both are running the =
amd64 kernel.=20

Basically what appears to happen is that traffic stops flowing through =
the interface and 'No buffer space available' error messages are =
produced when trying to send icmp packets. All establish connections =
appear to hang.

The machines are running as packet routers, and nfe0 is acting as the =
'lan' side.  PF is being used for filtering, NAT, BINAT and RDR.  The =
same PF configuration works correctly on two other servers using =
different network adapters. One of them is configured with pfsync & =
CARP, but the other one isn't.

The problem seems to happen under fairly light number of sessions ( < =
100 active states in PF) though the more states the quicker it occurs.  =
It is possible it's related to packet rates as putting on high bandwidth =
clients seems to produce the problem very quickly (several minutes) This =
is reinforced by the fact that the problem first manifested when we =
upgraded one of the leased lines.

Executing ifconfig nfe0 down && ifconfig nfe0 up will restart traffic =
flow. =20

Neither box is very highly loaded, generally around ~ 1.5 Mb/s.  This =
doesn't appear to be related to the amount of traffic as I have tried =
re-routing 95% of traffic around the server without any improvement in =
performance.  The traffic profile is fairly random - a mix of TCP and =
UDP, mostly flowing OUT of nfe0.  It is all L3 and there are  less than =
5 hosts on the segment attached to the nfe interface.

Both boxes are in different locations and are connected to different =
types of Cisco switches.  Both appear to autonegotiate correctly and the =
switch ports show no status changes.

It appears that PFSync, CARP & a GRE tunnel works correctly over the NFE =
interface for long periods of time (weeks +) And that it is something to =
do adding other traffic to the mix that is resulting in the interface =
'hanging'.

If I move the traffic from NFE to the other BGE interface (the one =
shared with the LOM) everything is stable and works correctly.  I have =
not been able to reproduce this using test loads, and the interface =
worked correctly with iperf testing prior to deployment.  I =
unfortunately (legal reasons) can't provide a traffic trace up to the =
time it occurs though everything looks normal to me.

The FreeBSD 7 X2100 lists the following from PCI conf:
nfe0@pci0:0:8:0:        class=3D0x068000 card=3D0x534c108e =
chip=3D0x037310de rev=3D0xa3 hdr=3D0x00
   vendor     =3D 'Nvidia Corp'
   device     =3D 'MCP55 Ethernet'
   class      =3D bridge
nfe1@pci0:0:9:0:        class=3D0x068000 card=3D0x534c108e =
chip=3D0x037310de rev=3D0xa3 hdr=3D0x00
   vendor     =3D 'Nvidia Corp'
   device     =3D 'MCP55 Ethernet'
   class      =3D bridge

The FreeBSD 8 X2200 lists the same thing:
nfe0@pci0:0:8:0:        class=3D0x068000 card=3D0x534b108e =
chip=3D0x037310de rev=3D0xa3 hdr=3D0x00
   vendor     =3D 'Nvidia Corp'
   device     =3D 'MCP55 Ethernet'
   class      =3D bridge
nfe1@pci0:0:9:0:        class=3D0x068000 card=3D0x534b108e =
chip=3D0x037310de rev=3D0xa3 hdr=3D0x00
   vendor     =3D 'Nvidia Corp'
   device     =3D 'MCP55 Ethernet'
   class      =3D bridge


Here are the two obvious tests (both from the FreeBSD 7 box), but the =
icmp response & the mbuf stats are very much the same on both boxes.

ping 172.31.3.129
PING 172.31.3.129 (172.31.3.129): 56 data bytes
ping: sendto: No buffer space available
ping: sendto: No buffer space available
^C

-- 172.31.3.129 ping statistics ---
2 packets transmitted, 0 packets received, 100.0% packet loss

netstat -m
852/678/1530 mbufs in use (current/cache/total)
818/448/1266/25600 mbuf clusters in use (current/cache/total/max)
817/317 mbuf+clusters out of packet secondary zone in use =
(current/cache)
0/362/362/12800 4k (page size) jumbo clusters in use =
(current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
1879K/2513K/4392K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines

=46rom the other machine, after the problem has occurred & and ifconfig =
down/up cycle has been done (ie when the interface is working)
vmstat -z=20
mbuf_packet:              256,        0,     1033,     1783, 330792410,  =
      0
mbuf:                     256,        0,        5,     1664, 395145472,  =
      0
mbuf_cluster:            2048,    25600,     2818,     1690, 13234653,   =
     0
mbuf_jumbo_page:         4096,    12800,        0,      336,   297749,   =
     0
mbuf_jumbo_9k:           9216,     6400,        0,        0,        0,   =
     0
mbuf_jumbo_16k:         16384,     3200,        0,        0,        0,   =
     0
mbuf_ext_refcnt:            4,        0,        0,        0,        0,   =
     0


Although I failed to keep a copy I don't believe there is a kmem problem

I'm at a complete loss as to what to try next :( =20

All suggestions very gratefully received!!!  The 7.0 box is live so =
can't really be played with but I can occasionally run tests on the =
other box

Thank you :)
Mel