From owner-freebsd-net@FreeBSD.ORG Thu Feb 27 23:13:04 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 27C7ACE8; Thu, 27 Feb 2014 23:13:04 +0000 (UTC) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id CC73A1200; Thu, 27 Feb 2014 23:13:03 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqQEABLGD1ODaFve/2dsb2JhbABag0FXgwO9Pk+BMXSCJQEBAQMBAQEBICsgCwUWGAICDRkCKQEJJgYIBwQBHASHUAgNqlKgcBeBKYxKEAIBDQ4BMweCLg8xgUkEiF1tjBiECJB4g0seMXsCBRkEHg X-IronPort-AV: E=Sophos;i="4.97,557,1389762000"; d="scan'208";a="100542019" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-annu.net.uoguelph.ca with ESMTP; 27 Feb 2014 18:13:01 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id B9670B4054; Thu, 27 Feb 2014 18:13:01 -0500 (EST) Date: Thu, 27 Feb 2014 18:13:01 -0500 (EST) From: Rick Macklem To: Markus Gebert Message-ID: <1673358278.14528789.1393542781747.JavaMail.root@uoguelph.ca> In-Reply-To: <76EBC5F0-DA4E-4A60-A10E-093F4E1BD1EF@hostpoint.ch> Subject: Re: Network loss MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [172.17.91.209] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - IE8 (Win)/7.2.1_GA_2790) Cc: Johan Kooijman , freebsd-net@freebsd.org, Jack Vogel , John Baldwin X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Feb 2014 23:13:04 -0000 Markus Gebert wrote: >=20 > On 27.02.2014, at 02:00, Rick Macklem wrote: >=20 > > John Baldwin wrote: > >> On Tuesday, February 25, 2014 2:19:01 am Johan Kooijman wrote: > >>> Hi all, > >>>=20 > >>> I have a weird situation here where I can't get my head around. > >>>=20 > >>> One FreeBSD 9.2-STABLE ZFS/NFS box, multiple Linux clients. Once > >>> in > >>> a while > >>> the Linux clients loose their NFS connection: > >>>=20 > >>> Feb 25 06:24:09 hv3 kernel: nfs: server 10.0.24.1 not responding, > >>> timed out > >>>=20 > >>> Not all boxes, just one out of the cluster. The weird part is > >>> that > >>> when I > >>> try to ping a Linux client from the FreeBSD box, I have between > >>> 10 > >>> and 30% > >>> packetloss - all day long, no specific timeframe. If I ping the > >>> Linux > >>> clients - no loss. If I ping back from the Linux clients to FBSD > >>> box - no > >>> loss. > >>>=20 > >>> The errors I get when pinging a Linux client is this one: > >>> ping: sendto: File too large >=20 > We were facing similar problems when upgrading to 9.2 and have stayed > with 9.1 on affected systems for now. We=E2=80=99ve seen this on HP G8 > blades with 82599EB controllers: >=20 > ix0@pci0:4:0:0:=09class=3D0x020000 card=3D0x18d0103c chip=3D0x10f88086 > rev=3D0x01 hdr=3D0x00 > vendor =3D 'Intel Corporation' > device =3D '82599EB 10 Gigabit Dual Port Backplane Connection' > class =3D network > subclass =3D ethernet >=20 > We didn=E2=80=99t find a way to trigger the problem reliably. But when it > occurs, it usually affects only one interface. Symptoms include: >=20 > - socket functions return the 'File too large' error mentioned by > Johan > - socket functions return 'No buffer space=E2=80=99 available > - heavy to full packet loss on the affected interface > - =E2=80=9Cstuck=E2=80=9D TCP connection, i.e. ESTABLISHED TCP connection= s that > should have timed out stick around forever (socket on the other side > could have been closed ours ago) > - userland programs using the corresponding sockets usually got stuck > too (can=E2=80=99t find kernel traces right now, but always in network > related syscalls) >=20 > Network is only lightly loaded on the affected systems (usually 5-20 > mbit, capped at 200 mbit, per server), and netstat never showed any > indication of ressource shortage (like mbufs). >=20 > What made the problem go away temporariliy was to ifconfig down/up > the affected interface. >=20 > We tested a 9.2 kernel with the 9.1 ixgbe driver, which was not > really stable. Also, we tested a few revisions between 9.1 and 9.2 > to find out when the problem started. Unfortunately, the ixgbe > driver turned out to be mostly unstable on our systems between these > releases, worse than on 9.2. The instability was introduced shortly > after to 9.1 and fixed only very shortly before 9.2 release. So no > luck there. We ended up using 9.1 with backports of 9.2 features we > really need. >=20 > What we can=E2=80=99t tell is wether it=E2=80=99s the 9.2 kernel or the 9= .2 ixgbe > driver or a combination of both that causes these problems. > Unfortunately we ran out of time (and ideas). >=20 >=20 > >> EFBIG is sometimes used for drivers when a packet takes too many > >> scatter/gather entries. Since you mentioned NFS, one thing you > >> can > >> try is to > >> disable TSO on the intertface you are using for NFS to see if that > >> "fixes" it. > >>=20 > > And please email if you try it and let us know if it helps. > >=20 > > I've think I've figured out how 64K NFS read replies can do this, > > but I'll admit "ping" is a mystery? (Doesn't it just send a single > > packet that would be in a single mbuf?) > >=20 > > I think the EFBIG is replied by bus_dmamap_load_mbuf_sg(), but I > > don't know if it can happen for an mbuf chain with < 32 entries? >=20 > We don=E2=80=99t use the nfs server on our systems, but they=E2=80=99re > (new)nfsclients. So I don=E2=80=99t think our problem is nfs related, unl= ess > the default rsize/wsize for client mounts is not 8K, which I thought > it was. Can you confirm this, Rick? >=20 Well, if you don't specify any mount options, it will be min(64K, what-the-server-specifies). "nfsstat -m" should show you what it actually is using, for 9.2 or later. 8K would be used if you specified "udp". For the client, it would be write requests that could be 64K. You could try "wsize=3D32768,rsize=3D32768" (it is actually the wsize that matters for this case, but you might as well set rsize at the same time). With these options specified, you know what the maximum value is (it will still be reduced for udp or if the server wants it smaller). rick > IIRC, disabling TSO did not make any difference in our case. >=20 >=20 > Markus >=20 > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to > "freebsd-net-unsubscribe@freebsd.org" >=20