From owner-freebsd-net@FreeBSD.ORG Thu Feb 27 01:01:41 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 7FFEAFEF; Thu, 27 Feb 2014 01:01:41 +0000 (UTC) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 290CF1196; Thu, 27 Feb 2014 01:01:40 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqQEAJKNDlODaFve/2dsb2JhbABag0FXgwO9LU+BLXSCJwEBAQQBAQEgKyALGw4KAgINGQIpAQkmBggHBAEcBIdYDakNoGgXgSmMSRACARs0B4JugUkEiUmMF4QIkHaDSx4xewc7 X-IronPort-AV: E=Sophos;i="4.97,551,1389762000"; d="scan'208";a="100163922" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-annu.net.uoguelph.ca with ESMTP; 26 Feb 2014 20:00:31 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id DAADBB3F12; Wed, 26 Feb 2014 20:00:31 -0500 (EST) Date: Wed, 26 Feb 2014 20:00:31 -0500 (EST) From: Rick Macklem To: John Baldwin Message-ID: <532475749.13937791.1393462831884.JavaMail.root@uoguelph.ca> In-Reply-To: <201402261132.09203.jhb@freebsd.org> Subject: Re: Network loss MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.203] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - IE8 (Win)/7.2.1_GA_2790) Cc: Johan Kooijman , freebsd-net@freebsd.org, Jack Vogel X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Feb 2014 01:01:41 -0000 John Baldwin wrote: > On Tuesday, February 25, 2014 2:19:01 am Johan Kooijman wrote: > > Hi all, > > > > I have a weird situation here where I can't get my head around. > > > > One FreeBSD 9.2-STABLE ZFS/NFS box, multiple Linux clients. Once in > > a while > > the Linux clients loose their NFS connection: > > > > Feb 25 06:24:09 hv3 kernel: nfs: server 10.0.24.1 not responding, > > timed out > > > > Not all boxes, just one out of the cluster. The weird part is that > > when I > > try to ping a Linux client from the FreeBSD box, I have between 10 > > and 30% > > packetloss - all day long, no specific timeframe. If I ping the > > Linux > > clients - no loss. If I ping back from the Linux clients to FBSD > > box - no > > loss. > > > > The errors I get when pinging a Linux client is this one: > > ping: sendto: File too large > > EFBIG is sometimes used for drivers when a packet takes too many > scatter/gather entries. Since you mentioned NFS, one thing you can > try is to > disable TSO on the intertface you are using for NFS to see if that > "fixes" it. > And please email if you try it and let us know if it helps. I've think I've figured out how 64K NFS read replies can do this, but I'll admit "ping" is a mystery? (Doesn't it just send a single packet that would be in a single mbuf?) I think the EFBIG is replied by bus_dmamap_load_mbuf_sg(), but I don't know if it can happen for an mbuf chain with < 32 entries? I've cc'd Jack in case he knows, rick ps: The other thing to try is setting rsize=32768,wsize=32768 options on the Linux client mounts. Either change should "fix" the long mbuf chain problem, but if one of these fixes the problem and the other one doesn't...?? > -- > John Baldwin > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to > "freebsd-net-unsubscribe@freebsd.org" >