From owner-freebsd-net@FreeBSD.ORG Tue Mar 25 12:16:19 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9457FADC; Tue, 25 Mar 2014 12:16:19 +0000 (UTC) Received: from mail.adm.hostpoint.ch (mail.adm.hostpoint.ch [IPv6:2a00:d70:0:a::e0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 269D5280; Tue, 25 Mar 2014 12:16:18 +0000 (UTC) Received: from [2001:1620:2013:1:a98e:740e:10d9:e192] (port=51063) by mail.adm.hostpoint.ch with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.80.1 (FreeBSD)) (envelope-from ) id 1WSQH9-00073G-9f; Tue, 25 Mar 2014 13:16:15 +0100 Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 7.2 \(1874\)) Subject: Re: 9.2 ixgbe tx queue hang From: Markus Gebert In-Reply-To: <1973302314.2516695.1395710288222.JavaMail.root@uoguelph.ca> Date: Tue, 25 Mar 2014 13:16:14 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <906D7DF8-DD6E-4501-B3ED-42EF728241F4@hostpoint.ch> References: <1973302314.2516695.1395710288222.JavaMail.root@uoguelph.ca> To: Rick Macklem X-Mailer: Apple Mail (2.1874) Cc: FreeBSD Net , Garrett Wollman , Jack Vogel , Christopher Forgeron X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Mar 2014 12:16:19 -0000 On 25.03.2014, at 02:18, Rick Macklem wrote: > Christopher Forgeron wrote: >>=20 >>=20 >>=20 >> This is regarding the TSO patch that Rick suggested earlier. (With >> many thanks for his time and suggestion) >>=20 >>=20 >> As I mentioned earlier, it did not fix the issue on a 10.0 system. It >> did make it less of a problem on 9.2, but either way, I think it's >> not needed, and shouldn't be considered as a patch for testing/etc. >>=20 >>=20 >> Patching TSO to anything other than a max value (and by default the >> code gives it IP_MAXPACKET) is confusing the matter, as the packet >> length ultimately needs to be adjusted for many things on the fly >> like TCP Options, etc. Using static header sizes won't be a good >> idea. >>=20 > If you look at tcp_output(), you'll notice that it doesn't do TSO if > there are any options. That way it knows that the TCP/IP header is > just hdrlen. >=20 > If you don't limit the TSO packet (including TCP/IP and ethernet = headers) > to 64K, then the "ix" driver can't send them, which is the problem > you guys are seeing. >=20 > There are other ways to fix this problem, but they all may introduce > issues that reducing if_hw_tsomax by a small amount does not. > For example, m_defrag() could be modified to use 4K pagesize clusters, > but this might introduce memory fragmentation problems. (I observed > what I think are memory fragmentation problems when I switched NFS > to use 4K pagesize clusters for large I/O messages.) >=20 > If setting IP_MAXPACKET to 65518 fixes the problem (no more EFBIG > error replies), then that is the size that if_hw_tsomax can be set > to (just can't change IP_MAXPACKET, but that is defined for other > things). (It just happens that IP_MAXPACKET is what if_hw_tsomax > defaults to. It has no other effect w.r.t. TSO.) >=20 >>=20 >> Additionally, it seems that setting nic TSO will/may be ignored by >> code like this in sys/netinet/tcp_output.c: >>=20 Is this confirmed or still a =91it seems=92? Have you actually seen a = tp->t_tsomax value in tcp_output() bigger than if_hw_tsomax or was this = just speculation because the values are stored in different places? = (Sorry, if you already stated this in another email, it=92s currently = hard to keep track of all the information.) Anyway, this dtrace one-liner should be a good test if other values = appear in tp->t_tsomax: # dtrace -n 'fbt::tcp_output:entry / args[0]->t_tsomax !=3D 0 && = args[0]->t_tsomax !=3D 65518 / { printf("unexpected tp->t_tsomax: %i\n", = args[0]->t_tsomax); stack(); }' Remember to adjust the value in the condition to whatever you=92re = currently expecting. The value seems to be 0 for new connections, = probably when tcp_mss() has not been called yet. So that=92s seems = normal and I have excluded that case too. This will also print a kernel = stack trace in case it sees an unexpected value. > Yes, but I don't know why. > The only conjecture I can come up with is that another net driver is > stacked above "ix" and the setting for if_hw_tsomax doesn't propagate > up. (If you look at the commit log message for r251296, the intent > of adding if_hw_tsomax was to allow device drivers to set a smaller > tsomax than IP_MAXPACKET.) >=20 > Are you using any of the "stacked" network device drivers like > lagg? I don't even know what the others all are? > Maybe someone else can list them? I guess the most obvious are lagg and vlan (and probably carp on FreeBSD = 9.x or older). On request from Jack, we=92ve eliminated lagg and vlan from the picture, = which gives us plain ixgbe interfaces with no stacked interfaces on top = of it. And we can still reproduce the problem. Markus >=20 > rick >>=20 >> 10.0 Code: >>=20 >> 780 if (len > tp->t_tsomax - hdrlen) { !! >> 781 len =3D tp->t_tsomax - hdrlen; !! >> 782 sendalot =3D 1; >> 783 } >>=20 >>=20 >>=20 >>=20 >> I've put debugging here, set the nic's max TSO as per Rick's patch ( >> set to say 32k), and have seen that tp->t_tsomax =3D=3D IP_MAXPACKET. >> It's being set someplace else, and thus our attempts to set TSO on >> the nic may be in vain. >>=20 >>=20 >> It may have mattered more in 9.2, as I see the code doesn't use >> tp->t_tsomax in some locations, and may actually default to what the >> nic is set to. >>=20 >> The NIC may still win, I didn't walk through the code to confirm, it >> was enough to suggest to me that setting TSO wouldn't fix this >> issue. >>=20 >>=20 >> However, this is still a TSO related issue, it's just not one related >> to the setting of TSO's max size. >>=20 >> A 10.0-STABLE system with tso disabled on ix0 doesn't have a single >> packet over IP_MAXPACKET in 1 hour of runtime. I'll let it go a bit >> longer to increase confidence in this assertion, but I don't want to >> waste time on this when I could be logging problem packets on a >> system with TSO enabled. >>=20 >>=20 >> Comments are very welcome.. >>=20 >>=20 >>=20 > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >=20