From owner-freebsd-net@FreeBSD.ORG Mon Mar 24 16:36:07 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 721F4CAC; Mon, 24 Mar 2014 16:36:07 +0000 (UTC) Received: from mail.adm.hostpoint.ch (mail.adm.hostpoint.ch [IPv6:2a00:d70:0:a::e0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id F255AAF5; Mon, 24 Mar 2014 16:36:06 +0000 (UTC) Received: from [2001:1620:2013:1:7810:eaed:8406:ff6] (port=50246) by mail.adm.hostpoint.ch with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.80.1 (FreeBSD)) (envelope-from ) id 1WS7r2-0006hT-NS; Mon, 24 Mar 2014 17:36:04 +0100 Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 7.2 \(1874\)) Subject: Re: 9.2 ixgbe tx queue hang From: Markus Gebert In-Reply-To: Date: Mon, 24 Mar 2014 17:36:03 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <1164414873.1690348.1395622026185.JavaMail.root@uoguelph.ca> <0BC10908-2081-45AC-A1C8-14220D81EC0A@hostpoint.ch> To: Christopher Forgeron X-Mailer: Apple Mail (2.1874) Cc: FreeBSD Net , Rick Macklem , Garrett Wollman , Jack Vogel X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Mar 2014 16:36:07 -0000 On 24.03.2014, at 17:23, Christopher Forgeron = wrote: > I think making hw_tsomax a sysctl would be a good patch to commit - It > could enable easy debugging/performance testing for the masses. >=20 > I'm curious to hear how your environment is working with a tso turned = off > on your nics. This will take some more time. Only one of the affected systems is = running the test kernel with prinfts, additional sysctl, and Rick=92s = patch. I want to be able to reproduce the problem with that patch, = before changing another variable (like turning TSO off), but that can = take days on one server. I=92ll probably be able to equip some more = servers with that kernel soon, and might run a subgroup without TSO. But = first I have to make sure, the new kernel doesn=92t add any new = problemes, we can=92t afford them on productive servers. > My testbed just hit the 2 hour mark. With TSO off, I don't get a = single > packet over IP_MAXPACKET. That puts my confidence at around 95% in = the > statement 'turning off tso negates this issue for me'. >=20 > I'm now rebooting into a +tso env to see if I can capture the bad = packets. >=20 > I am also sure that the netstat -m mbuf denied is a completely = separate > issue. I'm going around the lab and powering up different boxes with > 10.0-RELEASE, and they all have mbuf/mbuf clusters denied on boot, and = that > number increases with network traffic. It's probably not helping the > IP_MAXPACKET issue. While we have most symptoms in common, I=92ve still not seen any = allocation error in netstat -m. So I tend to agree that this is most = probably a different problem. Markus > I'll create a separate thread for that one shortly. >=20 >=20 > On Mon, Mar 24, 2014 at 1:14 PM, Markus Gebert > wrote: >=20 >>=20 >> On 24.03.2014, at 16:21, Christopher Forgeron >> wrote: >>=20 >>> This is regarding the TSO patch that Rick suggested earlier. (With = many >>> thanks for his time and suggestion) >>>=20 >>> As I mentioned earlier, it did not fix the issue on a 10.0 system. = It did >>> make it less of a problem on 9.2, but either way, I think it's not >> needed, >>> and shouldn't be considered as a patch for testing/etc. >>>=20 >>> Patching TSO to anything other than a max value (and by default the = code >>> gives it IP_MAXPACKET) is confusing the matter, as the packet length >>> ultimately needs to be adjusted for many things on the fly like TCP >>> Options, etc. Using static header sizes won't be a good idea. >>>=20 >>> Additionally, it seems that setting nic TSO will/may be ignored by = code >>> like this in sys/netinet/tcp_output.c: >>>=20 >>> 10.0 Code: >>>=20 >>> 780 if (len > tp->t_tsomax - hdrlen) >>> { !! >>> 781 len =3D tp->t_tsomax - >>> hdrlen; !! >>> 782 sendalot =3D >>> 1; >>> 783 } >>>=20 >>>=20 >>> I've put debugging here, set the nic's max TSO as per Rick's patch ( = set >> to >>> say 32k), and have seen that tp->t_tsomax =3D=3D IP_MAXPACKET. It's = being set >>> someplace else, and thus our attempts to set TSO on the nic may be = in >> vain. >>>=20 >>> It may have mattered more in 9.2, as I see the code doesn't use >>> tp->t_tsomax in some locations, and may actually default to what the = nic >> is >>> set to. >>>=20 >>> The NIC may still win, I didn't walk through the code to confirm, it = was >>> enough to suggest to me that setting TSO wouldn't fix this issue. >>=20 >>=20 >> I just applied Rick's ixgbe TSO patch and additionally wanted to be = able >> to easily change the value of hw_tsomax, so I made a sysctl out of = it. >>=20 >> While doing that, I asked myself the same question. Where and how = will >> this value actually be used and how comes that tcp_output() uses that = other >> value in struct tcpcb. >>=20 >> The only place tcpcb->t_tsomax gets set, that I have found so far, is = in >> tcp_input.c's tcp_mss() function. Some subfunctions get called: >>=20 >> tcp_mss() -> tcp_mss_update() -> tcp_maxmtu() >>=20 >> Then tcp_maxmtu() indeed uses the interface's hw_tsomax value: >>=20 >> 1746 cap->tsomax =3D = ifp->if_hw_tsomax; >>=20 >> It get's passed back to tcp_mss() where it is set on the connection = level >> which will be used in tcp_output() later on. >>=20 >> tcp_mss() gets called from multiple places, I'll look into that = later. I >> will let you know if I find out more. >>=20 >>=20 >> Markus >>=20 >>=20 >>> However, this is still a TSO related issue, it's just not one = related to >>> the setting of TSO's max size. >>>=20 >>> A 10.0-STABLE system with tso disabled on ix0 doesn't have a single >> packet >>> over IP_MAXPACKET in 1 hour of runtime. I'll let it go a bit longer = to >>> increase confidence in this assertion, but I don't want to waste = time on >>> this when I could be logging problem packets on a system with TSO >> enabled. >>>=20 >>> Comments are very welcome.. >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>=20 >>=20 >>=20 > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >=20