Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 25 Mar 2014 17:46:00 -0400 (EDT)
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Markus Gebert <markus.gebert@hostpoint.ch>
Cc:        FreeBSD Net <freebsd-net@freebsd.org>, Garrett Wollman <wollman@freebsd.org>, Jack Vogel <jfvogel@gmail.com>, Christopher Forgeron <csforgeron@gmail.com>
Subject:   Re: 9.2 ixgbe tx queue hang
Message-ID:  <2042344654.506796.1395783960030.JavaMail.root@uoguelph.ca>
In-Reply-To: <906D7DF8-DD6E-4501-B3ED-42EF728241F4@hostpoint.ch>

next in thread | previous in thread | raw e-mail | index | archive | help
Markus Gebert wrote:
>=20
> On 25.03.2014, at 02:18, Rick Macklem <rmacklem@uoguelph.ca> wrote:
>=20
> > Christopher Forgeron wrote:
> >>=20
> >>=20
> >>=20
> >> This is regarding the TSO patch that Rick suggested earlier. (With
> >> many thanks for his time and suggestion)
> >>=20
> >>=20
> >> As I mentioned earlier, it did not fix the issue on a 10.0 system.
> >> It
> >> did make it less of a problem on 9.2, but either way, I think it's
> >> not needed, and shouldn't be considered as a patch for
> >> testing/etc.
> >>=20
> >>=20
> >> Patching TSO to anything other than a max value (and by default
> >> the
> >> code gives it IP_MAXPACKET) is confusing the matter, as the packet
> >> length ultimately needs to be adjusted for many things on the fly
> >> like TCP Options, etc. Using static header sizes won't be a good
> >> idea.
> >>=20
> > If you look at tcp_output(), you'll notice that it doesn't do TSO
> > if
> > there are any options. That way it knows that the TCP/IP header is
> > just hdrlen.
> >=20
> > If you don't limit the TSO packet (including TCP/IP and ethernet
> > headers)
> > to 64K, then the "ix" driver can't send them, which is the problem
> > you guys are seeing.
> >=20
> > There are other ways to fix this problem, but they all may
> > introduce
> > issues that reducing if_hw_tsomax by a small amount does not.
> > For example, m_defrag() could be modified to use 4K pagesize
> > clusters,
> > but this might introduce memory fragmentation problems. (I observed
> > what I think are memory fragmentation problems when I switched NFS
> > to use 4K pagesize clusters for large I/O messages.)
> >=20
> > If setting IP_MAXPACKET to 65518 fixes the problem (no more EFBIG
> > error replies), then that is the size that if_hw_tsomax can be set
> > to (just can't change IP_MAXPACKET, but that is defined for other
> > things). (It just happens that IP_MAXPACKET is what if_hw_tsomax
> > defaults to. It has no other effect w.r.t. TSO.)
> >=20
> >>=20
> >> Additionally, it seems that setting nic TSO will/may be ignored by
> >> code like this in sys/netinet/tcp_output.c:
> >>=20
>=20
> Is this confirmed or still a =E2=80=98it seems=E2=80=99? Have you actuall=
y seen a
> tp->t_tsomax value in tcp_output() bigger than if_hw_tsomax or was
> this just speculation because the values are stored in different
> places? (Sorry, if you already stated this in another email, it=E2=80=99s
> currently hard to keep track of all the information.)
>=20
> Anyway, this dtrace one-liner should be a good test if other values
> appear in tp->t_tsomax:
>=20
> # dtrace -n 'fbt::tcp_output:entry / args[0]->t_tsomax !=3D 0 &&
> args[0]->t_tsomax !=3D 65518 / { printf("unexpected tp->t_tsomax:
> %i\n", args[0]->t_tsomax); stack(); }'
>=20
> Remember to adjust the value in the condition to whatever you=E2=80=99re
> currently expecting. The value seems to be 0 for new connections,
> probably when tcp_mss() has not been called yet. So that=E2=80=99s seems
> normal and I have excluded that case too. This will also print a
> kernel stack trace in case it sees an unexpected value.
>=20
>=20
> > Yes, but I don't know why.
> > The only conjecture I can come up with is that another net driver
> > is
> > stacked above "ix" and the setting for if_hw_tsomax doesn't
> > propagate
> > up. (If you look at the commit log message for r251296, the intent
> > of adding if_hw_tsomax was to allow device drivers to set a smaller
> > tsomax than IP_MAXPACKET.)
> >=20
> > Are you using any of the "stacked" network device drivers like
> > lagg? I don't even know what the others all are?
> > Maybe someone else can list them?
>=20
> I guess the most obvious are lagg and vlan (and probably carp on
> FreeBSD 9.x or older).
>=20
> On request from Jack, we=E2=80=99ve eliminated lagg and vlan from the
> picture, which gives us plain ixgbe interfaces with no stacked
> interfaces on top of it. And we can still reproduce the problem.
>=20
This was related to the "did if_hw_tsomax set tp->t_tsomax to the
same value?" question. Since you reported that my patch that set
if_hw_tsomax in the driver didn't fix the problem, that suggests
that tp->t_tsomax isn't being set to if_hw_tsomax from the driver,
but we don't know why?

rick

>=20
> Markus
>=20
>=20
> >=20
> > rick
> >>=20
> >> 10.0 Code:
> >>=20
> >> 780 if (len > tp->t_tsomax - hdrlen) { !!
> >> 781 len =3D tp->t_tsomax - hdrlen; !!
> >> 782 sendalot =3D 1;
> >> 783 }
> >>=20
> >>=20
> >>=20
> >>=20
> >> I've put debugging here, set the nic's max TSO as per Rick's patch
> >> (
> >> set to say 32k), and have seen that tp->t_tsomax =3D=3D IP_MAXPACKET.
> >> It's being set someplace else, and thus our attempts to set TSO on
> >> the nic may be in vain.
> >>=20
> >>=20
> >> It may have mattered more in 9.2, as I see the code doesn't use
> >> tp->t_tsomax in some locations, and may actually default to what
> >> the
> >> nic is set to.
> >>=20
> >> The NIC may still win, I didn't walk through the code to confirm,
> >> it
> >> was enough to suggest to me that setting TSO wouldn't fix this
> >> issue.
> >>=20
> >>=20
> >> However, this is still a TSO related issue, it's just not one
> >> related
> >> to the setting of TSO's max size.
> >>=20
> >> A 10.0-STABLE system with tso disabled on ix0 doesn't have a
> >> single
> >> packet over IP_MAXPACKET in 1 hour of runtime. I'll let it go a
> >> bit
> >> longer to increase confidence in this assertion, but I don't want
> >> to
> >> waste time on this when I could be logging problem packets on a
> >> system with TSO enabled.
> >>=20
> >>=20
> >> Comments are very welcome..
> >>=20
> >>=20
> >>=20
> > _______________________________________________
> > freebsd-net@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-net
> > To unsubscribe, send any mail to
> > "freebsd-net-unsubscribe@freebsd.org"
> >=20
>=20
>=20



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2042344654.506796.1395783960030.JavaMail.root>