Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 4 Sep 2013 15:03:12 -0700
From:      Adrian Chadd <adrian@freebsd.org>
To:        Rick Macklem <rmacklem@uoguelph.ca>
Cc:        FreeBSD Net <net@freebsd.org>, David Wolfskill <david@catwhisker.org>
Subject:   Re: TSO and FreeBSD vs Linux
Message-ID:  <CAJ-Vmomctb9WFYkt89-WjX=_DzktNgty96ANaOAVzWZfvYXK-Q@mail.gmail.com>
In-Reply-To: <979862494.17918795.1378299005617.JavaMail.root@uoguelph.ca>
References:  <20130903192734.GA19406@albert.catwhisker.org> <979862494.17918795.1378299005617.JavaMail.root@uoguelph.ca>

next in thread | previous in thread | raw e-mail | index | archive | help
Hiya,

David - can you put together a minimal test case that others can reproduce?
I have a bunch of gige intel NICs that I can try this with when I'm back in
the office.

Thanks,



-adrian


On 4 September 2013 05:50, Rick Macklem <rmacklem@uoguelph.ca> wrote:

> David Wolfskill wrote:
> > On Wed, Aug 21, 2013 at 07:12:38PM +0200, Andre Oppermann wrote:
> > > On 13.08.2013 19:29, Julian Elischer wrote:
> > > > I have been tracking down a performance embarrassment on AMAZON
> > > > EC2 and have found it I think.
> > > > Our OS cousins over at Linux land have implemented some
> > > > interesting behaviour when TSO is in use.
> > >
> > > There used to be a different problem with EC2 and FreeBSD TSO.  The
> > > Xen hypervisor
> > > doesn't like large 64K TSO bursts we generate, the drivers drops
> > > the whole TSO chain,
> > > TCP gets upset and turns off TSO alltogether leaving the connection
> > > going at one
> > > packet a time as in the old days.
> > > ...
> >
> > My apologies for jumping in so late -- I'm not subscribed to -net@.
> >
> > At work, I received a new desktop machine a few months ago; here's a
> > recent history of what it has been running:
> >
> > FreeBSD 9.2-PRERELEASE #4  r254801M/254827:902501: Sun Aug 25
> > 05:15:29 PDT 2013     root@dwolf-fbsd:/usr/obj/usr/src/sys/DWOLF
> >  amd64
> > FreeBSD 9.2-PRERELEASE #5  r255066M/255091:902503: Sat Aug 31
> > 11:58:53 PDT 2013     root@dwolf-fbsd:/usr/obj/usr/src/sys/DWOLF
> >  amd64
> > FreeBSD 9.2-PRERELEASE #5  r255104M/255115:902503: Sun Sep  1
> > 05:02:12 PDT 2013     root@dwolf-fbsd:/usr/obj/usr/src/sys/DWOLF
> >  amd64
> >
> > Now, I like to have a "private playground" for doing things with
> > machines, so I make use of both em(4) NICs on the machine: em0
> > connects
> > to the rest of the work network; em1 is connected to a switch I
> > brought
> > in from home, and to which I connect "other things" (such as my
> > laptop).
> > And because I'm fairly comfortable with them, I use IPFW & natd.  For
> > some folks here, none of that should come as a surprise. :-})
> >
> > For reference, the em(4) devices in question are:
> >
> > em0@pci0:0:25:0:        class=0x020000 card=0x060d15d9
> > chip=0x10ef8086 rev=0x06 hdr=0x00
> >     vendor     = 'Intel Corporation'
> >     device     = '82578DM Gigabit Network Connection'
> >
> > and
> >
> > em1@pci0:3:0:0: class=0x020000 card=0x060d15d9 chip=0x10d38086
> > rev=0x00 hdr=0x00
> >     vendor     = 'Intel Corporation'
> >     device     = '82574L Gigabit Network Connection'
> >
> >
> >
> > I noticed that when I tried to write files to NFS, I could write
> > small
> > files OK, but larger ones seemed to ... hang.
> >
> > Note: We don't use jumbo frames.  (Work IT is convinced that they
> > don't help.  I'm trying to better-understand their reasoning.)
> >
> > Further poking around showed that (under the above conditions):
> > * natd CPU% was climbing as more of the file was copied, up to 2^21
> >   bytes.  (At that point, nothing further was saved on NFS.)
> > * dhcpd CPU% was also climbing.  I tried killing that, but doing so
> >   didn't affect the other results.  (Killing natd made connectivity
> >   cease, given the IPFW rules in effect.)
> > * Performing a tcpdump while trying to copy a file of length
> > 117709618
> >   showed lots of TCP retransmissions.  In fact, I'd hazard that every
> >   TCP
> >   packet was getting retransmitted.
> > * "ifconfig -v em0" showed flags TSO4 & VLAN_HWTSO turned on.
> > * "sysctl net.inet.tcp.tso" showed "1" -- enabled.
> >
> > As soon as I issued "sudo net.inet.tcp.tso=0" ... the copy worked
> > without
> > a hitch or a whine.  And I was able to copy all 117709618 bytes, not
> > just
> > 2097152 (2^21).
> >
> > Is the above expected?  It came rather as a surprise to me.
> >
> Not surprising to me, I'm afraid. When there are serious NFS problems
> like this, it is often caused by a network fabric issue and broken
> TSO is at the top of the list w.r.t. cause.
>
> rick
>
> > Peace,
> > david
> > --
> > David H. Wolfskill                            david@catwhisker.org
> > Taliban: Evil cowards with guns afraid of truth from a 14-year old
> > girl.
> >
> > See http://www.catwhisker.org/~david/publickey.gpg for my public key.
> >
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-Vmomctb9WFYkt89-WjX=_DzktNgty96ANaOAVzWZfvYXK-Q>