Date: Thu, 15 Aug 2013 00:05:00 +0800 From: Julian Elischer <julian@freebsd.org> To: Lawrence Stewart <lstewart@freebsd.org> Cc: FreeBSD Net <net@freebsd.org> Subject: Re: TSO and FreeBSD vs Linux Message-ID: <520BAAAC.8070707@freebsd.org> In-Reply-To: <520B24A0.4000706@freebsd.org> References: <520A6D07.5080106@freebsd.org> <520AFBE8.1090109@freebsd.org> <520B24A0.4000706@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 8/14/13 2:33 PM, Julian Elischer wrote: > On 8/14/13 11:39 AM, Lawrence Stewart wrote: > There's a thing controlled by ethtool called GRO (generic receive >> offload) which appears to be enabled by default on at least Ubuntu >> and I >> guess other Linux's too. It's responsible for aggregating ACKs and >> data >> to batch them up the stack if the driver doesn't provide a hardware >> offload implementation. Try rerunning your experiments with the ACK >> batching disabled on the Linux host to get an additional comparison >> point. > I will try that as soon as I get back to the machines in question. turning on and off GRO seems to make no difference, either at the overall throughput level or at the low level packet-by-packet level (according to tcptrace). >>> for two examples look at: >>> >>> >>> http://www.freebsd.org/~julian/LvsF-tcp-start.tiff >>> and >>> http://www.freebsd.org/~julian/LvsF-tcp.tiff >>> >>> in each case, we can see FreeBSD on the left and Linux on the right. >>> >>> The first case shows the case as the sessions start, and the >>> second case >>> shows >>> some distance later (when the sequence numbers wrap around.. no >>> particular >>> reason to use that, it was just fun to see). >>> In both cases you can see that each Linux packet (white)(once they >>> have got >>> going) is responding to multiple bumps in the send window sequence >>> number (green and yellow lines) (representing the arrival of >>> several ACKs) >>> while FreeBSD produces a whole bunch of smaller packets, slavishly >>> following >>> exactly the size of each incoming ack.. This gives us quite a >>> performance debt. >> Again, please s/performance/what-you-really-mean/ here. > ok, In my tests this makes FreeBSD data transfers much slower, by as > much as 60%. >> >>> Notice that this behaviour in Linux seems to be modal.. it seems to >>> 'switch on' a little bit >>> into the 'starting' trace. >>> >>> In addition, you can see also that Linux gets going faster even in >>> the >>> beginning where >>> TSO isn't in play, by sending a lot more packets up-front. (of course >>> the wisdom of this >>> can be argued). >> They switched to using an initial window of 10 segments some time ago. >> FreeBSD starts with 3 or more recently, 10 if you're running recent >> 9-STABLE or 10-CURRENT. > I tried setting initial values as shown: > net.inet.tcp.local_slowstart_flightsize: 10 > net.inet.tcp.slowstart_flightsize: 10 > it didn't seem to make too much difference but I will redo the test. > >> >>> Has anyone done any work on aggregating ACKs, or delaying >>> responding to >>> them? >> As noted by Navdeep, we already have the code to aggregate ACKs in our >> software LRO implementation. The bigger problem is that appropriate >> byte >> counting places a default 2*MSS limit on the amount of ACKed data the >> window can grow by i.e. if an ACK for 64k of data comes up the stack, >> we'll grow the window by 2 segments worth of data in response. That >> needs to be addressed - we could send the ACK count up with the >> aggregated single ACK or just ignore abc_l_var when LRO is in use >> for a >> connection. > so, does "Software LRO" mean that LRO on hte NIC should be ON or OFF > to see this? > > >> >> Cheers, >> Lawrence >> >> > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?520BAAAC.8070707>