From owner-freebsd-net@FreeBSD.ORG  Wed Aug 14 16:05:12 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTP id E040396B;
 Wed, 14 Aug 2013 16:05:11 +0000 (UTC)
 (envelope-from julian@freebsd.org)
Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id BDE6E245C;
 Wed, 14 Aug 2013 16:05:11 +0000 (UTC)
Received: from jre-mbp.elischer.org (etroy.elischer.org [121.45.226.51])
 (authenticated bits=0)
 by vps1.elischer.org (8.14.7/8.14.6) with ESMTP id r7EG56r7018504
 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO);
 Wed, 14 Aug 2013 09:05:09 -0700 (PDT)
 (envelope-from julian@freebsd.org)
Message-ID: <520BAAAC.8070707@freebsd.org>
Date: Thu, 15 Aug 2013 00:05:00 +0800
From: Julian Elischer <julian@freebsd.org>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8;
 rv:17.0) Gecko/20130801 Thunderbird/17.0.8
MIME-Version: 1.0
To: Lawrence Stewart <lstewart@freebsd.org>
Subject: Re: TSO and FreeBSD vs Linux
References: <520A6D07.5080106@freebsd.org> <520AFBE8.1090109@freebsd.org>
 <520B24A0.4000706@freebsd.org>
In-Reply-To: <520B24A0.4000706@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: FreeBSD Net <net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 14 Aug 2013 16:05:12 -0000

On 8/14/13 2:33 PM, Julian Elischer wrote:
> On 8/14/13 11:39 AM, Lawrence Stewart wrote: 

> There's a thing controlled by ethtool called GRO (generic receive
>> offload) which appears to be enabled by default on at least Ubuntu 
>> and I
>> guess other Linux's too. It's responsible for aggregating ACKs and 
>> data
>> to batch them up the stack if the driver doesn't provide a hardware
>> offload implementation. Try rerunning your experiments with the ACK
>> batching disabled on the Linux host to get an additional comparison 
>> point.
> I will try that as soon as I get back to the machines in question.

turning on and off GRO seems to make no difference, either at the 
overall throughput level or at the
low level packet-by-packet level (according to tcptrace).

>>> for two examples look at:
>>>
>>>
>>> http://www.freebsd.org/~julian/LvsF-tcp-start.tiff
>>> and
>>> http://www.freebsd.org/~julian/LvsF-tcp.tiff
>>>
>>> in each case, we can see FreeBSD on the left and Linux on the right.
>>>
>>> The first case shows the case as the sessions start, and the 
>>> second case
>>> shows
>>> some distance later (when the sequence numbers wrap around.. no 
>>> particular
>>> reason to use that, it was just fun to see).
>>> In both cases you can see that each Linux packet (white)(once they 
>>> have got
>>> going) is responding to multiple bumps in the send window sequence
>>> number (green and yellow lines) (representing the arrival of 
>>> several ACKs)
>>> while FreeBSD produces a whole bunch of smaller packets, slavishly
>>> following
>>> exactly the size of each incoming ack.. This gives us quite  a
>>> performance debt.
>> Again, please s/performance/what-you-really-mean/ here.
> ok, In my tests this makes FreeBSD data transfers much slower, by as 
> much as 60%.
>>
>>> Notice that this behaviour in Linux seems to be modal.. it seems to
>>> 'switch on' a little bit
>>> into the 'starting' trace.
>>>
>>> In addition, you can see also that Linux gets going faster even in 
>>> the
>>> beginning where
>>> TSO isn't in play, by sending a lot more packets up-front. (of course
>>> the wisdom of this
>>> can be argued).
>> They switched to using an initial window of 10 segments some time ago.
>> FreeBSD starts with 3 or more recently, 10 if you're running recent
>> 9-STABLE or 10-CURRENT.
> I tried setting initial values as shown:
>   net.inet.tcp.local_slowstart_flightsize: 10
>   net.inet.tcp.slowstart_flightsize: 10
> it didn't seem to make too much difference but I will redo the test.
>
>>
>>> Has anyone done any work on aggregating ACKs, or delaying 
>>> responding to
>>> them?
>> As noted by Navdeep, we already have the code to aggregate ACKs in our
>> software LRO implementation. The bigger problem is that appropriate 
>> byte
>> counting places a default 2*MSS limit on the amount of ACKed data the
>> window can grow by i.e. if an ACK for 64k of data comes up the stack,
>> we'll grow the window by 2 segments worth of data in response. That
>> needs to be addressed - we could send the ACK count up with the
>> aggregated single ACK or just ignore abc_l_var when LRO is in use 
>> for a
>> connection.
> so, does "Software LRO" mean that LRO on hte NIC should be ON or OFF 
> to see this?
>
>
>>
>> Cheers,
>> Lawrence
>>
>>
>
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>