From owner-freebsd-stable@freebsd.org  Tue Sep 20 08:11:39 2016
Return-Path: <owner-freebsd-stable@freebsd.org>
Delivered-To: freebsd-stable@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 160D6BE1C65
 for <freebsd-stable@mailman.ysv.freebsd.org>;
 Tue, 20 Sep 2016 08:11:39 +0000 (UTC) (envelope-from slw@zxy.spb.ru)
Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id CF54BED1
 for <freebsd-stable@freebsd.org>; Tue, 20 Sep 2016 08:11:38 +0000 (UTC)
 (envelope-from slw@zxy.spb.ru)
Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD))
 (envelope-from <slw@zxy.spb.ru>)
 id 1bmG9P-000JKq-6d; Tue, 20 Sep 2016 11:11:35 +0300
Date: Tue, 20 Sep 2016 11:11:35 +0300
From: Slawa Olhovchenkov <slw@zxy.spb.ru>
To: Lyndon Nerenberg <lyndon@orthanc.ca>
Cc: FreeBSD Stable <freebsd-stable@freebsd.org>
Subject: Re: LAGG and Jumbo Frames
Message-ID: <20160920081135.GH2960@zxy.spb.ru>
References: <48926c6013f938af832c17e4ad10b232@dweimer.net>
 <alpine.BSF.2.20.1609191326280.93154@orthanc.ca>
 <04c9065ee4a780c6f8986d1b204c4198@dweimer.net>
 <alpine.BSF.2.20.1609191419030.93154@orthanc.ca>
 <20160919220812.GG2960@zxy.spb.ru>
 <42A03EA9-7F8E-446E-B430-7431AB9CE2E6@orthanc.ca>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <42A03EA9-7F8E-446E-B430-7431AB9CE2E6@orthanc.ca>
User-Agent: Mutt/1.5.24 (2015-08-30)
X-SA-Exim-Connect-IP: <locally generated>
X-SA-Exim-Mail-From: slw@zxy.spb.ru
X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-stable>, 
 <mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable/>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
 <mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 20 Sep 2016 08:11:39 -0000

On Mon, Sep 19, 2016 at 03:59:20PM -0700, Lyndon Nerenberg wrote:

> 
> > On Sep 19, 2016, at 3:08 PM, Slawa Olhovchenkov <slw@zxy.spb.ru> wrote:
> > 
> > This is because RTT of this link for jumbo frames higher 1500 bytes
> > frame for store-and-forward switch chain.
> 
> For TCP, RTT isn't really a factor (in this scenario),

I am don't see scenario in first message. For may scenario this is
limiting factor

> as the windowing and congestion avoidance algorithms will adapt to the actual bandwidth-delay product of the link, and the delays in each direction will be symmetrical.
> 
> Now the ack for a single 9000 octet packet will take longer than
> that for a 1500 octet one, but that's because you're sending six
> times as many octets before the ACK can be generated.  The time to
> send six 1500 octet packets and receive the ACK from sixth packet is
> going to be comparable to that of receiving the ack from a single
> 9000 octet packet.  It's simple arithmetic to calculate the extra
> protocol header overhead for 6x1500 vs 1x9000.

Time to send send six 1500 octet packets significant less then for
send one 9000 octet packet over multiple switch:

H1-[S1]-[S2]-[S3]-H2

Sendig single 1500 octet packet from H1 to S1 over 1Gbit link:

(1500+14+4+12+8)*8/10^9 = 12us
switch delayed for 3us

same for s1-s2, s2-s3, s3-h2.

2'nd packet delayed for 12us. 3..6 -- same.
Sending all six packets (5 inter packets over 4 hop):
(12+3)*4 + 12*5 = 120us.

Sending single 9000 octet packet from H1 to S1 over 1Gbit link:
(9000+14+4+12+8)*8/10^9 = 72us
switch delayed for 3us

Sending single 9000 octet packet over 4 hop: (72+3)*4 = 300us.

300/120 = 2.5 time slower

> If there *is* a significant difference (beyond the extra protocol header overhead), it's time to take a very close look at the NICs you are using in the end hosts.  A statistically significant difference would hint at poor interrupt handling performance on the part of one or more of the NICs and their associated device drivers.
> 
> The intermediate switch overhead will be a constant (unless the switch backplane becomes saturated from unrelated traffic).

You lost serelisation time.