From owner-freebsd-net@freebsd.org Tue Jun 27 12:05:10 2017 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9CF33DA8E8B for ; Tue, 27 Jun 2017 12:05:10 +0000 (UTC) (envelope-from prvs=34432f271=youssef.ghorbal@pasteur.fr) Received: from mx0.pasteur.fr (mx0.pasteur.fr [157.99.45.50]) (using TLSv1.2 with cipher RC4-SHA (128/128 bits)) (Client CN "Cisco Appliance Demo Certificate", Issuer "Cisco Appliance Demo Certificate" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 0677638F8 for ; Tue, 27 Jun 2017 12:05:09 +0000 (UTC) (envelope-from prvs=34432f271=youssef.ghorbal@pasteur.fr) Authentication-Results: mx0.pasteur.fr; spf=None smtp.pra=youssef.ghorbal@pasteur.fr; spf=None smtp.mailfrom=youssef.ghorbal@pasteur.fr; spf=None smtp.helo=postmaster@EXCHANGE04.corp.pasteur.fr Received-SPF: None (mx0.pasteur.fr: no sender authenticity information available from domain of youssef.ghorbal@pasteur.fr) identity=pra; client-ip=157.99.211.34; receiver=mx0.pasteur.fr; envelope-from="youssef.ghorbal@pasteur.fr"; x-sender="youssef.ghorbal@pasteur.fr"; x-conformance=sidf_compatible Received-SPF: None (mx0.pasteur.fr: no sender authenticity information available from domain of youssef.ghorbal@pasteur.fr) identity=mailfrom; client-ip=157.99.211.34; receiver=mx0.pasteur.fr; envelope-from="youssef.ghorbal@pasteur.fr"; x-sender="youssef.ghorbal@pasteur.fr"; x-conformance=sidf_compatible Received-SPF: None (mx0.pasteur.fr: no sender authenticity information available from domain of postmaster@EXCHANGE04.corp.pasteur.fr) identity=helo; client-ip=157.99.211.34; receiver=mx0.pasteur.fr; envelope-from="youssef.ghorbal@pasteur.fr"; x-sender="postmaster@EXCHANGE04.corp.pasteur.fr"; x-conformance=sidf_compatible X-IronPort-AV: E=Sophos;i="5.39,399,1493676000"; d="scan'208";a="1450646" Received: from exchange04.corp.pasteur.fr ([157.99.211.34]) by mx0.pasteur.fr with ESMTP/TLS/AES256-GCM-SHA384; 27 Jun 2017 14:05:07 +0200 Received: from EXCHANGE02.corp.pasteur.fr (2002:9d63:d320::9d63:d320) by EXCHANGE04.corp.pasteur.fr (2002:9d63:d322::9d63:d322) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.1.845.34; Tue, 27 Jun 2017 14:05:06 +0200 Received: from EXCHANGE02.corp.pasteur.fr ([fe80::a819:199f:2049:3d20]) by EXCHANGE02.corp.pasteur.fr ([fe80::a819:199f:2049:3d20%18]) with mapi id 15.01.0845.034; Tue, 27 Jun 2017 14:05:06 +0200 From: "Youssef GHORBAL" To: "sthaug@nethelp.no" CC: "matt.joras@gmail.com" , "freebsd-net@freebsd.org" , "nparhar@gmail.com" Subject: Re: Sporadic TCP/RST sent to client Thread-Topic: Sporadic TCP/RST sent to client Thread-Index: AQHS66rb2Qoho5P9iUa0cYMOeRPraqI3fpaAgAAbfoCAALjuAIAAG50AgAATvwA= Date: Tue, 27 Jun 2017 12:05:06 +0000 Message-ID: References: <5ABA962E-A90A-4C25-A5A7-EE5CF66FFDD4@pasteur.fr> <20170627.125426.74697078.sthaug@nethelp.no> In-Reply-To: <20170627.125426.74697078.sthaug@nethelp.no> Accept-Language: en-US, fr-FR Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-messagesentrepresentingtype: 1 x-originating-ip: [157.99.101.110] Content-Type: text/plain; charset="us-ascii" Content-ID: <9A4CBCD005B2534E88EEE4CC29B5949A@corp.pasteur.fr> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Jun 2017 12:05:10 -0000 > On 27 Jun 2017, at 12:54, sthaug@nethelp.no wrote: >=20 >> Imagine this set up : >>=20 >> freebsd host port0 <-> switch 1 <-> linux host port0 >> freebsd host port1 <-> switch 2 <-> linux host port1 >>=20 >> On the linux box, port 0&1 are enslaved in a bond with a RR algorithm (R= ound Robin) >> On the freebsd box, port 0&1 are enslaved in a lagg. >>=20 >> switchs 1&2 are configured for doing MLAG. >>=20 >> The Linux box disapatchs packets on both NICs (since the RR algo dictate= s that) packets are dispatched in order. >> Packets outgoing on port0 gets handled by switch1 and hits the freebsd b= ox on port 0 >> Packets outgoing on port1 gets handled by switch2 and hits the freebsd b= ox on port 1 >>=20 >> As I stated earlier, from the tcpdump traces I've done on the freebsd bo= x (both on the lagg interface and the actual ports) packets do arrive order= ed but on different NICs and it works great until the elapes times start to= be around microsecond. >>=20 >> I don't really have control over the Linux box to make them use other ha= sh algo (but I'm stil trying) >=20 > If the Linux box is using round robin you shouldn't expect to be able > to "fix" the problem at the FreeBSD end. There is nothing in the 802.3ad that mandates stickiness of flows per NIC, = the only thing explicit is that hash algorithm needs to maintain packet ord= er. In this case, strictly speaking, it's : Packets do leave in "order" and= do arrive in "order". > On routers and switches (which is what I normally work with) the hash > algorithm used for LAG connections ensures that one "flow" always uses > the same path, thus no reordering. A typical hash algorithm uses a > 5-tuple with (src ip, src port, dst ip, dst port, protocol) as input. >=20 > So the advice in this case is simple - don't use round robin! Yes, I > understand you don't control the Linux box. Sure, I was just wondering if the FreeBSD network stack was built with the = fact that each flow needs to arrive on the same NIC and the system was desi= gned with this assumption in mind or not. I reported it here, thinking that maybe it's a subtle buggy corner case and= maybe the community was interesting to know about and maybe fix : - If the stack is working as expected and was built with the assumption tha= t each incoming flow needs to stick to a NIC during it's lifetime, maybe do= cumentation needs to be more explicit regarding this situation. In that cas= e I'll file documentation enhancement bug report. - If the stack is misbehaving, maybe help the community identify the root c= ause and help fixing it Youssef