From owner-svn-src-all@freebsd.org  Thu Jan  4 23:37:02 2018
Return-Path: <owner-svn-src-all@freebsd.org>
Delivered-To: svn-src-all@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id EE8AEEC05E1
 for <svn-src-all@mailman.ysv.freebsd.org>;
 Thu,  4 Jan 2018 23:37:02 +0000 (UTC)
 (envelope-from steven@multiplay.co.uk)
Received: from mail-wm0-x230.google.com (mail-wm0-x230.google.com
 [IPv6:2a00:1450:400c:c09::230])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 9D7E167586
 for <svn-src-all@freebsd.org>; Thu,  4 Jan 2018 23:37:02 +0000 (UTC)
 (envelope-from steven@multiplay.co.uk)
Received: by mail-wm0-x230.google.com with SMTP id a79so6174150wma.0
 for <svn-src-all@freebsd.org>; Thu, 04 Jan 2018 15:37:02 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=multiplay-co-uk.20150623.gappssmtp.com; s=20150623;
 h=subject:to:cc:references:from:message-id:date:user-agent
 :mime-version:in-reply-to:content-language;
 bh=3Y9cX4fbwMD5MAot46jll7a1shZUSTNQe+jFE3rqEA0=;
 b=RhczLdZVK6c0dJVOZ2cw6LrGIYNvyDDXIbm8lmrMmD9Qq1sOn6csRzV9UieqahmbF1
 FbdoW8T2PaNogcIP6x03SZu84SmIfvnFFkn8Zgx+KCJZxmt8Rjt1Y/bdESYzjPL32LVK
 uA9qRMVMBRIqdZAB/8k8kYO49dba3VaLy/w61mnoecqkepBE6BdD8JkZpkHur8G6hRcV
 qlt48CS+8Qzaycc0AjSJIgHnSaUjKX/eDYWCHdRHD+jisjlDpgK3ZtsJmDXjx/4RA223
 dyFLUHXZMkKSy2MDR1ObLYURvPbU/i+cl3R8oQTx6Lvpqcotw5A6DEMLGEWdNfh98+Mo
 oJmA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:subject:to:cc:references:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-language;
 bh=3Y9cX4fbwMD5MAot46jll7a1shZUSTNQe+jFE3rqEA0=;
 b=QqKA7UxWFaeOOp8mf3TJ4Dz6B64DC2n50wd6mRWSZbehtdDbowGe4T32CLIQdcuEd1
 QPqUwuA5V6FsEN49QgPFZ/cIhD0UrSGp4trHsHbbb7ukrHyhJ8Xy1lVUOl5iA7cIgSlG
 LuBEsIdmR6oSFKCD6WH1Bx3887pNV9uJDtSyruiGOd3uG/3fk3F2IKg9DG8BNAmj4Eof
 aklmOj3+ONVoqxg2IL+pjKLGaIZL78GHTUAVda+1yaAcE7r22o5WJB7H0AF6yNDxPvXU
 k2SGh02H3ei4XjVFNwcNLNoKkD5X9stAB/WD9yJ974bf6WYXpGpC7xTHYDpW3BIa/GfC
 6N/Q==
X-Gm-Message-State: AKGB3mJgDc86GqvTD63B0uU83dNa+S/8SobynBxJ1aR8PrgG5GIHv6WW
 moJvet1Vr7zvIX9yUPc9aOiMgw==
X-Google-Smtp-Source: ACJfBouhpcyFG51BbPeplzGFlq4p8Y6DRtSo0KjEa2vOmdPLZCd1LNrkZ7vgBo95npIDp73NoWiOew==
X-Received: by 10.28.15.201 with SMTP id 192mr843627wmp.88.1515109020832;
 Thu, 04 Jan 2018 15:37:00 -0800 (PST)
Received: from [10.10.1.111] ([185.97.61.1])
 by smtp.gmail.com with ESMTPSA id 67sm5947674wmq.38.2018.01.04.15.36.59
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Thu, 04 Jan 2018 15:36:59 -0800 (PST)
Subject: Re: svn commit: r327559 - in head: . sys/net
To: hiren panchasara <hiren@strugglingcoder.info>
Cc: Eugene Grosbein <eugen@grosbein.net>, src-committers@freebsd.org,
 svn-src-all@freebsd.org, svn-src-head@freebsd.org
References: <201801042005.w04K5liB049411@repo.freebsd.org>
 <5A4E9397.9000308@grosbein.net>
 <f133b587-1f7e-4594-31d1-974775ad55be@freebsd.org>
 <20180104224214.GD18879@strugglingcoder.info>
From: Steven Hartland <steven@multiplay.co.uk>
Message-ID: <63c3c450-aeaf-bdd5-5e16-414146c9bb3a@multiplay.co.uk>
Date: Thu, 4 Jan 2018 23:37:00 +0000
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101
 Thunderbird/52.5.2
MIME-Version: 1.0
In-Reply-To: <20180104224214.GD18879@strugglingcoder.info>
Content-Language: en-US
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 8bit
X-Content-Filtered-By: Mailman/MimeDel 2.1.25
X-BeenThere: svn-src-all@freebsd.org
X-Mailman-Version: 2.1.25
Precedence: list
List-Id: "SVN commit messages for the entire src tree \(except for &quot;
 user&quot; and &quot; projects&quot; \)" <svn-src-all.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/svn-src-all>,
 <mailto:svn-src-all-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/svn-src-all/>
List-Post: <mailto:svn-src-all@freebsd.org>
List-Help: <mailto:svn-src-all-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/svn-src-all>,
 <mailto:svn-src-all-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 04 Jan 2018 23:37:03 -0000



On 04/01/2018 22:42, hiren panchasara wrote:
> On 01/04/18 at 09:52P, Steven Hartland wrote:
>> On 04/01/2018 20:50, Eugene Grosbein wrote:
>>> 05.01.2018 3:05, Steven Hartland wrote:
>>>
>>>> Author: smh
>>>> Date: Thu Jan  4 20:05:47 2018
>>>> New Revision: 327559
>>>> URL: https://svnweb.freebsd.org/changeset/base/327559
>>>>
>>>> Log:
>>>>     Disabled the use of flowid for lagg by default
>>>>     
>>>>     Disabled the use of RSS hash from the network card aka flowid for
>>>>     lagg(4) interfaces by default as it's currently incompatible with
>>>>     the lacp and loadbalance protocols.
>>>>     
>>>>     The incompatibility is due to the fact that the flowid isn't know
>>>>     for the first packet of a new outbound stream which can result in
>>>>     the hash calculation method changing and hence a stream being
>>>>     incorrectly split across multiple interfaces during normal
>>>>     operation.
>>>>     
>>>>     This can be re-enabled by setting the following in loader.conf:
>>>>     net.link.lagg.default_use_flowid="1"
>>>>     
>>>>     Discussed with: kmacy
>>>>     Sponsored by:	Multiplay
>>> RSS by definition has meaning to received stream. What is "outbound" stream
>>> in this context, why can the hash calculatiom method change and what exactly
>>> does it mean "a stream being incorrectly split"?
>> Yes RSS is indeed a received stream but that is used by lagg for lacp
>> and loadbalance protocols to decide which port of the lagg to "send" the
>> packet out of. As the flowid is not known when a new "output" stream is
>> instigated the current code falls back to manual hash calculation to
>> determine which port to send the initial packet from. Once a response is
>> received a tx then uses the flowid. This change of hash calculation
>> method can result in the initial packet being sent from a different port
>> than the rest of the stream; this is what I meant by "incorrectly split".
> For my understanding, is this just an issue for the first packet when we
> originate the flow? Once we have a response and if flowid is there, we'd
> use it, right? OR am I missing something?
Initially yes, but that can cause a whole cascading set of problems. If 
the source machine sends from two different ports then flow can traverse 
across the network using different paths and hence arrive at the 
destination on different ports too, causing the corresponding  issue on 
the other side.
> And with this change, we'd always go and do manual calculation even when
> we have a valid flowid (i.e. we didn't initiate a connection)?
Correct, but there's potentially no easy way to correctly determine what 
the flowid and hence hash should be in this case, likely impossible if 
the lagg consists of different interface types.

In addition if the hardware hash doesn't match the requested one as per 
laggproto then additional issues could also be triggered.

Our TCP stack seems fragile during setup to out of order packets which 
this multipath behavior causes, we've seen this on our loadbalancers 
which is what triggered the investigation. The concrete result is many 
aborted TCP connections, over 300k ~2% on the machine I'm looking at.

I hope there's some improvements that can be made, for example if we can 
determine the stream was instigated remotely then flowid would always be 
valid hence we can use it assuming it matches the requested spec or if 
we can make it clear to the user that laggproto is not the one they 
requested, I'm open to ideas?

     Regards
     Steve