From owner-svn-src-all@freebsd.org Thu Jan 4 23:37:02 2018 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id EE8AEEC05E1 for ; Thu, 4 Jan 2018 23:37:02 +0000 (UTC) (envelope-from steven@multiplay.co.uk) Received: from mail-wm0-x230.google.com (mail-wm0-x230.google.com [IPv6:2a00:1450:400c:c09::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9D7E167586 for ; Thu, 4 Jan 2018 23:37:02 +0000 (UTC) (envelope-from steven@multiplay.co.uk) Received: by mail-wm0-x230.google.com with SMTP id a79so6174150wma.0 for ; Thu, 04 Jan 2018 15:37:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=multiplay-co-uk.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language; bh=3Y9cX4fbwMD5MAot46jll7a1shZUSTNQe+jFE3rqEA0=; b=RhczLdZVK6c0dJVOZ2cw6LrGIYNvyDDXIbm8lmrMmD9Qq1sOn6csRzV9UieqahmbF1 FbdoW8T2PaNogcIP6x03SZu84SmIfvnFFkn8Zgx+KCJZxmt8Rjt1Y/bdESYzjPL32LVK uA9qRMVMBRIqdZAB/8k8kYO49dba3VaLy/w61mnoecqkepBE6BdD8JkZpkHur8G6hRcV qlt48CS+8Qzaycc0AjSJIgHnSaUjKX/eDYWCHdRHD+jisjlDpgK3ZtsJmDXjx/4RA223 dyFLUHXZMkKSy2MDR1ObLYURvPbU/i+cl3R8oQTx6Lvpqcotw5A6DEMLGEWdNfh98+Mo oJmA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language; bh=3Y9cX4fbwMD5MAot46jll7a1shZUSTNQe+jFE3rqEA0=; b=QqKA7UxWFaeOOp8mf3TJ4Dz6B64DC2n50wd6mRWSZbehtdDbowGe4T32CLIQdcuEd1 QPqUwuA5V6FsEN49QgPFZ/cIhD0UrSGp4trHsHbbb7ukrHyhJ8Xy1lVUOl5iA7cIgSlG LuBEsIdmR6oSFKCD6WH1Bx3887pNV9uJDtSyruiGOd3uG/3fk3F2IKg9DG8BNAmj4Eof aklmOj3+ONVoqxg2IL+pjKLGaIZL78GHTUAVda+1yaAcE7r22o5WJB7H0AF6yNDxPvXU k2SGh02H3ei4XjVFNwcNLNoKkD5X9stAB/WD9yJ974bf6WYXpGpC7xTHYDpW3BIa/GfC 6N/Q== X-Gm-Message-State: AKGB3mJgDc86GqvTD63B0uU83dNa+S/8SobynBxJ1aR8PrgG5GIHv6WW moJvet1Vr7zvIX9yUPc9aOiMgw== X-Google-Smtp-Source: ACJfBouhpcyFG51BbPeplzGFlq4p8Y6DRtSo0KjEa2vOmdPLZCd1LNrkZ7vgBo95npIDp73NoWiOew== X-Received: by 10.28.15.201 with SMTP id 192mr843627wmp.88.1515109020832; Thu, 04 Jan 2018 15:37:00 -0800 (PST) Received: from [10.10.1.111] ([185.97.61.1]) by smtp.gmail.com with ESMTPSA id 67sm5947674wmq.38.2018.01.04.15.36.59 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 04 Jan 2018 15:36:59 -0800 (PST) Subject: Re: svn commit: r327559 - in head: . sys/net To: hiren panchasara Cc: Eugene Grosbein , src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org References: <201801042005.w04K5liB049411@repo.freebsd.org> <5A4E9397.9000308@grosbein.net> <20180104224214.GD18879@strugglingcoder.info> From: Steven Hartland Message-ID: <63c3c450-aeaf-bdd5-5e16-414146c9bb3a@multiplay.co.uk> Date: Thu, 4 Jan 2018 23:37:00 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2 MIME-Version: 1.0 In-Reply-To: <20180104224214.GD18879@strugglingcoder.info> Content-Language: en-US Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.25 X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Jan 2018 23:37:03 -0000 On 04/01/2018 22:42, hiren panchasara wrote: > On 01/04/18 at 09:52P, Steven Hartland wrote: >> On 04/01/2018 20:50, Eugene Grosbein wrote: >>> 05.01.2018 3:05, Steven Hartland wrote: >>> >>>> Author: smh >>>> Date: Thu Jan 4 20:05:47 2018 >>>> New Revision: 327559 >>>> URL: https://svnweb.freebsd.org/changeset/base/327559 >>>> >>>> Log: >>>> Disabled the use of flowid for lagg by default >>>> >>>> Disabled the use of RSS hash from the network card aka flowid for >>>> lagg(4) interfaces by default as it's currently incompatible with >>>> the lacp and loadbalance protocols. >>>> >>>> The incompatibility is due to the fact that the flowid isn't know >>>> for the first packet of a new outbound stream which can result in >>>> the hash calculation method changing and hence a stream being >>>> incorrectly split across multiple interfaces during normal >>>> operation. >>>> >>>> This can be re-enabled by setting the following in loader.conf: >>>> net.link.lagg.default_use_flowid="1" >>>> >>>> Discussed with: kmacy >>>> Sponsored by: Multiplay >>> RSS by definition has meaning to received stream. What is "outbound" stream >>> in this context, why can the hash calculatiom method change and what exactly >>> does it mean "a stream being incorrectly split"? >> Yes RSS is indeed a received stream but that is used by lagg for lacp >> and loadbalance protocols to decide which port of the lagg to "send" the >> packet out of. As the flowid is not known when a new "output" stream is >> instigated the current code falls back to manual hash calculation to >> determine which port to send the initial packet from. Once a response is >> received a tx then uses the flowid. This change of hash calculation >> method can result in the initial packet being sent from a different port >> than the rest of the stream; this is what I meant by "incorrectly split". > For my understanding, is this just an issue for the first packet when we > originate the flow? Once we have a response and if flowid is there, we'd > use it, right? OR am I missing something? Initially yes, but that can cause a whole cascading set of problems. If the source machine sends from two different ports then flow can traverse across the network using different paths and hence arrive at the destination on different ports too, causing the corresponding  issue on the other side. > And with this change, we'd always go and do manual calculation even when > we have a valid flowid (i.e. we didn't initiate a connection)? Correct, but there's potentially no easy way to correctly determine what the flowid and hence hash should be in this case, likely impossible if the lagg consists of different interface types. In addition if the hardware hash doesn't match the requested one as per laggproto then additional issues could also be triggered. Our TCP stack seems fragile during setup to out of order packets which this multipath behavior causes, we've seen this on our loadbalancers which is what triggered the investigation. The concrete result is many aborted TCP connections, over 300k ~2% on the machine I'm looking at. I hope there's some improvements that can be made, for example if we can determine the stream was instigated remotely then flowid would always be valid hence we can use it assuming it matches the requested spec or if we can make it clear to the user that laggproto is not the one they requested, I'm open to ideas?     Regards     Steve