From owner-freebsd-transport@freebsd.org Thu Sep 10 09:49:17 2020 Return-Path: Delivered-To: freebsd-transport@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 1A69B3D0A15 for ; Thu, 10 Sep 2020 09:49:17 +0000 (UTC) (envelope-from Richard.Scheffenegger@netapp.com) Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2069.outbound.protection.outlook.com [40.107.243.69]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "GlobalSign Organization Validation CA - SHA256 - G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4BnDcW6Wfkz3Vg0 for ; Thu, 10 Sep 2020 09:49:15 +0000 (UTC) (envelope-from Richard.Scheffenegger@netapp.com) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=bCM3Ad2jreQ86iGXTW4m4Kz/3vyF1nr670HgHJIG41onFH0dQejxOLHt5rCOdnDKHMq0VV2rsw1/KDSToj/HG/awbJvMIpvEems8Dsf3sQD8aQqefFW5wpVNhxkbdOlJZvuPuqylycLcDLr0aTn+wC/SAhiu/Ga/Wufw3td0qaKVr4YAPOLvp4Z7rJm1GHWv3MNYDaaLM1h/3KHh5jC4kz+prxW04ElWlBhSOsrjGKbE4oQbV9HPW0UjEoh3xq3ogiXpq4XPUKGMSs9zxZ57w01wxFzXh58fzu5EcNE4Qd1vyFAiea334p43z7e/gjebQ7JeDo9V3K4ssmW0Mjxy/w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=vsIBVZcmoO/Jm81Hd1n5o+9E1z6fPWfyQ67Jfsgq5D8=; b=K2HJCOursPCjBEzXaTjAWgZcts1vwDtPaetKVSV7v3rysf3PmKAhRmfTh9xMnQBS/47M1YOnNFhD1GaztOLJs4rD1rzJHjRD//IqZC0YL2KYohIj01eEddx7El+lWhoeQSmBSyX1pOCXnnNmhIP3LxjAWwseqJd/lfjZA1j2HUixEWR/H40MAbyXd0PJvjrH5Ei4W/9523897Zve202/GqtefYaMqpbfaK4mgDckw6EAyQ6cdek3H50rW6qnard/lVKLJYZxynNJI6AJRPwe3Yc1ESAS1TTqrB7QALH08mV172aSaTK5aaSEhVVqHulS0wFbBb3/B/91p/dLg3yztQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=netapp.com; dmarc=pass action=none header.from=netapp.com; dkim=pass header.d=netapp.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=netapp.onmicrosoft.com; s=selector1-netapp-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=vsIBVZcmoO/Jm81Hd1n5o+9E1z6fPWfyQ67Jfsgq5D8=; b=s73wIbvk5K2UFioovlpV4Rcj6Vfa+HIw0IkPfa+a8Ecm0vM9WaneMI+ZUk+EbqBu8sot5Sm/pvWcv46qBvCiyzjM5cH60ntMkdjibfkWWzSehpvQjmto8vE85+A1xl9iup/k7pnMqsBYF/tHQh9Pgm+5zJ5IwYftrPsbNvdNRu8= Received: from SN4PR0601MB3728.namprd06.prod.outlook.com (2603:10b6:803:51::24) by SN6PR06MB4303.namprd06.prod.outlook.com (2603:10b6:805:aa::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3370.16; Thu, 10 Sep 2020 09:49:13 +0000 Received: from SN4PR0601MB3728.namprd06.prod.outlook.com ([fe80::84a9:fc74:7eed:ca3a]) by SN4PR0601MB3728.namprd06.prod.outlook.com ([fe80::84a9:fc74:7eed:ca3a%7]) with mapi id 15.20.3370.016; Thu, 10 Sep 2020 09:49:13 +0000 From: "Scheffenegger, Richard" To: Liang Tian CC: FreeBSD Transport Subject: RE: Fast recovery ssthresh value Thread-Topic: Fast recovery ssthresh value Thread-Index: AQHWeNGXKLXrd0Ypk0G/FpBfyhwhfqlGGMBAgBqPXACAARNjQA== Date: Thu, 10 Sep 2020 09:49:13 +0000 Message-ID: References: In-Reply-To: Accept-Language: de-AT, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [185.236.167.136] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: c6c1549b-b7de-4030-fef6-08d8556ec66f x-ms-traffictypediagnostic: SN6PR06MB4303: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: XaS1H6N56frFeAsOG6gpmGXFA7Nk1GL+GoCtKpb3iGTEdzbovJ6ofJqpuvZb2gkBPTFq/cQUMYDEP6AitupotSPpwS5/gHbtbdsB4WZWemwcFw9IjfV7ElgYEN+6ap1sMmnHY4CqesP4xKyl7ctykfXOmc6nRSikfL4AgzMOIR1ed2IrkS1KK5P4PaNAeyHHtfhNdVw2BHBsqFHIehvPcZ+bKnoh062j01ofDjSZbmg1VvRePM+aB5PvgwCHCwb3elud8vRb3Feywn5TNx8b0AAQ3gWa38zRHd0VsCg5b57sWOKRcOXFnoFzVLj5F/FGavZImlN2Xj8zuH+LLw/UQiE1g/xXrRHcwYjDdsB3Pd11HdDBnmqyUnnkEV3LCAmeta1eHa+3sP5622Yty9JOfA== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SN4PR0601MB3728.namprd06.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(396003)(136003)(366004)(39860400002)(346002)(376002)(186003)(316002)(52536014)(55016002)(7696005)(966005)(8936002)(6916009)(4326008)(2906002)(3480700007)(33656002)(66556008)(64756008)(83380400001)(9686003)(26005)(86362001)(66446008)(66476007)(8676002)(6506007)(66946007)(5660300002)(53546011)(76116006)(478600001)(71200400001); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: a+hj8NZJHkiushv3YFDlVKppAocO9CC3ZOBCNjWkwz6G6H9Xm82NT3kQeeHq2SIBjhWIwiPw76vWrG9uyFmD406Yo8L4xPzFqKNnaHaNC4Z0a7g/qSHg8HmfRuCcw8xFikMoGlMzkfS+RgoYRdGShC5FdxYAOM0tIYkibsP4hAqpUHdIB0KxJMPaj5VQQMc7/UKBkiWr/rCM4lXMA2IML5VSd0VDteQuJwJDClYbT1yyMSxjCcXr9JinE0ZMLGZQ5gBcOQCnSty1a9RfHwvMdQtGOPUFUwDYFXvl+pGPTmzgB1pg++/JrgcJPoDIJNlzB3JmNxxe+jv/BSCeBLkvyu9CFEw+484SZFXnVAHR0UJRIyPhgsv1cPLmxYjmMw9zvOkteoyV4jvsZYlfoKS3tQloAXKQ4cfBl1yWv+ydCUTYT9tEL0Axz1K/wBAnZcthMlOcaLtA/Qblz6O1o6alY9HGmUaMs+MUcamTqFJoQDvgvVIvsNYVR0tR0Rhpc3brd+qQewTe59XR4Q+5BfBl0p/lk9tMRNAXhQzTHrsqRPFAkL+lSozmw59xD4bYRrxET6ey1QreBWpss8BEWZnkoXOkGCGzRqhyR5Y10uAl6KY4SiAjs4xScXQpNXOM9fpC5ml9Esr29RGxbHVW2sYyTQ== x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: netapp.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: SN4PR0601MB3728.namprd06.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: c6c1549b-b7de-4030-fef6-08d8556ec66f X-MS-Exchange-CrossTenant-originalarrivaltime: 10 Sep 2020 09:49:13.4938 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 4b0911a0-929b-4715-944b-c03745165b3a X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: uQLZ99SMuhVxGQ/LpsTOIGdQ8Nky4NnzJKIpHN982yGxy80NB3TTeybeXrLfwQyhsJwx+Zv5MrQ+b4eozwF+kA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN6PR06MB4303 X-Rspamd-Queue-Id: 4BnDcW6Wfkz3Vg0 X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=netapp.onmicrosoft.com header.s=selector1-netapp-onmicrosoft-com header.b=s73wIbvk; dmarc=none; spf=pass (mx1.freebsd.org: domain of Richard.Scheffenegger@netapp.com designates 40.107.243.69 as permitted sender) smtp.mailfrom=Richard.Scheffenegger@netapp.com X-Spamd-Result: default: False [-4.46 / 15.00]; NEURAL_HAM_MEDIUM(-0.97)[-0.975]; R_DKIM_ALLOW(-0.20)[netapp.onmicrosoft.com:s=selector1-netapp-onmicrosoft-com]; HAS_XOIP(0.00)[]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:40.107.0.0/16]; NEURAL_HAM_LONG(-1.02)[-1.021]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[netapp.com]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[netapp.onmicrosoft.com:+]; RCPT_COUNT_TWO(0.00)[2]; RCVD_IN_DNSWL_NONE(0.00)[40.107.243.69:from]; NEURAL_HAM_SHORT(-0.97)[-0.967]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:8075, ipnet:40.104.0.0/14, country:US]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MAILMAN_DEST(0.00)[freebsd-transport]; RWL_MAILSPIKE_POSSIBLE(0.00)[40.107.243.69:from] X-BeenThere: freebsd-transport@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions of transport level network protocols in FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Sep 2020 09:49:17 -0000 Hi Liang, Yes, you are absolutely correct about this observation. The SACK loss recov= ery will only send one MSS per received ACK right now - and when there is = ACK thinning present, will fail to timely recover all the missing packets, = eventually receiving no more ACK to clock out more retransmissions... I have a Diff in review, to implement Proportional Rate Reduction: https://reviews.freebsd.org/D18892 Which should address not only that issue about ACK thinning, but also the i= ssue that current SACK loss recovery has to wait until pipe drops below sst= hresh, before the retransmissions are clocked out. And then, they would act= ually be clocked out at the same rate at the incoming ACKs. This would be t= he same rate as when the overload happened (barring any ACK thinning), and = as a secondary effect, it was observed that this behavior too can lead to s= elf-inflicted loss - of retransmissions. If you have the ability to patch your kernel with D18892 and observe how th= e reaction is in your dramatic ACK thinning scenario, that would be good to= know! The assumption of the Patch was, that - as per TCP RFC requirements = - there is one ACK for each received out-of-sequence data segment, and ACK = drops / thinning are not happening on such a massive scale as you describe = it. Best regards, Richard Scheffenegger -----Original Message----- From: owner-freebsd-transport@freebsd.org On Behalf Of Liang Tian Sent: Mittwoch, 9. September 2020 19:16 To: Scheffenegger, Richard Cc: FreeBSD Transport Subject: Re: Fast recovery ssthresh value Hi Richard, Thanks for the explanation and sorry for the late reply. I've been investigating SACK loss recovery and I think I'm seeing an issue = similar to the ABC L value issue that I reported previously(https://reviews.freebsd.org/D26120) and I do believe there is a = deviation to RFC3517: The issue happens when a DupAck is received during SACK loss recovery in th= e presence of ACK Thinning or receiver enabling LRO, which means the SACK b= lock edges could expand by more than 1 SMSS(We've seen 30*SMSS), i.e. a sin= gle DupAck could decrement `pipe` by more than 1 SMSS. In RFC3517, (C) If cwnd - pipe >=3D 1 SMSS, the sender SHOULD transmit one or more segm= ents... (C.5) If cwnd - pipe >=3D 1 SMSS, return to (C.1) So based on RFC, = the sender should be able to send more segments if such DupAck is received,= because of the big change to `pipe`. In the current implementation, the cwin variable, which controls the amount= of data that can be transmitted based on the new information, is dictated = by snd_cwnd. The snd_cwnd is incremented by 1 SMSS for each DupAck received= . I believe this effectively limits the retransmission triggered by each Du= pAck to 1 SMSS - deviation. 307 cwin =3D 308 imax(min(tp->snd_wnd, tp->snd_cwnd) - sack_bytes_rxmt, 0); As a result, SACK is not doing enough recovery in this scenario and loss ha= s to be recovered by RTO. Again, I'd appreciate feedback from the community. Regards, Liang Tian On Sun, Aug 23, 2020 at 3:56 PM Scheffenegger, Richard wrote: > > Hi Liang, > > In SACK loss recovery, you can recover up to ssthresh (prior cwnd/2 [or 7= 0% in case of cubic]) lost bytes - at least in theory. > > In comparison, (New)Reno can only recover one lost packet per window, and= then keeps on transmitting new segments (ack + cwnd), even before the rece= ipt of the retransmitted packet is acked. > > For historic reasons, the semantic of the variable cwnd is overloaded dur= ing loss recovery, and it doesn't "really" indicate cwnd, but rather indica= tes if/when retransmissions can happen. > > > In both cases (also the simple one, with only one packet loss), cwnd shou= ld be equal (or near equal) to ssthresh by the time loss recovery is finish= ed - but NOT before! While it may appear like slow-start, the value of the = cwnd variable really increases by acked_bytes only per ACK (not acked_bytes= + SMSS), since the left edge (snd_una) doesn't move right - unlike during = slow-start. But numerically, these different phases (slow-start / sack loss= -recovery) may appear very similar. > > You could check this using the (loadable) SIFTR module, which captures t_= flags (indicating if cong/loss recovery is active), ssthresh, cwnd, and oth= er parameters. > > That is at least how things are supposed to work; or have you investigate= d the timing and behavior of SACK loss recovery and found a deviation to RF= C3517? Note that FBSD currently has not fully implemented RFC6675 support (= which deviates slightly from 3517 under specific circumstances; I have a pa= tch pending to implemente 6675 rescue retransmissions, but haven't tweaked = the other aspects of 6675 vs. 3517. > > BTW: While freebsd-net is not the wrong DL per se, TCP, UDP, SCTP specifi= c questions can also be posted to freebsd-transport, which is more narrowly= focused. > > Best regards, > > Richard Scheffenegger > > -----Original Message----- > From: owner-freebsd-net@freebsd.org On=20 > Behalf Of Liang Tian > Sent: Sonntag, 23. August 2020 00:14 > To: freebsd-net > Subject: Fast recovery ssthresh value > > Hi all, > > When 3 dupacks are received and TCP enter fast recovery, if SACK is used,= the CWND is set to maxseg: > > 2593 if (tp->t_flags & TF_SACK_PERMIT) { > 2594 TCPSTAT_INC( > 2595 tcps_sack_recovery_episode); > 2596 tp->snd_recover =3D tp->snd_nxt; > 2597 tp->snd_cwnd =3D maxseg; > 2598 (void) tp->t_fb->tfb_tcp_output(tp); > 2599 goto drop; > 2600 } > > Otherwise(SACK is not in use), CWND is set to maxseg before > tcp_output() and then set back to snd_ssthresh+inflation > 2601 tp->snd_nxt =3D th->th_ack; > 2602 tp->snd_cwnd =3D maxseg; > 2603 (void) tp->t_fb->tfb_tcp_output(tp); > 2604 KASSERT(tp->snd_limited <=3D 2, > 2605 ("%s: tp->snd_limited too big", > 2606 __func__)); > 2607 tp->snd_cwnd =3D tp->snd_ssthresh + > 2608 maxseg * > 2609 (tp->t_dupacks - tp->snd_limited); > 2610 if (SEQ_GT(onxt, tp->snd_nxt)) > 2611 tp->snd_nxt =3D onxt; > 2612 goto drop; > > I'm wondering in the SACK case, should CWND be set back to ssthresh(which= has been slashed in cc_cong_signal() a few lines above) before line 2599, = like non-SACK case, instead of doing slow start from maxseg? > I read rfc6675 and a few others, and it looks like that's the case. I app= reciate your opinion, again. > > Thanks, > Liang > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" _______________________________________________ freebsd-transport@freebsd.org mailing list https://lists.freebsd.org/mailma= n/listinfo/freebsd-transport To unsubscribe, send any mail to "freebsd-transport-unsubscribe@freebsd.org= "