Date: Mon, 13 Jan 2020 23:39:54 +0000 From: "Scheffenegger, Richard" <Richard.Scheffenegger@netapp.com> To: "freebsd-transport@freebsd.org" <freebsd-transport@freebsd.org> Cc: "Cui, Cheng" <Cheng.Cui@netapp.com>, Christoph Paasch <cpaasch@apple.com>, Vidhi Goel <vidhi_goel@apple.com>, 'Michael Tuexen' <tuexen@freebsd.org> Subject: SACK + RTO interaction Message-ID: <SN4PR0601MB3728AC5A0466DBBEB1FA53B486350@SN4PR0601MB3728.namprd06.prod.outlook.com>
next in thread | raw e-mail | index | archive | help
Hi guys, I believe, Cheng has uncovered another long lurking bug, this time in the i= nteraction between RTO and SACK. Since its inception, tcp_sack_partialack stops the RTO timer apparently - w= hich doesn't seem right. What he observed is that if you run twice during the same SACK loss recover= y episode into a lost retransmission (which is currently only recoverable b= y RTO), the initial loss is recovered by the RTO (unless an partial ACK dis= abled the timer prior to it firing), and the 2nd twice lost segment is at t= he mercy of any other tcp timer which hopefully is still active (keepalive,= persist, ...). https://reviews.freebsd.org/source/src/browse/head/sys/netinet/tcp_sack.c#7= 82 I strongly suspect, that this should never cancelled the RTO, but reset it = anew after a partial ACK. At least that would be more logical - to pull for= ward the timeout, if you are making some forward progress - not to stop the= timeout completely, if one (of possibly many) retransmissions went through= ; if SACK loss recovery doesn't complete in an RTO timeout (which is many m= ore RTTs than the single RTT a SACK loss recover should be taking), it woul= d be prudent to give up and fall back to RTO, not? This effect may also explain some of the other sporadic, very lengthy SACK = recoveries we couldn't really pin down so far... The patch should be easy enough tcp_timer_activate(tp, TT_REXMT, tp->t_rxtcur); instead of tcp_timer_activate(tp, TT_REXMT, 0); Any comments? BTW, found the same in Darwin. Richard Scheffenegger Consulting Solution Architect NAS & Networking NetApp +43 1 3676 811 3157 Direct Phone +43 664 8866 1857 Mobile Phone Richard.Scheffenegger@netapp.com<mailto:Richard.Scheffenegger@netapp.com> [Welcome to Data Driven]<https://datavisionary.netapp.com/> <https://datavisionary.netapp.com/> [Facebook]<https://www.facebook.com/NetApp?fref=3Dts> [Twitter] <https://tw= itter.com/NetApp> #DataDriven https://ts.la/richard49892
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?SN4PR0601MB3728AC5A0466DBBEB1FA53B486350>