Date: Tue, 31 May 2011 09:39:51 +0100 From: "Scheffenegger, Richard" <rs@netapp.com> To: <freebsd-net@freebsd.org> Subject: Re: [CFT] Early Retransmit for TCP (rfc5827) patch Message-ID: <5FDC413D5FA246468C200652D63E627A0E99C887@LDCMVEXC1-PRD.hq.netapp.com>
next in thread | raw e-mail | index | archive | help
Hi Weongyo, Good to know that you are addressing the primary reason for retransmission timeouts with SACK. (Small window (early retransmit) is ~70%, lost retransmission ~25%, end-of-stream loss ~5% of all addressable causes for a RTO). I looked at your code to enable RFC5827 Early Retransmits. There is one minor nit-pick: tcp_input is calling tcp_getrexmtthresh for every duplicate ACK. When SACK is enabled (over 90% of all sessions today), the byte-based tcp_sack_ownd routine cycles over the entire SACK scoreboard. As the scoreboard can become huge with fat, long pipes, this appears to be suboptimal.=20 Perhaps something along these lines: ackedbyte =3D 0; int mark =3D tp->snd_una; TAILQ_FOREACH(p, &tp->snd_holes, scblink) { ackedbyte +=3D p->start - mark; if (ackedbyte >=3D amout) return(TRUE); mark =3D p->end; } ackedbyte +=3D tp->snd_fack - mark; if (ackedbyte >=3D amout) return(TRUE); return(FALSE); Would be more scalable (only a holes at the start need to be cycled, increasing the chances that they stick close to the CPU)... Perhaps adding a variable to track the number of bytes SACKed to the scoreboard (and updated with the receipt of a new SACK block) would be even more efficient.... Best regards, Richard Scheffenegger From: weongyo@freebsd.org Date: Sat May 7 00:19:38 UTC 2011 Hello all, I'd like to send another patch to support RFC5827 in TCP stack which could be found at: http://people.freebsd.org/~weongyo/patch_20110506_rfc5827.diff <http://people.freebsd.org/%7Eweongyo/patch_20110506_rfc5827.diff>=20 This patch supports all Early Retransmit logics (Byte-Based Early Retransmit and Segment-Based Early Retransmit) when net.inet.tcp.rfc5827 sysctl knob is turned on. Please note that Segment-Based Early Retransmit logic is separated using khelp module because it adds additional operations and requires variable spaces to track segment boundaries on the right side window. So if the khelp module is loaded, it's a preference but if not the default logic is `Byte-Based Early Retransmit'. I implemented based on DragonflyBSD's implementation but it looked it's not same with RFC specification what I thought so I changed most of parts. In my test environments it looks it's working correctly. Please review and test my work and tell me if you have any concerns and questions. regards, Weongyo Jeong -------------- next part -------------- A non-text attachment was scrubbed... Name: patch_20110506_rfc5827.diff Type: text/x-diff Size: 18455 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-net/attachments/20110507/90f2 f164/patch_20110506_rfc5827.bin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5FDC413D5FA246468C200652D63E627A0E99C887>