Date: Wed, 7 Oct 2015 10:26:10 -0700 From: hiren panchasara <hiren@strugglingcoder.info> To: transport@FreeBSD.org Subject: Correct inflight/pipe calculation Message-ID: <20151007172610.GA42742@strugglingcoder.info>
next in thread | raw e-mail | index | archive | help
--mP3DRpeJDSE+ciuQ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Randall and I have been poking at different ways to improve FreeBSD tcp's reaction to loss. One of the major issue we found is that we do not use information provided by SACK efficiently even though we do keep the SACK scoreboard well in shape. Knowing amount of data in flight can be really crucial and can help us use available capacity of the path more efficiently. We currently do not have an accurate way of knowing this information. For example, inside tcp_do_segment(), while processing duplicate acks, we try to compute amount of data inflight with: awnd =3D (tp->snd_nxt - tp->snd_fack) + tp->sackhint.sack_bytes_rexmit; Which is incorrect as it doesn't take into account whats been already sacked by the receiver. There are definitely other places in the stack where we do this incorrectly. RFC 6675 provides guidance on how to implement calculations for bytes in flight at any point in time. Randall and I came to a conclusion that following can provide us inflight information almost(!) accurately with least amount of code changes: pipe =3D snd_max - snd_una - sackhint.sacked_bytes + sackhint.sack_bytes_re= xmit; here, snd_max: highest sequence number sent snd_una: lowest sequence number sent but not yet cumulatively acked sacked_bytes: total bytes sacked by receiver reported via SACK holes sack_bytes_rexmit: total bytes retransmitted from holes in this recovery period Only missing piece in FreeBSD is sackd_bytes. This is basically total bytes sacked by the receiver and it can be extracted from SACK holes reported by the receiver. The approach we've decided to take is pretty simple: we already process each ACK with sack holes in=20 tcp_sack_doack() and extract sack blocks out of it. We'd now also track this new variable there which keeps track of total sacked bytes reported. The downside with this approach is: There is no persistent information about sacked bytes. We recalculate it every time we get an ACK with sack holes in it. So if, for any reason, receiver decides to drop sack info than we get incorrect value for inflight. This may be also true when there are more holes but receiver can only report 3 at a time. I have actual code that I've been testing and if people see no major problem with this approach, I can put it up for review in phabricator. Cheers, Hiren --mP3DRpeJDSE+ciuQ Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQF8BAABCgBmBQJWFVWrXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRBNEUyMEZBMUQ4Nzg4RjNGMTdFNjZGMDI4 QjkyNTBFMTU2M0VERkU1AAoJEIuSUOFWPt/lvV4IAJrQzV89VABQubkmmjuby0cC LgXoZ6nF7bEgK3W4SNWHJ+fQpgZC2JR9db08R6yQIdB4hb3sxwehqi/ySaiw9vix y0YL73Ohj4H5N9SZU7BXbWUl1PH0UWWh/SQLkArs3hkQw+uQWo43Ewcu1IO8xHcl 06LqNMqdbXbz0psuyyjyq9KSOnZ0qHq3XZhj2QNWJQr9uLqNY2baft1JGcLGG91u f6RFUhaRiopF0aQSs+9dDWyKLd9GTQjHFtdl77GCf0RN8SpSg7+jZiHzexoXpmMw 6jwqD7ut3/5p+TXhFbFemcB0wn7///pXEnE8CCZO9dvY6ncoR4rfcZW0RI5T/Z8= =sD2k -----END PGP SIGNATURE----- --mP3DRpeJDSE+ciuQ--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20151007172610.GA42742>