From owner-svn-src-all@freebsd.org Thu Sep 3 08:38:05 2020 Return-Path: Delivered-To: svn-src-all@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 51D9D3DF523; Thu, 3 Sep 2020 08:38:05 +0000 (UTC) (envelope-from rscheff@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4BhvMd1Tqyz45c4; Thu, 3 Sep 2020 08:38:05 +0000 (UTC) (envelope-from rscheff@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 15FB118B60; Thu, 3 Sep 2020 08:38:05 +0000 (UTC) (envelope-from rscheff@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id 0838c4VA038103; Thu, 3 Sep 2020 08:38:04 GMT (envelope-from rscheff@FreeBSD.org) Received: (from rscheff@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id 0838c4Ef038100; Thu, 3 Sep 2020 08:38:04 GMT (envelope-from rscheff@FreeBSD.org) Message-Id: <202009030838.0838c4Ef038100@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: rscheff set sender to rscheff@FreeBSD.org using -f From: Richard Scheffenegger Date: Thu, 3 Sep 2020 08:38:04 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-12@freebsd.org Subject: svn commit: r365292 - stable/12/sys/netinet X-SVN-Group: stable-12 X-SVN-Commit-Author: rscheff X-SVN-Commit-Paths: stable/12/sys/netinet X-SVN-Commit-Revision: 365292 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Sep 2020 08:38:05 -0000 Author: rscheff Date: Thu Sep 3 08:38:04 2020 New Revision: 365292 URL: https://svnweb.freebsd.org/changeset/base/365292 Log: MFC r364195: Improve SACK support code for RFC6675 and PRR Adding proper accounting of sacked_bytes and (per-ACK) delivered data to the SACK scoreboard. This will allow more aspects of RFC6675 to be implemented as well as Proportional Rate Reduction (RFC6937). Prior to this change, the pipe calculation controlled with net.inet.tcp.rfc6675_pipe was also susceptible to incorrect results when more than 3 (or 4) holes in the sequence space were present, which can no longer all fit into a single ACK's SACK option. Reviewed by: kbowling, rgrimes (mentor) Approved by: rgrimes (blanket) MFC after: 3 weeks Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D18624 Modified: stable/12/sys/netinet/tcp_input.c stable/12/sys/netinet/tcp_sack.c stable/12/sys/netinet/tcp_var.h Directory Properties: stable/12/ (props changed) Modified: stable/12/sys/netinet/tcp_input.c ============================================================================== --- stable/12/sys/netinet/tcp_input.c Thu Sep 3 08:16:57 2020 (r365291) +++ stable/12/sys/netinet/tcp_input.c Thu Sep 3 08:38:04 2020 (r365292) @@ -2715,9 +2715,16 @@ tcp_do_segment(struct mbuf *m, struct tcphdr *th, stru tp->t_dupacks = 0; /* * If this ack also has new SACK info, increment the - * counter as per rfc6675. + * counter as per rfc6675. The variable + * sack_changed tracks all changes to the SACK + * scoreboard, including when partial ACKs without + * SACK options are received, and clear the scoreboard + * from the left side. Such partial ACKs should not be + * counted as dupacks here. */ - if ((tp->t_flags & TF_SACK_PERMIT) && sack_changed) + if ((tp->t_flags & TF_SACK_PERMIT) && + (to.to_flags & TOF_SACK) && + sack_changed) tp->t_dupacks++; } Modified: stable/12/sys/netinet/tcp_sack.c ============================================================================== --- stable/12/sys/netinet/tcp_sack.c Thu Sep 3 08:16:57 2020 (r365291) +++ stable/12/sys/netinet/tcp_sack.c Thu Sep 3 08:38:04 2020 (r365292) @@ -534,9 +534,7 @@ tcp_sackhole_remove(struct tcpcb *tp, struct sackhole * tp->snd_holes is an ordered list of holes (oldest to newest, in terms of * the sequence space). * Returns 1 if incoming ACK has previously unknown SACK information, - * 0 otherwise. Note: We treat (snd_una, th_ack) as a sack block so any changes - * to that (i.e. left edge moving) would also be considered a change in SACK - * information which is slightly different than rfc6675. + * 0 otherwise. */ int tcp_sack_doack(struct tcpcb *tp, struct tcpopt *to, tcp_seq th_ack) @@ -544,16 +542,21 @@ tcp_sack_doack(struct tcpcb *tp, struct tcpopt *to, tc struct sackhole *cur, *temp; struct sackblk sack, sack_blocks[TCP_MAX_SACK + 1], *sblkp; int i, j, num_sack_blks, sack_changed; + int delivered_data, left_edge_delta; INP_WLOCK_ASSERT(tp->t_inpcb); num_sack_blks = 0; sack_changed = 0; + delivered_data = 0; + left_edge_delta = 0; /* * If SND.UNA will be advanced by SEG.ACK, and if SACK holes exist, * treat [SND.UNA, SEG.ACK) as if it is a SACK block. + * Account changes to SND.UNA always in delivered data. */ if (SEQ_LT(tp->snd_una, th_ack) && !TAILQ_EMPTY(&tp->snd_holes)) { + left_edge_delta = th_ack - tp->snd_una; sack_blocks[num_sack_blks].start = tp->snd_una; sack_blocks[num_sack_blks++].end = th_ack; } @@ -562,7 +565,6 @@ tcp_sack_doack(struct tcpcb *tp, struct tcpopt *to, tc * received new blocks from the other side. */ if (to->to_flags & TOF_SACK) { - tp->sackhint.sacked_bytes = 0; /* reset */ for (i = 0; i < to->to_nsacks; i++) { bcopy((to->to_sacks + i * TCPOLEN_SACK), &sack, sizeof(sack)); @@ -575,8 +577,6 @@ tcp_sack_doack(struct tcpcb *tp, struct tcpopt *to, tc SEQ_GT(sack.end, tp->snd_una) && SEQ_LEQ(sack.end, tp->snd_max)) { sack_blocks[num_sack_blks++] = sack; - tp->sackhint.sacked_bytes += - (sack.end-sack.start); } } } @@ -601,7 +601,7 @@ tcp_sack_doack(struct tcpcb *tp, struct tcpopt *to, tc } } } - if (TAILQ_EMPTY(&tp->snd_holes)) + if (TAILQ_EMPTY(&tp->snd_holes)) { /* * Empty scoreboard. Need to initialize snd_fack (it may be * uninitialized or have a bogus value). Scoreboard holes @@ -610,6 +610,8 @@ tcp_sack_doack(struct tcpcb *tp, struct tcpopt *to, tc * scoreboard). */ tp->snd_fack = SEQ_MAX(tp->snd_una, th_ack); + tp->sackhint.sacked_bytes = 0; /* reset */ + } /* * In the while-loop below, incoming SACK blocks (sack_blocks[]) and * SACK holes (snd_holes) are traversed from their tails with just @@ -633,6 +635,7 @@ tcp_sack_doack(struct tcpcb *tp, struct tcpopt *to, tc */ temp = tcp_sackhole_insert(tp, tp->snd_fack,sblkp->start,NULL); if (temp != NULL) { + delivered_data += sblkp->end - sblkp->start; tp->snd_fack = sblkp->end; /* Go to the previous sack block. */ sblkp--; @@ -650,11 +653,15 @@ tcp_sack_doack(struct tcpcb *tp, struct tcpopt *to, tc SEQ_LT(tp->snd_fack, sblkp->start)) sblkp--; if (sblkp >= sack_blocks && - SEQ_LT(tp->snd_fack, sblkp->end)) + SEQ_LT(tp->snd_fack, sblkp->end)) { + delivered_data += sblkp->end - tp->snd_fack; tp->snd_fack = sblkp->end; + sack_changed = 1; + } } } else if (SEQ_LT(tp->snd_fack, sblkp->end)) { /* fack is advanced. */ + delivered_data += sblkp->end - tp->snd_fack; tp->snd_fack = sblkp->end; sack_changed = 1; } @@ -688,6 +695,7 @@ tcp_sack_doack(struct tcpcb *tp, struct tcpopt *to, tc /* Data acks at least the beginning of hole. */ if (SEQ_GEQ(sblkp->end, cur->end)) { /* Acks entire hole, so delete hole. */ + delivered_data += (cur->end - cur->start); temp = cur; cur = TAILQ_PREV(cur, sackhole_head, scblink); tcp_sackhole_remove(tp, temp); @@ -699,6 +707,7 @@ tcp_sack_doack(struct tcpcb *tp, struct tcpopt *to, tc continue; } else { /* Move start of hole forward. */ + delivered_data += (sblkp->end - cur->start); cur->start = sblkp->end; cur->rxmit = SEQ_MAX(cur->rxmit, cur->start); } @@ -706,6 +715,7 @@ tcp_sack_doack(struct tcpcb *tp, struct tcpopt *to, tc /* Data acks at least the end of hole. */ if (SEQ_GEQ(sblkp->end, cur->end)) { /* Move end of hole backward. */ + delivered_data += (cur->end - sblkp->start); cur->end = sblkp->start; cur->rxmit = SEQ_MIN(cur->rxmit, cur->end); } else { @@ -725,6 +735,7 @@ tcp_sack_doack(struct tcpcb *tp, struct tcpopt *to, tc cur->end = sblkp->start; cur->rxmit = SEQ_MIN(cur->rxmit, cur->end); + delivered_data += (sblkp->end - sblkp->start); } } } @@ -739,6 +750,10 @@ tcp_sack_doack(struct tcpcb *tp, struct tcpopt *to, tc else sblkp--; } + tp->sackhint.delivered_data = delivered_data; + tp->sackhint.sacked_bytes += delivered_data - left_edge_delta; + KASSERT((delivered_data >= 0), ("delivered_data < 0")); + KASSERT((tp->sackhint.sacked_bytes >= 0), ("sacked_bytes < 0")); return (sack_changed); } Modified: stable/12/sys/netinet/tcp_var.h ============================================================================== --- stable/12/sys/netinet/tcp_var.h Thu Sep 3 08:16:57 2020 (r365291) +++ stable/12/sys/netinet/tcp_var.h Thu Sep 3 08:38:04 2020 (r365292) @@ -70,12 +70,12 @@ struct sackhole { struct sackhint { struct sackhole *nexthole; - int sack_bytes_rexmit; + int32_t sack_bytes_rexmit; tcp_seq last_sack_ack; /* Most recent/largest sacked ack */ - int ispare; /* explicit pad for 64bit alignment */ - int sacked_bytes; /* - * Total sacked bytes reported by the + int32_t delivered_data; /* Newly acked data from last SACK */ + + int32_t sacked_bytes; /* Total sacked bytes reported by the * receiver via sack option */ uint32_t _pad1[1]; /* TBD */