From nobody Fri Apr 7 14:15:39 2023 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4PtL4X6w6xz44h9Z; Fri, 7 Apr 2023 14:15:40 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4PtL4X0KS2z3MZw; Fri, 7 Apr 2023 14:15:40 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1680876940; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=EWOWT13bddbj7toKxMOxOwE9fheXqKGrtBNv9Hy681E=; b=PYtCPfEJeNpb4zLZ6SoWBcwJuMZ0vPnLBsAdMo5DPi1rnyb+A8aK5rWAhHLWYKKD7ePmAS ynhndvQ4YW9JZHvQB5LdcCyBSjM/MIP5vGDE1YzGmBa4mjkmZV9H/Wx/tpDQzcyfqeUZgP 8tWrAHC6lIOx12jy9bP7OCuiQG6amnhODjqczBD224C8KC/59XjvGFYO8qeeNf4Z22HtCm 3kVliRc34I890PVdsy/D2Ox+kG6bTORE2L2EX7yUk2X4gikKa1Vtist0GnvzeUY4WLCOhx sKbwNd8ZJZ5yJKtZyTBRdvZ2gPfTkHPtE90dP2S8qmqNt7dNz0r5h/1oA3bX+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1680876940; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=EWOWT13bddbj7toKxMOxOwE9fheXqKGrtBNv9Hy681E=; b=jge2QK9m5YQGO0ssTSNDo2H9QylZnzf2tOywtT6NyW34/M6MnUtWC7gji6O96qKG2/QBUp xbaeZKUr+TmoF5pH4Pq2MOlad/zlts5+/kyU1hE3yT0rOXF1NCPIEldqddAddk/iNm3el6 ipCUU+pMh/FksGkSpNYbQfRO/2zQkL0D2kVYdh+VYWajo4qACctuF6IUnPmUJO/RKnSGyl sclGGQzM+/JeToTEHZMLnInoDvIQDj6tOCo9z/Vyud8VTBDeog9HteXnwc9UQOHBb3AM0W yb72BgDERo3atPxEE5n+HV9SS39LLyMnibyz1vfDOgXl0QzRJz5cZG4ndbon+g== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1680876940; a=rsa-sha256; cv=none; b=Xzn8QgeRdDp7EVwX1f8Ca/dTd48nKblOWeMAKER7kNXz+oriAbDQZuDwOmP7ShDLX3pdZL 6B2pQh0fiVp4X641MJgYNu4ueRANNVL4Kj7PI8X4+N01//e9c4asRyAgeapPeh/vXc8Lqv 1XzJsXAGurxZjQ5MZHnCj/kWFOPmuZytUaF3tir0eQ1jskAVz34Kf+AqPpS3kzn4P4wETP kSvR85pkQnkRn3ckQd/kY7ZlH+N6YgskdWx2hxoLBJ5k8pw76C4f23Zitc1w9vHjFzzmcC FwrSd6IBVNpKbvknDPLvdVwhnf5T+U3od2JA8k9IRyypy73vui+I+bpUa3LESg== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4PtL4W6L00znfW; Fri, 7 Apr 2023 14:15:39 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 337EFdxi097448; Fri, 7 Apr 2023 14:15:39 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 337EFdna097447; Fri, 7 Apr 2023 14:15:39 GMT (envelope-from git) Date: Fri, 7 Apr 2023 14:15:39 GMT Message-Id: <202304071415.337EFdna097447@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Randall Stewart Subject: git: 945f9a7cc9dc - main - tcp: misc cleanup of options for rack as well as socket option logging. List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-main@freebsd.org X-BeenThere: dev-commits-src-main@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: rrs X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: 945f9a7cc9dcc071bfcc702748fbbb11087ae773 Auto-Submitted: auto-generated X-ThisMailContainsUnwantedMimeParts: N The branch main has been updated by rrs: URL: https://cgit.FreeBSD.org/src/commit/?id=945f9a7cc9dcc071bfcc702748fbbb11087ae773 commit 945f9a7cc9dcc071bfcc702748fbbb11087ae773 Author: Randall Stewart AuthorDate: 2023-04-07 14:15:29 +0000 Commit: Randall Stewart CommitDate: 2023-04-07 14:15:29 +0000 tcp: misc cleanup of options for rack as well as socket option logging. Both BBR and Rack have the ability to log socket options, which is currently disabled. Rack has an experimental SaD (Sack Attack Detection) algorithm that should be made available. Also there is a t_maxpeak_rate that needs to be removed (its un-used). Reviewed by: tuexen, cc Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D39427 --- sys/conf/options | 6 +++ sys/netinet/tcp.h | 3 ++ sys/netinet/tcp_log_buf.h | 2 +- sys/netinet/tcp_stacks/bbr.c | 36 +--------------- sys/netinet/tcp_stacks/rack.c | 99 +++++++++---------------------------------- sys/netinet/tcp_subr.c | 26 +++++++++--- sys/netinet/tcp_usrreq.c | 1 - sys/netinet/tcp_var.h | 16 ++++++- 8 files changed, 65 insertions(+), 124 deletions(-) diff --git a/sys/conf/options b/sys/conf/options index 40bb1e56e8b0..a8b441e320cb 100644 --- a/sys/conf/options +++ b/sys/conf/options @@ -229,6 +229,12 @@ SW_WATCHDOG opt_watchdog.h TCPHPTS opt_inet.h TCP_REQUEST_TRK opt_global.h TCP_ACCOUNTING opt_inet.h +# +# TCP SaD Detection is an experimental Sack attack Detection (SaD) +# algorithm that uses "normal" behaviour with SACK's to detect +# a possible attack. It is strictly experimental at this point. +# +TCP_SAD_DETECTION opt_inet.h TURNSTILE_PROFILING UMTX_PROFILING UMTX_CHAINS opt_global.h diff --git a/sys/netinet/tcp.h b/sys/netinet/tcp.h index bec1dc3552d1..51156e9ec76a 100644 --- a/sys/netinet/tcp.h +++ b/sys/netinet/tcp.h @@ -424,6 +424,9 @@ struct tcp_info { u_int32_t __tcpi_received_e0_bytes; u_int32_t __tcpi_received_ce_bytes; + u_int32_t tcpi_total_tlp; /* tail loss probes sent */ + u_int64_t tcpi_total_tlp_bytes; /* tail loss probe bytes sent */ + /* Padding to grow without breaking ABI. */ u_int32_t __tcpi_pad[19]; /* Padding. */ }; diff --git a/sys/netinet/tcp_log_buf.h b/sys/netinet/tcp_log_buf.h index 2b708c6545ce..4507507e5f63 100644 --- a/sys/netinet/tcp_log_buf.h +++ b/sys/netinet/tcp_log_buf.h @@ -255,7 +255,7 @@ enum tcp_log_events { TCP_LOG_CONNEND, /* End of connection 54 */ TCP_LOG_LRO, /* LRO entry 55 */ TCP_SACK_FILTER_RES, /* Results of SACK Filter 56 */ - TCP_SAD_DETECTION, /* Sack Attack Detection 57 */ + TCP_SAD_DETECT, /* Sack Attack Detection 57 */ TCP_TIMELY_WORK, /* Logs regarding Timely CC tweaks 58 */ TCP_LOG_USER_EVENT, /* User space event data 59 */ TCP_LOG_SENDFILE, /* sendfile() logging for TCP connections 60 */ diff --git a/sys/netinet/tcp_stacks/bbr.c b/sys/netinet/tcp_stacks/bbr.c index 621357494a02..cf9f71d7851b 100644 --- a/sys/netinet/tcp_stacks/bbr.c +++ b/sys/netinet/tcp_stacks/bbr.c @@ -2991,13 +2991,6 @@ use_initial_window: bw = bbr->r_ctl.red_bw; else bw = get_filter_value(&bbr->r_ctl.rc_delrate); - if (bbr->rc_tp->t_peakrate_thr && (bbr->rc_use_google == 0)) { - /* - * Enforce user set rate limit, keep in mind that - * t_peakrate_thr is in B/s already - */ - bw = uqmin((uint64_t)bbr->rc_tp->t_peakrate_thr, bw); - } if (bw == 0) { /* We should not be at 0, go to the initial window then */ goto use_initial_window; @@ -10071,9 +10064,6 @@ bbr_init(struct tcpcb *tp, void **ptr) bbr->r_ctl.rc_initial_hptsi_bw = bbr_initial_bw_bps; if (bbr_resends_use_tso) bbr->rc_resends_use_tso = 1; -#ifdef NETFLIX_PEAKRATE - tp->t_peakrate_thr = tp->t_maxpeakrate; -#endif if (tp->snd_una != tp->snd_max) { /* Create a send map for the current outstanding data */ struct bbr_sendmap *rsm; @@ -11668,20 +11658,10 @@ bbr_what_can_we_send(struct tcpcb *tp, struct tcp_bbr *bbr, uint32_t sendwin, return (len); } -static inline void -bbr_do_error_accounting(struct tcpcb *tp, struct tcp_bbr *bbr, struct bbr_sendmap *rsm, int32_t len, int32_t error) -{ -#ifdef NETFLIX_STATS - KMOD_TCPSTAT_INC(tcps_sndpack_error); - KMOD_TCPSTAT_ADD(tcps_sndbyte_error, len); -#endif -} - static inline void bbr_do_send_accounting(struct tcpcb *tp, struct tcp_bbr *bbr, struct bbr_sendmap *rsm, int32_t len, int32_t error) { if (error) { - bbr_do_error_accounting(tp, bbr, rsm, len, error); return; } if (rsm) { @@ -11690,10 +11670,8 @@ bbr_do_send_accounting(struct tcpcb *tp, struct tcp_bbr *bbr, struct bbr_sendmap * TLP should not count in retran count, but in its * own bin */ -#ifdef NETFLIX_STATS KMOD_TCPSTAT_INC(tcps_tlpresends); KMOD_TCPSTAT_ADD(tcps_tlpresend_bytes, len); -#endif } else { /* Retransmit */ tp->t_sndrexmitpack++; @@ -14206,9 +14184,6 @@ bbr_set_sockopt(struct inpcb *inp, struct sockopt *sopt) case TCP_BBR_PACE_SEG_MIN: case TCP_BBR_PACE_CROSS: case TCP_BBR_PACE_OH: -#ifdef NETFLIX_PEAKRATE - case TCP_MAXPEAKRATE: -#endif case TCP_BBR_TMR_PACE_OH: case TCP_BBR_RACK_RTT_USE: case TCP_BBR_RETRAN_WTSO: @@ -14474,14 +14449,7 @@ bbr_set_sockopt(struct inpcb *inp, struct sockopt *sopt) BBR_OPTS_INC(tcp_rack_pkt_delay); bbr->r_ctl.rc_pkt_delay = optval; break; -#ifdef NETFLIX_PEAKRATE - case TCP_MAXPEAKRATE: - BBR_OPTS_INC(tcp_maxpeak); - error = tcp_set_maxpeakrate(tp, optval); - if (!error) - tp->t_peakrate_thr = tp->t_maxpeakrate; - break; -#endif + case TCP_BBR_RETRAN_WTSO: BBR_OPTS_INC(tcp_retran_wtso); if (optval) @@ -14553,9 +14521,7 @@ bbr_set_sockopt(struct inpcb *inp, struct sockopt *sopt) return (tcp_default_ctloutput(inp, sopt)); break; } -#ifdef NETFLIX_STATS tcp_log_socket_option(tp, sopt->sopt_name, optval, error); -#endif INP_WUNLOCK(inp); return (error); } diff --git a/sys/netinet/tcp_stacks/rack.c b/sys/netinet/tcp_stacks/rack.c index 3fc0bb65bbf8..63d8c27e4c6d 100644 --- a/sys/netinet/tcp_stacks/rack.c +++ b/sys/netinet/tcp_stacks/rack.c @@ -746,18 +746,6 @@ rack_log_gpset(struct tcp_rack *rack, uint32_t seq_end, uint32_t ack_end_t, } } -#ifdef NETFLIX_PEAKRATE -static inline void -rack_update_peakrate_thr(struct tcpcb *tp) -{ - /* Keep in mind that t_maxpeakrate is in B/s. */ - uint64_t peak; - peak = uqmax((tp->t_maxseg * 2), - (((uint64_t)tp->t_maxpeakrate * (uint64_t)(tp->t_srtt)) / (uint64_t)HPTS_USEC_IN_SEC)); - tp->t_peakrate_thr = (uint32_t)uqmin(peak, UINT32_MAX); -} -#endif - static int sysctl_rack_clear(SYSCTL_HANDLER_ARGS) { @@ -2346,15 +2334,6 @@ rack_get_bw(struct tcp_rack *rack) return (rack_get_fixed_pacing_bw(rack)); } bw = rack_get_gp_est(rack); -#ifdef NETFLIX_PEAKRATE - if ((rack->rc_tp->t_maxpeakrate) && - (bw > rack->rc_tp->t_maxpeakrate)) { - /* The user has set a peak rate to pace at - * don't allow us to pace faster than that. - */ - return (rack->rc_tp->t_maxpeakrate); - } -#endif return (bw); } @@ -3187,7 +3166,7 @@ rack_log_to_prr(struct tcp_rack *rack, int frm, int orig_cwnd, int line) } } -#ifdef NETFLIX_EXP_DETECTION +#ifdef TCP_SAD_DETECTION static void rack_log_sad(struct tcp_rack *rack, int event) { @@ -3215,7 +3194,7 @@ rack_log_sad(struct tcp_rack *rack, int event) TCP_LOG_EVENTP(rack->rc_tp, NULL, &rack->rc_inp->inp_socket->so_rcv, &rack->rc_inp->inp_socket->so_snd, - TCP_SAD_DETECTION, 0, + TCP_SAD_DETECT, 0, 0, &log, false, &tv); } } @@ -3358,7 +3337,7 @@ rack_alloc_limit(struct tcp_rack *rack, uint8_t limit_type) counter_u64_add(rack_alloc_limited_conns, 1); } return (NULL); -#ifdef NETFLIX_EXP_DETECTION +#ifdef TCP_SAD_DETECTION } else if ((tcp_sad_limit != 0) && (rack->do_detection == 1) && (rack->r_ctl.rc_num_split_allocs >= tcp_sad_limit)) { @@ -5274,18 +5253,6 @@ rack_ack_received(struct tcpcb *tp, struct tcp_rack *rack, uint32_t th_ack, uint rack_enough_for_measurement(tp, rack, th_ack, &quality)) { /* Measure the Goodput */ rack_do_goodput_measurement(tp, rack, th_ack, __LINE__, quality); -#ifdef NETFLIX_PEAKRATE - if ((type == CC_ACK) && - (tp->t_maxpeakrate)) { - /* - * We update t_peakrate_thr. This gives us roughly - * one update per round trip time. Note - * it will only be used if pace_always is off i.e - * we don't do this for paced flows. - */ - rack_update_peakrate_thr(tp); - } -#endif } /* Which way our we limited, if not cwnd limited no advance in CA */ if (tp->snd_cwnd <= tp->snd_wnd) @@ -5366,14 +5333,6 @@ rack_ack_received(struct tcpcb *tp, struct tcp_rack *rack, uint32_t th_ack, uint if (rack->r_ctl.rc_rack_largest_cwnd < rack->r_ctl.cwnd_to_use) { rack->r_ctl.rc_rack_largest_cwnd = rack->r_ctl.cwnd_to_use; } -#ifdef NETFLIX_PEAKRATE - /* we enforce max peak rate if it is set and we are not pacing */ - if ((rack->rc_always_pace == 0) && - tp->t_peakrate_thr && - (tp->snd_cwnd > tp->t_peakrate_thr)) { - tp->snd_cwnd = tp->t_peakrate_thr; - } -#endif } static void @@ -5926,11 +5885,6 @@ rack_cc_after_idle(struct tcp_rack *rack, struct tcpcb *tp) INP_WLOCK_ASSERT(tptoinpcb(tp)); -#ifdef NETFLIX_STATS - KMOD_TCPSTAT_INC(tcps_idle_restarts); - if (tp->t_state == TCPS_ESTABLISHED) - KMOD_TCPSTAT_INC(tcps_idle_estrestarts); -#endif if (CC_ALGO(tp)->after_idle != NULL) CC_ALGO(tp)->after_idle(&tp->t_ccv); @@ -6744,7 +6698,7 @@ rack_start_hpts_timer(struct tcp_rack *rack, struct tcpcb *tp, uint32_t cts, } } hpts_timeout = rack_timer_start(tp, rack, cts, sup_rack); -#ifdef NETFLIX_EXP_DETECTION +#ifdef TCP_SAD_DETECTION if (rack->sack_attack_disable && (rack->r_ctl.ack_during_sd > 0) && (slot < tcp_sad_pacing_interval)) { @@ -7662,7 +7616,7 @@ rack_remxt_tmr(struct tcpcb *tp) rack_log_to_prr(rack, 6, 0, __LINE__); rack->r_timer_override = 1; if ((((tp->t_flags & TF_SACK_PERMIT) == 0) -#ifdef NETFLIX_EXP_DETECTION +#ifdef TCP_SAD_DETECTION || (rack->sack_attack_disable != 0) #endif ) && ((tp->t_flags & TF_SENTFIN) == 0)) { @@ -9343,7 +9297,7 @@ rack_proc_sack_blk(struct tcpcb *tp, struct tcp_rack *rack, struct sackblk *sack int insret __diagused; int32_t used_ref = 1; int moved = 0; -#ifdef NETFLIX_EXP_DETECTION +#ifdef TCP_SAD_DETECTION int allow_segsiz; int first_time_through = 1; #endif @@ -9353,7 +9307,8 @@ rack_proc_sack_blk(struct tcpcb *tp, struct tcp_rack *rack, struct sackblk *sack start = sack->start; end = sack->end; rsm = *prsm; -#ifdef NETFLIX_EXP_DETECTION + +#ifdef TCP_SAD_DETECTION /* * There are a strange number of proxys and meddle boxes in the world * that seem to cut up segments on different boundaries. This gets us @@ -9384,7 +9339,7 @@ do_rest_ofb: /* TSNH */ goto out; } -#ifdef NETFLIX_EXP_DETECTION +#ifdef TCP_SAD_DETECTION /* Now we must check for suspicous activity */ if ((first_time_through == 1) && ((end - start) < min((rsm->r_end - rsm->r_start), allow_segsiz)) && @@ -10252,7 +10207,7 @@ rack_do_decay(struct tcp_rack *rack) * Current default is 800 so it decays * 80% every second. */ -#ifdef NETFLIX_EXP_DETECTION +#ifdef TCP_SAD_DETECTION uint32_t pkt_delta; pkt_delta = rack->r_ctl.input_pkt - rack->r_ctl.saved_input_pkt; @@ -10261,7 +10216,7 @@ rack_do_decay(struct tcp_rack *rack) rack->r_ctl.saved_input_pkt = rack->r_ctl.input_pkt; rack->r_ctl.rc_last_time_decay = rack->r_ctl.act_rcv_time; /* Now do we escape without decay? */ -#ifdef NETFLIX_EXP_DETECTION +#ifdef TCP_SAD_DETECTION if (rack->rc_in_persist || (rack->rc_tp->snd_max == rack->rc_tp->snd_una) || (pkt_delta < tcp_sad_low_pps)){ @@ -10706,7 +10661,7 @@ rack_handle_might_revert(struct tcpcb *tp, struct tcp_rack *rack) } } -#ifdef NETFLIX_EXP_DETECTION +#ifdef TCP_SAD_DETECTION static void rack_merge_out_sacks(struct tcp_rack *rack) @@ -11384,7 +11339,7 @@ out_with_totals: counter_u64_add(rack_move_some, 1); } out: -#ifdef NETFLIX_EXP_DETECTION +#ifdef TCP_SAD_DETECTION rack_do_detection(tp, rack, BYTES_THIS_ACK(tp, th), ctf_fixed_maxseg(rack->rc_tp)); #endif if (changed) { @@ -14275,9 +14230,6 @@ rack_set_pace_segments(struct tcpcb *tp, struct tcp_rack *rack, uint32_t line, u } } else if (rack->rc_always_pace) { if (rack->r_ctl.gp_bw || -#ifdef NETFLIX_PEAKRATE - rack->rc_tp->t_maxpeakrate || -#endif rack->r_ctl.init_rate) { /* We have a rate of some sort set */ uint32_t orig; @@ -15034,7 +14986,7 @@ rack_init(struct tcpcb *tp, void **ptr) rack->rack_hdw_pace_ena = 1; if (rack_hw_rate_caps) rack->r_rack_hw_rate_caps = 1; -#ifdef NETFLIX_EXP_DETECTION +#ifdef TCP_SAD_DETECTION rack->do_detection = 1; #else rack->do_detection = 0; @@ -15604,7 +15556,7 @@ rack_log_input_packet(struct tcpcb *tp, struct tcp_rack *rack, struct tcp_ackent uint32_t orig_snd_una; uint8_t xx = 0; -#ifdef NETFLIX_HTTP_LOGGING +#ifdef TCP_REQUEST_TRK struct http_sendfile_track *http_req; if (SEQ_GT(ae->ack, tp->snd_una)) { @@ -15651,7 +15603,7 @@ rack_log_input_packet(struct tcpcb *tp, struct tcp_rack *rack, struct tcp_ackent log.u_bbr.timeStamp = tcp_get_usecs(<v); /* Log the rcv time */ log.u_bbr.delRate = ae->timestamp; -#ifdef NETFLIX_HTTP_LOGGING +#ifdef TCP_REQUEST_TRK log.u_bbr.applimited = tp->t_http_closed; log.u_bbr.applimited <<= 8; log.u_bbr.applimited |= tp->t_http_open; @@ -16163,7 +16115,7 @@ rack_do_compressed_ack_processing(struct tcpcb *tp, struct socket *so, struct mb } if (acked > sbavail(&so->so_snd)) acked_amount = sbavail(&so->so_snd); -#ifdef NETFLIX_EXP_DETECTION +#ifdef TCP_SAD_DETECTION /* * We only care on a cum-ack move if we are in a sack-disabled * state. We have already added in to the ack_count, and we never @@ -16641,7 +16593,7 @@ rack_do_segment_nounlock(struct mbuf *m, struct tcphdr *th, struct socket *so, if (tcp_bblogging_on(rack->rc_tp)) { union tcp_log_stackspecific log; struct timeval ltv; -#ifdef NETFLIX_HTTP_LOGGING +#ifdef TCP_REQUEST_TRK struct http_sendfile_track *http_req; if (SEQ_GT(th->th_ack, tp->snd_una)) { @@ -16687,7 +16639,7 @@ rack_do_segment_nounlock(struct mbuf *m, struct tcphdr *th, struct socket *so, log.u_bbr.timeStamp = tcp_get_usecs(<v); /* Log the rcv time */ log.u_bbr.delRate = m->m_pkthdr.rcv_tstmp; -#ifdef NETFLIX_HTTP_LOGGING +#ifdef TCP_REQUEST_TRK log.u_bbr.applimited = tp->t_http_closed; log.u_bbr.applimited <<= 8; log.u_bbr.applimited |= tp->t_http_open; @@ -17474,9 +17426,6 @@ rack_get_pacing_delay(struct tcp_rack *rack, struct tcpcb *tp, uint32_t len, str if (rack->use_fixed_rate) { rate_wanted = bw_est = rack_get_fixed_pacing_bw(rack); } else if ((rack->r_ctl.init_rate == 0) && -#ifdef NETFLIX_PEAKRATE - (rack->rc_tp->t_maxpeakrate == 0) && -#endif (rack->r_ctl.gp_bw == 0)) { /* no way to yet do an estimate */ bw_est = rate_wanted = 0; @@ -17717,9 +17666,6 @@ rack_get_pacing_delay(struct tcp_rack *rack, struct tcpcb *tp, uint32_t len, str done_w_hdwr: if (rack_limit_time_with_srtt && (rack->use_fixed_rate == 0) && -#ifdef NETFLIX_PEAKRATE - (rack->rc_tp->t_maxpeakrate == 0) && -#endif (rack->rack_hdrw_pacing == 0)) { /* * Sanity check, we do not allow the pacing delay @@ -23043,9 +22989,6 @@ rack_process_option(struct tcpcb *tp, struct tcp_rack *rack, int sopt_name, snt = 0; if ((snt < win) && (tp->t_srtt | -#ifdef NETFLIX_PEAKRATE - tp->t_maxpeakrate | -#endif rack->r_ctl.init_rate)) { /* * We are not past the initial window @@ -23324,9 +23267,7 @@ rack_process_option(struct tcpcb *tp, struct tcp_rack *rack, int sopt_name, default: break; } -#ifdef NETFLIX_STATS tcp_log_socket_option(tp, sopt_name, optval, error); -#endif return (error); } @@ -23668,9 +23609,9 @@ rack_fill_info(struct tcpcb *tp, struct tcp_info *ti) ti->tcpi_snd_rexmitpack = tp->t_sndrexmitpack; ti->tcpi_rcv_ooopack = tp->t_rcvoopack; ti->tcpi_snd_zerowin = tp->t_sndzerowin; -#ifdef NETFLIX_STATS ti->tcpi_total_tlp = tp->t_sndtlppack; ti->tcpi_total_tlp_bytes = tp->t_sndtlpbyte; +#ifdef NETFLIX_STATS memcpy(&ti->tcpi_rxsyninfo, &tp->t_rxsyninfo, sizeof(struct tcpsyninfo)); #endif #ifdef TCP_OFFLOAD diff --git a/sys/netinet/tcp_subr.c b/sys/netinet/tcp_subr.c index fcd430f270f3..80202bc3a416 100644 --- a/sys/netinet/tcp_subr.c +++ b/sys/netinet/tcp_subr.c @@ -143,7 +143,7 @@ VNET_DEFINE(int, tcp_mssdflt) = TCP_MSS; VNET_DEFINE(int, tcp_v6mssdflt) = TCP6_MSS; #endif -#ifdef NETFLIX_EXP_DETECTION +#ifdef TCP_SAD_DETECTION /* Sack attack detection thresholds and such */ SYSCTL_NODE(_net_inet_tcp, OID_AUTO, sack_attack, CTLFLAG_RW | CTLFLAG_MPSAFE, 0, @@ -154,11 +154,6 @@ SYSCTL_INT(_net_inet_tcp_sack_attack, OID_AUTO, force_detection, &tcp_force_detection, 0, "Do we force detection even if the INP has it off?"); int32_t tcp_sad_limit = 10000; -SYSCTL_INT(_net_inet_tcp_sack_attack, OID_AUTO, limit, - CTLFLAG_RW, - &tcp_sad_limit, 10000, - "If SaD is enabled, what is the limit to sendmap entries (0 = unlimited)?"); -int32_t tcp_sad_limit = 10000; SYSCTL_INT(_net_inet_tcp_sack_attack, OID_AUTO, limit, CTLFLAG_RW, &tcp_sad_limit, 10000, @@ -4579,3 +4574,22 @@ tcp_http_alloc_req(struct tcpcb *tp, union tcp_log_userdata *user, uint64_t ts) (void)tcp_http_alloc_req_full(tp, &user->http_req, ts, 1); } #endif + +void +tcp_log_socket_option(struct tcpcb *tp, uint32_t option_num, uint32_t option_val, int err) +{ + if (tcp_bblogging_on(tp)) { + struct tcp_log_buffer *l; + + l = tcp_log_event(tp, NULL, + &tptosocket(tp)->so_rcv, + &tptosocket(tp)->so_snd, + TCP_LOG_SOCKET_OPT, + err, 0, NULL, 1, + NULL, NULL, 0, NULL); + if (l) { + l->tlb_flex1 = option_num; + l->tlb_flex2 = option_val; + } + } +} diff --git a/sys/netinet/tcp_usrreq.c b/sys/netinet/tcp_usrreq.c index 7abf4c215102..f27810e14f0d 100644 --- a/sys/netinet/tcp_usrreq.c +++ b/sys/netinet/tcp_usrreq.c @@ -1710,7 +1710,6 @@ tcp_ctloutput_set(struct inpcb *inp, struct sockopt *sopt) * Ensure the new stack takes ownership with a * clean slate on peak rate threshold. */ - tp->t_peakrate_thr = 0; #ifdef TCPHPTS /* Assure that we are not on any hpts */ tcp_hpts_remove(tptoinpcb(tp)); diff --git a/sys/netinet/tcp_var.h b/sys/netinet/tcp_var.h index a86c52ad90a0..6018e84bfe64 100644 --- a/sys/netinet/tcp_var.h +++ b/sys/netinet/tcp_var.h @@ -332,7 +332,6 @@ struct tcpcb { tcp_seq snd_up; /* send urgent pointer */ uint32_t snd_wnd; /* send window */ uint32_t snd_cwnd; /* congestion-controlled window */ - uint32_t t_peakrate_thr; /* pre-calculated peak rate threshold */ uint32_t ts_offset; /* our timestamp offset */ uint32_t rfbuf_ts; /* recv buffer autoscaling timestamp */ int rcv_numsacks; /* # distinct sack blks present */ @@ -1086,6 +1085,16 @@ struct tcpstat { uint64_t tcps_ecn_sndect0; /* ECN Capable Transport */ uint64_t tcps_ecn_sndect1; /* ECN Capable Transport */ + /* + * BBR and Rack implement TLP's these values count TLP bytes in + * two catagories, bytes that were retransmitted and bytes that + * were newly transmited. Both types can serve as TLP's but they + * are accounted differently. + */ + uint64_t tcps_tlpresends; /* number of tlp resends */ + uint64_t tcps_tlpresend_bytes; /* number of bytes resent by tlp */ + + uint64_t _pad[4]; /* 4 TBD placeholder for STABLE */ }; @@ -1390,6 +1399,9 @@ struct tcp_function_block * find_and_ref_tcp_fb(struct tcp_function_block *fs); int tcp_default_ctloutput(struct inpcb *inp, struct sockopt *sopt); int tcp_ctloutput_set(struct inpcb *inp, struct sockopt *sopt); +void tcp_log_socket_option(struct tcpcb *tp, uint32_t option_num, + uint32_t option_val, int err); + extern counter_u64_t tcp_inp_lro_direct_queue; extern counter_u64_t tcp_inp_lro_wokeup_queue; @@ -1401,7 +1413,7 @@ extern counter_u64_t tcp_comp_total; extern counter_u64_t tcp_uncomp_total; extern counter_u64_t tcp_bad_csums; -#ifdef NETFLIX_EXP_DETECTION +#ifdef TCP_SAD_DETECTION /* Various SACK attack thresholds */ extern int32_t tcp_force_detection; extern int32_t tcp_sad_limit;