Date: Mon, 19 Mar 2018 16:37:48 +0000 (UTC) From: Lawrence Stewart <lstewart@FreeBSD.org> To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r331214 - in head: share/man/man4 sys/netinet/cc Message-ID: <201803191637.w2JGbmON093556@repo.freebsd.org>
next in thread | raw e-mail | index | archive | help
Author: lstewart Date: Mon Mar 19 16:37:47 2018 New Revision: 331214 URL: https://svnweb.freebsd.org/changeset/base/331214 Log: Add support for the experimental Internet-Draft "TCP Alternative Backoff with ECN (ABE)" proposal to the New Reno congestion control algorithm module. ABE reduces the amount of congestion window reduction in response to ECN-signalled congestion relative to the loss-inferred congestion response. More details about ABE can be found in the Internet-Draft: https://tools.ietf.org/html/draft-ietf-tcpm-alternativebackoff-ecn The implementation introduces four new sysctls: - net.inet.tcp.cc.abe defaults to 0 (disabled) and can be set to non-zero to enable ABE for ECN-enabled TCP connections. - net.inet.tcp.cc.newreno.beta and net.inet.tcp.cc.newreno.beta_ecn set the multiplicative window decrease factor, specified as a percentage, applied to the congestion window in response to a loss-based or ECN-based congestion signal respectively. They default to the values specified in the draft i.e. beta=50 and beta_ecn=80. - net.inet.tcp.cc.abe_frlossreduce defaults to 0 (disabled) and can be set to non-zero to enable the use of standard beta (50% by default) when repairing loss during an ECN-signalled congestion recovery episode. It enables a more conservative congestion response and is provided for the purposes of experimentation as a result of some discussion at IETF 100 in Singapore. The values of beta and beta_ecn can also be set per-connection by way of the TCP_CCALGOOPT TCP-level socket option and the new CC_NEWRENO_BETA or CC_NEWRENO_BETA_ECN CC algo sub-options. Submitted by: Tom Jones <tj@enoti.me> Tested by: Tom Jones <tj@enoti.me>, Grenville Armitage <garmitage@swin.edu.au> Relnotes: Yes Differential Revision: https://reviews.freebsd.org/D11616 Added: head/sys/netinet/cc/cc_newreno.h (contents, props changed) Modified: head/share/man/man4/cc_newreno.4 head/share/man/man4/mod_cc.4 head/sys/netinet/cc/cc.c head/sys/netinet/cc/cc.h head/sys/netinet/cc/cc_newreno.c Modified: head/share/man/man4/cc_newreno.4 ============================================================================== --- head/share/man/man4/cc_newreno.4 Mon Mar 19 16:17:10 2018 (r331213) +++ head/share/man/man4/cc_newreno.4 Mon Mar 19 16:37:47 2018 (r331214) @@ -30,17 +30,69 @@ .\" .\" $FreeBSD$ .\" -.Dd September 15, 2011 +.Dd March 19, 2018 .Dt CC_NEWRENO 4 .Os .Sh NAME .Nm cc_newreno .Nd NewReno Congestion Control Algorithm +.Sh SYNOPSIS +.In netinet/cc/cc_newreno.h .Sh DESCRIPTION The NewReno congestion control algorithm is the default for TCP. Details about the algorithm can be found in RFC5681. +.Sh Socket Options +The +.Nm +module supports a number of socket options under TCP_CCALGOOPT (refer to +.Xr tcp 4 +and +.Xr moc_cc 9 for details) +which can +be set with +.Xr setsockopt 2 +and tested with +.Xr getsockopt 2 . +The +.Nm +socket options use this structure defined in +<sys/netinet/cc/cc_newreno.h>: +.Bd -literal +struct cc_newreno_opts { + int name; + uint32_t val; +} +.Ed +.Bl -tag -width ".Va CC_NEWRENO_BETA_ECN" +.It Va CC_NEWRENO_BETA +Multiplicative window decrease factor, specified as a percentage, applied to +the congestion window in response to a congestion signal per: cwnd = (cwnd * +CC_NEWRENO_BETA) / 100. +Default is 50. +.It Va CC_NEWRENO_BETA_ECN +Multiplicative window decrease factor, specified as a percentage, applied to +the congestion window in response to an ECN congestion signal when +.Va net.inet.tcp.cc.abe=1 +per: cwnd = (cwnd * CC_NEWRENO_BETA_ECN) / 100. +Default is 80. .Sh MIB Variables -There are currently no tunable MIB variables. +The algorithm exposes these variables in the +.Va net.inet.tcp.cc.newreno +branch of the +.Xr sysctl 3 +MIB: +.Bl -tag -width ".Va beta_ecn" +.It Va beta +Multiplicative window decrease factor, specified as a percentage, applied to +the congestion window in response to a congestion signal per: cwnd = (cwnd * +beta) / 100. +Default is 50. +.It Va beta_ecn +Multiplicative window decrease factor, specified as a percentage, applied to +the congestion window in response to an ECN congestion signal when +.Va net.inet.tcp.cc.abe=1 +per: cwnd = (cwnd * beta_ecn) / 100. +Default is 80. .Sh SEE ALSO .Xr cc_chd 4 , .Xr cc_cubic 4 , @@ -50,6 +102,24 @@ There are currently no tunable MIB variables. .Xr mod_cc 4 , .Xr tcp 4 , .Xr mod_cc 9 +.Rs +.%A "Mark Allman" +.%A "Vern Paxson" +.%A "Ethan Blanton" +.%T "TCP Congestion Control" +.%O "RFC 5681" +.Re +.Rs +.%A "Naeem Khademi" +.%A "Michael Welzl" +.%A "Grenville Armitage" +.%A "Gorry Fairhurst" +.%T "TCP Alternative Backoff with ECN (ABE)" +.%R "internet draft" +.%D "February 2018" +.%N "draft-ietf-tcpm-alternativebackoff-ecn" +.%O "work in progress" +.Re .Sh ACKNOWLEDGEMENTS Development and testing of this software were made possible in part by grants from the FreeBSD Foundation and Cisco University Research Program Fund at @@ -77,6 +147,9 @@ congestion control module was written by .An Lawrence Stewart Aq Mt lstewart@FreeBSD.org and .An David Hayes Aq Mt david.hayes@ieee.org . +.Pp +Support for TCP ABE was added by +.An Tom Jones Aq Mt tj@enoti.me . .Pp This manual page was written by .An Lawrence Stewart Aq Mt lstewart@FreeBSD.org . Modified: head/share/man/man4/mod_cc.4 ============================================================================== --- head/share/man/man4/mod_cc.4 Mon Mar 19 16:17:10 2018 (r331213) +++ head/share/man/man4/mod_cc.4 Mon Mar 19 16:37:47 2018 (r331214) @@ -30,7 +30,7 @@ .\" .\" $FreeBSD$ .\" -.Dd January 21, 2016 +.Dd March 19, 2018 .Dt MOD_CC 4 .Os .Sh NAME @@ -73,7 +73,7 @@ The framework exposes the following variables in the branch of the .Xr sysctl 3 MIB: -.Bl -tag -width ".Va available" +.Bl -tag -width ".Va abe_frlossreduce" .It Va available Read-only list of currently available congestion control algorithms by name. .It Va algorithm @@ -83,6 +83,15 @@ When attempting to change the default algorithm, this one of the names listed by the .Va net.inet.tcp.cc.available MIB variable. +.It Va abe +Enable support for draft-ietf-tcpm-alternativebackoff-ecn, +which alters the window decrease factor applied to the congestion window in +response to an ECN congestion signal. +Refer to individual congestion control man pages to determine if they implement +support for ABE and for configuration details. +.It Va abe_frlossreduce +If non-zero, apply standard beta instead of ABE-beta during ECN-signalled +congestion recovery episodes if loss also needs to be repaired. .El .Sh SEE ALSO .Xr cc_cdg 4 , Modified: head/sys/netinet/cc/cc.c ============================================================================== --- head/sys/netinet/cc/cc.c Mon Mar 19 16:17:10 2018 (r331213) +++ head/sys/netinet/cc/cc.c Mon Mar 19 16:37:47 2018 (r331214) @@ -327,3 +327,14 @@ SYSCTL_PROC(_net_inet_tcp_cc, OID_AUTO, algorithm, SYSCTL_PROC(_net_inet_tcp_cc, OID_AUTO, available, CTLTYPE_STRING|CTLFLAG_RD, NULL, 0, cc_list_available, "A", "List available congestion control algorithms"); + +VNET_DEFINE(int, cc_do_abe) = 0; +SYSCTL_INT(_net_inet_tcp_cc, OID_AUTO, abe, CTLFLAG_VNET | CTLFLAG_RW, + &VNET_NAME(cc_do_abe), 0, + "Enable draft-ietf-tcpm-alternativebackoff-ecn (TCP Alternative Backoff with ECN)"); + +VNET_DEFINE(int, cc_abe_frlossreduce) = 0; +SYSCTL_INT(_net_inet_tcp_cc, OID_AUTO, abe_frlossreduce, CTLFLAG_VNET | CTLFLAG_RW, + &VNET_NAME(cc_abe_frlossreduce), 0, + "Apply standard beta instead of ABE-beta during ECN-signalled congestion " + "recovery episodes if loss also needs to be repaired"); Modified: head/sys/netinet/cc/cc.h ============================================================================== --- head/sys/netinet/cc/cc.h Mon Mar 19 16:17:10 2018 (r331213) +++ head/sys/netinet/cc/cc.h Mon Mar 19 16:37:47 2018 (r331214) @@ -64,6 +64,12 @@ extern struct cc_algo newreno_cc_algo; VNET_DECLARE(struct cc_algo *, default_cc_ptr); #define V_default_cc_ptr VNET(default_cc_ptr) +VNET_DECLARE(int, cc_do_abe); +#define V_cc_do_abe VNET(cc_do_abe) + +VNET_DECLARE(int, cc_abe_frlossreduce); +#define V_cc_abe_frlossreduce VNET(cc_abe_frlossreduce) + /* Define the new net.inet.tcp.cc sysctl tree. */ SYSCTL_DECL(_net_inet_tcp_cc); Modified: head/sys/netinet/cc/cc_newreno.c ============================================================================== --- head/sys/netinet/cc/cc_newreno.c Mon Mar 19 16:17:10 2018 (r331213) +++ head/sys/netinet/cc/cc_newreno.c Mon Mar 19 16:37:47 2018 (r331214) @@ -3,7 +3,7 @@ * * Copyright (c) 1982, 1986, 1988, 1990, 1993, 1994, 1995 * The Regents of the University of California. - * Copyright (c) 2007-2008,2010 + * Copyright (c) 2007-2008,2010,2014 * Swinburne University of Technology, Melbourne, Australia. * Copyright (c) 2009-2010 Lawrence Stewart <lstewart@freebsd.org> * Copyright (c) 2010 The FreeBSD Foundation @@ -48,6 +48,11 @@ * University Research Program Fund at Community Foundation Silicon Valley. * More details are available at: * http://caia.swin.edu.au/urp/newtcp/ + * + * Dec 2014 garmitage@swin.edu.au + * Borrowed code fragments from cc_cdg.c to add modifiable beta + * via sysctls. + * */ #include <sys/cdefs.h> @@ -69,20 +74,54 @@ __FBSDID("$FreeBSD$"); #include <netinet/tcp_var.h> #include <netinet/cc/cc.h> #include <netinet/cc/cc_module.h> +#include <netinet/cc/cc_newreno.h> +static MALLOC_DEFINE(M_NEWRENO, "newreno data", + "newreno beta values"); + +#define CAST_PTR_INT(X) (*((int*)(X))) + +static int newreno_cb_init(struct cc_var *ccv); static void newreno_ack_received(struct cc_var *ccv, uint16_t type); static void newreno_after_idle(struct cc_var *ccv); static void newreno_cong_signal(struct cc_var *ccv, uint32_t type); static void newreno_post_recovery(struct cc_var *ccv); +static int newreno_ctl_output(struct cc_var *ccv, struct sockopt *sopt, void *buf); +static VNET_DEFINE(uint32_t, newreno_beta) = 50; +static VNET_DEFINE(uint32_t, newreno_beta_ecn) = 80; +#define V_newreno_beta VNET(newreno_beta) +#define V_newreno_beta_ecn VNET(newreno_beta_ecn) + struct cc_algo newreno_cc_algo = { .name = "newreno", + .cb_init = newreno_cb_init, .ack_received = newreno_ack_received, .after_idle = newreno_after_idle, .cong_signal = newreno_cong_signal, .post_recovery = newreno_post_recovery, + .ctl_output = newreno_ctl_output, }; +struct newreno { + uint32_t beta; + uint32_t beta_ecn; +}; + +int +newreno_cb_init(struct cc_var *ccv) +{ + struct newreno *nreno; + + nreno = malloc(sizeof(struct newreno), M_NEWRENO, M_NOWAIT|M_ZERO); + if (nreno != NULL) { + nreno->beta = V_newreno_beta; + nreno->beta_ecn = V_newreno_beta_ecn; + } + + return (0); +} + static void newreno_ack_received(struct cc_var *ccv, uint16_t type) { @@ -184,27 +223,48 @@ newreno_after_idle(struct cc_var *ccv) static void newreno_cong_signal(struct cc_var *ccv, uint32_t type) { - u_int win; + struct newreno *nreno; + uint32_t cwin, factor; + u_int mss; + factor = V_newreno_beta; + nreno = ccv->cc_data; + if (nreno != NULL) { + if (V_cc_do_abe) + factor = (type == CC_ECN ? nreno->beta_ecn: nreno->beta); + else + factor = nreno->beta; + } + + cwin = CCV(ccv, snd_cwnd); + mss = CCV(ccv, t_maxseg); + /* Catch algos which mistakenly leak private signal types. */ KASSERT((type & CC_SIGPRIVMASK) == 0, ("%s: congestion signal type 0x%08x is private\n", __func__, type)); - win = max(CCV(ccv, snd_cwnd) / 2 / CCV(ccv, t_maxseg), 2) * - CCV(ccv, t_maxseg); + cwin = max(((uint64_t)cwin * (uint64_t)factor) / (100ULL * (uint64_t)mss), + 2) * mss; switch (type) { case CC_NDUPACK: if (!IN_FASTRECOVERY(CCV(ccv, t_flags))) { + if (IN_CONGRECOVERY(CCV(ccv, t_flags) && + V_cc_do_abe && V_cc_abe_frlossreduce)) { + CCV(ccv, snd_ssthresh) = + ((uint64_t)CCV(ccv, snd_ssthresh) * + (uint64_t)nreno->beta) / + (100ULL * (uint64_t)nreno->beta_ecn); + } if (!IN_CONGRECOVERY(CCV(ccv, t_flags))) - CCV(ccv, snd_ssthresh) = win; + CCV(ccv, snd_ssthresh) = cwin; ENTER_RECOVERY(CCV(ccv, t_flags)); } break; case CC_ECN: if (!IN_CONGRECOVERY(CCV(ccv, t_flags))) { - CCV(ccv, snd_ssthresh) = win; - CCV(ccv, snd_cwnd) = win; + CCV(ccv, snd_ssthresh) = cwin; + CCV(ccv, snd_cwnd) = cwin; ENTER_CONGRECOVERY(CCV(ccv, t_flags)); } break; @@ -242,5 +302,75 @@ newreno_post_recovery(struct cc_var *ccv) } } +int +newreno_ctl_output(struct cc_var *ccv, struct sockopt *sopt, void *buf) +{ + struct newreno *nreno; + struct cc_newreno_opts *opt; + + if (sopt->sopt_valsize != sizeof(struct cc_newreno_opts)) + return (EMSGSIZE); + + nreno = ccv->cc_data; + opt = buf; + + switch (sopt->sopt_dir) { + case SOPT_SET: + switch (opt->name) { + case CC_NEWRENO_BETA: + nreno->beta = opt->val; + break; + case CC_NEWRENO_BETA_ECN: + if (!V_cc_do_abe) + return (EACCES); + nreno->beta_ecn = opt->val; + break; + default: + return (ENOPROTOOPT); + } + case SOPT_GET: + switch (opt->name) { + case CC_NEWRENO_BETA: + opt->val = nreno->beta; + break; + case CC_NEWRENO_BETA_ECN: + opt->val = nreno->beta_ecn; + break; + default: + return (ENOPROTOOPT); + } + default: + return (EINVAL); + } + + return (0); +} + +static int +newreno_beta_handler(SYSCTL_HANDLER_ARGS) +{ + if (req->newptr != NULL ) { + if (arg1 == &VNET_NAME(newreno_beta_ecn) && !V_cc_do_abe) + return (EACCES); + if (CAST_PTR_INT(req->newptr) <= 0 || CAST_PTR_INT(req->newptr) > 100) + return (EINVAL); + } + + return (sysctl_handle_int(oidp, arg1, arg2, req)); +} + +SYSCTL_DECL(_net_inet_tcp_cc_newreno); +SYSCTL_NODE(_net_inet_tcp_cc, OID_AUTO, newreno, CTLFLAG_RW, NULL, + "New Reno related settings"); + +SYSCTL_PROC(_net_inet_tcp_cc_newreno, OID_AUTO, beta, + CTLFLAG_VNET | CTLTYPE_UINT | CTLFLAG_RW, + &VNET_NAME(newreno_beta), 3, &newreno_beta_handler, "IU", + "New Reno beta, specified as number between 1 and 100"); + +SYSCTL_PROC(_net_inet_tcp_cc_newreno, OID_AUTO, beta_ecn, + CTLFLAG_VNET | CTLTYPE_UINT | CTLFLAG_RW, + &VNET_NAME(newreno_beta_ecn), 3, &newreno_beta_handler, "IU", + "New Reno beta ecn, specified as number between 1 and 100"); DECLARE_CC_MODULE(newreno, &newreno_cc_algo); Added: head/sys/netinet/cc/cc_newreno.h ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ head/sys/netinet/cc/cc_newreno.h Mon Mar 19 16:37:47 2018 (r331214) @@ -0,0 +1,42 @@ +/*- + * Copyright (c) 2017 Tom Jones <tj@enoti.me> + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * $FreeBSD$ + */ + +#ifndef _CC_NEWRENO_H +#define _CC_NEWRENO_H + +#define CCALGONAME_NEWRENO "newreno" + +struct cc_newreno_opts { + int name; + uint32_t val; +}; + +#define CC_NEWRENO_BETA 1 +#define CC_NEWRENO_BETA_ECN 2 + +#endif /* _CC_NEWRENO_H */
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201803191637.w2JGbmON093556>