From owner-svn-src-head@freebsd.org Thu Dec 7 22:36:59 2017 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6FC3EE93221; Thu, 7 Dec 2017 22:36:59 +0000 (UTC) (envelope-from glebius@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 46E3B79C4B; Thu, 7 Dec 2017 22:36:59 +0000 (UTC) (envelope-from glebius@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id vB7Mawis059700; Thu, 7 Dec 2017 22:36:58 GMT (envelope-from glebius@FreeBSD.org) Received: (from glebius@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id vB7MawIY059698; Thu, 7 Dec 2017 22:36:58 GMT (envelope-from glebius@FreeBSD.org) Message-Id: <201712072236.vB7MawIY059698@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: glebius set sender to glebius@FreeBSD.org using -f From: Gleb Smirnoff Date: Thu, 7 Dec 2017 22:36:58 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r326673 - head/sys/netinet X-SVN-Group: head X-SVN-Commit-Author: glebius X-SVN-Commit-Paths: head/sys/netinet X-SVN-Commit-Revision: 326673 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Dec 2017 22:36:59 -0000 Author: glebius Date: Thu Dec 7 22:36:58 2017 New Revision: 326673 URL: https://svnweb.freebsd.org/changeset/base/326673 Log: Separate out send buffer autoscaling code into function, so that alternative TCP stacks may reuse it instead of pasting. Obtained from: Netflix Modified: head/sys/netinet/tcp_output.c head/sys/netinet/tcp_var.h Modified: head/sys/netinet/tcp_output.c ============================================================================== --- head/sys/netinet/tcp_output.c Thu Dec 7 22:19:08 2017 (r326672) +++ head/sys/netinet/tcp_output.c Thu Dec 7 22:36:58 2017 (r326673) @@ -487,60 +487,8 @@ after_sack_rexmit: /* len will be >= 0 after this point. */ KASSERT(len >= 0, ("[%s:%d]: len < 0", __func__, __LINE__)); - /* - * Automatic sizing of send socket buffer. Often the send buffer - * size is not optimally adjusted to the actual network conditions - * at hand (delay bandwidth product). Setting the buffer size too - * small limits throughput on links with high bandwidth and high - * delay (eg. trans-continental/oceanic links). Setting the - * buffer size too big consumes too much real kernel memory, - * especially with many connections on busy servers. - * - * The criteria to step up the send buffer one notch are: - * 1. receive window of remote host is larger than send buffer - * (with a fudge factor of 5/4th); - * 2. send buffer is filled to 7/8th with data (so we actually - * have data to make use of it); - * 3. send buffer fill has not hit maximal automatic size; - * 4. our send window (slow start and cogestion controlled) is - * larger than sent but unacknowledged data in send buffer. - * - * The remote host receive window scaling factor may limit the - * growing of the send buffer before it reaches its allowed - * maximum. - * - * It scales directly with slow start or congestion window - * and does at most one step per received ACK. This fast - * scaling has the drawback of growing the send buffer beyond - * what is strictly necessary to make full use of a given - * delay*bandwidth product. However testing has shown this not - * to be much of an problem. At worst we are trading wasting - * of available bandwidth (the non-use of it) for wasting some - * socket buffer memory. - * - * TODO: Shrink send buffer during idle periods together - * with congestion window. Requires another timer. Has to - * wait for upcoming tcp timer rewrite. - * - * XXXGL: should there be used sbused() or sbavail()? - */ - if (V_tcp_do_autosndbuf && so->so_snd.sb_flags & SB_AUTOSIZE) { - int lowat; + tcp_sndbuf_autoscale(tp, so, sendwin); - lowat = V_tcp_sendbuf_auto_lowat ? so->so_snd.sb_lowat : 0; - if ((tp->snd_wnd / 4 * 5) >= so->so_snd.sb_hiwat - lowat && - sbused(&so->so_snd) >= - (so->so_snd.sb_hiwat / 8 * 7) - lowat && - sbused(&so->so_snd) < V_tcp_autosndbuf_max && - sendwin >= (sbused(&so->so_snd) - - (tp->snd_nxt - tp->snd_una))) { - if (!sbreserve_locked(&so->so_snd, - min(so->so_snd.sb_hiwat + V_tcp_autosndbuf_inc, - V_tcp_autosndbuf_max), so, curthread)) - so->so_snd.sb_flags &= ~SB_AUTOSIZE; - } - } - /* * Decide if we can use TCP Segmentation Offloading (if supported by * hardware). @@ -1857,4 +1805,63 @@ tcp_addoptions(struct tcpopt *to, u_char *optp) KASSERT(optlen <= TCP_MAXOLEN, ("%s: TCP options too long", __func__)); return (optlen); +} + +void +tcp_sndbuf_autoscale(struct tcpcb *tp, struct socket *so, uint32_t sendwin) +{ + + /* + * Automatic sizing of send socket buffer. Often the send buffer + * size is not optimally adjusted to the actual network conditions + * at hand (delay bandwidth product). Setting the buffer size too + * small limits throughput on links with high bandwidth and high + * delay (eg. trans-continental/oceanic links). Setting the + * buffer size too big consumes too much real kernel memory, + * especially with many connections on busy servers. + * + * The criteria to step up the send buffer one notch are: + * 1. receive window of remote host is larger than send buffer + * (with a fudge factor of 5/4th); + * 2. send buffer is filled to 7/8th with data (so we actually + * have data to make use of it); + * 3. send buffer fill has not hit maximal automatic size; + * 4. our send window (slow start and cogestion controlled) is + * larger than sent but unacknowledged data in send buffer. + * + * The remote host receive window scaling factor may limit the + * growing of the send buffer before it reaches its allowed + * maximum. + * + * It scales directly with slow start or congestion window + * and does at most one step per received ACK. This fast + * scaling has the drawback of growing the send buffer beyond + * what is strictly necessary to make full use of a given + * delay*bandwidth product. However testing has shown this not + * to be much of an problem. At worst we are trading wasting + * of available bandwidth (the non-use of it) for wasting some + * socket buffer memory. + * + * TODO: Shrink send buffer during idle periods together + * with congestion window. Requires another timer. Has to + * wait for upcoming tcp timer rewrite. + * + * XXXGL: should there be used sbused() or sbavail()? + */ + if (V_tcp_do_autosndbuf && so->so_snd.sb_flags & SB_AUTOSIZE) { + int lowat; + + lowat = V_tcp_sendbuf_auto_lowat ? so->so_snd.sb_lowat : 0; + if ((tp->snd_wnd / 4 * 5) >= so->so_snd.sb_hiwat - lowat && + sbused(&so->so_snd) >= + (so->so_snd.sb_hiwat / 8 * 7) - lowat && + sbused(&so->so_snd) < V_tcp_autosndbuf_max && + sendwin >= (sbused(&so->so_snd) - + (tp->snd_nxt - tp->snd_una))) { + if (!sbreserve_locked(&so->so_snd, + min(so->so_snd.sb_hiwat + V_tcp_autosndbuf_inc, + V_tcp_autosndbuf_max), so, curthread)) + so->so_snd.sb_flags &= ~SB_AUTOSIZE; + } + } } Modified: head/sys/netinet/tcp_var.h ============================================================================== --- head/sys/netinet/tcp_var.h Thu Dec 7 22:19:08 2017 (r326672) +++ head/sys/netinet/tcp_var.h Thu Dec 7 22:36:58 2017 (r326673) @@ -884,6 +884,7 @@ void tcp_sack_partialack(struct tcpcb *, struct tcphd void tcp_free_sackholes(struct tcpcb *tp); int tcp_newreno(struct tcpcb *, struct tcphdr *); int tcp_compute_pipe(struct tcpcb *); +void tcp_sndbuf_autoscale(struct tcpcb *, struct socket *, uint32_t); static inline void tcp_fields_to_host(struct tcphdr *th)