From owner-freebsd-transport@freebsd.org Mon Feb 20 08:37:05 2017 Return-Path: Delivered-To: freebsd-transport@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D5FD6CDB7FA for ; Mon, 20 Feb 2017 08:37:05 +0000 (UTC) (envelope-from lstewart@freebsd.org) Received: from lauren.room52.net (lauren.room52.net [210.50.193.198]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 9D6DB1370 for ; Mon, 20 Feb 2017 08:37:05 +0000 (UTC) (envelope-from lstewart@freebsd.org) Received: from lgwl-lstewart2.corp.netflix.com (c110-22-60-167.eburwd6.vic.optusnet.com.au [110.22.60.167]) by lauren.room52.net (Postfix) with ESMTPSA id DE3DC7E958; Mon, 20 Feb 2017 19:36:56 +1100 (EST) Subject: Re: TCP Receive buffer scaling without Timestamps To: Steven Hartland , freebsd-transport@freebsd.org References: <2cc38c5d-78a9-4ce1-902e-70f2e55f14a2@multiplay.co.uk> From: Lawrence Stewart Message-ID: <9d615fdb-c507-b6ef-2abe-e464a7849748@freebsd.org> Date: Mon, 20 Feb 2017 19:36:47 +1100 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <2cc38c5d-78a9-4ce1-902e-70f2e55f14a2@multiplay.co.uk> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=2.4 required=5.0 tests=DNS_FROM_AHBL_RHSBL, UNPARSEABLE_RELAY autolearn=no version=3.3.2 X-Spam-Level: ** X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on lauren.room52.net X-Mailman-Approved-At: Mon, 20 Feb 2017 12:10:59 +0000 X-BeenThere: freebsd-transport@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussions of transport level network protocols in FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Feb 2017 08:37:05 -0000 On 19/02/2017 12:37, Steven Hartland wrote: > On 18/02/2017 00:33, Lawrence Stewart wrote: >> On 18/02/2017 09:37, Steven Hartland wrote: >> [snip] >>> So the questions: >>> >>> 1. Has anyone looked at this issue before or is working on it? >>> 2. Do people think we could add support for RTT estimates and enable >>> receive scaling without major tcp stack changes? >> We already have a non-TS-enabled RTT estimator in place c.f. the >> relevant code in tcp_input.c: >> >> if ((to.to_flags & TOF_TS) != 0 && >> to.to_tsecr) { >> uint32_t t; >> >> t = tcp_ts_getticks() - to.to_tsecr; >> if (!tp->t_rttlow || tp->t_rttlow > t) >> tp->t_rttlow = t; >> tcp_xmit_timer(tp, >> TCP_TS_TO_TICKS(t) + 1); >> } else if (tp->t_rtttime && >> SEQ_GT(th->th_ack, tp->t_rtseq)) { >> if (!tp->t_rttlow || >> tp->t_rttlow > ticks - tp->t_rtttime) >> tp->t_rttlow = ticks - tp->t_rtttime; >> tcp_xmit_timer(tp, >> ticks - tp->t_rtttime); >> } >> >> >> The autoscaling implementation just needs to have its insistence on >> using timestamps stuffed into a cannon and shot in the direction of the >> sun. This is the relevant block of rcvbuf code from tcp_input.c: >> >> >> if (V_tcp_do_autorcvbuf && >> (to.to_flags & TOF_TS) && >> to.to_tsecr && >> (so->so_rcv.sb_flags & SB_AUTOSIZE)) { >> if (TSTMP_GT(to.to_tsecr, tp->rfbuf_ts) && >> to.to_tsecr - tp->rfbuf_ts < hz) { >> if (tp->rfbuf_cnt > >> (so->so_rcv.sb_hiwat / 8 * 7) && >> so->so_rcv.sb_hiwat < >> V_tcp_autorcvbuf_max) { >> newsize = >> min(so->so_rcv.sb_hiwat + >> V_tcp_autorcvbuf_inc, >> V_tcp_autorcvbuf_max); >> } >> /* Start over with next RTT. */ >> tp->rfbuf_ts = 0; >> tp->rfbuf_cnt = 0; >> } else >> tp->rfbuf_cnt += tlen; /* add up */ >> } >> >> It's pretty trivial to fix by anyone so inclined. > Thanks for the pointers Lawrence, I'm not very familiar with network > stack but I've had a stab at this, after reading through the current > code in the blocks you pointed out, along side some experimentation ;-) Nice, thanks for running with it. > The results seem pretty encouraging, with S3 downloads with a ~17ms > latency on a 1Gbps jumping from ~3MB/s to ~80MB/s. Amazing what a bit of tuning can do :) > I'd appreciate feedback on the review which can be found here: > https://reviews.freebsd.org/D9668 ACK, will follow up there. Cheers, Lawrence