From owner-freebsd-net@FreeBSD.ORG Wed Feb 13 08:32:43 2013 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 14F02964; Wed, 13 Feb 2013 08:32:43 +0000 (UTC) (envelope-from lstewart@freebsd.org) Received: from lauren.room52.net (lauren.room52.net [210.50.193.198]) by mx1.freebsd.org (Postfix) with ESMTP id 9E6622A8; Wed, 13 Feb 2013 08:32:42 +0000 (UTC) Received: from lstewart.caia.swin.edu.au (lstewart.caia.swin.edu.au [136.186.229.95]) by lauren.room52.net (Postfix) with ESMTPSA id 644FA7E81E; Wed, 13 Feb 2013 19:25:20 +1100 (EST) Message-ID: <511B4DEF.8000500@freebsd.org> Date: Wed, 13 Feb 2013 19:25:19 +1100 From: Lawrence Stewart User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130213 Thunderbird/17.0.2 MIME-Version: 1.0 To: John Baldwin Subject: Re: [PATCH] Add a new TCP_IGNOREIDLE socket option References: <201301221511.02496.jhb@freebsd.org> In-Reply-To: <201301221511.02496.jhb@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=0.0 required=5.0 tests=UNPARSEABLE_RELAY autolearn=unavailable version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on lauren.room52.net Cc: net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Feb 2013 08:32:43 -0000 FYI I've read the whole thread as of this reply and plan to follow up to a few of the other posts separately, but first for my initial thoughts... On 01/23/13 07:11, John Baldwin wrote: > As I mentioned in an earlier thread, I recently had to debug an issue we were > seeing across a link with a high bandwidth-delay product (both high bandwidth > and high RTT). Our specific use case was to use a TCP connection to reliably > forward a latency-sensitive datagram stream across a WAN connection. We would > often see spikes in the latency of individual datagrams. I eventually tracked > this down to the connection entering slow start when it would transmit data > after being idle. The data stream was quite bursty and would often attempt to > transmit a burst of data after being idle for far longer than a retransmit > timeout. Got it. > In 7.x we had worked around this in the past by disabling RFC 3390 and jacking > the slow start window size up via a sysctl. On 8.x this no longer worked. I can't think of, nor have I read any convincing argument why we shouldn't support your use case out of the box. You're not the only user of FreeBSD over dedicated lines who knows what you're doing. We should provide some way to support this use case. We're therefore left with the question of how to implement this. As noted in the "Some questions about the new TCP congestion control code" thread [1], it was always my intention to axe the ss_flightsize variables and replace them with a better mechanism. Andre swung the axe before I did and 10.x is looming so it's a good time to discuss all of this. > The solution I came up with was to add a new socket option to disable idle > handling completely. That is, when an idle connection restarts with this new > option enabled, it keeps its current congestion window and doesn't enter slow > start. rwatson@ mentioned an idea in private discussion which I've also thought about over the years. The real goal here should be to subsume your use case (and others) into a much richer framework for hinting desired behaviour/tradeoff preferences (some aspects of which relate to parts of my PhD work, which will hopefully be coming to a kernel near you in 2013 ;). My main concern with your patch is that I'm a bit uneasy about enshrining a socket option in a public API and documentation that is so specific. I suspect apps probably want to set higher level goals like "low latency *at any cost*" and have the stack opaquely interpret that as "this guy is willing to blow his foot off, so let's disable idle window reset, tweak X, disable Y and hand the man his loaded shotgun". TCP_IGNOREIDLE as currently proposed misses this bigger picture, though doesn't preclude it either. I would also echo Kevin/Grenville's thoughts about keying the socket option's activation off a tunable (sysctl or kernel option is up for discussion, though I'd be leaning towards sysctl) that is disabled by default i.e. only skip after idle window reset if the app sets the option *and* the sysadmin has pulled the "I like me some bursty network" lever. > There are only a few cases where such an option is useful, but if anyone else > thinks this might be useful I'd be happy to add the option to FreeBSD. The idea is useful. I'd just like to discuss the implementation specifics a little further before recommending whether the patch should go in as is to provide a stop gap, or we rework the patch to be a little less specific in readiness for the future work I have in mind. Cheers, Lawrence [1] http://lists.freebsd.org/pipermail/freebsd-net/2013-January/034297.html