Date: Wed, 13 Feb 2013 19:25:19 +1100 From: Lawrence Stewart <lstewart@freebsd.org> To: John Baldwin <jhb@freebsd.org> Cc: net@freebsd.org Subject: Re: [PATCH] Add a new TCP_IGNOREIDLE socket option Message-ID: <511B4DEF.8000500@freebsd.org> In-Reply-To: <201301221511.02496.jhb@freebsd.org> References: <201301221511.02496.jhb@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
FYI I've read the whole thread as of this reply and plan to follow up to a few of the other posts separately, but first for my initial thoughts... On 01/23/13 07:11, John Baldwin wrote: > As I mentioned in an earlier thread, I recently had to debug an issue we were > seeing across a link with a high bandwidth-delay product (both high bandwidth > and high RTT). Our specific use case was to use a TCP connection to reliably > forward a latency-sensitive datagram stream across a WAN connection. We would > often see spikes in the latency of individual datagrams. I eventually tracked > this down to the connection entering slow start when it would transmit data > after being idle. The data stream was quite bursty and would often attempt to > transmit a burst of data after being idle for far longer than a retransmit > timeout. Got it. > In 7.x we had worked around this in the past by disabling RFC 3390 and jacking > the slow start window size up via a sysctl. On 8.x this no longer worked. I can't think of, nor have I read any convincing argument why we shouldn't support your use case out of the box. You're not the only user of FreeBSD over dedicated lines who knows what you're doing. We should provide some way to support this use case. We're therefore left with the question of how to implement this. As noted in the "Some questions about the new TCP congestion control code" thread [1], it was always my intention to axe the ss_flightsize variables and replace them with a better mechanism. Andre swung the axe before I did and 10.x is looming so it's a good time to discuss all of this. > The solution I came up with was to add a new socket option to disable idle > handling completely. That is, when an idle connection restarts with this new > option enabled, it keeps its current congestion window and doesn't enter slow > start. rwatson@ mentioned an idea in private discussion which I've also thought about over the years. The real goal here should be to subsume your use case (and others) into a much richer framework for hinting desired behaviour/tradeoff preferences (some aspects of which relate to parts of my PhD work, which will hopefully be coming to a kernel near you in 2013 ;). My main concern with your patch is that I'm a bit uneasy about enshrining a socket option in a public API and documentation that is so specific. I suspect apps probably want to set higher level goals like "low latency *at any cost*" and have the stack opaquely interpret that as "this guy is willing to blow his foot off, so let's disable idle window reset, tweak X, disable Y and hand the man his loaded shotgun". TCP_IGNOREIDLE as currently proposed misses this bigger picture, though doesn't preclude it either. I would also echo Kevin/Grenville's thoughts about keying the socket option's activation off a tunable (sysctl or kernel option is up for discussion, though I'd be leaning towards sysctl) that is disabled by default i.e. only skip after idle window reset if the app sets the option *and* the sysadmin has pulled the "I like me some bursty network" lever. > There are only a few cases where such an option is useful, but if anyone else > thinks this might be useful I'd be happy to add the option to FreeBSD. The idea is useful. I'd just like to discuss the implementation specifics a little further before recommending whether the patch should go in as is to provide a stop gap, or we rework the patch to be a little less specific in readiness for the future work I have in mind. Cheers, Lawrence [1] http://lists.freebsd.org/pipermail/freebsd-net/2013-January/034297.html
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?511B4DEF.8000500>