From owner-freebsd-stable@FreeBSD.ORG Mon Jun 23 10:23:08 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 55F4F106564A for ; Mon, 23 Jun 2008 10:23:08 +0000 (UTC) (envelope-from se@freebsd.org) Received: from mail.atsec.com (mail.atsec.com [195.30.249.153]) by mx1.freebsd.org (Postfix) with ESMTP id DB9DE8FC16 for ; Mon, 23 Jun 2008 10:23:07 +0000 (UTC) (envelope-from se@freebsd.org) Received: from mail.atsec.com (localhost [127.0.0.1]) by mail.atsec.com (Postfix) with ESMTP id 193BB58015 for ; Mon, 23 Jun 2008 12:05:51 +0200 (CEST) Received: from [10.2.2.68] (frueh.atsec.com [217.110.13.170]) (Authenticated sender: se@atsec.com) by mail.atsec.com (Postfix) with ESMTP id D106958008; Mon, 23 Jun 2008 12:05:50 +0200 (CEST) Message-ID: <485F7576.5070104@freebsd.org> Date: Mon, 23 Jun 2008 12:05:42 +0200 From: =?ISO-8859-15?Q?Stefan_E=DFer?= User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.8.1.14) Gecko/20080421 Lightning/0.9pre Thunderbird/2.0.0.14 ThunderBrowse/3.2.1.7 Mnenhy/0.7.5.666 MIME-Version: 1.0 To: Matthew Dillon References: <0222EAC1-A278-41D2-9566-C9CF19811068@optusnet.com.au> <200806230827.m5N8RBlW085475@apollo.backplane.com> In-Reply-To: <200806230827.m5N8RBlW085475@apollo.backplane.com> X-Enigmail-Version: 0.95.6 Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 8bit Cc: Jerahmy Pocott , freebsd-stable@freebsd.org Subject: Re: Sysctl knob(s) to set TCP 'nagle' time-out? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Jun 2008 10:23:08 -0000 Matthew Dillon wrote: > In anycase, the usual solution is to disable Nagle rather then mess > with delayed acks. What we need is a new Nagle that understands the > new reality for interactive connections... something that doesn't break > performance in the 'server in the middle' data relaying case. One possibility I see is a statistic about DelACKs per TCP connection, counting those that were rightfully delayed (with hindsight). I.e., if an ACK is delayed, but there was no chance to piggy-back it or to combine it with another ACK, it could have been sent without delay. Only those delayed ACKs that reduce load are "good", all others cause additional state to be maintained and may increase latencies for no good reason. Therefore, I thought about starting with Nagle enabled, but give up on delaying ACKs, when doing so is found to be ineffective. The only problem with this approach is that once TCP_NODELAY is implicitly set due to measured behavior of the communication, a situation that would benefit from delayed ACKs can no longer be detected. (Well, you could measure the delay between an ACK and the next data sent to the same destination; disable TCP_NODELAY if ACKs could have been piggy-backed on data packets without too much delay. May be we could really have TCP auto-tune with respect to use of delayed ACKs ... I had suggested this years back, when the issue was discussed, but consensus was, that you should just set TCP_NODELAY. But automatic adjustment could also (implicitly) take RTT, window size into consideration. And to me, automatic setting of TCP_NODELAY seems more useful than automatic clearing (after delayed ACKs had been found to be of no use for a window of say 8 or 16 ACKs). The implementation would be quite simple: Whenever a delayed ACK is sent, check whether it is sent on its own (bad) or whether it could be piggy-backed (good). If, say, 7 of 8 delayed ACKs had to be sent as ACK-only packets, anyway, set TCP_NODELAY and do not bother to keep on deciding whether delayed ACKs had become useful in a different phase of the communication. If you want to be able to automatically disable TCP_NODELAY, then just set a time-stamp whenever an ACK is sent and when the next data is sent through this same socket, check whether delaying the ACK had allowed to send it with that data packet (i.e. the delay was less than the maximum hold time of the delayed ACK). If it had been beneficial to delay ACKs (say 3 out of a window of 4) then clear TCP_NODELAY. I have no idea, whether SMP locking would be problematic, but I guess the checks and counter updates could be put in sections that are appropriately locked, anyway. Regards, STefan