From owner-freebsd-net@FreeBSD.ORG Fri Oct 12 16:13:30 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 7ABEAD48 for ; Fri, 12 Oct 2012 16:13:30 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 058C68FC1B for ; Fri, 12 Oct 2012 16:13:27 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id B6AFDB924 for ; Fri, 12 Oct 2012 12:13:11 -0400 (EDT) From: John Baldwin To: net@freebsd.org Subject: Dropping TCP options from retransmitted SYNs considered harmful Date: Fri, 12 Oct 2012 12:13:11 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p20; KDE/4.5.5; amd64; ; ) MIME-Version: 1.0 Content-Type: Text/Plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Message-Id: <201210121213.11152.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Fri, 12 Oct 2012 12:13:11 -0400 (EDT) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Oct 2012 16:13:30 -0000 Back in 2001 FreeBSD added a hack to strip TCP options from retransmitted SYNs starting with the 3rd SYN in this block in tcp_timer.c: /* * Disable rfc1323 if we haven't got any response to * our third SYN to work-around some broken terminal servers * (most of which have hopefully been retired) that have bad VJ * header compression code which trashes TCP segments containing * unknown-to-them TCP options. */ if ((tp->t_state == TCPS_SYN_SENT) && (tp->t_rxtshift == 3)) tp->t_flags &= ~(TF_REQ_SCALE|TF_REQ_TSTMP); There is even a PR for the original bug report: kern/1689 However, there is an unintended consequence of this change that can be disastrous. Specifically, suppose you have a FreeBSD client connecting to a server, and that the SYNs are arriving at the server successfully, but the first few return SYN/ACKs are dropped. Eventually a SYN/ACK makes it through and the connection is established. The server (based on the first SYN it saw) believes it has negotiated window scaling with the client. The client, however, has broken what it promised in that first SYN and believes it is not using any window scaling at all. This causes two forms of breakage: 1) When the server advertises a scaled window (e.g. '8' for a 64k window scaled at 13), the client thinks it is an unscaled window ('8') and sends data to the server very slowly. 2) When the client advertises an unscaled window (e.g. '65535' for a 64k window), the server thinks it has a huge window (65535 << 13 == 511MB) to send into. I'm not sure that 2) is a problem per se, but I have definitely seen instances of 1) (and examined the 'struct tcpcb' in kgdb on both the server and client end of the connections to verify they disagreed on the scaling). The original motivation of this change is to work around broken terminal servers that were old when this change was added in 2001. Over 10 years later I think we should at least have an option to turn this work-around off, and possibly disable it by default. Thoughts? -- John Baldwin