From owner-freebsd-net@FreeBSD.ORG Wed Mar 12 01:23:32 2008 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AB8F5106566C for ; Wed, 12 Mar 2008 01:23:32 +0000 (UTC) (envelope-from coda.trigger@gmail.com) Received: from wa-out-1112.google.com (wa-out-1112.google.com [209.85.146.180]) by mx1.freebsd.org (Postfix) with ESMTP id 7AE498FC24 for ; Wed, 12 Mar 2008 01:23:32 +0000 (UTC) (envelope-from coda.trigger@gmail.com) Received: by wa-out-1112.google.com with SMTP id k17so3086094waf.3 for ; Tue, 11 Mar 2008 18:23:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:mime-version:content-type; bh=8mQra0JHSxz3Py0iRrONnZ9dp41sI2QO+BMVfrvGgpw=; b=oZotu6ZCTB7529soqqFsLCXpVGe9Tq8bO10eYm+Mw7G4XeqDkSVDq+3xwfYMzvjF7YVrhXvOEQ/RISu6Im/fdNcNmMNj8SPngUU+3SqJc+Hpfm6YK3x6wcaj34zu20Yun2xCVlOHyAQLy/900ocaSTOIs/VJAQRZf65Or0vcl2w= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:mime-version:content-type; b=XWsC2/l6j1/71GqGhwCYpYUCxni4IBIU5HVh1l+l5bHxPSY/y3WeNjDg+sfXw1H7djvULO39pxC8fUsxkkgm7BdTh+nAX/zi2TNWD3/he9PTgAsNd9WsCu7wf0KjioKybUnv8eWKagJ5xijD6nvuVmcYYF8FKm1RMah6xJ/Vu2g= Received: by 10.114.170.1 with SMTP id s1mr6130507wae.54.1205283397593; Tue, 11 Mar 2008 17:56:37 -0700 (PDT) Received: by 10.115.15.12 with HTTP; Tue, 11 Mar 2008 17:56:37 -0700 (PDT) Message-ID: Date: Tue, 11 Mar 2008 20:56:37 -0400 From: "d.s. al coda" To: freebsd-net@freebsd.org MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_6845_23099961.1205283397588" X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: TCP options order changed in FreeBSD 7, incompatible with some routers X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Mar 2008 01:23:32 -0000 ------=_Part_6845_23099961.1205283397588 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline Hi, We recently upgraded one of our webservers to FreeBSD 7, and we started receiving complaints from some users not able to connect to that server anymore. On top of that, users were saying that the problem only occurred on Windows (at least, the ones who had more than on OS to try it out). After managing to get a user who had the problem running windump, running tcpdump on the new server, and comparing that to the windump & tcpdump output for a "control" user (me) that could connect, we managed to figure out the following: - For the user with this problem, ping works fine, but all TCP connections to the server fail. - The user, trying to connect, sends out a SYN packet, receives no response, and retries a few times until timing out. - The server sees a bunch of SYN packets and responds with SYN-ACK each time. - The issue only seems to arise if the sender has RFC1323 disabled. So, the SYN-ACK is getting lost somewhere. - For the control user (who can connect via TCP just fine), we set the TCP window size and RFC1323 options the same as the user with the problem. - The control user sees the SYN-ACK packet. - We send a connection attempt to one of our other servers, running FreeBSD 5.5, and one to the server running FreeBSD 7. - There is only one notable difference between the responses: the order of the options. - FreeBSD 5.5 has - FreeBSD 7 has (there is of course an aligning nop after the eol, which tcpdump skips) - These options don't appear in this exact configuration when using RFC1323 options. I get a hunch that the users with the problem have a router that erroneously thinks that these options are invalid, or thinks that the some part of byte sequence (e.g. 0204 05b4 0101 0402) is an attack. Just to try it out, I patched tcp_output.c so that the SACK permitted option was aligned on a 4-byte boundary, preventing the "sackOK, eol" pattern from ever occuring. Looking through previous versions, I found where the tcp option code had changed, and there used to be a comment about putting SACK permitted last, but I can't tell if it's relevant. http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/tcp_output.c.diff?r1=1.125;r2=1.126 The one-line patch to tcp_output.c is attached. Sure enough, it fixed the problem. Afterwards, we collected some information about the routers the users who had the problem were using, and while they didn't all have the same manufacturer, several mentioned that their router had a built-in firewall, which, when they disabled it, allowed them to access the server. Does all of this sound reasonable? And if so, would it be worth submitting this patch? I don't know if this particular change in options order was intentional, or just a side-effect of the new code, but it certainly works around an extremely hard-to-diagnose problem. -coda ------=_Part_6845_23099961.1205283397588 Content-Type: text/plain; name=tcp_output.c.patch.txt Content-Transfer-Encoding: base64 X-Attachment-Id: f_fdp6j40n Content-Disposition: attachment; filename=tcp_output.c.patch.txt LS0tIHRjcF9vdXRwdXRfb3JpZy5jICAgMjAwOC0wMy0wMyAwMDoxMzowNi4wMDAwMDAwMDAgLTA1 MDANCisrKyB0Y3Bfb3V0cHV0X25ldy5jICAgIDIwMDgtMDMtMDMgMDA6MTY6MDYuMDAwMDAwMDAw IC0wNTAwDQpAQCAtMTMwNCw3ICsxMzA0LDcgQEANCiAgICAgICAgICAgICAgICAgICAgICAgICpv cHRwKysgPSB0by0+dG9fd3NjYWxlOw0KICAgICAgICAgICAgICAgICAgICAgICAgYnJlYWs7DQog ICAgICAgICAgICAgICAgY2FzZSBUT0ZfU0FDS1BFUk06DQotICAgICAgICAgICAgICAgICAgICAg ICB3aGlsZSAob3B0bGVuICUgMikgew0KKyAgICAgICAgICAgICAgICAgICAgICAgd2hpbGUgKCFv cHRsZW4gfHwgb3B0bGVuICUgNCAhPSAyKSB7DQogICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgIG9wdGxlbiArPSBUQ1BPTEVOX05PUDsNCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgKm9wdHArKyA9IFRDUE9QVF9OT1A7DQogICAgICAgICAgICAgICAgICAgICAgICB9DQo= ------=_Part_6845_23099961.1205283397588--