From owner-freebsd-net@FreeBSD.ORG Tue May 19 21:26:10 2009 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 95C7D106566B for ; Tue, 19 May 2009 21:26:10 +0000 (UTC) (envelope-from zachary.loafman@isilon.com) Received: from seaxch10.isilon.com (seaxch10.isilon.com [74.85.160.26]) by mx1.freebsd.org (Postfix) with ESMTP id 75EA98FC17 for ; Tue, 19 May 2009 21:26:09 +0000 (UTC) (envelope-from zachary.loafman@isilon.com) Received: from famine.isilon.com ([10.54.190.95]) by seaxch10.isilon.com with Microsoft SMTPSVC(6.0.3790.1830); Tue, 19 May 2009 14:13:49 -0700 Received: from zloafman by famine.isilon.com with local (Exim 4.69) (envelope-from ) id 1M6Wd8-0000uK-Gj for net@freebsd.org; Tue, 19 May 2009 14:13:46 -0700 Date: Tue, 19 May 2009 14:13:46 -0700 From: Zachary Loafman To: net@freebsd.org Message-ID: <20090519211346.GC675@isilon.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="LwW0XdcUbUexiWVK" Content-Disposition: inline User-Agent: Mutt/1.5.18 (2008-05-17) X-OriginalArrivalTime: 19 May 2009 21:13:49.0709 (UTC) FILETIME=[B450D3D0:01C9D8C6] Cc: Subject: [PATCH] SYN issue X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 May 2009 21:26:11 -0000 --LwW0XdcUbUexiWVK Content-Type: text/plain; charset=us-ascii Content-Disposition: inline net@ - A short patch attached that requires 3 paragraphs of explanation. We found an issue in TCP when the a client connects to our server, establishes a connection, reboots and chooses the same source port to re-establish the connection. This isn't hard from other vendors' clients. On Solaris, the same NFS mount order at boot time will frequently result in source port re-use for the NFS connections. In this case, the customer was seeing mounts hang until the keepalive on our side would kick the established connection. The problem in the code is probably best explained using the patch itself: --- Index: sys/netinet/tcp_input.c =================================================================== --- sys/netinet/tcp_input.c (revision xxxx) +++ sys/netinet/tcp_input.c (working copy) @@ -1818,7 +1818,11 @@ tcp_do_segment(struct mbuf *m, struct tc todrop = tp->rcv_nxt - th->th_seq; if (todrop > 0) { - if (thflags & TH_SYN) { + /* + * If this is a duplicate SYN for our current connection, + * advance over it and pretend and it's not a SYN. + */ + if (thflags & TH_SYN && th->th_seq == tp->irs) { thflags &= ~TH_SYN; th->th_seq++; if (th->th_urp > 1) --- The problem is that when our TCP stack gets a SYN packet for a connection that's already in ESTABLISHED state, it runs through the above code. The above code is basically noticing that the packet is coming in left of the receive window and then saying "Ah, a SYN! This must be a duplicate SYN for our existing connect." After that, it just turns off SYN and treats it as a normal packet (after advancing past the SYN seq number). The code is broken, though: the only condition under which this is a duplicate SYN is if the th_seq matches the irs, the initial receive sequence. After correcting the above, any SYN that doesn't exactly match the initial sequence number results in a RST|ACK response and the ESTABLISHED connection being dropped. Before this change, this is also what happened if a SYN arrived within or past the window, so I'm basically making the before-window behavior match the other behavior. I tested this using telnet to establish a TCP connection and raw packet injection to throw SYNs at it. Comments? -- Zach Loafman | Staff Engineer | Isilon Systems --LwW0XdcUbUexiWVK Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="syn.patch" --- a/sys/netinet/tcp_input.c +++ b/sys/netinet/tcp_input.c @@ -1818,7 +1818,11 @@ tcp_do_segment(struct mbuf *m, struct tcphdr *th, struct socket *so, todrop = tp->rcv_nxt - th->th_seq; if (todrop > 0) { - if (thflags & TH_SYN) { + /* + * If this is a duplicate SYN for our current connection, + * advance over it and pretend and it's not a SYN. + */ + if (thflags & TH_SYN && th->th_seq == tp->irs) { thflags &= ~TH_SYN; th->th_seq++; if (th->th_urp > 1) --LwW0XdcUbUexiWVK--