From owner-freebsd-net@FreeBSD.ORG Fri Oct 14 10:04:34 2005 Return-Path: X-Original-To: freebsd-net@freebsd.org Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AF06A16A420 for ; Fri, 14 Oct 2005 10:04:34 +0000 (GMT) (envelope-from silby@silby.com) Received: from relay03.pair.com (relay03.pair.com [209.68.5.17]) by mx1.FreeBSD.org (Postfix) with SMTP id AD11243D5C for ; Fri, 14 Oct 2005 10:04:33 +0000 (GMT) (envelope-from silby@silby.com) Received: (qmail 12889 invoked from network); 14 Oct 2005 10:04:31 -0000 Received: from unknown (HELO localhost) (unknown) by unknown with SMTP; 14 Oct 2005 10:04:31 -0000 X-pair-Authenticated: 209.68.2.70 Date: Fri, 14 Oct 2005 05:04:29 -0500 (CDT) From: Mike Silbersack To: on@cs.ait.ac.th In-Reply-To: <20051014160128.hev160v52ossokg0@wwws.cs.ait.ac.th> Message-ID: <20051014045824.V5343@odysseus.silby.com> References: <20051014160128.hev160v52ossokg0@wwws.cs.ait.ac.th> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-net@freebsd.org Subject: Re: FreeBSD NFS server not responding to TCP SYN packets from Linux/SunOS clients X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Oct 2005 10:04:34 -0000 On Fri, 14 Oct 2005, on@cs.ait.ac.th wrote: > Nicolas KOWALSKI wrote: >> Our FreeBSD 4.10 NFS server has some problems serving files by NFS on >> TCP (no problem with UDP) when the Linux (2.6) or Solaris (5.9) >> clients shut down in an unclean manner (power failure). When the clients >> try to mount the shares from the server after an >> unclean shutdown, the mount process hang during several minutes (delay >> is varying), then succeeds. > > That is just a wild guess, but NFS mounting would happen always at the same > stage of the boot, so maybe with the same source port number and you could be > facing the problem that the connection is waiting for termination on the > server > (close_wait or fin_wait or something)... Se source port in working example is > 798 and source port in failing example is 799 certainly not random. > > Olivier The socket on the server would still be in the ESTABLISHED state, which is even worse than the close_wait or fin_wait states in this case. The SYN will be accepted if it's greater than the previous sequence number, so that's a 50% chance it'll work. Assuming that port reuse is the problem, there is no quick fix for this, just resetting connections when a SYN comes in would be a really big security problem. Actually, there may be a quick fix for this specific machine. If you set net.inet.tcp.keepidle to 1 minute (60*whatever kern.hz is), that'll cause keepalive packets to be sent every minute to an idle connection, rather than every 2 hours. That would kill the stuck connections much quicker. However, it's also possible that this could cause problems in normal operation if keepalive packets cause problems. So, give it a shot, but be careful. Mike "Silby" Silbersack