Date: Mon, 2 Nov 2009 10:37:24 -0500 (EST) From: Rick Macklem <rmacklem@uoguelph.ca> To: Olaf Seibert <O.Seibert@cs.ru.nl> Cc: danny@cs.huji.ca.il, dfr@freebsd.org, freebsd-stable@freebsd.org Subject: Re: 8.0-RC1 NFS client timeout issue Message-ID: <Pine.GSO.4.63.0911021028140.10631@muncher.cs.uoguelph.ca> In-Reply-To: <20091102100958.GY841@twoquid.cs.ru.nl> References: <20091027164159.GU841@twoquid.cs.ru.nl> <Pine.GSO.4.63.0910281624440.18390@muncher.cs.uoguelph.ca> <20091029135239.GX841@twoquid.cs.ru.nl> <Pine.GSO.4.63.0911011713290.23081@muncher.cs.uoguelph.ca> <20091102100958.GY841@twoquid.cs.ru.nl>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 2 Nov 2009, Olaf Seibert wrote: >> Although I think the patch does avoid sending the request on the >> partially closed connection, it doesn't fix the "real problem", >> so I don't know if it is worth testing? > > Well, I tested it anyway, just in case. It seems to work fine for me, so > far. > Yes, I think the patch is ok, but it doesn't completely resolve the reconnect issue. It's good to hear that it helps for your case. > I don't see your extra RSTs either. Maybe that is because in my case the > client used a different port number for the new connection. (Usually, > this is controlled by the TCP option SO_REUSEADDR from <sys/socket.h>). > For my packet trace, it is using different port#s. The problem is that, for some reason, it sends the RST from the new port# instead of the port# for the old connection just closed via soclose(). I don't know why you don't see the extra RSTs, but consider yourself lucky, since you should be ok without them. (It may simply be that your server isn't Solaris10 --> a different TCP stack in it.) Do you happen to know what your server is? >> I'm hoping that the "Help TCP Wizards..." thread I just started >> on freebsd-current comes up with something. >> >> At least I can reproduce the problem now. (For some reason, I have >> to reboot the Solaris10 server before the problem appears for me. >> I can't think why this matters, but that's networking for you:-) > > Maybe it depends on server load or something. This particular server is > a central file server at a university, it may have some more pressure to > terminate unused connections. > Or type of server (ie. not Solaris10). It definitely depends upon timing in the client. (I'm about to try introducing a 1sec delay before the soconnect() call and see if that makes the RSTs go away. Not much of a fix, but...) I now recall that I ran into a similar problem (although I didn't dig into the packet traces then) when testing my Mac OS X 10 client, which uses essentially the reconnect code from Mac OS X 10.4 Tiger. I "fixed" it by adding a 1sec delay before the reconnect. Thanks for helping with testing, rick
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.GSO.4.63.0911021028140.10631>