Date: Tue, 20 Dec 2011 21:53:36 -0500 (EST) From: Rick Macklem <rmacklem@uoguelph.ca> To: John De <jwd@freebsd.org> Cc: freebsd-fs@freebsd.org Subject: Re: NFS client UDP retransmit timer busted for 8.n/9.n (patch) Message-ID: <1231463684.484891.1324436016408.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <20111221012343.GA86024@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
John De wrote: > Hi Rick, > > ----- Rick Macklem's Original Message ----- > > Thanks to recent detective work done by jwd@, a problem w.r.t. > > retransmit timeouts for UDP mounts (both old and new NFS clients) > > has been identified. > > > > The kernel rpc has two timeouts for UDP: > > 1 - a timeout that causes the RPC request to be retransmitted on > > the same socket, using the same xid. This one defaults to > > 3seconds and can be set via CLSET_RETRY_TIMEOUT. > > (This is always the default of 3seconds for FreeBSD currently.) > > 2 - a timeout that cause the socket to be destroyed and a fresh > > one created. The request is then sent on this new socket, with > > a different xid. > > > > The problem with #2 is that the retransmitted RPC request will miss > > a server's Duplicate Request Cache (DRC), because of the different > > xid. > > As such, #2 should be much larger than #1. However, #2 defaults to > > 1second > > (ie. smaller than #1->trouble!) > > > > One way to avoid this problem is to set #2 to a much larger value > > via the > > "timeout=<value>" mount option. (Btw, the <value> is in 1/10 > > seconds, so > > "timeout="600" sets it to 60sec.) > > > > I now have a patch that I believe deals with this correctly. It sets > > #1 > > to the "timeout=<value>" (default 1second) and #2 to a much larger > > value. > > (#2 timeouts are what the kernel rpc counts as retries, so for > > "soft" > > mounts, I set #2 to "nm_retry * nm_timeout / 2" and "retries = 2", > > so > > that it fails after "nm_retry * nm_timeout", which I think is the > > correct > > semantics.) > > This patch is attached and is also available at: > > http://people.freebsd.org/~rmacklem/udp-timer.patch > > (jwd@, this patch is updated from what I emailed you, so you > > probably want it:-) > > We've tested both the mount_nfs option change and the patch and both > seem to work great. No re-occurence of the problem we were seeing. > We've only > been able to catch one retransmit via tcpdump and it showed the system > retransmitting with the same xid/port and recovering. > > +1 for getting this committed. > > Thanks for your time looking into this Rick. > And thanks for reporting it. Also, sorry everyone, this has been broken for a long time and I never spotted it. rick > -John > > > In summary, if you are using NFS mounts over UDP on FreeBSD8 or 9 > > systems, you > > either want to use "timeout=600" or try the patch. You are pretty > > badly broken > > otherwise. > > > > Hopefully, this patch can make it into -current/head soon, rick > > ps: jhb@, could you maybe review this, thanks, rick.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1231463684.484891.1324436016408.JavaMail.root>