Date: Fri, 14 Oct 2005 12:07:23 -0400 (EDT) From: rick@snowhite.cis.uoguelph.ca To: fs@freebsd.org Subject: FreeBSD NFS server not responding to TCP SYN packets from Linux/SunOS clients Message-ID: <200510141607.MAA21757@snowhite.cis.uoguelph.ca>
next in thread | raw e-mail | index | archive | help
As others have noted, the problem is that the connection is still established on the server and the client re-uses the same port#. Due to the NDA, I can't tell you the details, but there was an amusing case at the last NFSv4 Bakeathon which I'll call the "Psychic Server Theory". Basically, my server had a bug (that was fixed) where it would generate a Readdir reply larger than requested for a certain case. A client would crash when it saw this reply. The interesting part was that it would crash again as soon as it was rebooted, before it even attempted a remount. The theory was that my "Evil Psychic Server" KNEW that the client might try to mount it and would crash it "somehow" before the mount took place. (What was really happening was that the server had the TCP connection still established and would resend the reply to the same port#, which this client already had configured for its first mount.) >From an NFS point of view, I can't see a security risk w.r.t. the server breaking the TCP connection more quickly. (I don't know what implications this change might have w.r.t. TCP level denial of service attacks, etc.) What it will do is increase the risk of data corruption on the server, since a TCP reconnect on the client implies it must retry all outstanding requests, including non-idempotent ones. In the generic FreeBSD NFS server, the recent request cache is normally disabled for TCP, so a retry of a non-idempotent RPC can/will corrupt the server file system (from the point of view of what the client expects to have on the file system). My new server does have a redesigned recent request cache that would minimize this risk, although there will always be a worst case scenario where problems could still occur. In summary, unless there is an increased risk of a "denial of service" attack at the TCP level, it would be nice if the nfsd threads could tell TCP that connections can be broken down fairly quickly for unresponsive clients. This assumes a recent request cache that works well for TCP. (Related to this, having "keep alives" done more frequently should detect the rebooted client, since the connection doesn't exist at the other end?) rick ps: It would be nice if someone with the right expertise could explore other things in TCP specifically for NFS. For example, I don't see why a retransmit timeout should go above about 100msec, since net delays are well below that level, even half way around the world these days. Having said that, I don't know enough about TCP retransmit to say that one second retry intervals aren't correct?
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200510141607.MAA21757>