Date: Wed, 5 Jan 2011 08:30:05 -0500 (EST) From: Rick Macklem <rmacklem@uoguelph.ca> To: John Baldwin <jhb@freebsd.org> Cc: freebsd-stable@freebsd.org, marek sal <marek_sal@wp.pl>, perryh@pluto.rain.com, milu@dat.pl, jyavenard@gmail.com Subject: Re: NFSv4 - how to set up at FreeBSD 8.1 ? Message-ID: <1870282066.118978.1294234205820.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <201101050757.08116.jhb@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
> On Wednesday, January 05, 2011 5:55:53 am perryh@pluto.rain.com wrote: > > Rick Macklem <rmacklem@uoguelph.ca> wrote: > > > > > ... one of the fundamental principals for NFSv2, 3 was a stateless > > > server ... > > > > Only as long as UDP transport was used. Any NFS implementation that > > used TCP for transport had thereby abandoned the stateless server > > principle, since a TCP connection itself requires that state be > > maintained on both ends. > > Not filesystem cache coherency state, only socket state. And even NFS > UDP > mounts maintain their own set of "socket" state to manage retries and > retransmits for UDP RPCs. The filesystem is equally incoherent for > both UDP > and TCP NFSv[23] mounts. TCP did not change any of that. > Unfortunately even NFSv4 doesn't maintain cache coherency in general. The state it maintains/recovers after a server crash are opens/locks/delegations, but the opens are a Windows-like open share lock (can't remember the Windows/Samba term for them) and not a POSIX-like open. NFSv4 does tie cache coherency to file locking, so that clients will get a coherent view of file data for byte ranges they lock. The term stateless server refers to the fact that the server doesn't know anything about the file handling state in the client that needs to be recovered after a server crash (opens, locks, ...). When an NFSv2,3 server is rebooted, it normally knows nothing about what clients are mounted, what clients have files open, etc and just services RPCs as they come in. The design avoided the complexity of recovery after a crash but results in a non-POSIX compliant file system that can't do a good job of cache coherency, knows nothing about file locks, etc. (Sun did add a separate file locking protocol called the NLM or rpc.lockd if you prefer, but that protocol design was fundamentally flawed imho and, as such, using it is in the "your mileage may vary" category.) Further, since without any information about previous operations, retries of non-idempotent RPCs would cause weird failures, "soft state" in the form of a cache of recent RPCs (typically called the Duplicate Request Cache or DRC these days) was added, to avoid performing the non-idempotent operation twice. A server is not required to retain the contents of a DRC after a crash/reboot but some vendors with non-volatile RAM hardware may choose to do so in order to provide "closer to correct" behaviour after a server crash/reboot. rick
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1870282066.118978.1294234205820.JavaMail.root>