Date: Wed, 8 Oct 2014 20:43:33 -0400 (EDT) From: Rick Macklem <rmacklem@uoguelph.ca> To: Garrett Wollman <wollman@csail.mit.edu> Cc: freebsd-fs@freebsd.org Subject: Re: 9.3 NFS client bug? Message-ID: <1713580100.60978206.1412815413928.JavaMail.root@uoguelph.ca> In-Reply-To: <21557.22365.961980.709081@khavrinen.csail.mit.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
Garrett Wollman wrote: > <<On Tue, 7 Oct 2014 21:20:41 -0400 (EDT), Rick Macklem > <rmacklem@uoguelph.ca> said: > > > As far as I know, this has never worked correctly for FreeBSD. The > > unlink() invalidates the directory offset cookies and then it has > > trouble finding the next entry. > > To make the above loop work correctly for FreeBSD, it needs to be > > re-written to start at the beginning of the directory after each > > unlink(). > > How about instead we fix FreeBSD to work properly? Clearly it is not > impossible since the Linux NFS client does work. What exactly is the > issue? (Forgive me, I know very little about how VOP_READDIR works > under the hood.) > > -GAWollman > > Well, I've never looked at Linux or OpenSolaris to see how they handle these things, but here's a couple of ways I am aware of that could fix this: 1 - There is a "cookie_verifier" defined for NFS, which is a 64bit value that is supposed to change whenever the directory offset cookies are no longer valid. Implementing this requires something like: - add an attribute or new VOP_xxx() for this cookie_verfier - fix every file system to handle it --> This requires a good knowledge of the underlying file system, since it needs to change when the directory_offset_cookies are stale (I believe this is when objects are added to a directory for UFS. Have no idea for ZFS, etc.) - It needs to be stored on-disk (in the i-node or similar) since it is supposed to survive server crashes. The value for this cookie_verifier is in the readdir reply and then the client sends it in subsequent requests so that the server can reply with an error if the cookie_verifier refers to "stale" directory offset cookies. --> Unfortunately some servers haven't supported this correctly for a long time and it is difficult for clients to recover from the error. RFC-3530 strongly recommends that directory offset cookies not be allowed to become stale, but I don't know how to do this for UFS, ZFS, ... (There is still an ancient comment in the server code about the check being too strict for Solaris 2.5 clients. When was Solaris 2.5 released?;-) All in all, a mess. As such, the FreeBSD client assumes that the cookies are no longer valid when it sees the modify time on the directory change (guess what happens every time an entry is unlink'd from the directory). Unless not only the FreeBSD servers but most/all other servers (a lot of old BSD servers and I believe others are broken) are fixed, the client can't really depend on this to determine if directory offset cookies are still valid. (Can you now see why this has never been fixed?) 2 - Have readdir(3) do what fts(3) does. Read the entire directory into the user address space on the first readdir() after opendir() and then subsequent readdir() calls just return directory entries from memory (avoiding further getdirentries(2) calls that don't work correctly because of potentially stale directory offset cookies). --> This one is probably straightforward, but may eat a lot of address space for apps. that opendir(), readdir() a lot of large directories. This might be fine or it might break a bunch of apps that run out of address space and do more harm than the broken case of removing entries in a readdir() loop (which can be easily coded around)? If someone else knows of a better way to fix this (maybe what Linux or Solaris does) please post, because I don't think either 1 or 2 above is a good plan. rick ps: This is my recollection of the problem, but I haven't worked on it in quite a while.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1713580100.60978206.1412815413928.JavaMail.root>