Date: Tue, 11 Mar 1997 22:51:00 +0100 From: j@uriah.heep.sax.de (J Wunsch) To: hackers@freebsd.org Cc: dfr@freebsd.org Subject: Re: NFSv3 (was: Maybe a showstopper...) Message-ID: <19970311225100.SW24783@uriah.heep.sax.de> In-Reply-To: <19970311194645.GS08931@uriah.heep.sax.de>; from J Wunsch on Mar 11, 1997 19:46:45 %2B0100 References: <Pine.NEB.3.95.970310211652.24717A-100000@mail.cdsnet.net> <Pine.BSF.3.95q.970311110509.23141G-100000@fallout.campusview.indiana.edu> <19970311194645.GS08931@uriah.heep.sax.de>
next in thread | previous in thread | raw e-mail | index | archive | help
As J Wunsch wrote: > sunny.9dca8c93 > freeboy.nfs: 124 readdirplus fh 4,12/1 1048 bytes @ 0 (DF) > freeboy.nfs > sunny.9dca8c93: reply ok 592 readdirplus > sunny.9dca8c94 > freeboy.nfs: 112 remove fh 4,12/1 ld.exp (DF) > freeboy.nfs > sunny.9dca8c94: reply ok 144 remove > sunny.9dca8c95 > freeboy.nfs: 120 readdir fh 4,12/1 1048 bytes @ 512 (DF) > freeboy.nfs > sunny.9dca8c95: reply ok 116 readdir ERROR: 'Unknown error: 10003' Who had thought that Sun doesn't implement the NFSv3 specification correctly? :-)) If i read the specs correctly, the above behaviour is allowed, and the client should be able to handle this. (Note that it's possible that the FreeBSD server implementation is overly restrictive however. See below for a discussion.) ``IMPLEMENTATION ``In the NFS version 2 protocol, each directory entry returned included a cookie identifying a point in the directory. By including this cookie in a subsequent READDIR, the client could resume the directory read at any point in the directory. One problem with this scheme was that there was no easy way for a server to verify that a cookie was valid. If two READDIRs were separated by one or more operations that changed the directory in some way (for example, reordering or compressing it), it was possible that the second READDIR could miss entries, or process entries more than once. If the cookie was no longer usable (e.g. pointing into the middle of a directory entry), the server would have to indicate that the search had reached the end of the directory, even though more entries remained. There was no way that the client could distinguish this from a valid end-of-directory; the server hadn't reached the end of the directory, but the client remained unaware of any problem. ``In the NFS version 3 protocol, each READDIR request includes both a cookie and a cookie verifier. For the first call, both are set to 0. The response includes a new cookie verifier, together with a cookie per entry. For subsequent READDIRs, the client must present both the cookie and the corresponding cookie verifier. Directory entry cookies may become invalid if the directory is modified. If the server detects that the cookie is no longer valid, it will reject the request with the status, NFS3ERR_BAD_COOKIE. The client should be careful to avoid holding directory entry cookies across operations that modify the directory contents, such as REMOVE and CREATE.'' The return code 10003 is just NFS3ERR_BAD_COOKIE. (tcpdump doesn't know about this, apparently.) The trace above shows that the client was not ``careful to avoid holding directory entry cookies across operations that modify the directory contents'', since it did a remove on a directory handle, and attempted to reuse the old cookie verifier on a subsequent READDIR operation on the same directory handle later. Since FreeBSD's NFSv3 implementation is overly restrictive (it uses va_filerev of the i-node as cookie verifier), it immediately rejects this intend. Still, it's the SunOS client that breaks the protocol. Maybe somebody with a SunOS support contract should open a trouble ticket for this. ;-) The NFS specs also talk about a better implementation: ``One implementation of the cookie verifier mechanism might be for the server to use the modification time of the directory. This might be overly restrictive, however. A better approach would be to record the time of the last directory modification that changed the directory organization in a way which would make it impossible to reliably interpret a cookie.'' Btw., the above trace contains another interesting part. The SunOS client issues a READDIRPLUS first, but uses READDIRs later. I think this might prevent the READDIRPLUS cache flooding behaviour as mentioned in mount_nfs(8). -- cheers, J"org joerg_wunsch@uriah.heep.sax.de -- http://www.sax.de/~joerg/ -- NIC: JW11-RIPE Never trust an operating system you don't have sources for. ;-)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19970311225100.SW24783>