From owner-freebsd-current Fri Jul 21 13:43:51 1995 Return-Path: current-owner Received: (from majordom@localhost) by freefall.cdrom.com (8.6.11/8.6.6) id NAA23185 for current-outgoing; Fri, 21 Jul 1995 13:43:51 -0700 Received: from cs.weber.edu (cs.weber.edu [137.190.16.16]) by freefall.cdrom.com (8.6.11/8.6.6) with SMTP id NAA23177 for ; Fri, 21 Jul 1995 13:43:46 -0700 Received: by cs.weber.edu (4.1/SMI-4.1.1) id AA06811; Fri, 21 Jul 95 14:36:30 MDT From: terry@cs.weber.edu (Terry Lambert) Message-Id: <9507212036.AA06811@cs.weber.edu> Subject: Re: what's going on here? (NFSv3 problem?) To: dfr@render.com (Doug Rabson) Date: Fri, 21 Jul 95 14:36:29 MDT Cc: peter@haywire.dialix.com, freebsd-current@freebsd.org In-Reply-To: from "Doug Rabson" at Jul 21, 95 07:46:16 pm X-Mailer: ELM [version 2.4dev PL52] Sender: current-owner@freebsd.org Precedence: bulk > NFSv3 defines a mechanism to validate the cookies used to read directory > entries. Each readdir request returns a set of directory entries, each > with a cookie which can be used to start another readdir just after the > entry. To read from the beginning of the directory, one passes a NULL > cookie. > > NFSv3 also returns a 'cookie verifier' which must be passed with the next > readdir, along with the cookie representing the place to read from. If the > directory block was compacted, then the server should use the verifier to > detect this and can return an error to the client to force it to retry the > read from the beginning of the directory. Most file systems do not provide a generation count on directory blocks with which to validate the "cookie". With that in mind, the "cookie" is typically interpreted either as an entry offset or as a byte offset of entry, either in the block or in the directory. It is this use which one uses to resynchronize entries in the case of block compaction, with the inevitable problem potential this has associated with it (duplicate vs. skipped entries, as you point out). > > The buffer crap that got done to avoid a file system top end user > > presentation layer is totally bogus, and remains the cause of the > > prblem. If no one is interested in fixing it, I suggest reducing > > the transfer size to the page size or smaller. > > I can't parse this one. The stat structure passed around internally is larger than the stat structure expected by NFS. Rather than fix the view of things at the time it was exported to NFS, the internal buffer representation for all file system capable of being exported was changed. I can't say I'm not glad that this is coming back to haunt us. > > And, of course, at the same time eat the increased and otherwise > > unnecessary overhead in the read/write path transfers that will > > result from doing this "fix". > > I don't think that any fix is needed. The NFSv2 behaviour is adequate > and NFSv3 has the mechanism to detect this problem. This is the "drop the buffer size" fix, not the detection fix (which would be unnecessary if the buffer size "fix" wasn't there). It should also be noted that NFSv3 classifies the blocked directory entry retrieval as an *optional* implementation for the server, and the problem would also go away were the option declined and versioning more strictly enforced. I don't think this would be the cannonically correct thing to do. Terry Lambert terry@cs.weber.edu --- Any opinions in this posting are my own and not those of my present or previous employers.