From owner-freebsd-current Tue Aug 7 1: 8: 0 2001 Delivered-To: freebsd-current@freebsd.org Received: from scaup.mail.pas.earthlink.net (scaup.mail.pas.earthlink.net [207.217.121.49]) by hub.freebsd.org (Postfix) with ESMTP id 0A8F237B401; Tue, 7 Aug 2001 01:07:47 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from mindspring.com (dialup-209.245.136.130.Dial1.SanJose1.Level3.net [209.245.136.130]) by scaup.mail.pas.earthlink.net (EL-8_9_3_3/8.9.3) with ESMTP id BAA04106; Tue, 7 Aug 2001 01:07:38 -0700 (PDT) Message-ID: <3B6FA1F3.2563C15C@mindspring.com> Date: Tue, 07 Aug 2001 01:08:19 -0700 From: Terry Lambert Reply-To: tlambert2@mindspring.com X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Michael Reifenberger Cc: FreeBSD-Current , fs@FreeBSD.ORG Subject: Re: Linux ls fails on DEVFS /dev References: <20010805104350.A1188-100000@nihil> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Michael Reifenberger wrote: > linux ls fails on DEVFS /dev because linux_getdents fails because > linux_getdents uses VOP_READDIR( ..., &ncookies, &cookies ) instead of > VOP_READDIR( ..., NULL, NULL ) because it seems to need the offsets for > linux_dirent and sizeof(dirent) != sizeof(linux_dirent)... > > If I eliminate the usage of cookies, then a ls on at least > a cd9660 mounted dir fails with not finding all direntries. > > So the question is if all filesystems are expected to implement > the cookies != NULL case? The problem is that the interface is broken by design; for it to be correct, it actually needs to be split into two pieces: one to snapshot the directory entry block, and a second one to do the copy out from the on disk format to the "wire format", which is the NFS externalized version of the structure (cookies came in when the on disk directory entry structure changed from the representation that was historically used for NFS, to its current form; they basically exist to provide glue between internal and external representation). Basically, cookies assume that the client of their services will be the NFS server. They are actually a kludge, and there is a better way to do the same thing, which avoids the problem, at least as much as it is possible to avoid the problem, if the directory is changing out from under your server (or in this case, the Linux consumer). > BTW: > Wy doesn't a call to fstat on a directory set a st_blksize != 0? > Do directories have no preferred blocksize? Directories aren't files, per se. Directory entries are stored in physical disk blocks, to ensure atomicity of the directory operations. That said, this does seem to be a compatability issue with the Linux ABI (see below) that should be addressed in the Linux ABI implementation, and not the FreeBSD generic stat implementation. > I ask because getdents(2) explicitly states one > should use stat(2) to get the minimum buffersize... By "getdents(2)", I assume that you are talking about the Linux system call man page, not the FreeBSD one. The correct thing to do is to use opendir/readdir/closedir, and not call "getdents(2)" directly yourself. In general, as a compatability hedge, I suppose we could make the stat call behave for directories as it does on Linux. In reality, the buffer size should be large, since the standard directory reading code caches a "snapshot" of the directory blocks, externalized into the "neutral" format. For NFS clients of VFS', and for the Linux system call code, which is also a VFS client, the correct thing to do is probably to return a large number, and then "short change" the result by backing off on the copy out, if it can't be done on a full directory entry block boundary internal to the FS. This is true because the cookies are really there to permit an arbitrary restart on a non-directory block boundary; you could achieve the same thing for NFS by traversing the block entries to the entry following the offset; if it was not on an offset boundary, then you back up one entry (i.e. you are trying to restart based on a "snapshot" that has changed out from under you). It's up to the client to perform duplicate suppression. If you did the "short change" trick, then you would always be guaranteed that you could copy out one or more full directory entry blocks, and stop on an alignment boundary, which would eliminate the need to restart in the middle of a block, which would mean that "NULL, NULL" would be the correct thing to pass. The only place this would fail unexpectedly is when the buffer passed in to the "linux_getdents(2)" call was too small to hold all the entries that could occur in a single FreeBSD directory block. You code would have to be very badly behaved (intentionally so) to do that. Still, you can fake up the restart by copying out a FreeBSD-normal block into a transition buffer, and then traversing from there to do the restart (the other trick mentioned above). This would probably be best, since it would avoid the problem recurring with any future FS's. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message