Date: Sun, 27 Aug 95 1:56:26 MDT From: terry@cs.weber.edu (Terry Lambert) To: freebsd-current@FreeBSD.ORG Subject: Re: nfs panic after changing cd's Message-ID: <9508270756.AA19071@cs.weber.edu> In-Reply-To: <199508270619.IAA21777@uriah.heep.sax.de> from "J Wunsch" at Aug 27, 95 08:19:10 am
next in thread | previous in thread | raw e-mail | index | archive | help
> > The cd9660 has glossed over the idea of a generation count; in point of > > fact, the generation count on an inode for a cdromfs is *exactly* what is > > needed to fix this problem. This is what the UFS uses on a remount on > > the same mountpoint of another FS to cause it to return ESTALE. The > > ESTALE should be returned before a lot of the crap dereferences in the > > cd9660 FS take place. > > Sorry, i cannot entirely follow you. :) > > I think the idea behind the decision as it's done now was to allow the > remount of an identical CD without returning ESTALE. It has been > discussed here that computing some MD5 checksum for the CD is probably > the best way to base a decision for stale NFS file handles on. The cache is flushed on the unmount, so keeping the generation number the same for a CD acroos a mount/unmount/remount isn't really necessary... or desirable. The Linux non-flush of the cache in their CDFS is arguably a good thing for disk changers. As long as you don't handle a disk change as a mount instance (instead, you "mount" all the disks in the changer), then the cache won't be updated when the changer is triggered by a particular LUN access. The only other alternative is to change the mount and the unmount code so the cache can survive the process. I don't think it's terribly useful to do this. Arguably, the Linux code can fail (I can give you the email address for the guy who argued it with Linus -- I work with him). You end up with a useless one-behind cache, since the likelihood is that you are going to be putting in a new CDROM and you've blown the locality of reference model that put the cached data into the LRU list in the first place. It's silly to keep a handle across an unmount/mount for anything but an option change on the same FS (like going async for administrative reasons). That can be accomplished with a remount rather than an unmount/mount -- and async doesn't apply to CDROM anyway, since you mount them read-only. The point is, the error is because the handle depends on unrecoverable state, and you could trap the loss of that state using the mount time stamp as a generation number, assuming the mount time stamp was unique across all CDFS devices. It's a lot clearer if you bring the cd9660_vfsops.c up in one window, the ffs_vfsops.c in another, and the ufs_vfsops.c in a third so you can look at the operation order differences. Keep the fact that UFS doesn't have the problem firmly in mind while comparing the *_fhtovp() code and the validation routine from the underlying ufs routine. Or to put it another way: any code that doesn't panic is better than any code that does. 8-). I suspect there's several failure modes under buffer thrash conditions from an NFS client accessing supposedly good data for which there is a handle but for which there isn't an open instance to keep it at the top of the LRU. Probably the handle state needs to be generated fully from the on disk data in all cases to fix this. I'd hack the CD9660 FS, but my CDROM collection isn't large enough to give me a good test base for all the FS types it supports, so any media auto-recognition code I came up with would be a guess at best or crap at worst -- and so I'm not really prepared to "own" that. Terry Lambert terry@cs.weber.edu --- Any opinions in this posting are my own and not those of my present or previous employers.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9508270756.AA19071>