Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 27 Aug 95 1:56:26 MDT
From:      terry@cs.weber.edu (Terry Lambert)
To:        freebsd-current@FreeBSD.ORG
Subject:   Re: nfs panic after changing cd's
Message-ID:  <9508270756.AA19071@cs.weber.edu>
In-Reply-To: <199508270619.IAA21777@uriah.heep.sax.de> from "J Wunsch" at Aug 27, 95 08:19:10 am

next in thread | previous in thread | raw e-mail | index | archive | help
> > The cd9660 has glossed over the idea of a generation count; in point of
> > fact, the generation count on an inode for a cdromfs is *exactly* what is
> > needed to fix this problem.  This is what the UFS uses on a remount on
> > the same mountpoint of another FS to cause it to return ESTALE.  The
> > ESTALE should be returned before a lot of the crap dereferences in the
> > cd9660 FS take place.
> 
> Sorry, i cannot entirely follow you. :)
> 
> I think the idea behind the decision as it's done now was to allow the
> remount of an identical CD without returning ESTALE.  It has been
> discussed here that computing some MD5 checksum for the CD is probably
> the best way to base a decision for stale NFS file handles on.

The cache is flushed on the unmount, so keeping the generation number the
same for a CD acroos a mount/unmount/remount isn't really necessary... or
desirable.

The Linux non-flush of the cache in their CDFS is arguably a good thing
for disk changers.

As long as you don't handle a disk change as a mount instance (instead,
you "mount" all the disks in the changer), then the cache won't be
updated when the changer is triggered by a particular LUN access.

The only other alternative is to change the mount and the unmount code
so the cache can survive the process.  I don't think it's terribly
useful to do this.  Arguably, the Linux code can fail (I can give you
the email address for the guy who argued it with Linus -- I work with
him).  You end up with a useless one-behind cache, since the likelihood
is that you are going to be putting in a new CDROM and you've blown the
locality of reference model that put the cached data into the LRU list
in the first place.

It's silly to keep a handle across an unmount/mount for anything but
an option change on the same FS (like going async for administrative
reasons).  That can be accomplished with a remount rather than an
unmount/mount -- and async doesn't apply to CDROM anyway, since you
mount them read-only.

The point is, the error is because the handle depends on unrecoverable
state, and you could trap the loss of that state using the mount time
stamp as a generation number, assuming the mount time stamp was unique
across all CDFS devices.

It's a lot clearer if you bring the cd9660_vfsops.c up in one window,
the ffs_vfsops.c in another, and the ufs_vfsops.c in a third so you
can look at the operation order differences.  Keep the fact that UFS
doesn't have the problem firmly in mind while comparing the *_fhtovp()
code and the validation routine from the underlying ufs routine.

Or to put it another way: any code that doesn't panic is better than
any code that does.  8-).


I suspect there's several failure modes under buffer thrash conditions
from an NFS client accessing supposedly good data for which there is
a handle but for which there isn't an open instance to keep it at the
top of the LRU.  Probably the handle state needs to be generated fully
from the on disk data in all cases to fix this.

I'd hack the CD9660 FS, but my CDROM collection isn't large enough to
give me a good test base for all the FS types it supports, so any
media auto-recognition code I came up with would be a guess at best or
crap at worst -- and so I'm not really prepared to "own" that.


					Terry Lambert
					terry@cs.weber.edu
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9508270756.AA19071>