From owner-freebsd-current  Sat Aug 26 15:04:46 1995
Return-Path: current-owner
Received: (from majordom@localhost)
          by freefall.FreeBSD.org (8.6.11/8.6.6) id PAA04062
          for current-outgoing; Sat, 26 Aug 1995 15:04:46 -0700
Received: from cs.weber.edu (cs.weber.edu [137.190.16.16])
          by freefall.FreeBSD.org (8.6.11/8.6.6) with SMTP id PAA04047
          for <current@freebsd.org>; Sat, 26 Aug 1995 15:04:43 -0700
Received: by cs.weber.edu (4.1/SMI-4.1.1)
	id AA16363; Sat, 26 Aug 95 16:04:15 MDT
From: terry@cs.weber.edu (Terry Lambert)
Message-Id: <9508262204.AA16363@cs.weber.edu>
Subject: Re: nfs panic after changing cd's
To: bde@zeta.org.au (Bruce Evans)
Date: Sat, 26 Aug 95 16:04:14 MDT
Cc: current@freebsd.org, dfr@render.com
In-Reply-To: <199508262106.HAA17016@godzilla.zeta.org.au> from "Bruce Evans" at Aug 27, 95 07:06:11 am
X-Mailer: ELM [version 2.4dev PL52]
Sender: current-owner@freebsd.org
Precedence: bulk

> My nfs client paniced with a null pointer somewhere in nfs_statfs() or
> thereabouts.  The server printed:
> 
> Aug 26 23:13:48 alphplex /kernel: fhtovp: file start miss 142213208 vs 60
> Aug 26 23:13:51 alphplex last message repeated 2 times
> 
> This message is from cd9660_vfsops.c (which prints a lot of similar
> messages without identifying itself).
> 
> My cdrom had been changed.  Usually I mount it at boot time and export
> it and never change it and have no problems with cd9660 vs nfs.
> 
> I think this bug is known.  It maybe the same as the one reported a day
> or two ago where the panic occurred in an lkm'ed file system so that it
> wasn't obvious what it was for.

If you look at the cd9660_vfsops code, the cd9660_fhtovp() code does a
lot more work than the ffs_vfsops.c code.

In reality, the ufs_check_export() type checks being done in the UFS
code in the FFS should be done BEFORE this failure opportunity exists;
that is, the ordering of the checks is wrong.

The cd9660 has glossed over the idea of a generation count; in point of
fact, the generation count on an inode for a cdromfs is *exactly* what is
needed to fix this problem.  This is what the UFS uses on a remount on
the same mountpoint of another FS to cause it to return ESTALE.  The
ESTALE should be returned before a lot of the crap dereferences in the
cd9660 FS take place.

I would suggest that if you are serious about this, then the correct thing
to do is to maintain a monotonically increasing 32 bit counter for the
generation count.

Specifically, the system time should be stored in the mount point at mount
time, with a guarantee that multiple CDROM mounts will be delayed to avoid
identical time stamps.  The timeval rollover isn't a problem, since systems
don't remain up that long.  8-).

This number should the be used as the generation number for inodes
allocated for the in core inodes for the cd9660fs.  The generation of
all inodes owned by a cd9660 will be the same for each mount instance
and different for subsequent mounts on the same mount point (the source
of your problem).

At which point you return ESTALE before hitting the bitchy code in the
first place.


Really, many aspects of the cd9660fs want to be rewritten.  The major
one is the use of mount options rather than examination of the CDROM
itself in order to determine CDROM format at all.  This turns out to
be a much more critical problem, since it prevents clean root mounts
of different CDROM formats without user intervention on the command
line.  THere is code in the FS to force a rollover to High Sierra, but
there are still options and a lot of legacy code there, and it's not
that hard to "do the right thing" for Rock Ridge extensions as well
(though one might want a namespace switch for exporting the FS as a
net server to DOS machines).


					Terry Lambert
					terry@cs.weber.edu
---
Any opinions in this posting are my own and not those of my present
or previous employers.