Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 24 Nov 1995 11:48:54 -0700 (MST)
From:      Terry Lambert <terry@lambert.org>
To:        julian@ref.tfs.com (Julian Elischer)
Cc:        hackers@FreeBSD.org
Subject:   Re: VOP_RECLAIM and  vnode references..
Message-ID:  <199511241848.LAA10011@phaeton.artisoft.com>
In-Reply-To: <199511232115.NAA11760@ref.tfs.com> from "Julian Elischer" at Nov 23, 95 01:15:11 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> In vclean() (the only usage of VOP_RECLAIM I have found) the following
> happens:
> 
> If a vnode has no references, it's left that way..
> if it DOES have a reference (v_usecount > 0) then it's artificially
> incremented by one, to stop it going to 0 while the VOP_RECLAIM is  active..
> 
> this is a bit stupid in my opinion...
> 
> firstly, in devfs, my sanity code that checks that a vnode is referenced
> before it uses it, throws a fit when asked by devfs_reclaim to
> find the devfs_node associated with the vnode..
> 
> (this is why those of you that have made devfs report seeing
> "!no reference!" on the console.. secoondly, it's unknown how long
> a VOP_RECLAIM might take and something else may 
> come in and raise and lower the reference count while it's
> happenning.. (not devfs, but their might be slower fs's)
> 
> surely the v_usecount should be raised by one
> regardless of whether it is already non-zero.?
> 
> 
> the VOP_LOCK might make this un-needed, but I am loath
> to take the sanity check code out of my devfs code..
> I figure that if I'm doing a vntodn() (vnode-to-devfsnode)
> then I should be doing it on a referenced vnode..
> I don't like the fact that there is ONE exception..
> 
> "Except whenn asked to do it in a reclaim operation.."
> I guess I could do a vref in devfs_reclaim before calling
> vntodn() but why make a specific fix if a general one
> is just as easy?
> does anyone know why it's like it is?
> I BELIEVE (but don't know) that this is the only time when
> a vnode is passed toa filesystem without a reference..

I was looking at exactly this code last Wednesday.

If you look at the VOP_LOCK code in the /sys/ufs/ufs/ufs_vnops.c in
ufs_lock, you'll see that the VXLOCK is observed.  This actually
complicates the lock process immensely (from profiling, there is
as much time spent here as in the table reverse lookup for the op
in the VOP table when resolving the descriptor because the VOP
table isn't ordered).

It turns out that most other FS's, including MSDOS, don't respect
the cleaner this way.

Since the intention is to disassociate the inode from the vnode,
this is really broken.

The short term fix is to do the VXLOCK/VXWANT dancing in all the
other FS's.


The real issue is the flawed interlock mechanism in the vget(), vclean(),
vgoneall(), vgone() interaction, and the use of vnodes without regard
to the underlying file system type (ie: an associated inode).

To solve this, the vnode refrence count would need to be incremented
as if for an open instance when it is put in the cache, or opened, etc.,
and the reference count would need to be guaraded by the lock.  This
would mean a change to the lock and vget() architecture.  In particular,
vget() guards the freelist by causing the process to go to sleep waiting
for the bogus entry to be removed and returning an error instead of
moving onto a non-bogus entry and returning it immediately.  Save a
bunch of waits and unnecessary voluntary context switches!


In my opinion, there is no need for the underlying file system to
support a VOP_LOCK/VOP_UNLOCK mechanism if the locking is instituted
with a hierarchy lock based on the vnode.  The underlying FS must
support loging of its inode hash list during list access to prevent
multiple allocation, but there is no need for a lock.

Presumably, the lock mechanism was introduced to allow the underlying
FS to overlay on top of multiple other FS's: like the union FS.  This
is because the underlying vnode lock is not allowed to be recursive.


If we consider a union FS on top of FS 'a' and FS 'b', a lookup will
return a referenced vnode.  The underlying vnode for 'a' and 'b' for
the object (say it is a directory existing in both) will be locked by
the union FS.

Because this lock is allowed to recurse, VOP's that require a lock to
ensure exclusive access will be allowed to complete.

The only remaining issues are that of transferring lock ownership in
the union FS (relatively trivial) and in ensuring multiple entrancy
by a single process (P1 makes an async I/O call and P1 makes another
async I/O call, or P1 has two threads on two processors both making
calls) blocks apropriately (FS multithreading via mutex for global
and static local variable access).


Given the amount of work in solving this "correctly" for all cases, I
suggest you integrate the ufs VXLOCK spins for right now.  It should
"solve" your problem and let you go on to other code.


					Regards,
					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199511241848.LAA10011>