Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 09 May 2001 22:38:39 +0100
From:      Ian Dowse <iedowse@maths.tcd.ie>
To:        freebsd-fs@freebsd.org
Cc:        phk@freebsd.org, iedowse@maths.tcd.ie
Subject:   vflush()
Message-ID:   <200105092238.aa48273@salmon.maths.tcd.ie>

next in thread | raw e-mail | index | archive | help

The function vflush() is used by most filesystems in xxx_unmount()
to clean out all vnodes associated with the filesystem. Normally
it returns EBUSY if any vnodes have a non-zero reference count,
but the FORCECLOSE flag overrides this, forcing a vgone() on all
vnodes.

Many filesystems have a natural reference count of 1 for the
filesystem root vnode, so a non-forced vflush() would always fail.
This situation is currently catered for by vflush's `skipvp'
argument; when non-NULL, vflush() skips the specified vnode and
lets the caller deal with it.

A problem with this approach is that when a skipvp argument is
supplied, the caller must re-implement vflush()'s logic for the
root vnode, except that it must compare the reference count against
1 instead of 0. Currently all filesystems in the tree that make
use of `skipvp' do this incorrectly. They check the reference count,
but ignore FORCECLOSE, so if there are any extra references on the
filesystem root directory vnode then the umount will fail even with
FORCECLOSE specified.

All of devfs, smbfs, fdesc, nullfs, portal, umapfs, union, nfs and
nwfs have this problem. e.g in devfs:

	if (rootvp->v_usecount > 2) {
		vrele(rootvp);
		return (EBUSY);
	}
	error = vflush(mp, rootvp, flags);
	if (error)
		return (error);
	devfs_purge(fmp->dm_rootdir);
	vput(rootvp);
	vrele(rootvp);
	vgone(rootvp);

In each and every filesystem in the tree that uses `skipvp':
  (a) it specifies the root vnode;
  (b) there are a fixed number of extra references;
  (c) the caller immediately releases all root vnode references
      when vflush() is successful.
Also, vflush() is used exclusively by filesystem unmount() functions.

All this suggests that vflush() should subsume the logic for dealing
with the reference count on the filesystem root vnode. This would
simplify the xxx_umount() functions and permits the FORCECLOSE
problem to be solved for all 9 filesystems in one place.

One possibility (which may be a bit too restrictive for future
filesystems) is to replace vflush()'s `skipvp' argument with an
integer specifying how many references should be both expected and
removed from the filesystem root vnode. e.g:

	/*
	...
	 * `rootrefs' specifies the base reference count for the root vnode
	 * of this filesystem. The root vnode is considered busy if its
	 * v_usecount exceeds this value. On a successful return, vflush()
	 * will call vrele() on the root vnode exactly rootrefs times.
	 */
	int
	vflush(struct mount *mp, int rootrefs, int flags)
	{

That would simplify devfs_unmount() considerably:

--- /usr/src/sys/fs/devfs/devfs_vfsops.c	Tue May  8 18:59:58 2001
+++ /tmp/devfs_vfsops.c	Wed May  9 21:57:45 2001
@@ -117,26 +117,16 @@
 {
 	int error;
 	int flags = 0;
-	struct vnode *rootvp;
 	struct devfs_mount *fmp;
 
-	error = devfs_root(mp, &rootvp);
-	if (error)
-		return (error);
 	fmp = VFSTODEVFS(mp);
 	if (mntflags & MNT_FORCE)
 		flags |= FORCECLOSE;
-	if (rootvp->v_usecount > 2) {
-		vrele(rootvp);
-		return (EBUSY);
-	}
-	error = vflush(mp, rootvp, flags);
+	/* The root vnode has 1 extra reference to be removed. */
+	error = vflush(mp, 1, flags);
 	if (error)
 		return (error);
 	devfs_purge(fmp->dm_rootdir);
-	vput(rootvp);
-	vrele(rootvp);
-	vgone(rootvp);
 	mp->mnt_data = 0;
 	lockdestroy(&fmp->dm_lock);
 	free(fmp, M_DEVFS);

The changes needed to vflush() are along the lines of the patch
below. Does this approach seem reasonable or any other comments?
e.g. are the semantics of vflush() doing the vrele's itself too
weird and restrictive?

Ian

--- /usr/src/sys/kern/vfs_subr.c	Tue May  8 19:00:17 2001
+++ /tmp/vfs_subr.c	Wed May  9 22:08:47 2001
@@ -1634,15 +1644,24 @@
 #endif
 
 int
-vflush(mp, skipvp, flags)
+vflush(mp, rootrefs, flags)
 	struct mount *mp;
-	struct vnode *skipvp;
+	int rootrefs;
 	int flags;
 {
 	struct proc *p = curproc;	/* XXX */
-	struct vnode *vp, *nvp;
-	int busy = 0;
+	struct vnode *vp, *nvp, *rootvp = NULL;
+	int baserefs, busy = 0, error;
 
+	if (rootrefs > 0) {
+		/*
+		 * Get the filesystem root vnode. We can vput() it
+		 * immediately, since with rootrefs > 0, it won't go away.
+		 */
+		if ((error = VFS_ROOT(mp, &rootvp)) != 0)
+			return (error);
+		vput(rootvp);
+	}
 	mtx_lock(&mntvnode_mtx);
 loop:
 	for (vp = LIST_FIRST(&mp->mnt_vnodelist); vp; vp = nvp) {
@@ -1678,10 +1697,11 @@
 		}
 
 		/*
-		 * With v_usecount == 0, all we need to do is clear out the
-		 * vnode data structures and we are done.
+		 * With v_usecount == baserefs, all we need to do is clear
+		 * out the vnode data structures and we are done.
 		 */
-		if (vp->v_usecount == 0) {
+		baserefs = (vp == rootvp) ? rootrefs : 0;
+		if (vp->v_usecount == baserefs) {
 			mtx_unlock(&mntvnode_mtx);
 			vgonel(vp, p);
 			mtx_lock(&mntvnode_mtx);
@@ -1715,6 +1735,8 @@
 	mtx_unlock(&mntvnode_mtx);
 	if (busy)
 		return (EBUSY);
+	for (; rootrefs > 0; rootrefs--)
+		vrele(rootvp);
 	return (0);
 }
 

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-fs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi? <200105092238.aa48273>