From owner-freebsd-current  Fri Mar 13 10:53:27 1998
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Received: (from majordom@localhost)
          by hub.freebsd.org (8.8.8/8.8.8) id KAA19054
          for freebsd-current-outgoing; Fri, 13 Mar 1998 10:47:12 -0800 (PST)
          (envelope-from owner-freebsd-current@FreeBSD.ORG)
Received: from pat.idi.ntnu.no (0@pat.idi.ntnu.no [129.241.103.5])
          by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id KAA18971
          for <freebsd-current@FreeBSD.ORG>; Fri, 13 Mar 1998 10:46:27 -0800 (PST)
          (envelope-from Tor.Egge@idi.ntnu.no)
Received: from idi.ntnu.no (tegge@presis.idi.ntnu.no [129.241.111.173])
	by pat.idi.ntnu.no (8.8.8/8.8.8) with ESMTP id TAA26656;
	Fri, 13 Mar 1998 19:45:20 +0100 (MET)
Message-Id: <199803131845.TAA26656@pat.idi.ntnu.no>
To: michaelh@cet.co.jp
Cc: freebsd-current@FreeBSD.ORG
Subject: Re: 4 WILLRELE's to bite the dust
In-Reply-To: Your message of "Fri, 13 Mar 1998 13:22:20 +0900 (JST)"
References: <Pine.SV4.3.95.980313130507.15693A-101000@parkplace.cet.co.jp>
X-Mailer: Mew version 1.70 on Emacs 19.34.1
Mime-Version: 1.0
Content-Type: Text/Plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Date: Fri, 13 Mar 1998 19:45:20 +0100
From: Tor Egge <Tor.Egge@idi.ntnu.no>
Sender: owner-freebsd-current@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> I did some simple testing of nullfs and union.  It's seems Kato-san has
> gotten them pretty stable.  What kind of things cause problems with null
> and union?

	- coherence. When the different layers operates on separate
	  vm objects, the vm objects might get out of sync.
	  E.g. page fault algorithms checks the size of the upper
	  vm object, and gets the wrong value, since a VOP_WRITE
	  only extended the size of the lower vm object.
	
	  Symptom: install -C during make world gets a SIBGUS
	  if /usr/obj is nullfs mounted.
	
	- freeing of resources.  When the file is unlinked, the 
	  lower vnode still has a reference from the upper vnode
	  which is left until the upper vnode is recycled. Currently
	  null_inactive calls VOP_INACTIVE on the lower vnode. 
	  This is wrong (e.g. a process directly accessing the lower
	  vnode will see a truncated file, which is an unintended
	  side effect).

	  Symptom: Removing a file that has been opend via nullfs
	  will not necessarily free the space on the file system
	  before the null vnode is recycled.

	- locking.

		 - Missing vnode locking during vnode recycling,
		   i.e. does not wait for inactive to complete.
		   Symptom: lock manager panic when trying
		   to release the lock in vclean.

		 - During object recycling, the VOP_ISLOCKED method
	 	   can cause a trap 12.  This problem might not
		   occur anymore, since vfs_msync now checks for
		   the VXLOCK vnode flag.

		-  Use of VOP_ISLOCKED() is very often a kludge, since
		   it is assumed that the current process is the owner
		   of an exclusive lock on the vnode, while it might
		   be a different process or a shared lock.

	- vnode leakage in null_node_alloc.

	- During vnode recycling, fsync is called several times on
	  the lower vnode.  This only affects performance.

	- Access to files on a nullfs file system gives spurious
	  EDEADLK errors, due to a bogus deadlock detection in
	  nullfs_root.  That deadlock detection is not needed
	  if the generic mount system call is slightly changed.
	  (most local file systems has the same kind of deadlock
	   problem, thus a generic solution is preferrable).

	
I've been using nullfs since Nov 7 1997 without any serious problems
after having fixed some of the above mentioned problems in my local
source tree.  The code is not perfect, panics are still possible due
to heuristics (due to VOP_ISLOCKED()) being wrong, but I've yet 
not experienced any problems.

Using the nullfs in -current without any fixes, a make world would
very likely not succeed if /usr/obj is nullfs mounted.

By modifying null_mount, I've played around with hierarchical mounts:

       ikke# find /usr/zzz -print
        /usr/zzz
        /usr/zzz/zzz4
        /usr/zzz/zzz4/test4
        ikke# mount -t null /usr/zzz /usr/zzz
        null: Resource deadlock avoided
        ikke# mount -t null /usr/zzz /usr/zzz/zzz4
        ikke# find /usr/zzz -print
        /usr/zzz
        /usr/zzz/zzz4
        /usr/zzz/zzz4/zzz4
        /usr/zzz/zzz4/zzz4/test4
        ikke# umount /usr/zzz/zzz4
        ikke# mount -t null /usr/zzz/zzz4 /usr/zzz
        ikke# find /usr/zzz -print
        /usr/zzz
        /usr/zzz/test4
        ikke# umount /usr/zzz
        ikke# mount -t null /usr/zzz /usr/yyy
        ikke# mount -t null /usr/zzz /usr/yyy
        null: Resource deadlock avoided
        ikke# mount -t null /usr/yyy /usr/zzz
        null: Resource deadlock avoided
        ikke# mount -t null /usr/yyy /usr/yyy
        null: Resource deadlock avoided
        ikke# mount -t null /usr/zzz /usr/yyy
        null: Resource deadlock avoided
        ikke# mount -t null /usr/zzz /usr/zzz
        null: Resource deadlock avoided
        ikke# umount /usr/yyy

I saw the need for a general change in the mount system call when I issued 

       mount /usr/zzz /usr/zzz

instead of 

       mount -t null /usr/zzz /usr/zzz

when testing the nullfs deadlock handling, and got a panic, due to bugs in the 
the ufs mount code.

With a changed mount system call, I get

        ikke# mount /usr/zzz /usr/zzz
        /usr/zzz on /usr/zzz: Block device required
        ikke# mount -t null /usr/zzz /usr/zzz
        null: Device busy
        ikke# mount -t msdos /usr/zzz /usr/zzz
        msdos: /usr/zzz: Block device required
        ikke# mount -t cd9660 /usr/zzz /usr/zzz
        cd9660: Block device required
        ikke# ls -ld /usr/zzz
        drwxr-xr-x  3 root  wheel  512 Feb 10 02:02 /usr/zzz


- Tor Egge

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message