From owner-freebsd-fs Fri Jun 18 10:42:18 1999 Delivered-To: freebsd-fs@freebsd.org Received: from smtp02.primenet.com (smtp02.primenet.com [206.165.6.132]) by hub.freebsd.org (Postfix) with ESMTP id 02D3B14FD1; Fri, 18 Jun 1999 10:42:14 -0700 (PDT) (envelope-from tlambert@usr01.primenet.com) Received: (from daemon@localhost) by smtp02.primenet.com (8.8.8/8.8.8) id KAA00929; Fri, 18 Jun 1999 10:42:13 -0700 (MST) Received: from usr01.primenet.com(206.165.6.201) via SMTP by smtp02.primenet.com, id smtpd000849; Fri Jun 18 10:42:07 1999 Received: (from tlambert@localhost) by usr01.primenet.com (8.8.5/8.8.5) id KAA13187; Fri, 18 Jun 1999 10:42:07 -0700 (MST) From: Terry Lambert Message-Id: <199906181742.KAA13187@usr01.primenet.com> Subject: Re: nullfs? To: eivind@FreeBSD.ORG (Eivind Eklund) Date: Fri, 18 Jun 1999 17:42:06 +0000 (GMT) Cc: jepace@pobox.com, freebsd-fs@FreeBSD.ORG In-Reply-To: <19990618190138.F62123@bitbox.follo.net> from "Eivind Eklund" at Jun 18, 99 07:01:38 pm X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org > > The man page for mount_null(8) warns me: > > > > THIS FILESYSTEM TYPE IS NOT YET FULLY SUPPORTED (READ: IT DOESN'T > > WORK) AND USING IT MAY, IN FACT, DESTROY DATA ON YOUR SYSTEM. USE AT > > YOUR OWN RISK. BEWARE OF DOG. SLIPPERY WHEN WET. > > > > What are the known problems with it? > > Incorrect alias handling (read: your data may or may not reach disk) These are vnode backing object issues. The "vn" device driver addresses these issues rather trivially (and successfully) by implementing copy on write coherency semantics. The problem boils down to wanting the vnode to be an alias for a vm_object_t, the existance of non-null default vfsops, and the lack of a VOP_TERMINAL operation for getting the terminal backing object that has the vm_object_t that's actually physically backing the data. Several suggestions have been put forth by people wanting to optimize the MFS case at the expense of everything else (e.g. compression and cyrptographic layers which must implement explicit cache coherency and therefore can not use an alias) by creating explicit aliases using a seperate alias object to reference the actual vm_object_t. There are a number of good reasons (including the cited examples, above) why this will break some existing modules written by John Heidemann's students and others, and will preclude a number of future research directions that some of us believe are highly desirable. > and extreme locking problems, mostly. He means both VOP_LOCK and VOP_ADVLOCK. The VOP_LOCK issue has to do with the fact that VFS's do not own their own vnode free pools (impossible except for TFS style hacks, due to the lack of a per VFS VFS_VRELE mechanism). The VOP_ADVLOCK issue has to do with the fact that advisory locks are coelesced immediately after assertion, such that if you had a stack that exported an object that looked like: ,-----------------------------. | my file | ,-----------------------------. | vfs #1 | vfs #2 | `---------------'-------------' An advisory lock that was asserted to span the boundaries would have to be addreted in both underlying objects. If the lock failed in the second underlying object, then the lock in the first underlying object would have to be deasserted. Because of the coelesce in the first object, deassertion could incorrectly remove locks (the coelesce itself could incorrectly promote or demote a partial region form the start to the spanning boundary). A secondary problem with the VOP_ADVLOCK that is more germane to the NULLFS implementation is that locks ar hung off the inode object, instead of the terminal vnode object. This means that a lock you assert will not necessarily prevent access if the VFS you are mounting under the NULLFS is exposed elsewhere in the filesystem namespace (e.g. I mount "/usr", and then I NULLFS it to "/home/bob/usr"). A tertiary problem with the NULLFS in this case is reciprocity in the directory entry lookup traversal; specifically, since the implementation is not via coroutines, instead of a ladder-lock for absolute paths, there is a stack lock, which can cause a lock to root. In this case, under some circumstances you would see "locking against myself" panics from mounting "/usr" and then NULLFS'ing it to "/usr/foo/fee". And don't even get me started on propagation of POSIX namespace escapes like "//machine-name/" to subsequent path components on a per component basis... Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message