From owner-freebsd-hackers Wed Nov 13 10:05:45 1996 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id KAA02520 for hackers-outgoing; Wed, 13 Nov 1996 10:05:45 -0800 (PST) Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211]) by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id KAA02495; Wed, 13 Nov 1996 10:05:32 -0800 (PST) Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id KAA22484; Wed, 13 Nov 1996 10:54:58 -0700 From: Terry Lambert Message-Id: <199611131754.KAA22484@phaeton.artisoft.com> Subject: Re: NFS bypass op and the utok layer To: michaelh@cet.co.jp (Michael Hancock) Date: Wed, 13 Nov 1996 10:54:58 -0700 (MST) Cc: Hackers@freebsd.org, freebsd-fs@freebsd.org In-Reply-To: from "Michael Hancock" at Nov 13, 96 11:49:43 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-hackers@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Boy, people keep asking questions for which my work is the answer... this is more than a little cool. 8-). > Were these even considered when the FreeBSD vnode stacking implementation > was done? > > The NFS default op is the one returning the NOT SUPPORTED error. A bypass > op would allow you to stack on top of an out-of-kernel layer which could > then be layered on a utok layer to cross the boundary again. > > I guess the fs memory allocation architecture is not compatible with this. You have hit the nail on the head. There are many places where the FS is expected to allocate something which it will never deallocate, or deallocate something which it did not allocate. Examples include: o The vfs_syscalls.c generated namei cn_pnbuf o The NFS generated namei cn_pnbuf o The vnode In addition, there are many places where the VOP's are not abstracted by status return (ie: they are call-down instead of veto interfaces). Examples include: o VOP_LOCK o VOP_ADVLOCK o VFS_MOUNT o NFS export list porcessing o root mount processing o remount processing o mount point covering o namei() o CREATE op in EXISTS case with no intention of overrwrite in the case of collision Without a clear abstraction, it's impossible to build a utok/ktou layer (I would prefer a ktou to a bypass op; it's more general, and doesn't require an NFS loopback). Particularly problematic are the NFS LEASE VOP's, which are interfaced by a serious kludge because they are call-down instead of veto, and therefore can not be zero-overhead registration based. If my changes for fcntl() to support server-side NFS locking (as the subsystem called by rpc.lockd) are ever integrated, this will add another, identical kludge for FHTOVP for an NFS LKM. > Debugging in userland would sure be cool, when you're satisfied take away > the transport layers and you're back in the kernel. This was discussed in detail in the Heidemann paper, actually... and yes, it's the way I'd like to do FS debuging as well. Regards, Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.