From owner-freebsd-fs Wed Nov 24 11: 4:49 1999 Delivered-To: freebsd-fs@freebsd.org Received: from smtp02.primenet.com (smtp02.primenet.com [206.165.6.132]) by hub.freebsd.org (Postfix) with ESMTP id 7AF86153E8; Wed, 24 Nov 1999 11:04:09 -0800 (PST) (envelope-from tlambert@usr08.primenet.com) Received: (from daemon@localhost) by smtp02.primenet.com (8.8.8/8.8.8) id MAA03761; Wed, 24 Nov 1999 12:03:15 -0700 (MST) Received: from usr08.primenet.com(206.165.6.208) via SMTP by smtp02.primenet.com, id smtpd003665; Wed Nov 24 12:03:07 1999 Received: (from tlambert@localhost) by usr08.primenet.com (8.8.5/8.8.5) id LAA21738; Wed, 24 Nov 1999 11:55:04 -0700 (MST) From: Terry Lambert Message-Id: <199911241855.LAA21738@usr08.primenet.com> Subject: Re: namei() and freeing componentnames To: eivind@FreeBSD.ORG (Eivind Eklund) Date: Wed, 24 Nov 1999 18:55:04 +0000 (GMT) Cc: ezk@cs.columbia.edu, fs@FreeBSD.ORG In-Reply-To: <19991118153220.E45524@bitbox.follo.net> from "Eivind Eklund" at Nov 18, 99 03:32:20 pm X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org > Yes, this is the intent. > > The problem I'm finding with VOP_RELEASEND() is that namei() can > return two different vps - the dvp (directory vp) and the actual vp > (inside the directory dvp points at), and that neither of these are > always available. What gets returned is based on the flags passed down. I think that trying to encapsulate this transparently, so that any namei() operation that succeeds or fails can be freed in its entirety without resort to flags specific code in the caller is a mistake. I don't think you can reasonably do this. One issue that occurs to me is that namei() itself, and not the underlying VOP_LOOKUP code, should be the one to reference the path component name cache. If the underlying VFS doesn't want the cache hit to occur without notifying it of the event, then it needs to not enter the data in the cache. This would simplify a large amount of code. The other simplification, which is organizational, and could, using inline functions, be effectively NULL additional code overhead, is to seperate the lookup operations by request type. Whether or not something wants the parent directory back has much to do with whther it is a create or rename operation, and little to do with anything else. Operations which intend to modify the returned directory entry are very distinct from those merely doing a lookup. I have often felt that much of the mess create/rename/delete/open variant behaviour causes should be addressed by moving the complexity to upper level code. > Progress report: Based on current rate of progress, it looks like I'll > be able to have patches ready for (my personal) testing sunday (or > *possibly* saturday, but most likely not). Depending on how > testing/debugging works out, the patches will most likely be ready for > public testing sometime next week. I'll need help with NFS testing. Heh. This is the same stumbling block I hit, needing help with NFS testing. I created, and I believe it was Peter who updated it, a testing framework that can detect kernel memory leaks from user space, and which exercised the entire branch path for the namei()/nameifree() cases. This would probably be a good thing for someone to use, since it will identify the branch path in which any memory leaks are occurring. > Forward view: I'm undecided on the next step. Possibilities: > > (1) Change the way locking is specificied to make it feasible to test > locking patches properly, and change the assertion generation to > generate better assertions. This will probably require changing > VOP_ISLOCKED() to be able to take a process parameter, and return > different valued based on wether an exlusive lock is held by that > process or by another process. The present behaviour will be > available by passing NULL for this parameter. > > Presently, running multiple processes does not work properly, as > the assertions do not really assert the right things. > > These changes are necessary to properly debug the use of locks, > which I again believe is necessary for stacking layers (which I > would like to work in 4.0, but I don't know if I will be able to > have ready). This would be nice; I still believe most of the vnode and the advisory locking code can move to upper layers. I think it is the responsibility of the stacking layers to propagate locks, and the only place that this is really an issue is on fan-in or fan-out. Please keep an eye towards not precluding Jermey Allisons work on a kernel opportunity locking interface, since it's really needed to do hosted OS/host OS coherency properly (e.g. Samba clients must obey UNIX locks, and UNIX applications must obey those of Samba). This is similar to what NFS clients and local applications must do to interoperate, and is the primary purpose of the LOASE interface. > (2) Change the behaviour of VOP_LOOKUP() to "eat as much as you can, > and return how much that was" rather than "Eat a single path > component; we have already decided what this is." > This allows different types of namespaces, and it allows > optimizations in VOP_LOOKUP() when several steps in the traversal > is inside a single filesystem (and hey - who mounts a > new filesystem on every directory they see, anyway?) The path component buffer mechanism already specifies this behaviour as one of its initial design requirements, so I think this is already taken care of. What does not happen is that lookups that will take place in a single VFS are not held down in that VFS for the entire traversal, but instead pop up to namei(). I don't think you can get rid of this, without destroying the "union" option (not the same as the "unionfs"), and without damaging the ability to cover mount points and to chroot or do symlink expansion, or deal with POSIX namespace escape. The original reason for allowing this behaviour at all, according to Heidemann's thesis, is to permit an underlying FS to "eat as much as you want", as opposed to "eat as much as you can". This was used in proxy VFS stacking layers, since a proxy layer knows that it owns the entire tree inferior to the current component. One "low hanging fruit" optimization that can be made is to _always_ set the fdp->fd_rdir to the processes current root directory; this avoids the NULL/non-NULL test, so long as it is inherited correctly on fork, and set for init. This would be very nice for many other reasons... 8-). Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message