From owner-freebsd-fs Sun Aug 22 17:26:18 1999 Delivered-To: freebsd-fs@freebsd.org Received: from bastard.40ounce.net (bastard.40ounce.net [206.14.239.252]) by hub.freebsd.org (Postfix) with ESMTP id 95B8314C23 for ; Sun, 22 Aug 1999 17:26:13 -0700 (PDT) (envelope-from jb@syndicate.net) Received: from syndicate.net (adsl-209-233-31-199.dsl.snfc21.pacbell.net [209.233.31.199]) by bastard.40ounce.net (8.8.8/8.8.8) with ESMTP id RAA16079 for ; Sun, 22 Aug 1999 17:26:35 -0700 (PDT) (envelope-from jb@syndicate.net) Message-ID: <37C0951B.837EA807@syndicate.net> Date: Sun, 22 Aug 1999 17:26:03 -0700 From: James Brown Reply-To: jb@syndicate.net Organization: SYNDICATE Consulting Group, Inc. X-Mailer: Mozilla 4.5 [en] (X11; U; FreeBSD 2.2.8-STABLE i386) X-Accept-Language: en MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: ccd striped array slower than single disk! Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org hello, bonnie reports 5636 K/s when using a single disk without ccd. i've been testing ccd performance, using various interleaves. the best performance i've seen striping with ccd has been 5571 K/s using an interleave of 32768. these are identical disks (maxtor 27.2 gb) and are on two separate ultra dma/33 busses (one is bus 1 slave, other is bus 2 master but that shouldn't matter, right?). shouldn't i be seeing something closer to 9000 or 10000 K/s? thank you, james To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sun Aug 22 17:42:50 1999 Delivered-To: freebsd-fs@freebsd.org Received: from allegro.lemis.com (allegro.lemis.com [192.109.197.134]) by hub.freebsd.org (Postfix) with ESMTP id ABC79155FE for ; Sun, 22 Aug 1999 17:42:44 -0700 (PDT) (envelope-from grog@freebie.lemis.com) Received: from freebie.lemis.com (freebie.lemis.com [192.109.197.137]) by allegro.lemis.com (8.9.1/8.9.0) with ESMTP id KAA15417; Mon, 23 Aug 1999 10:12:38 +0930 (CST) Received: (from grog@localhost) by freebie.lemis.com (8.9.3/8.9.0) id KAA83501; Mon, 23 Aug 1999 10:12:37 +0930 (CST) Date: Mon, 23 Aug 1999 10:12:37 +0930 From: Greg Lehey To: James Brown Cc: freebsd-fs@FreeBSD.ORG Subject: Re: ccd striped array slower than single disk! Message-ID: <19990823101237.C83273@freebie.lemis.com> References: <37C0951B.837EA807@syndicate.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.95.4i In-Reply-To: <37C0951B.837EA807@syndicate.net>; from James Brown on Sun, Aug 22, 1999 at 05:26:03PM -0700 WWW-Home-Page: http://www.lemis.com/~grog X-PGP-Fingerprint: 6B 7B C3 8C 61 CD 54 AF 13 24 52 F8 6D A4 95 EF Organization: LEMIS, PO Box 460, Echunga SA 5153, Australia Phone: +61-8-8388-8286 Fax: +61-8-8388-8725 Mobile: +61-41-739-7062 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Sunday, 22 August 1999 at 17:26:03 -0700, James Brown wrote: > hello, > > bonnie reports 5636 K/s when using a single disk without ccd. Bonnie lies. > i've been testing ccd performance, using various interleaves. the > best performance i've seen striping with ccd has been 5571 K/s using > an interleave of 32768. Do you mean 32768 blocks or bytes? One's rather small, the other is unnecessarily large. I've found stripe sizes in the order of 256 kB to be optimum. > these are identical disks (maxtor 27.2 gb) and are on two > separate ultra dma/33 busses (one is bus 1 slave, other is bus 2 > master but that shouldn't matter, right?). shouldn't i be seeing > something closer to 9000 or 10000 K/s? That depends on what you're doing. Remember that bonnie uses buffer cache, so the results are usually inaccurate and difficult to repeat. You also don't say which of the bonnie tests you're quoting. Try rawio (in the Ports Collection) instead, and remember that it's usually the random access tests which count. If you're still unhappy, I'd be interested to see how you fare with Vinum instead of ccd. Greg -- See complete headers for address, home page and phone numbers finger grog@lemis.com for PGP public key To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Mon Aug 23 23:28:50 1999 Delivered-To: freebsd-fs@freebsd.org Received: from gilliam.users.flyingcroc.net (gilliam.users.flyingcroc.net [207.246.128.2]) by hub.freebsd.org (Postfix) with ESMTP id C9330150CF; Mon, 23 Aug 1999 23:28:25 -0700 (PDT) (envelope-from ross@gilliam.users.flyingcroc.net) Received: (from ross@localhost) by gilliam.users.flyingcroc.net (8.9.3/8.9.3) id XAA04603; Mon, 23 Aug 1999 23:28:10 -0700 (PDT) Date: Wed, 18 Aug 1999 11:23:56 -0700 (PDT) Message-Id: <199908240628.XAA04603@gilliam.users.flyingcroc.net> To: Julian Elischer Cc: Bill Studenmund , Terry Lambert , Alton Matthew , Hackers@FreeBSD.ORG, fs@FreeBSD.ORG Subject: Re: BSD XFS Port & BSD VFS Rewrite From: Poul-Henning Kamp Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org In message , Julian Elischer writes: >On Wed, 18 Aug 1999, Poul-Henning Kamp wrote: > >> Matt doesn't represent the FreeBSD project, and even if he rewrites >> the VFS subsystem so he can understand it, his rewrite would face >> considerable resistance on its way into FreeBSD. I don't think >> there is reason to rewrite it, but there certainly are areas >> that need fixing. > >You are misinformed as far as I know.. From discussions I saw, th >main architect of a VFS rewrite would be Kirk, and Matt would be acting as >Kirk's right-hand-man. I bet that Matt and Kirk uses "rewrite" for two very different concepts. The resulting reviews will be equally different. >> >> The use of the "vfs_default" to make unimplemented VOP's >> >> fall through to code which implements function, while well >> >> intentioned, is misguided. >> >> I beg to differ. The only difference is that we pass through >> multiple layers before we hit the bottom of the stack. There is >> no loss of functionality but significant gain of clarity and >> modularity. > >Well I believe that Kirk considers them misguided too, but he stated that >he wasn't going to remove them without serious thought about the alternatives. I'll be more than ready to discuss this with Kirk. -- Poul-Henning Kamp FreeBSD coreteam member phk@FreeBSD.ORG "Real hackers run -current on their laptop." FreeBSD -- It will take a long time before progress goes too far! To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Mon Aug 23 23:28:51 1999 Delivered-To: freebsd-fs@freebsd.org Received: from gilliam.users.flyingcroc.net (gilliam.users.flyingcroc.net [207.246.128.2]) by hub.freebsd.org (Postfix) with ESMTP id 024D615134; Mon, 23 Aug 1999 23:28:26 -0700 (PDT) (envelope-from ross@gilliam.users.flyingcroc.net) Received: (from ross@localhost) by gilliam.users.flyingcroc.net (8.9.3/8.9.3) id XAA04552; Mon, 23 Aug 1999 23:28:07 -0700 (PDT) Date: Wed, 18 Aug 1999 11:07:25 -0700 (PDT) Message-Id: <199908240628.XAA04552@gilliam.users.flyingcroc.net> To: Bill Studenmund Cc: Terry Lambert , Hackers@FreeBSD.ORG, fs@FreeBSD.ORG Subject: Re: BSD XFS Port & BSD VFS Rewrite From: Poul-Henning Kamp Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org In message , Bill Studenmund writes: >> >I doubt we need more than 64 bit times. 2^63 seconds works out to >> >292,279,025,208 years, or 292 (american) billion years. Current theories >> >put the age of the universe at I think 12 to 16 billion years. So 64-bit >> >signed times in seconds will cover from before the big bang to way past >> >any time we'll be caring about. :-) > >I was unclear. I was refering to the seconds side of things. Sub-second >resolution would need other bits. Yes, but we need subsecond in the filesystems. Think about make(1) on a blinding fast machine... -- Poul-Henning Kamp FreeBSD coreteam member phk@FreeBSD.ORG "Real hackers run -current on their laptop." FreeBSD -- It will take a long time before progress goes too far! To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Mon Aug 23 23:29: 4 1999 Delivered-To: freebsd-fs@freebsd.org Received: from gilliam.users.flyingcroc.net (gilliam.users.flyingcroc.net [207.246.128.2]) by hub.freebsd.org (Postfix) with ESMTP id BF57C15278; Mon, 23 Aug 1999 23:28:30 -0700 (PDT) (envelope-from ross@gilliam.users.flyingcroc.net) Received: (from ross@localhost) by gilliam.users.flyingcroc.net (8.9.3/8.9.3) id XAA04645; Mon, 23 Aug 1999 23:28:12 -0700 (PDT) Date: Wed, 18 Aug 1999 11:37:03 -0700 (PDT) Message-Id: <199908240628.XAA04645@gilliam.users.flyingcroc.net> From: Matthew Dillon To: Julian Elischer Cc: Poul-Henning Kamp , Bill Studenmund , Terry Lambert , Alton Matthew , Hackers@FreeBSD.ORG, fs@FreeBSD.ORG Subject: Re: BSD XFS Port & BSD VFS Rewrite Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org :On Wed, 18 Aug 1999, Poul-Henning Kamp wrote: : :> Matt doesn't represent the FreeBSD project, and even if he rewrites :> the VFS subsystem so he can understand it, his rewrite would face :> considerable resistance on its way into FreeBSD. I don't think :> there is reason to rewrite it, but there certainly are areas :> that need fixing. : :You are misinformed as far as I know.. From discussions I saw, th :main architect of a VFS rewrite would be Kirk, and Matt would be acting as :Kirk's right-hand-man. Yes, this is correct. Kirk is going to be the main architect. I have been heavily involved and will continue to be. :> >> The use of the "vfs_default" to make unimplemented VOP's : :> I beg to differ. The only difference is that we pass through :> multiple layers before we hit the bottom of the stack. There is :... :Well I believe that Kirk considers them misguided too, but he stated that :he wasn't going to remove them without serious thought about the alternatives. The vfs op callout layering has not been on the radar screen. There are much too many other more serious problems. I really doubt that any changes will be made to this piece any time in the next year or even two, if at all. The main items on the radar screen are related to buffer management (struct buf stuff. For example, preventing VM blockages due to pages being wired by write I/O's), VFS locking and reference count issues (for example, namei lookups, blockages in the pager and syncer due to vnode locks held by blocked processes, etc...), and interactions between VFS and VM (for example: moving away from VOP_READ/VOP_WRITE and moving more towards a getpages/putpages model). None of the items have been set in stone yet. We're waiting for Kirk to get back from vacation and get back into the groove. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Mon Aug 23 23:29:18 1999 Delivered-To: freebsd-fs@freebsd.org Received: from gilliam.users.flyingcroc.net (gilliam.users.flyingcroc.net [207.246.128.2]) by hub.freebsd.org (Postfix) with ESMTP id DCB41152B1; Mon, 23 Aug 1999 23:28:30 -0700 (PDT) (envelope-from ross@gilliam.users.flyingcroc.net) Received: (from ross@localhost) by gilliam.users.flyingcroc.net (8.9.3/8.9.3) id XAA04556; Mon, 23 Aug 1999 23:28:07 -0700 (PDT) Date: Wed, 18 Aug 1999 11:09:14 -0700 (PDT) Message-Id: <199908240628.XAA04556@gilliam.users.flyingcroc.net> From: Julian Elischer To: Poul-Henning Kamp Cc: Bill Studenmund , Terry Lambert , Alton Matthew , Hackers@FreeBSD.ORG, fs@FreeBSD.ORG Subject: Re: BSD XFS Port & BSD VFS Rewrite Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Wed, 18 Aug 1999, Poul-Henning Kamp wrote: > Matt doesn't represent the FreeBSD project, and even if he rewrites > the VFS subsystem so he can understand it, his rewrite would face > considerable resistance on its way into FreeBSD. I don't think > there is reason to rewrite it, but there certainly are areas > that need fixing. You are misinformed as far as I know.. From discussions I saw, th main architect of a VFS rewrite would be Kirk, and Matt would be acting as Kirk's right-hand-man. > > >> The use of the "vfs_default" to make unimplemented VOP's > >> fall through to code which implements function, while well > >> intentioned, is misguided. > > I beg to differ. The only difference is that we pass through > multiple layers before we hit the bottom of the stack. There is > no loss of functionality but significant gain of clarity and > modularity. Well I believe that Kirk considers them misguided too, but he stated that he wasn't going to remove them without serious thought about the alternatives. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Mon Aug 23 23:29:55 1999 Delivered-To: freebsd-fs@freebsd.org Received: from gilliam.users.flyingcroc.net (gilliam.users.flyingcroc.net [207.246.128.2]) by hub.freebsd.org (Postfix) with ESMTP id 1B31215193; Mon, 23 Aug 1999 23:28:27 -0700 (PDT) (envelope-from ross@gilliam.users.flyingcroc.net) Received: (from ross@localhost) by gilliam.users.flyingcroc.net (8.9.3/8.9.3) id XAA04619; Mon, 23 Aug 1999 23:28:11 -0700 (PDT) Date: Wed, 18 Aug 1999 11:33:43 -0700 (PDT) Message-Id: <199908240628.XAA04619@gilliam.users.flyingcroc.net> From: Terry Lambert Subject: Re: BSD XFS Port & BSD VFS Rewrite To: wrstuden@nas.nasa.gov Cc: tlambert@primenet.com, Hackers@FreeBSD.ORG, fs@FreeBSD.ORG Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org > > > > > > 2. Advisory locks are hung off private backing objects. > > > I'm not sure. The struct lock * is only used by layered filesystems, so > > > they can keep track both of the underlying vnode lock, and if needed their > > > own vnode lock. For advisory locks, would we want to keep track both of > > > locks on our layer and the layer below? Don't we want either one or the > > > other? i.e. layers bypass to the one below, or deal with it all > > > themselves. > > > > I think you want the lock on the intermediate layer: basically, on > > every vnode that has data associated with it that is unique to a > > layer. Let's not forget, also, that you can expose a layer into > > the namespace in one place, and expose it covered under another > > layer, at another. If you locked down to the backing object, then > > the only issue you would be left with is one or more intermediate > > backing objects. > > Right. That exported struct lock * makes locking down to the lowest-level > file easy - you just feed it to the lock manager, and you're locking the > same lock the lowest level fs uses. You then lock all vnodes stacked over > this one at the same time. Otherwise, you just call VOP_LOCK below and > then lock yourself. I think this defeats the purpose of the stacking architecture; I think that if you look at an unadulterated NULLFS, you'll see what I mean. Intermediate FS's should not trap VOP's that are not applicable to them. One of the purposes of doing a VOP_LOCK on intermediate vnodes that aren't backing objects is to deal with the global vnode pool management. I'd really like FS's to own their vnode pools, but even without that, you don't need the locking, since you only need to flush data on vnodes that are backing objects. If we look at a stack of FS's with intermediate exposure into the namespace, then it's clear that the issue is really only applicable to objects that act as a backing store: ---------------------- ---------------------- -------------------- FS Exposed in hierarchy Backing object ---------------------- ---------------------- -------------------- top yes no intermediate_1 no no intermediate_2 no yes intermediate_3 yes no bottom no yes ---------------------- ---------------------- -------------------- So when we lock "top", we only lock in intermediate_2 and in bottom. Then we attempt to lock in intermediate_3, but it fails: not because there is a lock on the vnode in intermediate_3, but because there is a lock in bottom. It's unnecessary to lock the vnodes in the intermediate path, or even at the exposure level, unless they are vnodes that have an associated backing store. The need to lock in intermediate_2 exists because it is a translation layer or a namespace escape. It deals with compression, or it deals with file-as-a-directory folding, or it deals with file-hiding (perhaps for a quoata file), etc.. If it didn't, it wouldn't need backing store (and therefore wouldn't need to be locked). > > For a layer with an intermediate backing object, I'm prepared to > > declare it "special", and proxy the operation down to any inferior > > backing object (e.g. a union FS that adds files from two FS's > > together, rather than just directoriy entry lists). I think such > > layers are the exception, not the rule. > > Actually isn't the only problem when you have vnode fan-in (union FS)? > i.e. a plain compressing layer should not introduce vnode locking > problems. If it's a block compression layer, it will. Also a translation layer; consider a pure Unicode system that wants to remotely mount an FS from a legacy system. To do this, it needs to expand the pages from the legacy system [only it can, since the legacy system doesn't know about Unicode] in a 2:1 ratio. Now consider doing a byte-range lock on a file on such a system. To propogate the lock, you have to do an arithmetic conversion at the translation layer. This gets worse if the lower end FS is exposed in the namespace as well. You could make the same arguments for other types of translation or namespace escapes. > > I think that export policies are the realm of /etc/exports. > > > > The problem with each FS implementing its own policy, is that this > > is another place that copyinstr() gets called, when it shouldn't. > > Well, my thought was that, like with current code, most every fs would > just call vfs_export() when it's presented an export operation. But by > retaining the option of having the fs do its own thing, we can support > different export semantics if desired. I think this bears down on whether the NFS server VFS consumer is allowed access to the VFS stack at the particular intermediate layer. I think this is really an administrative policy decision, and not an option for the VFS. I think it would be bad if a given VFS could refuse to participate in a stacking operation because it didn't like who was stacking. If we insist on the ability for a VFS to refused stacking, then we should generalize the idea, such that an intermediate VFS could refuse exposure into the filesystem namespace accessible to users. Consider the case of a VFS without quota support, stacked under a VFS layer that provided quota support by hiding a file in the top level directory ("quota") and then folding the directory closed by rerooting in a subdirectory of the top level directory ("root/"). It's reasonable to assume that most admins that want to enforce quotas would *not* want the possibility of exposing the VFS without quota support in the user accessible namespace. Should the VFS without quotas refuse such exposure? I think the answer is "no", and that it is an administrative control issue, not a VFS's preference issue. Administrators enforce this by protecting the path to exposure points, or by mounting stacks over top of exposure points, which results in the exposure being hidden under another mount. Using the QUOTAFS example, you mount the FS to be quota-enforced on /home, and then you mount the QUOTAFS over top of it, and have it cover "/home" itself, hiding the underlying FS from exposure. > > I would resolve this by passing a standard option to the mount code > > in user space. For root mounts, a vnode is passed down. For other > > mounts, the vnode is parsed and passed if the option is specified. > > Or maybe add a field to vfsops. This info says what the mount call will > expect (I want a block device, a regular file, a directory, etc), so it > fits. :-) This is actually an elegant soloution to the problem. Much of the time, we don't consider data interfaces when they are appropriate because of their widespread use in inappropriate ways (e.g. "ps"). > Also, if we leave it to userland, what happens if someone writes a > program which calls sys_mount with something the fs doesn't expect. :-) Well, that gets to another grail of mine: when a device containing a filesystem "arrives", I believe it should trigger a mount into the list of mounted filesystems. I don't necessarily mean that it should also be exported into the filesystem hierarchy at that point (but it's an option, using the "last mounted on" information). > > I think that you will only be able to find rare examples of FS's > > that don't take device names as arguments. But for those, you > > don't specify the option, and it gets "NULL", and whatever local > > options you specify. > > I agree I can't see a leaf fs not taking a device node. But layered > fs's certainly will want something else. :-) I think they want a vnode of an already mounted FS. The trick is to enforce the "already mounted" part of that. I'm comforable with doing this by saying "it's not already mounted until you can look up a vnode on it". > > The point is that, for FS's that can be both root and sub-root, > > the mount code doesn't have to make the decision, it can be punted > > to higher level code, in one place, where the code can be centrally > > maintained and kept from getting "stale" when things change out > > from under it. > > True. > > And with good comments we can catch the times when the centrally located > code changes & brakes an assumption made by the fs. :-) 8-). > > > Except for a minor buglet with device nodes, stacking works in NetBSD at > > > present. :-) > > > > Have you tried Heidemann's student's stacking layers? There is one > > encryption, and one per-file compression with namespace hiding, that > > I think it would be hard pressed to keep up with. But I'll give it > > the benefit of the doubt. 8-). > > Nope. The problem is that while stacking (null, umap, and overlay fs's) > work, we don't have the coherency issues worked out so that upper layers > can cache data. i.e. so that the lower fs knows it has to ask the uper > layers to give pages back. :-) But multiple ls -lR's work fine. :-) With UVM in NetBSD, this is (supposedly) not an issue. You could actually think of it this way, as well: only FS's that contain vnodes that provide backing should implement VOP_GETPAGES and VOP_PUTPAGES, and all I/O should be done through paging. > > > I agree it's ugly, but it has the advantage that it doesn't grow the > > > on-disk inode. A lot of flks have designs on the remaining 64 bits free. > > > :-) > > > > Well, so long as we can resolve the issue for a long, long time; > > I plan on being around to have to put up with the bugs, if I can > > wrangle it... 8-). > > :-) > > I bet by then (559447 AD) we won't be using ffs, so the problem will be > moot. :-) Unless I'm the curator of a computer museum... 8-). Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Mon Aug 23 23:30: 3 1999 Delivered-To: freebsd-fs@freebsd.org Received: from gilliam.users.flyingcroc.net (gilliam.users.flyingcroc.net [207.246.128.2]) by hub.freebsd.org (Postfix) with ESMTP id BD446158A0; Mon, 23 Aug 1999 23:28:58 -0700 (PDT) (envelope-from ross@gilliam.users.flyingcroc.net) Received: (from ross@localhost) by gilliam.users.flyingcroc.net (8.9.3/8.9.3) id XAA04712; Mon, 23 Aug 1999 23:28:19 -0700 (PDT) Date: Wed, 18 Aug 1999 11:58:41 -0700 (PDT) Message-Id: <199908240628.XAA04712@gilliam.users.flyingcroc.net> To: Terry Lambert Cc: michaelh@cet.co.jp, wrstuden@nas.nasa.gov, Matthew.Alton@anheuser-busch.com, Hackers@FreeBSD.ORG, fs@FreeBSD.ORG Subject: Re: BSD XFS Port & BSD VFS Rewrite From: Poul-Henning Kamp Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org In message <199908181848.LAA14960@usr02.primenet.com>, Terry Lambert writes: >> >You would have to de-collapse several VOP lists that have been >> >pre-collapsed. >> >> You are talking gibberish here. Please show code where this is >> a problem. > >When you write a proxy stacking layer, such as John Heidemann's >network proxy stacking layer (an NFS alternative), VOP's which >would normally be handled by vfs_default have to be handled on >the other end of the proxy, instead, in the same way that they >would be handled by the vfs_default stuff. And what prevents you from taking over the default op ? -- Poul-Henning Kamp FreeBSD coreteam member phk@FreeBSD.ORG "Real hackers run -current on their laptop." FreeBSD -- It will take a long time before progress goes too far! To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Mon Aug 23 23:30:39 1999 Delivered-To: freebsd-fs@freebsd.org Received: from gilliam.users.flyingcroc.net (gilliam.users.flyingcroc.net [207.246.128.2]) by hub.freebsd.org (Postfix) with ESMTP id 4D9AE158F0; Mon, 23 Aug 1999 23:29:05 -0700 (PDT) (envelope-from ross@gilliam.users.flyingcroc.net) Received: (from ross@localhost) by gilliam.users.flyingcroc.net (8.9.3/8.9.3) id XAA04749; Mon, 23 Aug 1999 23:28:20 -0700 (PDT) Date: Wed, 18 Aug 1999 12:09:27 -0700 (PDT) Message-Id: <199908240628.XAA04749@gilliam.users.flyingcroc.net> From: Bill Studenmund Reply-To: Bill Studenmund To: Poul-Henning Kamp Cc: Terry Lambert , Hackers@FreeBSD.ORG, fs@FreeBSD.ORG Subject: Re: BSD XFS Port & BSD VFS Rewrite Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Wed, 18 Aug 1999, Poul-Henning Kamp wrote: > Yes, but we need subsecond in the filesystems. Think about make(1) on > a blinding fast machine... Oh yes, I realize that. :-) It's just that I thought you were at one point suggesting having 128 bits to the left of the decimal point (128 bits worth of seconds). I was trying to say that'd be a bit much. :-) Take care, Bill To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Mon Aug 23 23:30:46 1999 Delivered-To: freebsd-fs@freebsd.org Received: from gilliam.users.flyingcroc.net (gilliam.users.flyingcroc.net [207.246.128.2]) by hub.freebsd.org (Postfix) with ESMTP id 6B545158B1; Mon, 23 Aug 1999 23:28:59 -0700 (PDT) (envelope-from ross@gilliam.users.flyingcroc.net) Received: (from ross@localhost) by gilliam.users.flyingcroc.net (8.9.3/8.9.3) id XAA04725; Mon, 23 Aug 1999 23:28:19 -0700 (PDT) Date: Wed, 18 Aug 1999 12:01:18 -0700 (PDT) Message-Id: <199908240628.XAA04725@gilliam.users.flyingcroc.net> From: Bill Studenmund Reply-To: Bill Studenmund To: Terry Lambert Cc: Hackers@FreeBSD.ORG, fs@FreeBSD.ORG Subject: Re: BSD XFS Port & BSD VFS Rewrite Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Wed, 18 Aug 1999, Terry Lambert wrote: > > Right. That exported struct lock * makes locking down to the lowest-level > > file easy - you just feed it to the lock manager, and you're locking the > > same lock the lowest level fs uses. You then lock all vnodes stacked over > > this one at the same time. Otherwise, you just call VOP_LOCK below and > > then lock yourself. > > I think this defeats the purpose of the stacking architecture; I > think that if you look at an unadulterated NULLFS, you'll see what I > mean. Please be more precise. I have looked at an unadulterated NULLFS, and found it lacking. I don't see how this change breaks stacking. > Intermediate FS's should not trap VOP's that are not applicable > to them. True. But VOP_LOCK is applicable to layered fs's. :-) > One of the purposes of doing a VOP_LOCK on intermediate vnodes > that aren't backing objects is to deal with the global vnode > pool management. I'd really like FS's to own their vnode pools, > but even without that, you don't need the locking, since you > only need to flush data on vnodes that are backing objects. > > If we look at a stack of FS's with intermediate exposure into the > namespace, then it's clear that the issue is really only applicable > to objects that act as a backing store: > > > ---------------------- ---------------------- -------------------- > FS Exposed in hierarchy Backing object > ---------------------- ---------------------- -------------------- > top yes no > intermediate_1 no no > intermediate_2 no yes > intermediate_3 yes no > bottom no yes > ---------------------- ---------------------- -------------------- > > So when we lock "top", we only lock in intermediate_2 and in bottom. No. One of the things Heidemann notes in his dissertation is that to prevent deadlock, you have to lock the whole stack of vnodes at once, not bit by bit. i.e. there is one lock for the whole thing. > > Actually isn't the only problem when you have vnode fan-in (union FS)? > > i.e. a plain compressing layer should not introduce vnode locking > > problems. > > If it's a block compression layer, it will. Also a translation layer; > consider a pure Unicode system that wants to remotely mount an FS > from a legacy system. To do this, it needs to expand the pages from > the legacy system [only it can, since the legacy system doesn't know > about Unicode] in a 2:1 ratio. Now consider doing a byte-range lock > on a file on such a system. To propogate the lock, you have to do > an arithmetic conversion at the translation layer. This gets worse > if the lower end FS is exposed in the namespace as well. Wait. byte-range locking is different from vnode locking. I've been talking about vnode locking, which is different from the byte-range locking you're discussing above. > > Nope. The problem is that while stacking (null, umap, and overlay fs's) > > work, we don't have the coherency issues worked out so that upper layers > > can cache data. i.e. so that the lower fs knows it has to ask the uper > > layers to give pages back. :-) But multiple ls -lR's work fine. :-) > > With UVM in NetBSD, this is (supposedly) not an issue. UBC. UVM is a new memory manager. UBC unifies the buffer cache with the VM system. > You could actually think of it this way, as well: only FS's that > contain vnodes that provide backing should implement VOP_GETPAGES > and VOP_PUTPAGES, and all I/O should be done through paging. Right. That's part of UBC. :-) Take care, Bill To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Mon Aug 23 23:31: 7 1999 Delivered-To: freebsd-fs@freebsd.org Received: from gilliam.users.flyingcroc.net (gilliam.users.flyingcroc.net [207.246.128.2]) by hub.freebsd.org (Postfix) with ESMTP id CDCC915901; Mon, 23 Aug 1999 23:29:03 -0700 (PDT) (envelope-from ross@gilliam.users.flyingcroc.net) Received: (from ross@localhost) by gilliam.users.flyingcroc.net (8.9.3/8.9.3) id XAA04695; Mon, 23 Aug 1999 23:28:18 -0700 (PDT) Date: Wed, 18 Aug 1999 11:49:25 -0700 (PDT) Message-Id: <199908240628.XAA04695@gilliam.users.flyingcroc.net> From: Terry Lambert Subject: Re: BSD XFS Port & BSD VFS Rewrite To: phk@critter.freebsd.dk (Poul-Henning Kamp) Cc: tlambert@primenet.com, michaelh@cet.co.jp, wrstuden@nas.nasa.gov, Matthew.Alton@anheuser-busch.com, Hackers@FreeBSD.ORG, fs@FreeBSD.ORG Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org > >> > > I'm not familiar with the VFS_default stuff. All the vop_default_desc > >> > > routines in NetBSD point to error routines. > >> > > >> > In FreeBSD, they now point to default routines that are *not* error > >> > routines. This is the problem. I admit the change was very well > >> > intentioned, since it made the code a hell of a lot more readable, > >> > but choosing between readable and additional function, I take function > >> > over form (I think the way I would have "fixed" the readability is by > >> > making the operations that result in the descriptor set for a mounted > >> > FS instance be both discrete, and named for their specific function). > >> > >> As I recall most of FBSD's default routines are also error routines, if > >> the exceptions were a problem it would would be trivial to fix. > > > >You would have to de-collapse several VOP lists that have been > >pre-collapsed. > > You are talking gibberish here. Please show code where this is > a problem. When you write a proxy stacking layer, such as John Heidemann's network proxy stacking layer (an NFS alternative), VOP's which would normally be handled by vfs_default have to be handled on the other end of the proxy, instead, in the same way that they would be handled by the vfs_default stuff. Some VOP's, like advisory locking, need both local assertion and remote proxy of the VOP to avoid introducing race windows. The result of this is that, if you rely on the vfs_default stuff, then you can't proxy those VOP's into a different address space, either on another machine, or to a user space VFS stacking layer developement environment. This is the same problem that embedding VM references directly into any FS causes, and that vm_object_t aliases would exacerbate. John has, in the past, sent me a number of stacking layers done by various people, with the requirement that I not redistribute them, as they are not what he would consider to be properly representative of finished work. Since John himself did the network proxy, you could perhaps get him to send you a copy, so you could have direct access to code where this was a problem. Make sure that the system you are talking to over the proxy is not assumed to be a FreeBSD system (e.g. don't assume that the vfs_default stuff exists on the other end of the proxy, or that it would be functional). Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Aug 24 17:11:23 1999 Delivered-To: freebsd-fs@freebsd.org Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20]) by hub.freebsd.org (Postfix) with ESMTP id 0D82314DEF for ; Tue, 24 Aug 1999 17:11:20 -0700 (PDT) (envelope-from bright@wintelcom.net) Received: from localhost (bright@localhost) by fw.wintelcom.net (8.8.8/8.8.8) with ESMTP id RAA21060 for ; Tue, 24 Aug 1999 17:24:42 GMT (envelope-from bright@wintelcom.net) Date: Tue, 24 Aug 1999 17:24:42 +0000 (GMT) From: Alfred Perlstein To: fs@freebsd.org Subject: VFS cleanup and fh*() syscalls almost complete. Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org I have tested this code lightly, it compiles cleanly and the NFS server code still seems to work. http://big.endian.org/~bright/freebsd/in_progress/vfs-fhsyscall.diff Several things have been done, mostly inspired by NetBSD: 1) VFS_FHTOVP has been split into 2 VFS ops: VFS_FHTOVP now only takes a mountpoint, filehandle and **vnode VFS_CHECKEXP now takes the export checking arguments that VFS_FHTOVP used to take. 2) The casting of VFS ops to eopnotsupp() has been removed and vfs_nop*() functions have been put into kern/vfs_default.c This makes it more clear that certain VFS-ops are giving default behavior, either returning automatic success or returning EOPNOTSUPP. Someone mentioned that EOPNOTSUPP should be replaced by EINVAL, can that person please speak up? Also, some filesystems that actually implemented VFS-ops that did nothing or returned EOPNOTSUPP have had those functions deleted except when they are essential to understanding the VFS, or if it looked like they may eventually be filled in. 3) fh(open|stat|statfs) have been haphazardly implemented, meaning I won't be able to test them until tonight. Testers? Critics? Comments? please? If you're wondering why/what I'm doing, it's the kernel support for a lockd that I'm working on, as well as a cleanup I thought would make it easier to understand our filesystem code. I'm sure some people will be wondering: Why does VFS_CHECKEXP take a vnode and not a mount point? Hopefully in the future a filesystem will be able to more restrictively export its files, this will help facilitate that. Lastly, I'm jumping the gun here and posting a lightly tested patch because I've had to re-merge this stuff 3 times already and would like comments on not only functionality, but style as well. This way I can either commit it, (after more testing) or scrap the whole project. What I have to do next is make the fh*() syscalls solid and investigate the veto locking ideas that have been brought up in the past. thanks for you time, -Alfred Perlstein - [bright@rush.net|alfred@freebsd.org] Wintelcom systems administrator and programmer - http://www.wintelcom.net/ [bright@wintelcom.net] To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed Aug 25 10:34:38 1999 Delivered-To: freebsd-fs@freebsd.org Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20]) by hub.freebsd.org (Postfix) with ESMTP id 8176515351; Wed, 25 Aug 1999 10:34:30 -0700 (PDT) (envelope-from bright@wintelcom.net) Received: from localhost (bright@localhost) by fw.wintelcom.net (8.8.8/8.8.8) with ESMTP id KAA16565; Wed, 25 Aug 1999 10:49:12 GMT (envelope-from bright@wintelcom.net) Date: Wed, 25 Aug 1999 10:49:12 +0000 (GMT) From: Alfred Perlstein To: hackers@freebsd.org Cc: fs@freebsd.org Subject: second round, vfs and fh* calls. Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org I've done a bit more work on the VFS cleanup part of my diffs. Unfortunatly I had overlooked some of the filesystems and they were not compiling cleanly. (ext2fs) http://big.endian.org/~bright/freebsd/in_progress/vfs-fhsyscall.diff The fh*() syscalls are still being worked on, but I'd really like to get this into the tree so my patches don't get stale. thanks, -Alfred Perlstein - [bright@rush.net|alfred@freebsd.org] Wintelcom systems administrator and programmer - http://www.wintelcom.net/ [bright@wintelcom.net] To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu Aug 26 8:13: 5 1999 Delivered-To: freebsd-fs@freebsd.org Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20]) by hub.freebsd.org (Postfix) with ESMTP id 47FEC14D83; Thu, 26 Aug 1999 08:13:00 -0700 (PDT) (envelope-from bright@wintelcom.net) Received: from localhost (bright@localhost) by fw.wintelcom.net (8.8.8/8.8.8) with ESMTP id IAA18732; Thu, 26 Aug 1999 08:28:29 GMT (envelope-from bright@wintelcom.net) Date: Thu, 26 Aug 1999 08:28:29 +0000 (GMT) From: Alfred Perlstein To: hackers@freebsd.org Cc: fs@freebsd.org, Michael Hancock , David Greenman Subject: HEADS UP Reviewers. VFS changes to be committed. Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org I've posted 2 times asking for someone to review these diffs: http://big.endian.org/~bright/freebsd/in_progress/vfs-fhsyscall.diff Am I to take it that silence is accpetance? I'll be committing this to -current tonight or tomorrow unless I get feedback. See attached email for details. thank you, -Alfred Perlstein - [bright@rush.net|alfred@freebsd.org] Wintelcom systems administrator and programmer - http://www.wintelcom.net/ [bright@wintelcom.net] ---------- Forwarded message ---------- Date: Tue, 24 Aug 1999 17:24:42 +0000 (GMT) From: Alfred Perlstein To: fs@freebsd.org Subject: VFS cleanup and fh*() syscalls almost complete. I have tested this code lightly, it compiles cleanly and the NFS server code still seems to work. http://big.endian.org/~bright/freebsd/in_progress/vfs-fhsyscall.diff Several things have been done, mostly inspired by NetBSD: 1) VFS_FHTOVP has been split into 2 VFS ops: VFS_FHTOVP now only takes a mountpoint, filehandle and **vnode VFS_CHECKEXP now takes the export checking arguments that VFS_FHTOVP used to take. 2) The casting of VFS ops to eopnotsupp() has been removed and vfs_nop*() functions have been put into kern/vfs_default.c This makes it more clear that certain VFS-ops are giving default behavior, either returning automatic success or returning EOPNOTSUPP. Someone mentioned that EOPNOTSUPP should be replaced by EINVAL, can that person please speak up? Also, some filesystems that actually implemented VFS-ops that did nothing or returned EOPNOTSUPP have had those functions deleted except when they are essential to understanding the VFS, or if it looked like they may eventually be filled in. 3) fh(open|stat|statfs) have been haphazardly implemented, meaning I won't be able to test them until tonight. Testers? Critics? Comments? please? If you're wondering why/what I'm doing, it's the kernel support for a lockd that I'm working on, as well as a cleanup I thought would make it easier to understand our filesystem code. I'm sure some people will be wondering: Why does VFS_CHECKEXP take a vnode and not a mount point? Hopefully in the future a filesystem will be able to more restrictively export its files, this will help facilitate that. Lastly, I'm jumping the gun here and posting a lightly tested patch because I've had to re-merge this stuff 3 times already and would like comments on not only functionality, but style as well. This way I can either commit it, (after more testing) or scrap the whole project. What I have to do next is make the fh*() syscalls solid and investigate the veto locking ideas that have been brought up in the past. thanks for you time, -Alfred Perlstein - [bright@rush.net|alfred@freebsd.org] Wintelcom systems administrator and programmer - http://www.wintelcom.net/ [bright@wintelcom.net] To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu Aug 26 8:26:42 1999 Delivered-To: freebsd-fs@freebsd.org Received: from axl.noc.iafrica.com (axl.noc.iafrica.com [196.31.1.175]) by hub.freebsd.org (Postfix) with ESMTP id CEEEA15DD9; Thu, 26 Aug 1999 08:26:24 -0700 (PDT) (envelope-from sheldonh@axl.noc.iafrica.com) Received: from sheldonh (helo=axl.noc.iafrica.com) by axl.noc.iafrica.com with local-esmtp (Exim 3.02 #1) id 11K1Pc-0001ng-00; Thu, 26 Aug 1999 17:26:00 +0200 From: Sheldon Hearn To: Alfred Perlstein Cc: hackers@freebsd.org, fs@freebsd.org, Michael Hancock , David Greenman Subject: Re: HEADS UP Reviewers. VFS changes to be committed. In-reply-to: Your message of "Thu, 26 Aug 1999 08:28:29 GMT." Date: Thu, 26 Aug 1999 17:26:00 +0200 Message-ID: <6923.935681160@axl.noc.iafrica.com> Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Thu, 26 Aug 1999 08:28:29 GMT, Alfred Perlstein wrote: > Am I to take it that silence is accpetance? I'll be committing this > to -current tonight or tomorrow unless I get feedback. Recent discussions with bde and eivind indicate that at least some of the code you're about to touch has one or more maintainers. Kirk McKusick is probably one of them. Make sure you contact the maintainers directly before smacking their code. Also, I'd suggest that it's a bad idea to say "if I get no feedback before tonight, I'm committing". I think this applies even if it's not the first time you've asked for review. Basically, timezones and stuff make for a situation where such an e-mail is useless for many of your readers. Ciao, Sheldon. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu Aug 26 9:27:55 1999 Delivered-To: freebsd-fs@freebsd.org Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20]) by hub.freebsd.org (Postfix) with ESMTP id 1629615D46 for ; Thu, 26 Aug 1999 09:27:38 -0700 (PDT) (envelope-from bright@wintelcom.net) Received: from localhost (bright@localhost) by fw.wintelcom.net (8.8.8/8.8.8) with ESMTP id CAA20719; Thu, 26 Aug 1999 02:43:21 -0700 (PDT) (envelope-from bright@wintelcom.net) Date: Thu, 26 Aug 1999 09:43:21 +0000 (GMT) From: Alfred Perlstein To: Sheldon Hearn Cc: fs@FreeBSD.ORG Subject: Re: HEADS UP Reviewers. VFS changes to be committed. In-Reply-To: <6923.935681160@axl.noc.iafrica.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Thu, 26 Aug 1999, Sheldon Hearn wrote: > > > On Thu, 26 Aug 1999 08:28:29 GMT, Alfred Perlstein wrote: > > > Am I to take it that silence is accpetance? I'll be committing this > > to -current tonight or tomorrow unless I get feedback. > > Recent discussions with bde and eivind indicate that at least some of > the code you're about to touch has one or more maintainers. Kirk > McKusick is probably one of them. I will attempt to contact him. Afaik he is on vacation and won't be available for quite some time. David is the principal architect, and Mike Smith refered me to Michael Hancock. I probably should try phk as well as it seems he's been messing around with vfs lately as well. > Make sure you contact the maintainers directly before smacking their > code. I have done so in the past (uthread) and will continue to respect the maintainers. > Also, I'd suggest that it's a bad idea to say "if I get no feedback > before tonight, I'm committing". I think this applies even if it's not > the first time you've asked for review. Basically, timezones and stuff > make for a situation where such an e-mail is useless for many of your > readers. I agree with you on this, I just wanted to get the ball rolling on these changes. I've seen that in past silence often means acceptance, however I wanted to make sure, hence the email. I espcially don't want to re-merge my code, I'm sure you know how fustrating that gets after the 3rd or 4th time. :) Just fyi, the changes are just a cleanup imo with the addition of 3 new syscalls where suser(p) == 0 must be true. -Alfred > > Ciao, > Sheldon. > > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-hackers" in the body of the message > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu Aug 26 9:44: 8 1999 Delivered-To: freebsd-fs@freebsd.org Received: from jade.chc-chimes.com (jade.chc-chimes.com [216.28.46.6]) by hub.freebsd.org (Postfix) with ESMTP id D03BB14A2E; Thu, 26 Aug 1999 09:44:04 -0700 (PDT) (envelope-from billf@jade.chc-chimes.com) Received: by jade.chc-chimes.com (Postfix, from userid 1001) id 871FD1C2E; Thu, 26 Aug 1999 11:45:47 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by jade.chc-chimes.com (Postfix) with ESMTP id 8293C3815; Thu, 26 Aug 1999 11:45:47 -0400 (EDT) Date: Thu, 26 Aug 1999 11:45:47 -0400 (EDT) From: Bill Fumerola To: Sheldon Hearn Cc: Alfred Perlstein , hackers@freebsd.org, fs@freebsd.org, Michael Hancock , David Greenman Subject: Re: HEADS UP Reviewers. VFS changes to be committed. In-Reply-To: <6923.935681160@axl.noc.iafrica.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Thu, 26 Aug 1999, Sheldon Hearn wrote: > Also, I'd suggest that it's a bad idea to say "if I get no feedback > before tonight, I'm committing". I think this applies even if it's not > the first time you've asked for review. Basically, timezones and stuff > make for a situation where such an e-mail is useless for many of your > readers. This would be post #3 of the same code and changes that no-one has reponded to. -- - bill fumerola - billf@chc-chimes.com - BF1560 - computer horizons corp - - ph:(800) 252-2421 - bfumerol@computerhorizons.com - billf@FreeBSD.org - To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu Aug 26 9:44:42 1999 Delivered-To: freebsd-fs@freebsd.org Received: from arc.hq.cti.ru (arc.hq.cti.ru [195.34.40.3]) by hub.freebsd.org (Postfix) with ESMTP id A973614DAE; Thu, 26 Aug 1999 09:44:36 -0700 (PDT) (envelope-from tejblum@arc.hq.cti.ru) Received: from arc.hq.cti.ru (localhost [127.0.0.1]) by arc.hq.cti.ru (8.9.3/8.9.3) with ESMTP id UAA07357; Thu, 26 Aug 1999 20:42:17 +0400 (MSD) (envelope-from tejblum@arc.hq.cti.ru) Message-Id: <199908261642.UAA07357@arc.hq.cti.ru> X-Mailer: exmh version 2.0zeta 7/24/97 To: Alfred Perlstein Cc: hackers@FreeBSD.ORG, fs@FreeBSD.ORG Subject: Re: HEADS UP Reviewers. VFS changes to be committed. In-reply-to: Your message of "Thu, 26 Aug 1999 08:28:29 -0000." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Thu, 26 Aug 1999 20:42:16 +0400 From: Dmitrij Tejblum Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Just a few comments... > 2) The casting of VFS ops to eopnotsupp() has been removed and > vfs_nop*() functions have been put into kern/vfs_default.c > > This makes it more clear that certain VFS-ops are giving default > behavior, either returning automatic success or returning EOPNOTSUPP. I like the idea. (However, I think that the functions returning failure should not be called NOPs.) > Why does VFS_CHECKEXP take a vnode and not a mount point? > Hopefully in the future a filesystem will be able to more > restrictively export its files, this will help facilitate that. IMO, if it take a vnode, it should be VOP_CHECKEXP, not VFS_CHECKEXP. Dima To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu Aug 26 9:46:10 1999 Delivered-To: freebsd-fs@freebsd.org Received: from axl.noc.iafrica.com (axl.noc.iafrica.com [196.31.1.175]) by hub.freebsd.org (Postfix) with ESMTP id DB53B15D95; Thu, 26 Aug 1999 09:46:01 -0700 (PDT) (envelope-from sheldonh@axl.noc.iafrica.com) Received: from sheldonh (helo=axl.noc.iafrica.com) by axl.noc.iafrica.com with local-esmtp (Exim 3.02 #1) id 11K2eD-000CbI-00; Thu, 26 Aug 1999 18:45:09 +0200 From: Sheldon Hearn To: Bill Fumerola Cc: Alfred Perlstein , hackers@freebsd.org, fs@freebsd.org, Michael Hancock , David Greenman Subject: Re: HEADS UP Reviewers. VFS changes to be committed. In-reply-to: Your message of "Thu, 26 Aug 1999 11:45:47 -0400." Date: Thu, 26 Aug 1999 18:45:09 +0200 Message-ID: <48439.935685909@axl.noc.iafrica.com> Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Thu, 26 Aug 1999 11:45:47 -0400, Bill Fumerola wrote: > This would be post #3 of the same code and changes that no-one has > reponded to. I hear you, and I was aware of that when I made my comments. Basically, it's a waste of time saying such a thing, so either be prepared to wait longer, or don't say it. :-) Feelings, nothing more than feelings... Ciao, Sheldon. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu Aug 26 10:30:29 1999 Delivered-To: freebsd-fs@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [209.157.86.2]) by hub.freebsd.org (Postfix) with ESMTP id D312B14DAE; Thu, 26 Aug 1999 10:30:23 -0700 (PDT) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.9.3/8.9.1) id KAA23308; Thu, 26 Aug 1999 10:27:47 -0700 (PDT) (envelope-from dillon) Date: Thu, 26 Aug 1999 10:27:47 -0700 (PDT) From: Matthew Dillon Message-Id: <199908261727.KAA23308@apollo.backplane.com> To: Alfred Perlstein Cc: hackers@FreeBSD.ORG, fs@FreeBSD.ORG, Michael Hancock , David Greenman Subject: Re: HEADS UP Reviewers. VFS changes to be committed. References: Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org :I've posted 2 times asking for someone to review these diffs: : :http://big.endian.org/~bright/freebsd/in_progress/vfs-fhsyscall.diff : :Am I to take it that silence is accpetance? I'll be committing this :to -current tonight or tomorrow unless I get feedback. : :See attached email for details. : :thank you, :-Alfred Perlstein - [bright@rush.net|alfred@freebsd.org] :Wintelcom systems administrator and programmer : - http://www.wintelcom.net/ [bright@wintelcom.net] : :3) fh(open|stat|statfs) have been haphazardly implemented, meaning I : won't be able to test them until tonight. : :Testers? Critics? Comments? please? : :If you're wondering why/what I'm doing, it's the kernel support :for a lockd that I'm working on, as well as a cleanup I thought :would make it easier to understand our filesystem code. : :I'm sure some people will be wondering: :Why does VFS_CHECKEXP take a vnode and not a mount point? :.. :-Alfred Perlstein - [bright@rush.net|alfred@freebsd.org] I've done a quick once-over of your patch. From the point of view of the work I'm doing and the work Kirk will be doing later on, I do not think the patch will cause any problems since you are adding new VOPs for the most part rather then modifying (too many) existing VOPs. Most of the work that Kirk and I will be doing will concentrate on namei, locking, and I/O, which you mostly avoid in your patch. In general I like the idea of implementing reasonable defaults. I would ask two things though: * First, please add comprehensive /* */ comments in front of each vfsnop_*() procedure explaining what it does, why it returns what it returns, locking requirements (if any) on entry, and side effects on return. This is just for readability. Do the same for all the procedures you are adding, in fact. * I think you can safely commit any elements that are not used by existing builds since they are not likely to impact existing builds operationally. Then see what you have left over. If it is not significant, commit that to. If it is significant, do more comprehensive testing on what you have left over (i.e. that impacts existing builds) and ask for another review after testing, before committing it. A working lock daemon would be totally awesome! The fh*() routines you are adding are roughly what you (and we) need to make progress in this area! -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Fri Aug 27 13:21:57 1999 Delivered-To: freebsd-fs@freebsd.org Received: from cs.columbia.edu (cs.columbia.edu [128.59.16.20]) by hub.freebsd.org (Postfix) with ESMTP id 836D81557C; Fri, 27 Aug 1999 13:21:35 -0700 (PDT) (envelope-from ezk@shekel.mcl.cs.columbia.edu) Received: from shekel.mcl.cs.columbia.edu (shekel.mcl.cs.columbia.edu [128.59.18.15]) by cs.columbia.edu (8.9.1/8.9.1) with ESMTP id QAA02879; Fri, 27 Aug 1999 16:18:47 -0400 (EDT) Received: (from ezk@localhost) by shekel.mcl.cs.columbia.edu (8.9.1/8.9.1) id QAA22373; Fri, 27 Aug 1999 16:18:45 -0400 (EDT) Date: Fri, 27 Aug 1999 16:18:45 -0400 (EDT) Message-Id: <199908272018.QAA22373@shekel.mcl.cs.columbia.edu> X-Authentication-Warning: shekel.mcl.cs.columbia.edu: ezk set sender to ezk@shekel.mcl.cs.columbia.edu using -f From: Erez Zadok To: Matthew Dillon Cc: Alfred Perlstein , hackers@FreeBSD.ORG, fs@FreeBSD.ORG, Michael Hancock , David Greenman Subject: Re: HEADS UP Reviewers. VFS changes to be committed. In-reply-to: Your message of "Thu, 26 Aug 1999 10:27:47 PDT." <199908261727.KAA23308@apollo.backplane.com> Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org In message <199908261727.KAA23308@apollo.backplane.com>, Matthew Dillon writes: [...] > I would ask two things though: > > * First, please add comprehensive /* */ comments in front of each > vfsnop_*() procedure explaining what it does, why it returns what > it returns, locking requirements (if any) on entry, and side effects > on return. This is just for readability. > > Do the same for all the procedures you are adding, in fact. Moreover, I would strongly recommend xplicitly documenting the following: - which function args are in-args and which are out-args? - does the function takes any allocated memory that it is expected to free? - is the function expected to allocate any memory objects that have to be freed elsewhere? - does the function increase or decrease any reference counts of any objects? Is it expected to? These and other requirements are essentially the "interface" between the VFS and lower-level file systems. Figuring out this stuff on every OS and OS revision (esp. when the VFS changes so often---linux) was the longest most frustrating task I faced when writing my Wrapfs stackable f/s module for solaris, freebsd, and linux. I wish documentation had been in place. > * I think you can safely commit any elements that are not used by > existing builds since they are not likely to impact existing > builds operationally. > > Then see what you have left over. If it is not significant, commit > that to. If it is significant, do more comprehensive testing on > what you have left over (i.e. that impacts existing builds) and > ask for another review after testing, before committing it. Erez. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Fri Aug 27 13:35: 0 1999 Delivered-To: freebsd-fs@freebsd.org Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.40.131]) by hub.freebsd.org (Postfix) with ESMTP id 7DAB816064; Fri, 27 Aug 1999 13:34:48 -0700 (PDT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.9.3/8.9.2) with ESMTP id WAA06795; Fri, 27 Aug 1999 22:32:27 +0200 (CEST) (envelope-from phk@critter.freebsd.dk) To: Erez Zadok Cc: Matthew Dillon , Alfred Perlstein , hackers@FreeBSD.ORG, fs@FreeBSD.ORG, Michael Hancock , David Greenman Subject: Re: HEADS UP Reviewers. VFS changes to be committed. In-reply-to: Your message of "Fri, 27 Aug 1999 16:18:45 EDT." <199908272018.QAA22373@shekel.mcl.cs.columbia.edu> Date: Fri, 27 Aug 1999 22:32:27 +0200 Message-ID: <6793.935785947@critter.freebsd.dk> From: Poul-Henning Kamp Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Uhm, have any of you actually ever looked at src/sys/kern/vnode_if.src ? Poul-Henning In message <199908272018.QAA22373@shekel.mcl.cs.columbia.edu>, Erez Zadok write s: >In message <199908261727.KAA23308@apollo.backplane.com>, Matthew Dillon writes: >[...] >> I would ask two things though: >> >> * First, please add comprehensive /* */ comments in front of each >> vfsnop_*() procedure explaining what it does, why it returns what >> it returns, locking requirements (if any) on entry, and side effects >> on return. This is just for readability. >> >> Do the same for all the procedures you are adding, in fact. > >Moreover, I would strongly recommend xplicitly documenting the following: > >- which function args are in-args and which are out-args? > >- does the function takes any allocated memory that it is expected to free? > >- is the function expected to allocate any memory objects that have to be > freed elsewhere? > >- does the function increase or decrease any reference counts of any objects? > Is it expected to? > >These and other requirements are essentially the "interface" between the VFS >and lower-level file systems. Figuring out this stuff on every OS and OS >revision (esp. when the VFS changes so often---linux) was the longest most >frustrating task I faced when writing my Wrapfs stackable f/s module for >solaris, freebsd, and linux. I wish documentation had been in place. > >> * I think you can safely commit any elements that are not used by >> existing builds since they are not likely to impact existing >> builds operationally. >> >> Then see what you have left over. If it is not significant, commit >> that to. If it is significant, do more comprehensive testing on >> what you have left over (i.e. that impacts existing builds) and >> ask for another review after testing, before committing it. > >Erez. > > >To Unsubscribe: send mail to majordomo@FreeBSD.org >with "unsubscribe freebsd-fs" in the body of the message > -- Poul-Henning Kamp FreeBSD coreteam member phk@FreeBSD.ORG "Real hackers run -current on their laptop." FreeBSD -- It will take a long time before progress goes too far! To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sat Aug 28 11:20:43 1999 Delivered-To: freebsd-fs@freebsd.org Received: from mail.du.gtn.com (mail.du.gtn.com [194.77.9.57]) by hub.freebsd.org (Postfix) with ESMTP id 1867F14EE0 for ; Sat, 28 Aug 1999 11:20:39 -0700 (PDT) (envelope-from ticso@cicely8.cicely.de) Received: from mail.cicely.de (cicely.de [194.231.9.142]) by mail.du.gtn.com (8.9.3/8.9.3) with ESMTP id UAA20751 for ; Sat, 28 Aug 1999 20:19:30 +0200 (MET DST) Received: from cicely8.cicely.de (cicely8.cicely.de [10.1.2.10]) by mail.cicely.de (8.9.0/8.9.0) with ESMTP id UAA00825 for ; Sat, 28 Aug 1999 20:16:03 +0200 (CEST) Received: (from ticso@localhost) by cicely8.cicely.de (8.9.3/8.9.2) id UAA27736 for freebsd-fs@freebsd.org; Sat, 28 Aug 1999 20:17:34 +0200 (CEST) (envelope-from ticso) Date: Sat, 28 Aug 1999 20:17:23 +0200 From: Bernd Walter To: freebsd-fs@freebsd.org Subject: fs-locking and fs memory copies questions Message-ID: <19990828201723.A27704@cicely8.cicely.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.95.3i Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org I would like to grow a ufs filesystem without the need to umount it before. The growing itself will be possible for me in the near future without any problems. Before doing it while the fs is mounted I need to flush all blocks back to the partition and lock the fs until the fs-strucktures are all updated. All programs should block during the lock. Is it possible to flush relieable with softupdates? After the strucktures are up to date I need to reread the superblock and all in memory copies of the previously last cg and all derived variables. In some cases it is needed that I reallocate some frags/blocks. That means that cg 0 has changed and maybe some other cgs I used to move the frags to. That also means that at least one inode or indirection block have beend modified and of course that in memory references to diskblocks of indirection and/or datablocks need to be updated. Finally the locks need to be removed and all waiting operations can continue. I would be happy if someone can explain or point me to a good documentation about internal operations regarding the points I need. -- B.Walter COSMO-Project http://www.cosmo-project.de ticso@cicely.de Usergroup info@cosmo-project.de To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sat Aug 28 15:50:16 1999 Delivered-To: freebsd-fs@freebsd.org Received: from smtp03.primenet.com (smtp03.primenet.com [206.165.6.133]) by hub.freebsd.org (Postfix) with ESMTP id 5A57E14C3C for ; Sat, 28 Aug 1999 15:50:08 -0700 (PDT) (envelope-from tlambert@usr01.primenet.com) Received: (from daemon@localhost) by smtp03.primenet.com (8.9.3/8.9.3) id PAA10531; Sat, 28 Aug 1999 15:49:48 -0700 (MST) Received: from usr01.primenet.com(206.165.6.201) via SMTP by smtp03.primenet.com, id smtpdAAADEaaJu; Sat Aug 28 15:49:39 1999 Received: (from tlambert@localhost) by usr01.primenet.com (8.8.5/8.8.5) id PAA06168; Sat, 28 Aug 1999 15:49:51 -0700 (MST) From: Terry Lambert Message-Id: <199908282249.PAA06168@usr01.primenet.com> Subject: Re: fs-locking and fs memory copies questions To: ticso@cicely.de (Bernd Walter) Date: Sat, 28 Aug 1999 22:49:51 +0000 (GMT) Cc: freebsd-fs@FreeBSD.ORG In-Reply-To: <19990828201723.A27704@cicely8.cicely.de> from "Bernd Walter" at Aug 28, 99 08:17:23 pm X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org > I would like to grow a ufs filesystem without the need to umount it before. > The growing itself will be possible for me in the near future without any > problems. > > Before doing it while the fs is mounted I need to flush all blocks back to > the partition and lock the fs until the fs-strucktures are all updated. > All programs should block during the lock. > Is it possible to flush relieable with softupdates? running "mount -u" on the FS in question does this. > After the strucktures are up to date I need to reread the superblock and > all in memory copies of the previously last cg and all derived variables. You will need to modify code to do this. The "mount -u" code doesn't cause the in core data to be re-read, and doesn't sync out data to a "clean" state. The code you need to modify is in ffs_mount in ffs_vfops.c. > In some cases it is needed that I reallocate some frags/blocks. This is the problem with growing FS's: the hash fill on the preexisting cylinger groups will be higher than on the new cylinder groups, leading to fragmentation. There are two ways around this, one good, one bad. I expect you'll want to use the "bad" way: Good: Defrag the drive by either writing and then running a defragger, or by backing up and restoring files. You could copy the files in place, if you had enough disk, and used a holey copy program (e.g. GNU tar). Bad: Give preference to the new CG's for all allocations; the problem with this is, you never know when to stop doing this. 8-(. > That means that cg 0 has changed and maybe some other cgs I used > to move the frags to. That also means that at least one inode > or indirection block have beend modified and of course that in > memory references to diskblocks of indirection and/or > datablocks need to be updated. > Finally the locks need to be removed and all waiting operations can continue. I think you will find an "on the fly" defragger a difficult thing to write. > I would be happy if someone can explain or point me to a good documentation > about internal operations regarding the points I need. The above will get you started. A generic defragger would be a good think to have, if you wanted to allow shrinking partitions, too. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sat Aug 28 16:40:27 1999 Delivered-To: freebsd-fs@freebsd.org Received: from mail.du.gtn.com (mail.du.gtn.com [194.77.9.57]) by hub.freebsd.org (Postfix) with ESMTP id 033B214D8C for ; Sat, 28 Aug 1999 16:40:24 -0700 (PDT) (envelope-from ticso@cicely8.cicely.de) Received: from mail.cicely.de (cicely.de [194.231.9.142]) by mail.du.gtn.com (8.9.3/8.9.3) with ESMTP id BAA04443; Sun, 29 Aug 1999 01:37:21 +0200 (MET DST) Received: from cicely8.cicely.de (cicely8.cicely.de [10.1.2.10]) by mail.cicely.de (8.9.0/8.9.0) with ESMTP id BAA01607; Sun, 29 Aug 1999 01:37:18 +0200 (CEST) Received: (from ticso@localhost) by cicely8.cicely.de (8.9.3/8.9.2) id BAA28142; Sun, 29 Aug 1999 01:36:57 +0200 (CEST) (envelope-from ticso) Date: Sun, 29 Aug 1999 01:36:56 +0200 From: Bernd Walter To: Terry Lambert Cc: Bernd Walter , freebsd-fs@FreeBSD.ORG Subject: Re: fs-locking and fs memory copies questions Message-ID: <19990829013655.E27811@cicely8.cicely.de> References: <19990828201723.A27704@cicely8.cicely.de> <199908282249.PAA06168@usr01.primenet.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.95.3i In-Reply-To: <199908282249.PAA06168@usr01.primenet.com>; from Terry Lambert on Sat, Aug 28, 1999 at 10:49:51PM +0000 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Sat, Aug 28, 1999 at 10:49:51PM +0000, Terry Lambert wrote: > This is the problem with growing FS's: the hash fill on the > preexisting cylinger groups will be higher than on the new > cylinder groups, leading to fragmentation. Mmmmm - I can't follow you on this. The existing cg's are prefilled with old files. The new ones are empty after growing. I beleaved ffs would prever the new ones automaticaly because of the super-blocks summary information. Guess I need to look more deeply at the block-searching routines. At least the problems created with that should loose during usage. > > A generic defragger would be a good think to have, if you wanted > to allow shrinking partitions, too. > I already need to move blocks around in case the superblock-summary information needs another frag. It was one of the more difficult things to do there are still some erros left about this - the rest was quite easy. Shrinking is something I don't beleave to get working properly, because that would mean loosing cg's with all their inodes. Moving files to different inodes is generaly a mess for NFS-servers. Another point is that finding the reference for a frag is a real expensive thing to do :( -- B.Walter COSMO-Project http://www.cosmo-project.de ticso@cicely.de Usergroup info@cosmo-project.de To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sat Aug 28 20: 9:25 1999 Delivered-To: freebsd-fs@freebsd.org Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20]) by hub.freebsd.org (Postfix) with ESMTP id 55C4914F7A; Sat, 28 Aug 1999 20:09:21 -0700 (PDT) (envelope-from bright@wintelcom.net) Received: from localhost (bright@localhost) by fw.wintelcom.net (8.8.8/8.8.8) with ESMTP id NAA17700; Sat, 28 Aug 1999 13:25:34 -0700 (PDT) (envelope-from bright@wintelcom.net) Date: Sat, 28 Aug 1999 20:25:34 +0000 (GMT) From: Alfred Perlstein To: Poul-Henning Kamp Cc: Erez Zadok , Matthew Dillon , hackers@FreeBSD.ORG, fs@FreeBSD.ORG, Michael Hancock , David Greenman Subject: Re: HEADS UP Reviewers. VFS changes to be committed. In-Reply-To: <6793.935785947@critter.freebsd.dk> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Fri, 27 Aug 1999, Poul-Henning Kamp wrote: > > Uhm, have any of you actually ever looked at src/sys/kern/vnode_if.src ? I can't really tell if you are commenting on the diffs I provided or if you are commmenting on the comments I have recieved, or both. Either way, could you elaborate a bit? I was hoping for your input on this issue. thank you, -Alfred Perlstein - [bright@rush.net|alfred@freebsd.org] Wintelcom systems administrator and programmer - http://www.wintelcom.net/ [bright@wintelcom.net] > > Poul-Henning > > In message <199908272018.QAA22373@shekel.mcl.cs.columbia.edu>, Erez Zadok write > s: > >In message <199908261727.KAA23308@apollo.backplane.com>, Matthew Dillon writes: > >[...] > >> I would ask two things though: > >> > >> * First, please add comprehensive /* */ comments in front of each > >> vfsnop_*() procedure explaining what it does, why it returns what > >> it returns, locking requirements (if any) on entry, and side effects > >> on return. This is just for readability. > >> > >> Do the same for all the procedures you are adding, in fact. > > > >Moreover, I would strongly recommend xplicitly documenting the following: > > > >- which function args are in-args and which are out-args? > > > >- does the function takes any allocated memory that it is expected to free? > > > >- is the function expected to allocate any memory objects that have to be > > freed elsewhere? > > > >- does the function increase or decrease any reference counts of any objects? > > Is it expected to? > > > >These and other requirements are essentially the "interface" between the VFS > >and lower-level file systems. Figuring out this stuff on every OS and OS > >revision (esp. when the VFS changes so often---linux) was the longest most > >frustrating task I faced when writing my Wrapfs stackable f/s module for > >solaris, freebsd, and linux. I wish documentation had been in place. > > > >> * I think you can safely commit any elements that are not used by > >> existing builds since they are not likely to impact existing > >> builds operationally. > >> > >> Then see what you have left over. If it is not significant, commit > >> that to. If it is significant, do more comprehensive testing on > >> what you have left over (i.e. that impacts existing builds) and > >> ask for another review after testing, before committing it. > > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message