From owner-freebsd-fs Mon Nov 11 04:59:29 1996 Return-Path: owner-fs Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id EAA12398 for fs-outgoing; Mon, 11 Nov 1996 04:59:29 -0800 (PST) Received: from bbs.mpcs.com (hgoldste@bbs.mpcs.com [204.215.226.2]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id EAA12381; Mon, 11 Nov 1996 04:59:23 -0800 (PST) Received: (from hgoldste@localhost) by bbs.mpcs.com (8.8.2/8.8.2/MPCS) id HAA31462; Mon, 11 Nov 1996 07:59:20 -0500 Date: Mon, 11 Nov 1996 07:59:20 -0500 From: Howard Goldstein Message-Id: <199611111259.HAA31462@bbs.mpcs.com> To: freebsd-isp@freebsd.org, freebsd-fs@freebsd.org Cc: dg@root.com, michaelv@MindBender.serv.net Subject: Re: Best mount options, tunefs for newsserver In-Reply-To: <199611110210.SAA01988@root.com> Reply-To: hgoldste@bbs.mpcs.com Sender: owner-fs@freebsd.org X-Loop: FreeBSD.org Precedence: bulk [for just one swipe going to fbsd-fs too] In article <199611110210.SAA01988@root.com>, dg@root.com wrote: : >I'll have to defer this to someone more knowledgable about the : >internals of FreeBSD... : : We didn't have async support in 2.1.0. We had partial support for it in : 2.1.5, and full support for it in 2.2/-current. In -current, it should take : only a few seconds to delete /usr/src. :-) (I haven't tried, however) The Ahh excellent I guess it's time to grab some code and start following freebsd-current. Thanks! : ultra-fast async in 2.2/-current also means that it has a much higher chance : of trashing your filesystem if the system should go down before the stuff : is written out...so it's a mixed bag. Well if we lose the newsspool, overviews, and history it's not such a big deal, just wait 6 days and it's all back again. In some ways it's a Good Thing as all the smut and spam are gone, albeit temporarily. Note our users would not agree with this. Of course 99% probably wouldn't notice if I dropped everything other than alt.binaries.pictures.erotica, but that's another story. Would be neat to have a new class of filesystem, call it "expfs" (expendable filesystem) for low-value high-volume stuff like news articles. Could have default async writes, noatime, preferred time optimization, anything else that would lend itself to this sort of thing. Regarding current, can one do a surgical strike (make on kernel only) install on it or is a make world needed as it would be for the NOATIME patch? -- Howard Goldstein From owner-freebsd-fs Mon Nov 11 06:25:47 1996 Return-Path: owner-fs Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id GAA16291 for fs-outgoing; Mon, 11 Nov 1996 06:25:47 -0800 (PST) Received: from brasil.moneng.mei.com (brasil.moneng.mei.com [151.186.109.160]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id GAA16285; Mon, 11 Nov 1996 06:25:45 -0800 (PST) Received: (from jgreco@localhost) by brasil.moneng.mei.com (8.7.Beta.1/8.7.Beta.1) id IAA19231; Mon, 11 Nov 1996 08:25:10 -0600 From: Joe Greco Message-Id: <199611111425.IAA19231@brasil.moneng.mei.com> Subject: Re: Best mount options, tunefs for newsserver To: hgoldste@bbs.mpcs.com Date: Mon, 11 Nov 1996 08:25:09 -0600 (CST) Cc: freebsd-isp@freebsd.org, freebsd-fs@freebsd.org, dg@root.com, michaelv@MindBender.serv.net In-Reply-To: <199611111259.HAA31462@bbs.mpcs.com> from "Howard Goldstein" at Nov 11, 96 07:59:20 am X-Mailer: ELM [version 2.4 PL24] Content-Type: text Sender: owner-fs@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > Would be neat to have a new class of filesystem, call it "expfs" > (expendable filesystem) for low-value high-volume stuff like news > articles. Could have default async writes, noatime, preferred time > optimization, anything else that would lend itself to this sort of thing. Oh, boatloads :-) One big win in general would be an extension to FFS to allow some sort of sorted or hashed directory, which would be useful for directories with lots of files (not just news spool, think about large mail spools, etc). News in particular could stand to have its very own type of filesystem, since it has lots of things that can be optimized for. Consider changing the news directory behavior on a news spool. You have two types of data, subdirectories and articles. In general isdigit(element[0]) is a good first guess at which type of directory entry it is, so now consider the following: A portion of the "directory" is reserved for subdirectories. Since it is not often updated with new entries, it is considered to be acceptable to maintain it as a sorted list, or some other method (hashing perhaps) to allow rapid binary tree style lookups. A portion of the "directory" is reserved for news articles. Usenet news articles have fascinating properties: they are written in numerically ascending order, and tend to be erased in the same fashion. Adding a new file need be only as complex as knowing the offset of the last entry written and checking against it to make sure that the new entry is higher (if not, one would need to shuffle the directory to "make it right"). Removal simply zeroes the entry; empty directory blocks can be freed. As a further twist, the "article" entries are stored in native integer format as opposed to ASCII "string" format, allowing comparisons to happen even more rapidly. Now we have mechanisms in place for extremely fast lookup operations and write operations. If only I had the hours in the day to learn how to write my own file systems... FFS is excellent as a general purpose file system, but it lacks the ability to take advantage of these types of things. ... JG From owner-freebsd-fs Mon Nov 11 09:29:59 1996 Return-Path: owner-fs Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id JAA27706 for fs-outgoing; Mon, 11 Nov 1996 09:29:59 -0800 (PST) Received: from databus.databus.com (databus.databus.com [198.186.154.34]) by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id JAA27689; Mon, 11 Nov 1996 09:29:54 -0800 (PST) From: Barney Wolff To: freebsd-isp@freebsd.org, freebsd-fs@freebsd.org Date: Mon, 11 Nov 1996 12:22 EST Subject: Re: Best mount options, tunefs for newsserver Content-Type: text/plain Message-ID: <3287628d0.4fed@databus.databus.com> Sender: owner-fs@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Did God say to Moses that each article must reside in its own file? This is an application problem, not a file system problem. Or at least it is if you ever want to move the solution to another OS. Barney Wolff From owner-freebsd-fs Mon Nov 11 10:54:47 1996 Return-Path: owner-fs Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id KAA02595 for fs-outgoing; Mon, 11 Nov 1996 10:54:47 -0800 (PST) Received: from ingenieria ([168.176.15.11]) by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id KAA02575; Mon, 11 Nov 1996 10:54:40 -0800 (PST) Received: from unalslip.usc.unal.edu.co by ingenieria (SMI-8.6/SMI-SVR4) id NAA00900; Mon, 11 Nov 1996 13:54:48 +0600 Message-ID: <32879F01.14@fps.biblos.unal.edu.co> Date: Mon, 11 Nov 1996 13:47:45 -0800 From: "Pedro Giffuni S." Reply-To: m230761@ingenieria.ingsala.unal.edu.co Organization: Universidad Nacional de Colombia X-Mailer: Mozilla 3.0 (Win16; I) MIME-Version: 1.0 To: FreeBSD-fs@FreeBSD.org CC: hackers@FreeBSD.org Subject: Info on SYSV fs available: filesystem expert required Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Sender: owner-fs@FreeBSD.org X-Loop: FreeBSD.org Precedence: bulk Hello: Paul Monday just sent me a postscript file with the information of how he implemented an SCO filesystem for Linux. He is going to send me the tarball also. It doesn´t seem difficult, but someone that knows the inner workings of FreeBSD filesystems can do a really great job. Where should I put this information? Pedro. From owner-freebsd-fs Mon Nov 11 11:05:01 1996 Return-Path: owner-fs Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id LAA03474 for fs-outgoing; Mon, 11 Nov 1996 11:05:01 -0800 (PST) Received: from brasil.moneng.mei.com (brasil.moneng.mei.com [151.186.109.160]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id LAA03459; Mon, 11 Nov 1996 11:04:57 -0800 (PST) Received: (from jgreco@localhost) by brasil.moneng.mei.com (8.7.Beta.1/8.7.Beta.1) id NAA19680; Mon, 11 Nov 1996 13:02:58 -0600 From: Joe Greco Message-Id: <199611111902.NAA19680@brasil.moneng.mei.com> Subject: Re: Best mount options, tunefs for newsserver To: barney@databus.com (Barney Wolff) Date: Mon, 11 Nov 1996 13:02:57 -0600 (CST) Cc: freebsd-isp@freebsd.org, freebsd-fs@freebsd.org In-Reply-To: <3287628d0.4fed@databus.databus.com> from "Barney Wolff" at Nov 11, 96 12:22:00 pm X-Mailer: ELM [version 2.4 PL24] Content-Type: text Sender: owner-fs@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > Did God say to Moses that each article must reside in its own file? > > This is an application problem, not a file system problem. Or at > least it is if you ever want to move the solution to another OS. God said to Moses that unless you want to write your own news server, each article must reside in its own file. Some of us are already "breaking" that paradigm. But practical production use is still a ways in the future. However, to a certain extent, an OS should be considered as something which provides an environment and tools with which to perform tasks. Debating whether God told Moses to use those tools, or if he told Moses to write a good database engine to handle it, is sort of irrelevant... There are some clear deficiencies in FFS. The lack of rapid directory lookups for large directories is one. The lack of a "bat outta hell" mode for data writes is another (since generally I could care less if I lose articles after a crash, maybe I don't even care too much about directories... but I do want it to come back without manual fsck intervention, even having lost some data)... That's just my opinion though... :-) ... JG From owner-freebsd-fs Mon Nov 11 13:40:53 1996 Return-Path: owner-fs Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id NAA11957 for fs-outgoing; Mon, 11 Nov 1996 13:40:53 -0800 (PST) Received: from news1.gtn.com (news1.gtn.com [192.109.159.3]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id NAA11921; Mon, 11 Nov 1996 13:40:37 -0800 (PST) Received: (from uucp@localhost) by news1.gtn.com (8.7.2/8.7.2) with UUCP id WAA02531; Mon, 11 Nov 1996 22:15:33 +0100 (MET) Received: from localhost (localhost [127.0.0.1]) by klemm.gtn.com (8.8.2/8.8.2) with ESMTP id VAA02283; Mon, 11 Nov 1996 21:30:15 +0100 (MET) Date: Mon, 11 Nov 1996 21:30:14 +0100 (MET) From: Andreas Klemm To: "Pedro Giffuni S." cc: FreeBSD-fs@FreeBSD.org, hackers@FreeBSD.org Subject: Re: Info on SYSV fs available: filesystem expert required In-Reply-To: <32879F01.14@fps.biblos.unal.edu.co> Message-ID: X-try-apsfilter: ftp://sunsite.unc.edu/pub/Linux/system/Printing/aps-491.tgz X-Fax: +49 2137 2018 X-Phone: +49 2137 2020 MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=ISO-8859-1 Content-Transfer-Encoding: 8BIT Sender: owner-fs@FreeBSD.org X-Loop: FreeBSD.org Precedence: bulk On Mon, 11 Nov 1996, Pedro Giffuni S. wrote: > Hello: > Paul Monday just sent me a postscript file with the information of how > he implemented an SCO filesystem for Linux. He is going to send me the > tarball also. > It doesn´t seem difficult, but someone that knows the inner workings of > FreeBSD filesystems can do a really great job. > Where should I put this information? What about a place in /usr/share/doc ... ? Especially for technical manuals, reports ?! andreas@klemm.gtn.com /\/\___ Wiechers & Partner Datentechnik GmbH Andreas Klemm ___/\/\/ Support Unix -- andreas.klemm@wup.de pgp p-key http://www-swiss.ai.mit.edu/~bal/pks-toplev.html >>> powered by <<< ftp://sunsite.unc.edu/pub/Linux/system/Printing/aps-491.tgz >>> FreeBSD <<< From owner-freebsd-fs Mon Nov 11 15:43:52 1996 Return-Path: owner-fs Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id PAA21489 for fs-outgoing; Mon, 11 Nov 1996 15:43:52 -0800 (PST) Received: from ingenieria ([168.176.15.11]) by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id PAA21484; Mon, 11 Nov 1996 15:43:44 -0800 (PST) Received: by ingenieria (SMI-8.6/SMI-SVR4) id SAA01317; Mon, 11 Nov 1996 18:43:35 +0600 Date: Mon, 11 Nov 1996 18:43:35 +0600 (GMT) From: Pedro Giffuni To: Andreas Klemm cc: FreeBSD-fs@FreeBSD.org, hackers@FreeBSD.org Subject: Re: Info on SYSV fs available: filesystem expert required In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Content-Transfer-Encoding: QUOTED-PRINTABLE Sender: owner-fs@FreeBSD.org X-Loop: FreeBSD.org Precedence: bulk On Mon, 11 Nov 1996, Andreas Klemm wrote: >=20 > What about a place in /usr/share/doc ... ? Especially for=20 > technical manuals, reports ?! For the time being I left a copy in freefall (/incoming), it seemed to=20 have less junk than ftp.freebsd.org. Its called main2.ps.gz (very short), when more information arrives I will= =20 write an explanatory note and make it available. Pedro. > On Mon, 11 Nov 1996, Pedro Giffuni S. wrote: >=20 > > Hello: > > Paul Monday just sent me a postscript file with the information of how > > he implemented an SCO filesystem for Linux. He is going to send me the > > tarball also. > > It doesn=B4t seem difficult, but someone that knows the inner workings = of > > FreeBSD filesystems can do a really great job. > > Where should I put this information? >=20 > andreas@klemm.gtn.com /\/\___ Wiechers & Partner Datentechni= k GmbH > Andreas Klemm ___/\/\/ Support Unix -- andreas.klemm@= wup.de > pgp p-key http://www-swiss.ai.mit.edu/~bal/pks-toplev.html >>> powered = by <<< > ftp://sunsite.unc.edu/pub/Linux/system/Printing/aps-491.tgz >>> FreeB= SD <<< >=20 >=20 From owner-freebsd-fs Wed Nov 13 06:49:51 1996 Return-Path: owner-fs Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id GAA19939 for fs-outgoing; Wed, 13 Nov 1996 06:49:51 -0800 (PST) Received: from parkplace.cet.co.jp (parkplace.cet.co.jp [202.32.64.1]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id GAA19926; Wed, 13 Nov 1996 06:49:46 -0800 (PST) Received: from localhost (michaelh@localhost) by parkplace.cet.co.jp (8.8.2/CET-v2.1) with SMTP id OAA15158; Wed, 13 Nov 1996 14:49:44 GMT Date: Wed, 13 Nov 1996 23:49:43 +0900 (JST) From: Michael Hancock To: FreeBSD Hackers cc: freebsd-fs@FreeBSD.ORG Subject: NFS bypass op and the utok layer Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-fs@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk Were these even considered when the FreeBSD vnode stacking implementation was done? The NFS default op is the one returning the NOT SUPPORTED error. A bypass op would allow you to stack on top of an out-of-kernel layer which could then be layered on a utok layer to cross the boundary again. I guess the fs memory allocation architecture is not compatible with this. Debugging in userland would sure be cool, when you're satisfied take away the transport layers and you're back in the kernel. Regards, Mike Hancock From owner-freebsd-fs Wed Nov 13 10:05:41 1996 Return-Path: owner-fs Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id KAA02505 for fs-outgoing; Wed, 13 Nov 1996 10:05:41 -0800 (PST) Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211]) by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id KAA02495; Wed, 13 Nov 1996 10:05:32 -0800 (PST) Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id KAA22484; Wed, 13 Nov 1996 10:54:58 -0700 From: Terry Lambert Message-Id: <199611131754.KAA22484@phaeton.artisoft.com> Subject: Re: NFS bypass op and the utok layer To: michaelh@cet.co.jp (Michael Hancock) Date: Wed, 13 Nov 1996 10:54:58 -0700 (MST) Cc: Hackers@freebsd.org, freebsd-fs@freebsd.org In-Reply-To: from "Michael Hancock" at Nov 13, 96 11:49:43 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-fs@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Boy, people keep asking questions for which my work is the answer... this is more than a little cool. 8-). > Were these even considered when the FreeBSD vnode stacking implementation > was done? > > The NFS default op is the one returning the NOT SUPPORTED error. A bypass > op would allow you to stack on top of an out-of-kernel layer which could > then be layered on a utok layer to cross the boundary again. > > I guess the fs memory allocation architecture is not compatible with this. You have hit the nail on the head. There are many places where the FS is expected to allocate something which it will never deallocate, or deallocate something which it did not allocate. Examples include: o The vfs_syscalls.c generated namei cn_pnbuf o The NFS generated namei cn_pnbuf o The vnode In addition, there are many places where the VOP's are not abstracted by status return (ie: they are call-down instead of veto interfaces). Examples include: o VOP_LOCK o VOP_ADVLOCK o VFS_MOUNT o NFS export list porcessing o root mount processing o remount processing o mount point covering o namei() o CREATE op in EXISTS case with no intention of overrwrite in the case of collision Without a clear abstraction, it's impossible to build a utok/ktou layer (I would prefer a ktou to a bypass op; it's more general, and doesn't require an NFS loopback). Particularly problematic are the NFS LEASE VOP's, which are interfaced by a serious kludge because they are call-down instead of veto, and therefore can not be zero-overhead registration based. If my changes for fcntl() to support server-side NFS locking (as the subsystem called by rpc.lockd) are ever integrated, this will add another, identical kludge for FHTOVP for an NFS LKM. > Debugging in userland would sure be cool, when you're satisfied take away > the transport layers and you're back in the kernel. This was discussed in detail in the Heidemann paper, actually... and yes, it's the way I'd like to do FS debuging as well. Regards, Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.