From owner-freebsd-arch Wed Mar 20 16:37:41 2002 Delivered-To: freebsd-arch@freebsd.org Received: from albatross.prod.itd.earthlink.net (albatross.mail.pas.earthlink.net [207.217.120.120]) by hub.freebsd.org (Postfix) with ESMTP id 882BA37B41A for ; Wed, 20 Mar 2002 16:37:36 -0800 (PST) Received: from pool0083.cvx22-bradley.dialup.earthlink.net ([209.179.198.83] helo=mindspring.com) by albatross.prod.itd.earthlink.net with esmtp (Exim 3.33 #1) id 16nqaB-0004MU-00; Wed, 20 Mar 2002 16:37:31 -0800 Message-ID: <3C992B35.37349A85@mindspring.com> Date: Wed, 20 Mar 2002 16:37:09 -0800 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Julian Elischer Cc: Poul-Henning Kamp , arch@freebsd.org Subject: Re: UFS2, GEOM & DARPA - don't get all excited, OK ? References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Julian Elischer wrote: > On Wed, 20 Mar 2002, Terry Lambert wrote: > > Julian Elischer wrote: > > > One thing to look at: > > > Inode in dirent allocation.. > > > > > > there was a Usenix paper on this .. > > > It made a huge performance difference in some cases.. > > > > Microsoft has an approach simialt to this. > > > > The call it "The FAT Filesystem". > > > > 8-) 8-). > > haha > > read the papaer. > > http://www.pdos.lcs.mit.edu/pubs.html > > under "Storage management" Haha. Personally, I prefer PDF: http://citeseer.nj.nec.com/ganger97embedded.html Also, read the earlier paper about the same thing: http://www.usenix.org/publications/library/proceedings/sf94/forin.html It's well known that making the directory entry into the inode works really well, particularly when you handicap the caching (as they did in the earlier MACH paper, in an apparent attempt to make their numbers better). You can get the same improvement in FFS by faulting in the inodes when you do a directory traversal, so that when you go to open/stat the inodes, they are already in core. This was one of the basis of the paper that I, Ed Lane, and Bryan Cardoza weren't allowed to present in 1994 because of the USL lawsuit, after the USL acquisition by Novell. You can also get a big improvement by returning stat information each time it is changed, and with "open" and other calls, particularly when you are implementing a file system server. Also, one of the earliest technical discussions we had on FS issues at Whistle dealt with actually allocating reference nodes for hard links, which, among other things, results in a different vnode per reference path, which has the effect of making parent pointers reliable (e.g. "fdgetpath()" can be made to work). If you look at the MD-DOS MACH FS paper from 1194, their assumptions are obviously the wrong things to do: they had to slow down FFS for their MS-DOS FS to beat it, including caching all the FAT blocks in core. It's pretty trivial to get a locality based cache of directory and inode blocks in the FFS case, as well, to get similar performance. Actually, this approach should be obvious from the NetWare NWFS approach in Native NetWare, which has a RAM requirement proportional to the disk size for mounts because the directory structure is cached in core in its entirety. Right now, FreeBSD does somewhat the wrong thing, in not resulting in pages being faulted in when a non-blocking availability check occurs (e.g. a check on something that will not block if the data is not available, but triggers a prefetch). The FreeBSD non-blocking I/O implementation actually suffers because of this, since you really want unavailable disk data to result in a fault and a conversion, rather than blocking. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message