Date: Wed, 2 Jan 2008 00:52:03 +0000 (GMT) From: Robert Watson <rwatson@FreeBSD.org> To: Adrian Chadd <adrian@freebsd.org> Cc: stable@freebsd.org Subject: Re: SMP on FreeBSD 6.x and 7.0: Worth doing? Message-ID: <20080102004236.R30578@fledge.watson.org> In-Reply-To: <d763ac660712260525u7e2c18b8t32904807c549b3c4@mail.gmail.com> References: <200712220531.WAA09277@lariat.net> <476FBED0.2080400@samsco.org> <200712241549.IAA19650@lariat.net> <476FDA10.4060107@samsco.org> <200712241653.JAA20845@lariat.net> <476FE868.8080704@samsco.org> <200712241756.KAA21950@lariat.net> <d763ac660712241820s38237d99x1243862095780dc6@mail.gmail.com> <4772529D.9010805@samsco.org> <d763ac660712260525u7e2c18b8t32904807c549b3c4@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 26 Dec 2007, Adrian Chadd wrote: > On 26/12/2007, Scott Long <scottl@samsco.org> wrote: > >> Yes, Squid is the ideal application for IFS. Do you still have any of your >> work on this, and would you be able to share it? > > It'd be easy to rewrite it from scratch if IFS were recovered. In fact, the > whole point behind IFS, way back when, is I could layer a user-space > directory hierarchy on top of a kernel provided space and then do "stuff" (I > had a POP3 Maildir-like server written using IFS back then.) > > The squid code wasn't difficult at all. The biggest problem back then was > rebuilding the disk index - didn't I have some code to export the inode > allocation bitmap via a special file in the filesystem so I didn't have to > stat() each individual inode, or didn't I end up comitting that? > > I'm happy to work on that later on next year. I've got enough non-disk Squid > code to rewrite and optimise over the next few months; the storage side is > going to have to wait a while. Do you think the IFS model offers significant benefits from an application perspective to, say, the fh*() model used by Arla? This approach originated, as far as I am aware, with the AFS implementation from CMU, in which new ioctls added by CMU allowed an give-me-a-free-inode, open-by-inode-number, and flagged inodes as "in use by AFS" even though they weren't hooked up to the namespace. fsck then knew to skip them, but the UFS implementation was otherwise largely unmodified. In the slightly less intrusive Arla view of the world, cache files do appear in the UFS name space, but an independent namespace is maintained by the cache manager, each with two file system names: a normal path (used to delete the cache file if required), and its NFS file handle, which can be used to open, stat, etc, the file without a normal file system namespace operation. The user application can allocate a set of inodes in some arbitrary directory tree using normal operations (ideally in advance), but when it does so also query the NFS file handles for the files using getfh(2). Then it later performs all accesses using the file handles (fhopen(2) fhstat(2), etc), unless they are invalidated due to, say, moving the cache to a new file system, in which case the handle database can be rebuilt by re-getfh(2)'ing the files using the actual file system namespace. It also passes the file handles to the kernel for use by the nnpfs synthetic file system for file access... Last time I looked closely, it seemed like the main downside to this vs. IFS was that you did in fact need real file system names to files with the fh*() approach, even though you never used them except for create/destroy. As long as the application effectively "cached" the inodes for reuse, rather than unlinking/creating frequently, this wasn't a problem. This did, however, mean that a whole new metadata layer didn't have to be created for an IFS, and fsck requires no modifications as compared to the AFS approach. So Squid (or whatever) would need to populate a tree and build a DB with file handles as well as real names in case the DB has to be rebuilt. You'd also have to be careful about crash-recovery state to make sure the squid DB agreed with the contents of the files when coming up after a crash, if reusing inodes rather than unlinking/reallocating them. Robert N M Watson Computer Laboratory University of Cambridge
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080102004236.R30578>