Date: Thu, 12 Feb 1998 19:47:34 +0000 (GMT) From: Terry Lambert <tlambert@primenet.com> To: inf@nyef.res.cmu.edu (Marca Registrada) Cc: freebsd-hackers@FreeBSD.ORG Subject: Re: Coda FS: FBSD port done!, but development favors Linux Message-ID: <199802121947.MAA19676@usr02.primenet.com> In-Reply-To: <19980212123955.08290@nyef.res.cmu.edu> from "Marca Registrada" at Feb 12, 98 12:39:55 pm
next in thread | previous in thread | raw e-mail | index | archive | help
YES! REAL DEVELOPEMENT AT LAST! > The current Coda release that I know of for FreeBSD is supposed to be for > -stable, so my first project may be to port it to -current (although I've > heard this may be difficult), and it would be easier for me to make light > contributions from time to time to do whatever is necessary when the > -stable-patched are unworkable for -current. This may be difficult. I will provide advice and information on Poul-Henning Kamp's interface changes, and other issues, as necessary (the changes somewhat broke Kirk McKusick's intended design, where the UFS code was to provide directory facilities, and the FFS code was to provide a linear (but not necessarily externalized) flat namespace). The main differences will be in VOP's and locking, and are pretty trivial (ie: 4.4-Lite2 didn't do much, and neither has anyone since, barring minor cleanups). > > * Development, particularly in the area of scalability, is focused on > > Linux. Why? His stated reasons: > > > > * Linux's ext2fs filesystem is much faster than *BSD's ffs > > (How good is FreeBSD's ext2fs support these days? Is > > it in 2.2.6 or must we wait for 3.0?) > > Would anyone think that softupdate's may fix this? I havn't keep close > enough track of the discussion to know when softupdates may ever come > around, though. Linux's ext2fs is apparently faster because it is, by default, mounted async. As a real FS hacker, he should be aware that an fsck can only undo one state transition. After ext2fs crashes, the FS after the fsck is in *a* consistent state, but not *the* consistent state it would have been had the crash not taken place. For each async call that takes place, you have another potential state. In general: For N outstanding operations, there are 2^(N-1) possible ground states following a one state change by fsck. This means for 11 outstanding operations, you have less than a 1 in 1000 chance of fsck guessing the right one. Classic implementations have guaranteed ordering using synchronus writes of metadata. This is the FFS default mechanism. Other approaches to ordering guarantees are: o Log structuring (fragmentation is high) o Journalling (commits are slow and fragmentation is high) o Delayed Ordered Writes ("Banker's Algorithm" for graph reduction sacrifices speed for overcautious safety; also patent-pending by USL, so not usable) o Soft Updates (within 5% of async, faster for some things, and with all the safety of synchronus writes). So the answers are, in order: A) There's nothing to fix; ext2fs is being used with a false sense of safety. B) Yes. Soft Updates in FreeBSD address the speed issue. > > * Current work is being done to develop Linux kernel extensions that > > will allow access to files via raw inodes. This development is > > seen as key to allowing Coda to support large filespaces with > > reasonable performance. See this URL for Peter's notes on > > these extensions: > > From the latest I heard on the Coda lists, Linus is very against this > becuase he feels it ruins the consistency of the FS interface. It doesn't, really. What it *does* do is blow out the inheritance security model based on directory permissions. The one way to save it from this is to change the structure of hard links on disk, and then keep parent pointers in all inodes. Then you traverse to root creating a path vector, and then traverse down the vector applying permissions. FreeBSD doesn't currently support this (about the only FS which does is NXFS, the NetWare eXtended File System, which I wrote for the NetWare for UNIX product while I was at Novell). Without this, if you can get a path on a filesystem, you can open any inode that you have permissions to the inode, regardless of the permissions of the intermediate path. > This of > course can change at any moment. The current proposal is to make an > filesystem where inodes can be accessed directly as files.. ie: > > fopen("/mnt/__inode_#12345#","r"); > or something similar looking to that. It actually doesn't sound like a > monster to implement at all. And as a separate filesystem solves many of > the fsck problems Coda currently has. I have implemented this at one time, and I have very recently provided assistance to Adrian Chadd, who has implemented it in -current. The idea is not new. This is called a "namespace incursion". It places a "magic" prefix in the namespace. My suggested escape, and the one I believe Adrian used, is the string ^I^N^O^D^E (8th bit set on all 5 characters), followed by decimal digits for the inode number. You can use any path onto the FS to get the dev_t. This works for current working directory, as well. Probably the correct way to implement this is to use the POSIX namespace escape, "//". Unfortunately, the FreeBSD namei() code is broken, such that an escape can not be inherited on a per path component basis, and applied solely to the terminal path component. I have patches for this which have not been committed. Practically, for this specific use, the namespace incursion is just about as good. You can reach Adrian Chadd at the following email address: <adrian@creative.net.au> > I'm totally with you on wanting to get Coda going strong on FreeBSD, and > will lend all the free coding time I have. If you need any architectural information, let me know. I know the FreeBSD FS code backwards and forwards. > As an aside, you also mentioned AFS. Has that been progressign at all on > the FreeBSD front? I havn't heard anything but light rustle about AFS. AFS was ported to NetBSD. FreeBSD couldn't use the NetBSD implementation for two reasons: 1) The kernel interface differences between FreeBSD and NetBSD; namely, the interfaces consumed by FS implementations that were terminal (bottom end) implementations. Things like local media filesystems, the NFS client, and the AFS client. These differences have recently gotten worse. I've been working upstream to try to reduce the number of interfaces that are consumed by terminal FS implementations, but it is slow going trying to get the code committed. 2) The VOP interface difference. Initially, this was just the mechanism for use of "cookies" in VOP_READDIR (a particularly ugly soloution to the search restart problem brought on by the underlying FS exposing the wrong view of the struct direct, instead of exposing an opaque pointer and a translation VOP). NetBSD has a slightly different cookie mechanism, but both are fundamentally broken by design. This leads to a lot of NFS code working around the breakage, and occasional NFS problems. The same workarounds are required in most VFS consumers. These differences have also recently gotten worse, and in fact, a number of VOP's necessary to the seperation of block naming from the imposistion of directory hierarchy have been removed. Eventually I expect them to come back. In any case, if it's already working, then you have these workarounds; if you don't, there are ways around the problems that I can help you with, if CODA isn't enough of a reason to get it fixed the right way. Regards, Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199802121947.MAA19676>