Date: Wed, 20 Jul 2005 10:35:46 +0800 From: yf-263 <yfyoufeng@263.net> To: Eric Anderson <anderson@centtech.com> Cc: freebsd-fs@freebsd.org Subject: Re: Cluster Filesystem for FreeBSD - any interest? Message-ID: <1121826946.2235.6.camel@localhost.localdomain> In-Reply-To: <42DDB3F2.7020000@centtech.com> References: <200507020038.j620cO7F071025@gate.bitblocks.com> <42DDB3F2.7020000@centtech.com>
next in thread | previous in thread | raw e-mail | index | archive | help
在 2005-07-19二的 21:16 -0500,Eric Anderson写道: > Bakul Shah wrote: > [..snip..] > >>:) I understand. Any nudging in the right direction here would be > >>appreciated. > > > > > > I'd probably start with modelling a single filesystem and how > > it maps to a sequence of disk blocks (*without* using any > > code or worrying about details of formats but capturing the > > essential elements). I'd describe various operations in > > terms of preconditions and postconditions. Then, I'd extend > > the model to deal with redundancy and so on. Then I'd model > > various failure modes. etc. If you are interested _enough_ > > we can take this offline and try to work something out. You > > may even be able to use perl to create an `executable' > > specification:-) > > I've done some research, and read some books/articles/white papers since > I started this thread. > > First, porting GFS might be a more universal effort, and might be > 'easier'. However, that doesn't get us a clustered filesystem with BSD > license (something that sounds good to me). It has been said it would be a seven man-month efforts for a FS expert. > > Clustering UFS2 would be cool. Here's what I'm looking for: It is exactly how "Lustre" doing its work, though it build itself on Ext3, and Lustre targets at http://www.lustre.org/docs/SGSRFP.pdf . > > A clustered filesystem (or layer?) that allows all machines in the > cluster to see the same filesystem as if it were local, with read/write > access. The cluster will need cache coherency across all nodes, and > there will need to be some sort of lock manager on each node to > communicate with all the other nodes to coordinate file locking. The > filesystem will have to support journaling. > > I'm wondering if one could make a pseudo filesystem something like > nullfs that sits on top of a UFS2 partition, and essentially monitors > all VFS operations to the filesystem, and communicates them over TCP/IP > to the other nodes in the cluster. That way, each node would know which > inodes and blocks are changing, so they can flush those buffers, and > they would know which blocks (or partial blocks) to view as locked as > another node locks it. This could be done via multicast, so all nodes in > the cluster would have to be running a distributed lock manager daemon > (dlmd) that would coordinate this. I think also that the UFS2 > filesystem would have to have a bit set upon mount that tracked it's > mount as a 'clustered' filesystem mount. The reason for that is so that > we could modify mount to only mount 'clustered' filesystems (mount -o > clustered) if the dlmd was running, since that would be a dependency for > stable coherent file control on a mount point. > > Does anyone have any insight as to whether a layer would work? Or maybe > I'm way off here and I need to do more reading :) > > Eric > > > -- yf-263 <yfyoufeng@263.net> Unix-driver.org
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1121826946.2235.6.camel>