Date: Tue, 19 Jul 2005 21:16:18 -0500 From: Eric Anderson <anderson@centtech.com> To: freebsd-fs@freebsd.org Subject: Re: Cluster Filesystem for FreeBSD - any interest? Message-ID: <42DDB3F2.7020000@centtech.com> In-Reply-To: <200507020038.j620cO7F071025@gate.bitblocks.com> References: <200507020038.j620cO7F071025@gate.bitblocks.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Bakul Shah wrote: [..snip..] >>:) I understand. Any nudging in the right direction here would be >>appreciated. > > > I'd probably start with modelling a single filesystem and how > it maps to a sequence of disk blocks (*without* using any > code or worrying about details of formats but capturing the > essential elements). I'd describe various operations in > terms of preconditions and postconditions. Then, I'd extend > the model to deal with redundancy and so on. Then I'd model > various failure modes. etc. If you are interested _enough_ > we can take this offline and try to work something out. You > may even be able to use perl to create an `executable' > specification:-) I've done some research, and read some books/articles/white papers since I started this thread. First, porting GFS might be a more universal effort, and might be 'easier'. However, that doesn't get us a clustered filesystem with BSD license (something that sounds good to me). Clustering UFS2 would be cool. Here's what I'm looking for: A clustered filesystem (or layer?) that allows all machines in the cluster to see the same filesystem as if it were local, with read/write access. The cluster will need cache coherency across all nodes, and there will need to be some sort of lock manager on each node to communicate with all the other nodes to coordinate file locking. The filesystem will have to support journaling. I'm wondering if one could make a pseudo filesystem something like nullfs that sits on top of a UFS2 partition, and essentially monitors all VFS operations to the filesystem, and communicates them over TCP/IP to the other nodes in the cluster. That way, each node would know which inodes and blocks are changing, so they can flush those buffers, and they would know which blocks (or partial blocks) to view as locked as another node locks it. This could be done via multicast, so all nodes in the cluster would have to be running a distributed lock manager daemon (dlmd) that would coordinate this. I think also that the UFS2 filesystem would have to have a bit set upon mount that tracked it's mount as a 'clustered' filesystem mount. The reason for that is so that we could modify mount to only mount 'clustered' filesystems (mount -o clustered) if the dlmd was running, since that would be a dependency for stable coherent file control on a mount point. Does anyone have any insight as to whether a layer would work? Or maybe I'm way off here and I need to do more reading :) Eric -- ------------------------------------------------------------------------ Eric Anderson Sr. Systems Administrator Centaur Technology A lost ounce of gold may be found, a lost moment of time never. ------------------------------------------------------------------------
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?42DDB3F2.7020000>