Date: Tue, 07 Jun 2005 21:19:52 -0400 From: Richard Coleman <rcoleman@criticalmagic.com> To: Scott Long <scottl@samsco.org> Cc: Pawel Jakub Dawidek <pjd@FreeBSD.org>, scottl@FreeBSD.org, Ivan Voras <ivoras@fer.hr>, David Malone <dwmalone@maths.tcd.ie>, hackers@FreeBSD.org, phk@FreeBSD.org Subject: Re: Google SoC idea Message-ID: <42A647B8.30709@criticalmagic.com> In-Reply-To: <42A6091C.40409@samsco.org> References: <42A475AB.6020808@fer.hr> <20050607194005.GG837@darkness.comp.waw.pl> <20050607201642.GA58346@walton.maths.tcd.ie> <42A6091C.40409@samsco.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Scott Long wrote: > /me jumps up and down and waves his hands > > The problem with journalling at the block layer is that you pretty much > become forced to journal metadata and data, since the block layer really > doesn't know the distinction, and definitely not in a > filesystem-independent way (yes, UFS does evil things to the buffer > cache by representing metadata with negative block numbers, but that is > just UFS). Full journalling has many drawbacks from the viewpoint of > speed and complexity, of course. So you really want to be able to do > just metadata journalling. > > Another hard part of distinguishing between metadata and data is that > filesystems have a habit of migrating disk blocks from holding metadata > to holding data, and vice versa (think indirect pointer blocks, not > inode blocks). If you are only replaying metadata, you want to make > sure that you don't smash data blocks with old metadata. > > Coming up with a filesystem independent way to represent all of this for > the block layer is not easy. Filesystems would have to be able to be > modified to provide proper metadata vs. data hints to the block layer. > And if you're going to do that, then why not just make it a library in > VFS, like what Darwin does? > > The UFS Journalling work is already well underway, and I expect it to > follow the path of being a VFS library. Note that I'm saying 'library' > here, not 'layer'. There really is no way to make journalling work with > an arbitrary filesystem 'for free', whether as a VFS layer or a GEOM > transform, since journalling is 100% dependent on the filesystem working > with the buffer-cache to do sane operations in a defined in order. > > An alternate SoC project that would be very useful is block-level > snapshots. I'm not sure if I'll be able to retain the filesystem > snapshot functionality in UFS with journalling enabled, so moving to > doing the snapshots in the block layer would be a good way to make up > for this. Beware that while the GEOM transform would be pretty > straight-forward to write, the real trick comes from being able to make > the consumer of a block device (a filesystem, maybe) flush itself to a > consistent state while the snapshot is being taken. The infrastructure > for this is the part that is very interesting, but also the most work. > > Scott Scott, Have you looked at the journaling layer that Matt has been adding to DragonflyBSD? What you are talking about appears very similar. Or am I misunderstanding something? Richard Coleman rcoleman@criticalmagic.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?42A647B8.30709>