Date: Tue, 20 Jun 2006 13:29:48 -0700 From: Bakul Shah <bakul@bitblocks.com> To: Pawel Jakub Dawidek <pjd@FreeBSD.org> Cc: freebsd-fs@FreeBSD.org, freebsd-current@FreeBSD.org, freebsd-geom@FreeBSD.org Subject: Re: Journaling UFS with gjournal. Message-ID: <20060620202948.933F2294C1@mail.bitblocks.com> In-Reply-To: Your message of "Mon, 19 Jun 2006 15:11:01 %2B0200." <20060619131101.GD1130@garage.freebsd.pl>
next in thread | previous in thread | raw e-mail | index | archive | help
This is great! We have sorely needed this for quite a while what with terabyte size filesystems getting into common use. > How it works (in short). You may define one or two providers which > gjournal will use. If one provider is given, it will be used for both - > data and journal. If two providers are given, one will be used for data > and one for journal. > Every few seconds (you may define how many) journal is terminated and > marked as consistent and gjournal starts to copy data from it to the > data provider. In the same time new data are stored in new journal. Some random comments: Would it make sense to treat the journal as a circular buffer? Then commit to the underlying provider starts when the buffer has $hiwater blocks or the upper layer wants to sync. The commit stops when the buffer has $lowater blocks or in case of sync the buffer is empty. This will allow parallel writes to the provider and the journal, thereby reducing latency. I don't understand why you need FS synchronization. Once the journal is written, the data is safe. A "redo" may be needed after a crash to sync the filesystem but that is about it. Redo should be idempotent. Each journal write block may need some flags. For instance mark a block as a "sync point" -- when this block is on the disk, the FS will be in a consistent state. In case of redo after crash you have to throw away all the journal blocks after the last sync point. It seems to me if you write a serial number with each data block, in the worst case redo has to do a binary search to find the first block to write but normal writes to journal and reads from journal (for commiting to the provider) can be completely sequential. Since redo will be much much faster than fsck you can afford to slow it down a bit if the normal case can be speeded up. Presumably you disallow opening any file in /.deleted. Can you gjournal the journal disk? Recursion is good:-) -- bakul
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060620202948.933F2294C1>