Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 20 Jun 2006 13:29:48 -0700
From:      Bakul Shah <bakul@bitblocks.com>
To:        Pawel Jakub Dawidek <pjd@FreeBSD.org>
Cc:        freebsd-fs@FreeBSD.org, freebsd-current@FreeBSD.org, freebsd-geom@FreeBSD.org
Subject:   Re: Journaling UFS with gjournal. 
Message-ID:  <20060620202948.933F2294C1@mail.bitblocks.com>
In-Reply-To: Your message of "Mon, 19 Jun 2006 15:11:01 %2B0200." <20060619131101.GD1130@garage.freebsd.pl> 

next in thread | previous in thread | raw e-mail | index | archive | help
This is great!  We have sorely needed this for quite a while
what with terabyte size filesystems getting into common use.

> How it works (in short). You may define one or two providers which
> gjournal will use. If one provider is given, it will be used for both -
> data and journal. If two providers are given, one will be used for data
> and one for journal.
> Every few seconds (you may define how many) journal is terminated and
> marked as consistent and gjournal starts to copy data from it to the
> data provider. In the same time new data are stored in new journal.

Some random comments:

Would it make sense to treat the journal as a circular
buffer?  Then commit to the underlying provider starts when
the buffer has $hiwater blocks or the upper layer wants to
sync.  The commit stops when the buffer has $lowater blocks
or in case of sync the buffer is empty.  This will allow
parallel writes to the provider and the journal, thereby
reducing latency.

I don't understand why you need FS synchronization.  Once the
journal is written, the data is safe.  A "redo" may be needed
after a crash to sync the filesystem but that is about it.
Redo should be idempotent.  Each journal write block may need
some flags.  For instance mark a block as a "sync point" --
when this block is on the disk, the FS will be in a
consistent state.  In case of redo after crash you have to
throw away all the journal blocks after the last sync point.

It seems to me if you write a serial number with each data
block, in the worst case redo has to do a binary search to
find the first block to write but normal writes to journal
and reads from journal (for commiting to the provider) can be
completely sequential.  Since redo will be much much faster
than fsck you can afford to slow it down a bit if the normal
case can be speeded up.

Presumably you disallow opening any file in /.deleted.

Can you gjournal the journal disk?  Recursion is good:-)

-- bakul



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060620202948.933F2294C1>