Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 10 Aug 2006 21:28:41 +0200
From:      Pawel Jakub Dawidek <pjd@FreeBSD.org>
To:        Craig Boston <craig@xfoil.gank.org>, freebsd-fs@FreeBSD.org, freebsd-current@FreeBSD.org, freebsd-geom@FreeBSD.org
Subject:   Re: GJournal (hopefully) final patches.
Message-ID:  <20060810192841.GA1345@garage.freebsd.pl>
In-Reply-To: <20060810184702.GA8567@nowhere>
References:  <20060808195202.GA1564@garage.freebsd.pl> <20060810184702.GA8567@nowhere>

next in thread | previous in thread | raw e-mail | index | archive | help

--M9NhX3UHpAaciwkO
Content-Type: text/plain; charset=iso-8859-2
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, Aug 10, 2006 at 01:47:23PM -0500, Craig Boston wrote:
> Hi,
>=20
> It's great to see this project so close to completion!  I'm trying it
> out on a couple machines to see how it goes.
>=20
> A few comments and questions:
>=20
> * It took me a little by surprise that it carves 1G out of the device
>   for the journal.  Depending on the size of the device that can be a
>   pretty hefty price to pay (and I didn't see any mention of it in the
>   setup notes).  For a couple of my smaller filesystems I reduced it to
>   512MB.  Perhaps some algorithm for auto-sizing the journal based on
>   the size / expected workload of the device would be in order?

It will be pointed out in documentation when I finally prepare it.
I don't have plans about autosizing currently.

> * Attached is a quick patch for geom_eli to allow it to pass BIO_FLUSH
>   down to its backing device.  It seems like the right thing to do and
>   fixes the "BIO_FLUSH not supported" warning on my laptop that uses a
>   geli encrypted disk.

I've this already in my perforce tree. I also implemented BIO_FLUSH
passing in gmirror and graid3.

I also added a flag for gmirror and graid3 which says "don't
resynchronize components after a power failure - trust they are
consistent". And they are always consistent when placed below gjournal.

> * On a different system, however, it complains about it even on a raw
>   ATA slice:
>=20
>     atapci1: <Intel ICH4 UDMA100 controller> port 0x1f0-0x1f7,0x3f6,0x170=
-0x177,0x376,0xffa0-0xffaf at device 31.1 on pci0
>     ata0: <ATA channel 0> on atapci1
>     ad0: 114473MB <WDC WD1200JB-00CRA1 17.07W17> at ata0-master UDMA100
>     GEOM_JOURNAL: BIO_FLUSH not supported by ad0s1e.
>=20
>   It seems like a reasonably modern controller and disk, at least it
>   should be capable of issuing a cache flush command.  Not sure why it
>   doesn't like it :/

We would need to add some printfs to diagnoze this probably - you can
try adding some lines to ad_init() to get this:

    if (atadev->param.support.command1 & ATA_SUPPORT_WRITECACHE) {
        if (ata_wc)
            ata_controlcmd(dev, ATA_SETFEATURES, ATA_SF_ENAB_WCACHE, 0, 0);
        else
            ata_controlcmd(dev, ATA_SETFEATURES, ATA_SF_DIS_WCACHE, 0, 0);
    } else {
	printf("ad_init: WRITE CACHE not supported by ad%d.\n",
	    device_get_unit(dev));
    }

> * How "close" does the filesystem need to be to the gjournal device in
>   order for the UFS hooks to work?  Directly on it?
>=20
>   The geom stack on my laptop currently looks something like this:
>=20
>   [geom_disk] ad0 <- [geom_eli] ad0.eli <- [geom_gpt] ad0.elip6 <-
>   [geom_label] gjtest <- [geom_journal] gjtest.journal <- UFS
>=20
>   I was wondering if an arrangement like this would work:
>=20
>   [geom_journal] ad0p6.journal <- [geom_eli] ad0p6.journaleli <- UFS
>=20
>   and if it would be any more efficient (journal the encrypted data
>   rather than encrypt the journal).  Or even gjournal the whole disk at
>   once?

When you mount file system it sends BIO_GETATTR "GJOURNAL::provider"
requests. So as long as classes between the file system and gjournal
provider pass BIO_GETATTR down, it will work.

On my home machine I've the following configuration:

raid3/DATA1.elid.journal

So it's UFS over gjournal over bsdlabel over geli over raid3 over ata.

I prefer to put gjournal on the top, because it gives consistency to
layers below it. For example I can use geli with bigger sector size
(sector size greater than disk sector size in encryption-only-mode can
be unreliable on power failures, which is not the case when gjournal is
above geli), I can turn off synchronization of gmirror/graid3 after a
power failure, etc.

On the other hand configuring geli on top of gjournal can be more
effective for large files - geli will not encrypt the data twice.

Fortunatelly with GEOM you can freely mix your puzzles.

> Haven't been brave enough to try gjournal on root yet, but my /usr and
> /compile (src, obj, ports) partitions are already on it so I'm sure I'll
> try it soon ;)

Markus Trippelsdorf reported that it doesn't work out of the box, but he
manage to make it to work with some small changes to fsck_ffs(8).

--=20
Pawel Jakub Dawidek                       http://www.wheel.pl
pjd@FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!

--M9NhX3UHpAaciwkO
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.4 (FreeBSD)

iD8DBQFE24jpForvXbEpPzQRAmenAKC/J05IojZltHSXJFETFfsgAqMYZwCdHXeQ
/EXU/FMCMvMFGhyqVW6JlNE=
=LvdY
-----END PGP SIGNATURE-----

--M9NhX3UHpAaciwkO--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060810192841.GA1345>