Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 4 Jul 2009 11:15:38 +0200
From:      Pawel Jakub Dawidek <pjd@FreeBSD.org>
To:        Marcel Moolenaar <xcllnt@mac.com>
Cc:        rick-freebsd2008@kiwi-computer.com, freebsd-geom@freebsd.org
Subject:   Re: gmirror gm0 destroyed on shutdown; GPT corrupt
Message-ID:  <20090704091538.GA2891@garage.freebsd.pl>
In-Reply-To: <D4656301-95DD-46B2-A52B-A4E9AE1CE841@mac.com>
References:  <20090625110253.GA31443@mech-cluster238.men.bris.ac.uk> <10FCC74D-6D46-4112-AD89-BBB4C5933957@mac.com> <h24v15$70v$1@ger.gmane.org> <2FFFB36F-EFA3-4D92-98A3-692BA2D6F63E@mac.com> <20090629210003.GA24038@keira.kiwi-computer.com> <704EE47D-F0C4-4C63-AA3C-3ADF92CC8379@mac.com> <20090701135338.GE4372@garage.freebsd.pl> <D4656301-95DD-46B2-A52B-A4E9AE1CE841@mac.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--sdtB3X0nJg68CQEu
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Jul 01, 2009 at 08:29:23AM -0700, Marcel Moolenaar wrote:
>=20
> On Jul 1, 2009, at 6:53 AM, Pawel Jakub Dawidek wrote:
>=20
> >>Answer the following:
> >>
> >>foo0 is a provider with 3 sectors.
> >>bar is a geom class that puts meta-data in the first sector.
> >>baz is a geom class that puts meta-data in the last sector.
> >>
> >>Both bar and baz get to taste foo0. Which one should go first?
> >
> >Marcel, I don't think you expect than entire world will agree on one
> >place where metadata should be stored?
>=20
> No, I don't expect it. But we do need to realize that there
> is a race and unless we keep track of the ordering (outside
> of GEOM), we will always run into some scenarios where the
> tasting results in warnings or errors...

This is not a race, really and also ordering is not important.

Let's do the following:

	# gmirror create test da0
	# gpt create /dev/mirror/test

Let's assume GPT will be given providers for tasting before MIRROR on
boot:

	da0 arrives

	GEOM: GPT->taste(da0)
	GPT: Raport GPT corrupted (da0 is not the size we expect)

	GEOM: MIRROR->taste(da0)
	MIRROR: g_new_providerf(mirror/test)

	GEOM: GPT->taste(mirror/test)
	GPT: GPT is ok, configure partitions, etc.

Now let's revert the order: MIRROR goes first, then GPT:

	da0 arrives

	GEOM: MIRROR->taste(da0)
	MIRROR: g_new_providerf(mirror/test)

	GEOM: GPT->taste(da0)
	GPT: Raport GPT corrupted (da0 is not the size we expect)

	GEOM: GPT->taste(mirror/test)
	GPT: GPT is ok, configure partitions, etc.

This is the same, because GEOM will still present da0 for GPT tasting
even if MIRROR will decide to use it.

I do agree that it is hard to cope with, especially for metadata formats
that are given and that we cannot extend.

The real problem here is that in some situations (for some metadata
formats) class cannot auto-discover its providers reliably.

GPT is not alone here. There is similar issue for UFS labels. You have a
500GB disk da0, you also have 200GB partition da0a starting at sector 0.
You create UFS file system on da0a:

	# newfs -L foo /dev/da0a

The LABEL class is given disk da0 for tasting. How can it tell if the
file system was created on da0 or da0a? What we do now is to look inside
UFS metadata and get file system size from there. If the file system
size is equal to provider's size this is our provider. So in this case
file system size is 200GB and da0 size is 500GB, so we skip it.  This is
not perfect, because one can create smaller UFS file system than
provider size:

	# newfs -s 419430400 -L foo /dev/da0

We created 200GB file system on 500GB da0. Now the LABEL class will
incorrectly skip da0 during tasting, because of size mismatch.

The problem is similar to GPT: they cannot reliably work in
auto-discovery mode.

This is also problematic that provider can have multiple consumers
attached, but solution I use in some classes (which is a side-effect
really) is to open provider for write and exclusively during tasting.
Even if MIRROR provider isn't mounted it keeps its components open for
writing and exclusively all the time (the main reason was to allow
synchronization). Once MIRROR opens provider for writing every consumer
attached to this provider gets spoiled event (at least those that depend
on metadata). Going back to our example even if GPT will configure
partitions on da0, it should remove them on spoiled event once MIRROR
opens this provider for writing. At the end GPT will configure
partitions on mirror/test. This is of course not perfect, but reduce the
mess in /dev/ a bit.

--=20
Pawel Jakub Dawidek                       http://www.wheel.pl
pjd@FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!

--sdtB3X0nJg68CQEu
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.4 (FreeBSD)

iD8DBQFKTx26ForvXbEpPzQRAqNqAKCIOf1YZAiE8ct1Z63/qdTnkBNjFQCgylmY
02uu/aS9yrvyUbp6I5D5Aw8=
=6Rqk
-----END PGP SIGNATURE-----

--sdtB3X0nJg68CQEu--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090704091538.GA2891>