Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 5 Jan 2011 14:46:17 +0100
From:      Pawel Jakub Dawidek <pjd@FreeBSD.org>
To:        Alexander Motin <mav@FreeBSD.org>
Cc:        svn-src-projects@freebsd.org, src-committers@freebsd.org, Warner Losh <imp@FreeBSD.org>
Subject:   Re: svn commit: r216984 - projects/graid/head/sys/geom/raid
Message-ID:  <20110105134617.GC1740@garage.freebsd.pl>
In-Reply-To: <4D2466FE.3000307@FreeBSD.org>
References:  <201101050019.p050Je5J059533@svn.freebsd.org> <20110105083906.GB1740@garage.freebsd.pl> <4D2466FE.3000307@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--Bu8it7iiRSEf40bY
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Jan 05, 2011 at 02:41:34PM +0200, Alexander Motin wrote:
> On 05.01.2011 10:39, Pawel Jakub Dawidek wrote:
> >On Wed, Jan 05, 2011 at 12:19:40AM +0000, Warner Losh wrote:
> >>Author: imp
> >>Date: Wed Jan  5 00:19:40 2011
> >>New Revision: 216984
> >>URL: http://svn.freebsd.org/changeset/base/216984
> >>
> >>Log:
> >>   First pass at error recovery: if the first disk that we get errors on
> >>   has a problem, try from the second one.  Note info about possible bad
> >>   sector remap attempt through write, and some ideas on when to eject
> >>   the subdisk from the disk.
> >
> >My ideas what to do on I/O error mostly matches yours:
> >- On read error, read from the other disk, write the data back to the
> >   first disk.  Before you return the data up, you must wait for write to
> >   complete.  If you won't wait, you can lose race with new write request
> >   going into the same area and you will overwrite new data with the old
> >   one.
>=20
> In design document we have planned range locking mechanism for use here=
=20
> and during synchronization/rebuild.

Range locking is definiatelly good idea. It is a must have for
RAID4/RAID5, but also for RAID1 when you synchronize.

> >- On write error you want to mark disk as broken immediately, as from
> >   now on it has stale data and can't be trusted.
>=20
> Right. As further steps we have discussed idea of keeping such disks as=
=20
> part of array, marking them as dirty, avoiding reads from them. If main=
=20
> disk instrantly fail, partially broken disk is probably better then nothi=
ng.

I agree that this is more intuitive and easier for the user to observe
which disk exactly broke and why.

> >How do you plan to detect if there was unclean shutdown and you need to
> >synchronize the disks?
>=20
> It depends from metadata format. Intel metadata, according to Linux=20
> sources, seem to have some flags related to the case. I have planned to=
=20
> implement logic used by gmirror (dirty on first write and clean on close=
=20
> or after timeout) using that flags and metadata sequence numbers.

I was also thinking about flash-friendly resync. Currently gmirror
synchronizes entire thing by reading data from one component and
writting to the other one. Flash-friendly synchronization will read data
from both components and write only if they differ.

--=20
Pawel Jakub Dawidek                       http://www.wheelsystems.com
pjd@FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!

--Bu8it7iiRSEf40bY
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (FreeBSD)

iEYEARECAAYFAk0kdikACgkQForvXbEpPzTW9ACgq2aE21iG6LlTjYBst//xOjyO
p5wAmgNIQxhnmAbtKqx5RO5UuBM21zPS
=aMH8
-----END PGP SIGNATURE-----

--Bu8it7iiRSEf40bY--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110105134617.GC1740>