Date: Wed, 5 Jan 2011 14:46:17 +0100 From: Pawel Jakub Dawidek <pjd@FreeBSD.org> To: Alexander Motin <mav@FreeBSD.org> Cc: svn-src-projects@freebsd.org, src-committers@freebsd.org, Warner Losh <imp@FreeBSD.org> Subject: Re: svn commit: r216984 - projects/graid/head/sys/geom/raid Message-ID: <20110105134617.GC1740@garage.freebsd.pl> In-Reply-To: <4D2466FE.3000307@FreeBSD.org> References: <201101050019.p050Je5J059533@svn.freebsd.org> <20110105083906.GB1740@garage.freebsd.pl> <4D2466FE.3000307@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--Bu8it7iiRSEf40bY Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Jan 05, 2011 at 02:41:34PM +0200, Alexander Motin wrote: > On 05.01.2011 10:39, Pawel Jakub Dawidek wrote: > >On Wed, Jan 05, 2011 at 12:19:40AM +0000, Warner Losh wrote: > >>Author: imp > >>Date: Wed Jan 5 00:19:40 2011 > >>New Revision: 216984 > >>URL: http://svn.freebsd.org/changeset/base/216984 > >> > >>Log: > >> First pass at error recovery: if the first disk that we get errors on > >> has a problem, try from the second one. Note info about possible bad > >> sector remap attempt through write, and some ideas on when to eject > >> the subdisk from the disk. > > > >My ideas what to do on I/O error mostly matches yours: > >- On read error, read from the other disk, write the data back to the > > first disk. Before you return the data up, you must wait for write to > > complete. If you won't wait, you can lose race with new write request > > going into the same area and you will overwrite new data with the old > > one. >=20 > In design document we have planned range locking mechanism for use here= =20 > and during synchronization/rebuild. Range locking is definiatelly good idea. It is a must have for RAID4/RAID5, but also for RAID1 when you synchronize. > >- On write error you want to mark disk as broken immediately, as from > > now on it has stale data and can't be trusted. >=20 > Right. As further steps we have discussed idea of keeping such disks as= =20 > part of array, marking them as dirty, avoiding reads from them. If main= =20 > disk instrantly fail, partially broken disk is probably better then nothi= ng. I agree that this is more intuitive and easier for the user to observe which disk exactly broke and why. > >How do you plan to detect if there was unclean shutdown and you need to > >synchronize the disks? >=20 > It depends from metadata format. Intel metadata, according to Linux=20 > sources, seem to have some flags related to the case. I have planned to= =20 > implement logic used by gmirror (dirty on first write and clean on close= =20 > or after timeout) using that flags and metadata sequence numbers. I was also thinking about flash-friendly resync. Currently gmirror synchronizes entire thing by reading data from one component and writting to the other one. Flash-friendly synchronization will read data from both components and write only if they differ. --=20 Pawel Jakub Dawidek http://www.wheelsystems.com pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --Bu8it7iiRSEf40bY Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (FreeBSD) iEYEARECAAYFAk0kdikACgkQForvXbEpPzTW9ACgq2aE21iG6LlTjYBst//xOjyO p5wAmgNIQxhnmAbtKqx5RO5UuBM21zPS =aMH8 -----END PGP SIGNATURE----- --Bu8it7iiRSEf40bY--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110105134617.GC1740>