From owner-svn-src-projects@FreeBSD.ORG Wed Jan 5 13:46:29 2011 Return-Path: Delivered-To: svn-src-projects@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 88D6E106564A; Wed, 5 Jan 2011 13:46:29 +0000 (UTC) (envelope-from pawel@dawidek.net) Received: from mail.garage.freebsd.pl (60.wheelsystems.com [83.12.187.60]) by mx1.freebsd.org (Postfix) with ESMTP id F280D8FC1F; Wed, 5 Jan 2011 13:46:28 +0000 (UTC) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id C013A45D8D; Wed, 5 Jan 2011 14:46:26 +0100 (CET) Received: from localhost (pdawidek.whl [10.0.1.1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id AAE9445CAC; Wed, 5 Jan 2011 14:46:21 +0100 (CET) Date: Wed, 5 Jan 2011 14:46:17 +0100 From: Pawel Jakub Dawidek To: Alexander Motin Message-ID: <20110105134617.GC1740@garage.freebsd.pl> References: <201101050019.p050Je5J059533@svn.freebsd.org> <20110105083906.GB1740@garage.freebsd.pl> <4D2466FE.3000307@FreeBSD.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="Bu8it7iiRSEf40bY" Content-Disposition: inline In-Reply-To: <4D2466FE.3000307@FreeBSD.org> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 9.0-CURRENT amd64 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-5.9 required=4.5 tests=ALL_TRUSTED,BAYES_00 autolearn=ham version=3.0.4 Cc: svn-src-projects@freebsd.org, src-committers@freebsd.org, Warner Losh Subject: Re: svn commit: r216984 - projects/graid/head/sys/geom/raid X-BeenThere: svn-src-projects@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the src " projects" tree" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Jan 2011 13:46:29 -0000 --Bu8it7iiRSEf40bY Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Jan 05, 2011 at 02:41:34PM +0200, Alexander Motin wrote: > On 05.01.2011 10:39, Pawel Jakub Dawidek wrote: > >On Wed, Jan 05, 2011 at 12:19:40AM +0000, Warner Losh wrote: > >>Author: imp > >>Date: Wed Jan 5 00:19:40 2011 > >>New Revision: 216984 > >>URL: http://svn.freebsd.org/changeset/base/216984 > >> > >>Log: > >> First pass at error recovery: if the first disk that we get errors on > >> has a problem, try from the second one. Note info about possible bad > >> sector remap attempt through write, and some ideas on when to eject > >> the subdisk from the disk. > > > >My ideas what to do on I/O error mostly matches yours: > >- On read error, read from the other disk, write the data back to the > > first disk. Before you return the data up, you must wait for write to > > complete. If you won't wait, you can lose race with new write request > > going into the same area and you will overwrite new data with the old > > one. >=20 > In design document we have planned range locking mechanism for use here= =20 > and during synchronization/rebuild. Range locking is definiatelly good idea. It is a must have for RAID4/RAID5, but also for RAID1 when you synchronize. > >- On write error you want to mark disk as broken immediately, as from > > now on it has stale data and can't be trusted. >=20 > Right. As further steps we have discussed idea of keeping such disks as= =20 > part of array, marking them as dirty, avoiding reads from them. If main= =20 > disk instrantly fail, partially broken disk is probably better then nothi= ng. I agree that this is more intuitive and easier for the user to observe which disk exactly broke and why. > >How do you plan to detect if there was unclean shutdown and you need to > >synchronize the disks? >=20 > It depends from metadata format. Intel metadata, according to Linux=20 > sources, seem to have some flags related to the case. I have planned to= =20 > implement logic used by gmirror (dirty on first write and clean on close= =20 > or after timeout) using that flags and metadata sequence numbers. I was also thinking about flash-friendly resync. Currently gmirror synchronizes entire thing by reading data from one component and writting to the other one. Flash-friendly synchronization will read data from both components and write only if they differ. --=20 Pawel Jakub Dawidek http://www.wheelsystems.com pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --Bu8it7iiRSEf40bY Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (FreeBSD) iEYEARECAAYFAk0kdikACgkQForvXbEpPzTW9ACgq2aE21iG6LlTjYBst//xOjyO p5wAmgNIQxhnmAbtKqx5RO5UuBM21zPS =aMH8 -----END PGP SIGNATURE----- --Bu8it7iiRSEf40bY--