Date: Sun, 24 Apr 2005 19:04:15 +0200 From: Pawel Jakub Dawidek <pjd@FreeBSD.org> To: Paul Mather <paul@gromit.dlib.vt.edu> Cc: freebsd-geom@freebsd.org Subject: Re: Is there a "disconnected" state for geom_mirror providers? Message-ID: <20050424170415.GC837@darkness.comp.waw.pl> In-Reply-To: <1114360313.77313.14.camel@zappa.Chelsea-Ct.Org> References: <1114308801.71938.2.camel@zappa.Chelsea-Ct.Org> <20050424094148.GZ837@darkness.comp.waw.pl> <1114360313.77313.14.camel@zappa.Chelsea-Ct.Org>
next in thread | previous in thread | raw e-mail | index | archive | help
--GLdS9qjAGFrs7Ts6 Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Apr 24, 2005 at 12:31:53PM -0400, Paul Mather wrote: +> > If gmirror gets an error for READ or WRITE operation, it assumes provi= der +> > is broken. This is very important - if it will be marked only as stale, +> > it will be connected, resynchronization will start, but because there +> > was an error on provider, it probably will be disconnected again and we +> > have endless loop. +>=20 +> I guess it depends on what caused the disconnection in the first place. +> If it was a READ of a bad sector, it could be that subsequent +> resynchronisation will force a block reallocation of the bad block and +> the drive will no longer be "broken." So you want me to count number of failures of every sector and mark component as broken if I've 2 failures related to the same sector or something like that?:) +> > Stale provider is when it is hot-plug and you remove it; when you use +> > 'deactivate' command to disconnect it; when it doesn't show up on mirr= or +> > start, but later. +> >=20 +> > The rule is simple: when an error was returned on I/O operation, provi= der +> > is marked as broken. +>=20 +> Thanks for the clarification. That makes sense. I just need to +> remember "gmirror forget" before I attempt to add back in the disk in my +> "TIMEOUT - WRITE_DMA" not-really-broken broken disk case. :-) If reallocation happens here, there should be no I/O error visible for gmirror. +> The shame about it being deleted from the mirror as opposed to marked as +> "broken" is you lose info (shown in "gmirror list") about the broken +> component priority, etc., which is useful for when you add a replacement +> device (or re-add the same one, as in my case). You can use 'gmirror dump /dev/<your_component>'. +> If you marked a component as "broken" (but still listed as part of the +> mirror), you could add a "-f" option to "gmirror rebuild" to force +> rebuilding onto it a la RAIDframe. :-) This is not so simple. I don't store any info on broken component, that it is broken, because e.g. bad sector could be the sector with metadata. Other components are informed that something wrong is going on. How one can remove such broken component for good? Let's say you was able to read metadata from the component, but you cannot write there any more. How you can easily replace this component? This complicates things a lot and I don't need more complications if I want gmirror to stay reliable (which I hope it is now). --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --GLdS9qjAGFrs7Ts6 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (FreeBSD) iD4DBQFCa9GPForvXbEpPzQRAiaEAJ9ijXlSdmaSEiVZmzmlMG/Qpv+QsQCY0QFF mAViO3NgqYE8BH+7ojxc4Q== =fVKR -----END PGP SIGNATURE----- --GLdS9qjAGFrs7Ts6--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050424170415.GC837>