Date: Sun, 29 Oct 2000 09:39:31 +1030 From: Greg Lehey <grog@lemis.com> To: Jesse <j@lumiere.net> Cc: freebsd-questions@FreeBSD.ORG Subject: Re: handling disk failures with vinum Message-ID: <20001029093931.H22174@wantadilla.lemis.com> In-Reply-To: <Pine.BSF.4.21.0010280523380.1282-100000@localhost>; from j@lumiere.net on Sat, Oct 28, 2000 at 06:01:34AM -0700 References: <Pine.BSF.4.21.0010280523380.1282-100000@localhost>
next in thread | previous in thread | raw e-mail | index | archive | help
[Format recovered--see http://www.lemis.com/email/email-format.html] Please don't wrap log output. On Saturday, 28 October 2000 at 6:01:34 -0700, Jesse wrote: > > Hi, > > I've setup two 30GB IDE drives for RAID 1 mirroring. > It works during normal conditions, but I'd like to test some failure > modes. > > I tried disconnecting the power to one of the drives. I got a bunch of > consoles messages -- access to the filesystems on the mirror blocked. Once > I powered up the second drive again, accesses completed. Here's the logs: > > Oct 28 05:21:40 leaf /kernel: ata1-master: no status, reselecting device > Oct 28 05:21:40 leaf last message repeated 757 times > Oct 28 05:21:40 leaf /kernel: ata1-master: timeout waiting to give command=c8 s=ff e=ff > Oct 28 05:21:40 leaf /kernel: ad2: error executing command - resetting > Oct 28 05:21:40 leaf /kernel: ata1: resetting devices .. done > Oct 28 05:21:50 leaf /kernel: ad2: READ command timeout tag=0 serv=0 - resetting > Oct 28 05:21:50 leaf /kernel: ata1: resetting devices .. done > > So.. is vinum capable of continuing to operate when a drive fails, > or will the system always die block on accesses and require a > reboot? I don't understand this comment; it contradicts your previous statement above. But this is a disk subsystem issue, not a Vinum issue. The only way Vinum can tell if a disk is dead is when the driver tells it so. From your output above, you only waited 10 seconds; the drivers should take a reasonable amount of time to retry before they give up on a drive, but it's possible that ata is waiting too long. If so, please enter a PR against the ata driver; I know from my own experience that the CAM drivers (SCSI) don't have problems. Greg -- When replying to this message, please copy the original recipients. If you don't, I may ignore the reply. For more information, see http://www.lemis.com/questions.html Finger grog@lemis.com for PGP public key See complete headers for address and phone numbers To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20001029093931.H22174>