Date: Tue, 18 Mar 2003 10:22:36 +1030 From: Greg 'groggy' Lehey <grog@FreeBSD.org> To: Scott Mitchell <scott+freebsd@fishballoon.org> Cc: freebsd-questions@FreeBSD.ORG Subject: Re: Strange crash, possibly vinum-related Message-ID: <20030317235236.GH9422@wantadilla.lemis.com> In-Reply-To: <20030317105828.GA23237@tuatara.fishballoon.org> References: <20030310231532.GD522@tuatara.fishballoon.org> <20030317105828.GA23237@tuatara.fishballoon.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--uAgJxtfIS94j9H4T Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Monday, 17 March 2003 at 10:58:28 +0000, Scott Mitchell wrote: > On Mon, Mar 10, 2003 at 11:15:32PM +0000, Scott Mitchell wrote: >> Hi all, >> >> I wonder if anyone out there can shed any light on this: >> >> A drive failed on one of our Vinum-powered RAID-5 arrays over the weeken= d. >> This morning, we swapped out the offending drive (hot-swappable SCSI >> hardware), disklabel-ed it and restarted the offending subdisk. Everyth= ing >> seemed fine at this point, with vinum happily reviving the stale subdisk. >> >> However, twenty minutes later, with the revive 29% complete, I got this = in >> /var/log/messages: >> >> Mar 10 11:39:50 kokako vinum[12708]: can't revive raid.p0.s0: Invalid ar= gument >> >> 'vinum list' was also showing an error message, which I foolishly didn't >> capture, something along the lines of 'the revive process died'. Lacking >> any better ideas, I started the subdisk again. The revival seemed to pi= ck >> up where it left off. >> >> Half an hour later, the box rebooted :-( I wasn't actually watching it = at >> the time, so I don't know if it finished reviving the subdisk or not. >> There's no indication in the logs as to what happened, but the timing of >> the reboot is consistent with it happening around the time the subdisk >> would have come back to life. >> >> Once the box came back up, I restarted the subdisk yet again (I had to >> create the drive again first), with the RAID volume unmounted. This time >> the process finished without complaints and things seem to be working as >> well as ever since then. > [logs, etc. snipped...] > > > No takers?=20 I've been intending to do so, but there's not much I can do based on the information you've supplied. > Maybe someone who's done this (replacing a failed Vinum drive on > hot-swap SCSI hardware) before can at least tell me whether: > > - I should have done some camcontrol magic before rebuilding > the drive? I can't see anything in particular you would need to do, but then I haven't seen the details. > - Rebuilding the drive without unmounting the volume first was > just asking for trouble? There have been reports of this kind of problem, mainly from Vallo Kallaste, who has also responded. I haven't seen it myself, and I haven't heard of panics as a result. But yes, umounting is a good precaution. > - -hackers or even -stable is a better venue for this kind of problem? -questions will do fine. Greg -- When replying to this message, please copy the original recipients. If you don't, I may ignore the reply or reply to the original recipients. For more information, see http://www.lemis.com/questions.html See complete headers for address and phone numbers --uAgJxtfIS94j9H4T Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.0 (FreeBSD) iD8DBQE+dl/EIubykFB6QiMRApS3AJ4oXbMpGoPx5CJbGExlyI4d2tHMDQCfcSIv lWiwpcwEScE9H4klnY+lUEI= =tgIY -----END PGP SIGNATURE----- --uAgJxtfIS94j9H4T-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030317235236.GH9422>