From owner-freebsd-questions Mon Mar 17 2:59:13 2003 Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5367637B404 for ; Mon, 17 Mar 2003 02:59:11 -0800 (PST) Received: from mta06-svc.ntlworld.com (mta06-svc.ntlworld.com [62.253.162.46]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8A76643F3F for ; Mon, 17 Mar 2003 02:59:09 -0800 (PST) (envelope-from scott@fishballoon.org) Received: from fishballoon.org ([81.104.195.199]) by mta06-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030317105907.QGSK20605.mta06-svc.ntlworld.com@fishballoon.org> for ; Mon, 17 Mar 2003 10:59:07 +0000 Received: from tuatara.fishballoon.org (tuatara [192.168.1.6]) by fishballoon.org (8.12.6/8.12.6) with ESMTP id h2HAwS9A050768 for ; Mon, 17 Mar 2003 10:58:28 GMT (envelope-from scott@tuatara.fishballoon.org) Received: (from scott@localhost) by tuatara.fishballoon.org (8.12.7/8.12.6/Submit) id h2HAwSc0023442 for freebsd-questions@FreeBSD.ORG; Mon, 17 Mar 2003 10:58:28 GMT (envelope-from scott) Date: Mon, 17 Mar 2003 10:58:28 +0000 From: Scott Mitchell To: freebsd-questions@FreeBSD.ORG Subject: Re: Strange crash, possibly vinum-related Message-ID: <20030317105828.GA23237@tuatara.fishballoon.org> References: <20030310231532.GD522@tuatara.fishballoon.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030310231532.GD522@tuatara.fishballoon.org> User-Agent: Mutt/1.4i X-Operating-System: FreeBSD 4.8-PRERELEASE i386 Sender: owner-freebsd-questions@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Mon, Mar 10, 2003 at 11:15:32PM +0000, Scott Mitchell wrote: > Hi all, > > I wonder if anyone out there can shed any light on this: > > A drive failed on one of our Vinum-powered RAID-5 arrays over the weekend. > This morning, we swapped out the offending drive (hot-swappable SCSI > hardware), disklabel-ed it and restarted the offending subdisk. Everything > seemed fine at this point, with vinum happily reviving the stale subdisk. > > However, twenty minutes later, with the revive 29% complete, I got this in > /var/log/messages: > > Mar 10 11:39:50 kokako vinum[12708]: can't revive raid.p0.s0: Invalid argument > > 'vinum list' was also showing an error message, which I foolishly didn't > capture, something along the lines of 'the revive process died'. Lacking > any better ideas, I started the subdisk again. The revival seemed to pick > up where it left off. > > Half an hour later, the box rebooted :-( I wasn't actually watching it at > the time, so I don't know if it finished reviving the subdisk or not. > There's no indication in the logs as to what happened, but the timing of > the reboot is consistent with it happening around the time the subdisk > would have come back to life. > > Once the box came back up, I restarted the subdisk yet again (I had to > create the drive again first), with the RAID volume unmounted. This time > the process finished without complaints and things seem to be working as > well as ever since then. [logs, etc. snipped...] No takers? Maybe someone who's done this (replacing a failed Vinum drive on hot-swap SCSI hardware) before can at least tell me whether: - I should have done some camcontrol magic before rebuilding the drive? - Rebuilding the drive without unmounting the volume first was just asking for trouble? - -hackers or even -stable is a better venue for this kind of problem? Many thanks in advance, Scott -- =========================================================================== Scott Mitchell | PGP Key ID | "Eagles may soar, but weasels Cambridge, England | 0x54B171B9 | don't get sucked into jet engines" scott at fishballoon.org | 0xAA775B8B | -- Anon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message