From owner-freebsd-stable@FreeBSD.ORG  Mon Jan  2 14:36:51 2006
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
X-Original-To: freebsd-stable@FreeBSD.org
Delivered-To: freebsd-stable@FreeBSD.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 46D8D16A41F
	for <freebsd-stable@FreeBSD.org>; Mon,  2 Jan 2006 14:36:51 +0000 (GMT)
	(envelope-from sty@blosphere.net)
Received: from vanessa.ncm.brain.riken.jp (vanessa.ncm.brain.riken.jp
	[134.160.174.10])
	by mx1.FreeBSD.org (Postfix) with ESMTP id B43FC43D6A
	for <freebsd-stable@FreeBSD.org>; Mon,  2 Jan 2006 14:36:45 +0000 (GMT)
	(envelope-from sty@blosphere.net)
Received: from [192.168.0.4] (d245.HtokyoFL24.vectant.ne.jp [210.131.223.245])
	by vanessa.ncm.brain.riken.jp (Postfix) with ESMTP id 88F416168
	for <freebsd-stable@FreeBSD.org>; Mon,  2 Jan 2006 23:36:42 +0900 (JST)
Message-ID: <43B93A77.5070502@blosphere.net>
Date: Mon, 02 Jan 2006 23:36:39 +0900
From: =?ISO-8859-1?Q?Tommi_L=E4tti?= <sty@blosphere.net>
User-Agent: Mozilla Thunderbird 1.0.7 (Windows/20050923)
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: freebsd-stable@FreeBSD.org
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: 
Subject: graid3 lost disk - array still fails
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Jan 2006 14:36:51 -0000

A few hours ago, my customers graid3 array crashed due one hard-drive 
loss and it's unable to recover. The data is easily replaceable so no 
loss of sleep for that but I'd really like to hear some ideas what 
happened, if possible.

Since this was 'do-it-cheaply', we got 3x160G seagates, all old pata 
type and put the in as primary master, secondary master and slave. Not 
the best possible combo I know but it worked.

Now the secondary master died a bit earlier, and the array started 
rebuilding, and then somebody rebooted the machine while it was 
rebuilding ad3...

ad2: FAILURE - READ_DMA status=51<READY,DSC,ERROR> 
error=40<UNCORRECTABLE> LBA=286404016
GEOM_RAID3: Request failed. ad2[READ(offset=146638856192, length=8192)]
GEOM_RAID3: Request failed. raid3/gr0[READ(offset=293277712384, 
length=16384)]
GEOM_RAID3: Device gr0: provider ad2 disconnected.
GEOM_RAID3: Device gr0: provider raid3/gr0 destroyed.
GEOM_RAID3: Device gr0: rebuilding provider ad3 stopped.
GEOM_RAID3: Synchronization request failed (error=6). 
ad3[WRITE(offset=973602816, length=
65536)]
GEOM_RAID3: Device gr0: provider ad3 disconnected.
GEOM_RAID3: Device gr0 destroyed.

So now that the ad2 is removed, graid3 still reports that ad3 is broken 
(GEOM_RAID3: Component ad3 (device gr0) broken, skipping.) and then 
proceeds to remove the array since that was the second disk already and 
there are not enough disks left...

Now, the question would be that is there any way I could lie to the 
graid3 that the ad3 is okay?

I'm pretty sure that there were no writes to the array during the time 
the ad2 crashed so maybe some data would still be recoverable?

-- 
br,
Tommi