From owner-freebsd-stable@FreeBSD.ORG Mon Jan 2 14:36:51 2006 Return-Path: X-Original-To: freebsd-stable@FreeBSD.org Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 46D8D16A41F for ; Mon, 2 Jan 2006 14:36:51 +0000 (GMT) (envelope-from sty@blosphere.net) Received: from vanessa.ncm.brain.riken.jp (vanessa.ncm.brain.riken.jp [134.160.174.10]) by mx1.FreeBSD.org (Postfix) with ESMTP id B43FC43D6A for ; Mon, 2 Jan 2006 14:36:45 +0000 (GMT) (envelope-from sty@blosphere.net) Received: from [192.168.0.4] (d245.HtokyoFL24.vectant.ne.jp [210.131.223.245]) by vanessa.ncm.brain.riken.jp (Postfix) with ESMTP id 88F416168 for ; Mon, 2 Jan 2006 23:36:42 +0900 (JST) Message-ID: <43B93A77.5070502@blosphere.net> Date: Mon, 02 Jan 2006 23:36:39 +0900 From: =?ISO-8859-1?Q?Tommi_L=E4tti?= User-Agent: Mozilla Thunderbird 1.0.7 (Windows/20050923) X-Accept-Language: en-us, en MIME-Version: 1.0 To: freebsd-stable@FreeBSD.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Subject: graid3 lost disk - array still fails X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Jan 2006 14:36:51 -0000 A few hours ago, my customers graid3 array crashed due one hard-drive loss and it's unable to recover. The data is easily replaceable so no loss of sleep for that but I'd really like to hear some ideas what happened, if possible. Since this was 'do-it-cheaply', we got 3x160G seagates, all old pata type and put the in as primary master, secondary master and slave. Not the best possible combo I know but it worked. Now the secondary master died a bit earlier, and the array started rebuilding, and then somebody rebooted the machine while it was rebuilding ad3... ad2: FAILURE - READ_DMA status=51 error=40 LBA=286404016 GEOM_RAID3: Request failed. ad2[READ(offset=146638856192, length=8192)] GEOM_RAID3: Request failed. raid3/gr0[READ(offset=293277712384, length=16384)] GEOM_RAID3: Device gr0: provider ad2 disconnected. GEOM_RAID3: Device gr0: provider raid3/gr0 destroyed. GEOM_RAID3: Device gr0: rebuilding provider ad3 stopped. GEOM_RAID3: Synchronization request failed (error=6). ad3[WRITE(offset=973602816, length= 65536)] GEOM_RAID3: Device gr0: provider ad3 disconnected. GEOM_RAID3: Device gr0 destroyed. So now that the ad2 is removed, graid3 still reports that ad3 is broken (GEOM_RAID3: Component ad3 (device gr0) broken, skipping.) and then proceeds to remove the array since that was the second disk already and there are not enough disks left... Now, the question would be that is there any way I could lie to the graid3 that the ad3 is okay? I'm pretty sure that there were no writes to the array during the time the ad2 crashed so maybe some data would still be recoverable? -- br, Tommi