From owner-freebsd-geom@FreeBSD.ORG Mon Aug 28 14:05:41 2006 Return-Path: X-Original-To: freebsd-geom@freebsd.org Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4A3D816A4E1 for ; Mon, 28 Aug 2006 14:05:41 +0000 (UTC) (envelope-from awatts@pett.com.au) Received: from mail.equard.com.au (mail.equard.com.au [150.101.96.125]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7D3E443D68 for ; Mon, 28 Aug 2006 14:05:37 +0000 (GMT) (envelope-from awatts@pett.com.au) Received: from pett.com.au ([172.24.169.71]) by mail.equard.com.au (8.13.6.20060614/8.13.6) with ESMTP id k7SE5Qat043764; Mon, 28 Aug 2006 23:35:27 +0930 (CST) (envelope-from awatts@pett.com.au) Message-ID: <44F2F7DE.6080207@pett.com.au> Date: Mon, 28 Aug 2006 23:34:14 +0930 From: Alastair Watts User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-GB; rv:1.6) Gecko/20040113 X-Accept-Language: en-au, en-us, en MIME-Version: 1.0 To: Eric Anderson References: <20060828132846.81472.qmail@web30308.mail.mud.yahoo.com> <44F2F3C5.7090001@pett.com.au> <44F2F586.6060600@centtech.com> In-Reply-To: <44F2F586.6060600@centtech.com> X-Enigmail-Version: 0.83.6.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-geom@freebsd.org Subject: Re: gvinum behaviour on disk failure X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Aug 2006 14:05:41 -0000 Eric Anderson wrote: > I've had a drive that belonged to a mirrored die before, and didn't > notice it until I logged in to the machine and poked around, so it did > do its job. I've also had a drive in a gmirror go bad, and it hung the > whole box, but that isn't gmirrors' fault as far as I know. gvinum did its job in this instance as well - it took the dead drive plex down. After that happened, subsequent requests were all served successfully from ad0. But it still returned a 'fail' for the request that caused the fault (probably just passed the fault through) instead of seeing it as a fault and retrying the operation on ad0. That caused a process to die as the read request was for a swap page. I'd be interested to hear more information on the gmirrored drive hanging the box. Were both drives on the same ATA bus and the bus hung perhaps? Cheers, Al