From owner-freebsd-hackers@FreeBSD.ORG Fri Sep 12 13:21:04 2008 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A5EAE1065670 for ; Fri, 12 Sep 2008 13:21:04 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from QMTA04.emeryville.ca.mail.comcast.net (qmta04.emeryville.ca.mail.comcast.net [76.96.30.40]) by mx1.freebsd.org (Postfix) with ESMTP id 851828FC14 for ; Fri, 12 Sep 2008 13:21:04 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from OMTA07.emeryville.ca.mail.comcast.net ([76.96.30.59]) by QMTA04.emeryville.ca.mail.comcast.net with comcast id Dp7K1a00C1GXsucA4pM4cT; Fri, 12 Sep 2008 13:21:04 +0000 Received: from koitsu.dyndns.org ([67.180.253.227]) by OMTA07.emeryville.ca.mail.comcast.net with comcast id DpM21a00U4v8bD78TpM36v; Fri, 12 Sep 2008 13:21:03 +0000 X-Authority-Analysis: v=1.0 c=1 a=FEcCtSrf6_wA:10 a=SSZ9KyxJ8eYA:10 a=QycZ5dHgAAAA:8 a=MzONc8gBtlYMkQ8CHR0A:9 a=2IKYoY-TbCJQfB7gm3YRLnJDFmcA:4 a=EoioJ0NPDVgA:10 a=LY0hPdMaydYA:10 Received: by icarus.home.lan (Postfix, from userid 1000) id D50F917B81A; Fri, 12 Sep 2008 06:21:02 -0700 (PDT) Date: Fri, 12 Sep 2008 06:21:02 -0700 From: Jeremy Chadwick To: Karl Pielorz Message-ID: <20080912132102.GB56923@icarus.home.lan> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.18 (2008-05-17) Cc: freebsd-hackers@freebsd.org Subject: Re: ZFS w/failing drives - any equivalent of Solaris FMA? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Sep 2008 13:21:04 -0000 On Fri, Sep 12, 2008 at 10:45:24AM +0100, Karl Pielorz wrote: > Recently, a ZFS pool on my FreeBSD box started showing lots of errors on > one drive in a mirrored pair. > > The pool consists of around 14 drives (as 7 mirrored pairs), hung off of > a couple of SuperMicro 8 port SATA controllers (1 drive of each pair is > on each controller). > > One of the drives started picking up a lot of errors (by the end of > things it was returning errors pretty much for any reads/writes issued) - > and taking ages to complete the I/O's. > > However, ZFS kept trying to use the drive - e.g. as I attached another > drive to the remaining 'good' drive in the mirrored pair, ZFS was still > trying to read data off the failed drive (and remaining good one) in > order to complete it's re-silver to the newly attached drive. > > Having posted on the Open Solaris ZFS list - it appears, under Solaris > there's an 'FMA Engine' which communicates drive failures and the like to > ZFS - advising ZFS when a drive should be marked as 'failed'. > > Is there anything similar to this on FreeBSD yet? - i.e. Does/can > anything on the system tell ZFS "This drives experiencing failures" > rather than ZFS just seeing lots of timed out I/O 'errors'? (as appears > to be the case). As far as I know, there is no such "standard" mechanism in FreeBSD. If the drive falls off the bus entirely (e.g. detached), I would hope ZFS would notice that. I can imagine it (might) also depend on if the disk subsystem you're using is utilising CAM or not (e.g. disks should be daX not adX); Scott Long might know if something like this is implemented in CAM. I'm fairly certain nothing like this is implemented in ata(4). Ideally, it would be the job of the controller and controller driver to announce to underlying I/O operations fail/success. Do you agree? I hope this "FMA Engine" on Solaris only *tells* underlying pieces of I/O errors, rather than acting on them (e.g. automatically yanking the disk off the bus for you). I'm in no way shunning Solaris, I'm simply saying such a mechanism could be as risky/deadly as it could be useful. > In the end, the failing drive was timing out literally every I/O - I did > recover the situation by detaching it from the pool (which hung the > machine - probably caused by ZFS having to update the meta-data on all > drives, including the failed one). A reboot bought the pool back, minus > the 'failed' drive, so enough of the 'detach' must have completed. > > The newly attached drive completed the re-silver in half an hour (as > opposed to an estimated 755 hours and climbing with the other drive still > in the pool, limping along). -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |