From owner-freebsd-hackers@FreeBSD.ORG  Fri Sep 12 15:59:09 2008
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 21A6F106564A
	for <freebsd-hackers@freebsd.org>; Fri, 12 Sep 2008 15:59:09 +0000 (UTC)
	(envelope-from fjwcash@gmail.com)
Received: from smtp.sd73.bc.ca (smtp.sd73.bc.ca [142.24.13.140])
	by mx1.freebsd.org (Postfix) with ESMTP id EC43A8FC1F
	for <freebsd-hackers@freebsd.org>; Fri, 12 Sep 2008 15:59:08 +0000 (UTC)
	(envelope-from fjwcash@gmail.com)
Received: from localhost (localhost [127.0.0.1])
	by localhost.sd73.bc.ca (Postfix) with ESMTP id 86FD81A013BBC
	for <freebsd-hackers@freebsd.org>; Fri, 12 Sep 2008 08:34:49 -0700 (PDT)
X-Virus-Scanned: Debian amavisd-new at smtp.sd73.bc.ca
Received: from smtp.sd73.bc.ca ([127.0.0.1])
	by localhost (smtp.sd73.bc.ca [127.0.0.1]) (amavisd-new, port 10024)
	with LMTP id fDDk9XTZsv7L for <freebsd-hackers@freebsd.org>;
	Fri, 12 Sep 2008 08:34:02 -0700 (PDT)
Received: from coal (unknown [192.168.0.10])
	by smtp.sd73.bc.ca (Postfix) with ESMTP id F2E521A01550B
	for <freebsd-hackers@freebsd.org>; Fri, 12 Sep 2008 08:33:28 -0700 (PDT)
From: Freddie Cash <fjwcash@gmail.com>
To: freebsd-hackers@freebsd.org
Date: Fri, 12 Sep 2008 08:33:27 -0700
User-Agent: KMail/1.9.9
References: <C984A6E7B1C6657CD8C4F79E@Slim64.dmpriest.net.uk>
In-Reply-To: <C984A6E7B1C6657CD8C4F79E@Slim64.dmpriest.net.uk>
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200809120833.28233.fjwcash@gmail.com>
Subject: Re: ZFS w/failing drives - any equivalent of Solaris FMA?
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 12 Sep 2008 15:59:09 -0000

On September 12, 2008 02:45 am Karl Pielorz wrote:
> Recently, a ZFS pool on my FreeBSD box started showing lots of errors
> on one drive in a mirrored pair.
>
> The pool consists of around 14 drives (as 7 mirrored pairs), hung off
> of a couple of SuperMicro 8 port SATA controllers (1 drive of each pair
> is on each controller).
>
> One of the drives started picking up a lot of errors (by the end of
> things it was returning errors pretty much for any reads/writes issued)
> - and taking ages to complete the I/O's.
>
> However, ZFS kept trying to use the drive - e.g. as I attached another
> drive to the remaining 'good' drive in the mirrored pair, ZFS was still
> trying to read data off the failed drive (and remaining good one) in
> order to complete it's re-silver to the newly attached drive.

For the one time I've had a drive fail, and the three times I've replaced 
drives for larger ones, the process used was:

  zpool offline <pool> <old device>
  <remove old device>
  <insert new device>
  zpool replace <pool> <old device> <new device>

For one machine, I had to shut it off after the offline, as it didn't have 
hot-swappable drive bays.  For the other machine, it did everything while 
online and running.

IOW, the old device never had a chance to interfere with anything.  Same 
process we've used with hardware RAID setups in the past.

> Is there anything similar to this on FreeBSD yet? - i.e. Does/can
> anything on the system tell ZFS "This drives experiencing failures"
> rather than ZFS just seeing lots of timed out I/O 'errors'? (as appears
> to be the case).

Beyond the periodic script that checks for things like this, and sends 
root an e-mail, I haven't seen anything.

-- 
Freddie Cash
fjwcash@gmail.com