From owner-freebsd-fs@FreeBSD.ORG Tue Mar 9 17:15:51 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6E0E5106566C for ; Tue, 9 Mar 2010 17:15:51 +0000 (UTC) (envelope-from freebsd-fs@m.gmane.org) Received: from lo.gmane.org (lo.gmane.org [80.91.229.12]) by mx1.freebsd.org (Postfix) with ESMTP id F05558FC17 for ; Tue, 9 Mar 2010 17:15:50 +0000 (UTC) Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1Np30s-00067e-Q5 for freebsd-fs@freebsd.org; Tue, 09 Mar 2010 18:14:34 +0100 Received: from lara.cc.fer.hr ([161.53.72.113]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 09 Mar 2010 18:14:34 +0100 Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 09 Mar 2010 18:14:34 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-fs@freebsd.org From: Ivan Voras Date: Tue, 09 Mar 2010 11:11:39 +0100 Lines: 59 Message-ID: References: <4B953C92.5080606@comcast.net> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.1.5) Gecko/20100118 Thunderbird/3.0 In-Reply-To: <4B953C92.5080606@comcast.net> Cc: freebsd-stable@freebsd.org Subject: Re: ZFS hot spares X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Mar 2010 17:15:51 -0000 On 03/08/10 19:06, Steve Polyack wrote: > ZFS in FreeBSD lacks at least one major feature from the Solaris > version: hot spares. There is a PR open at > http://www.freebsd.org/cgi/query-pr.cgi?pr=134491, but there hasn't been > any motion/thoughts posted on it since its creation almost one year ago. > > I'm aware that on Solaris, hot spare replacement is handled by a few > Solaris-specific daemons, zfs-retire and zfs-diagnose, which both plug > into the Solaris FMA (Fault Management Architecture). Have there been > any thoughts on porting these over or getting something similar running > within FreeBSD? With all of the recent SATA/SAS CAM hotplug work now > committed, it would be nice to have automatic replacement of hot spares > with a future hot-replacement of the failed drive. > > On the other side, I'd be interested in hearing if anyone has had > success in rolling their own scripted solution: i.e. something which > polls 'zpool status' looking for failed drives and performing hot-spare > replacements automatically. You don't have to exactly poll it. See /etc/devd.conf: # Sample ZFS problem reports handling. notify 10 { match "system" "ZFS"; match "type" "zpool"; action "logger -p kern.err 'ZFS: failed to load zpool $pool'"; }; notify 10 { match "system" "ZFS"; match "type" "vdev"; action "logger -p kern.err 'ZFS: vdev failure, zpool=$pool type=$type'"; }; notify 10 { match "system" "ZFS"; match "type" "data"; action "logger -p kern.warn 'ZFS: zpool I/O failure, zpool=$pool error=$zio_err'"; }; notify 10 { match "system" "ZFS"; match "type" "io"; action "logger -p kern.warn 'ZFS: vdev I/O failure, zpool=$pool path=$vdev_path offset=$zio_offset size=$zio_size error=$zio_err'"; }; notify 10 { match "system" "ZFS"; match "type" "checksum"; action "logger -p kern.warn 'ZFS: checksum mismatch, zpool=$pool path=$vdev_path offset=$zio_offset size=$zio_size'"; }; I don't really know if these notifications actually work since I don't have hot-plug test machines, but if they do, this looks like a decent starting point.