Date: Tue, 09 Mar 2010 11:11:39 +0100 From: Ivan Voras <ivoras@freebsd.org> To: freebsd-fs@freebsd.org Cc: freebsd-stable@freebsd.org Subject: Re: ZFS hot spares Message-ID: <hn56sl$kor$1@dough.gmane.org> In-Reply-To: <4B953C92.5080606@comcast.net> References: <4B953C92.5080606@comcast.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On 03/08/10 19:06, Steve Polyack wrote: > ZFS in FreeBSD lacks at least one major feature from the Solaris > version: hot spares. There is a PR open at > http://www.freebsd.org/cgi/query-pr.cgi?pr=134491, but there hasn't been > any motion/thoughts posted on it since its creation almost one year ago. > > I'm aware that on Solaris, hot spare replacement is handled by a few > Solaris-specific daemons, zfs-retire and zfs-diagnose, which both plug > into the Solaris FMA (Fault Management Architecture). Have there been > any thoughts on porting these over or getting something similar running > within FreeBSD? With all of the recent SATA/SAS CAM hotplug work now > committed, it would be nice to have automatic replacement of hot spares > with a future hot-replacement of the failed drive. > > On the other side, I'd be interested in hearing if anyone has had > success in rolling their own scripted solution: i.e. something which > polls 'zpool status' looking for failed drives and performing hot-spare > replacements automatically. You don't have to exactly poll it. See /etc/devd.conf: # Sample ZFS problem reports handling. notify 10 { match "system" "ZFS"; match "type" "zpool"; action "logger -p kern.err 'ZFS: failed to load zpool $pool'"; }; notify 10 { match "system" "ZFS"; match "type" "vdev"; action "logger -p kern.err 'ZFS: vdev failure, zpool=$pool type=$type'"; }; notify 10 { match "system" "ZFS"; match "type" "data"; action "logger -p kern.warn 'ZFS: zpool I/O failure, zpool=$pool error=$zio_err'"; }; notify 10 { match "system" "ZFS"; match "type" "io"; action "logger -p kern.warn 'ZFS: vdev I/O failure, zpool=$pool path=$vdev_path offset=$zio_offset size=$zio_size error=$zio_err'"; }; notify 10 { match "system" "ZFS"; match "type" "checksum"; action "logger -p kern.warn 'ZFS: checksum mismatch, zpool=$pool path=$vdev_path offset=$zio_offset size=$zio_size'"; }; I don't really know if these notifications actually work since I don't have hot-plug test machines, but if they do, this looks like a decent starting point.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?hn56sl$kor$1>