Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 09 Mar 2010 11:11:39 +0100
From:      Ivan Voras <ivoras@freebsd.org>
To:        freebsd-fs@freebsd.org
Cc:        freebsd-stable@freebsd.org
Subject:   Re: ZFS hot spares
Message-ID:  <hn56sl$kor$1@dough.gmane.org>
In-Reply-To: <4B953C92.5080606@comcast.net>
References:  <4B953C92.5080606@comcast.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On 03/08/10 19:06, Steve Polyack wrote:
> ZFS in FreeBSD lacks at least one major feature from the Solaris
> version: hot spares. There is a PR open at
> http://www.freebsd.org/cgi/query-pr.cgi?pr=134491, but there hasn't been
> any motion/thoughts posted on it since its creation almost one year ago.
>
> I'm aware that on Solaris, hot spare replacement is handled by a few
> Solaris-specific daemons, zfs-retire and zfs-diagnose, which both plug
> into the Solaris FMA (Fault Management Architecture). Have there been
> any thoughts on porting these over or getting something similar running
> within FreeBSD? With all of the recent SATA/SAS CAM hotplug work now
> committed, it would be nice to have automatic replacement of hot spares
> with a future hot-replacement of the failed drive.
>
> On the other side, I'd be interested in hearing if anyone has had
> success in rolling their own scripted solution: i.e. something which
> polls 'zpool status' looking for failed drives and performing hot-spare
> replacements automatically.

You don't have to exactly poll it. See /etc/devd.conf:

# Sample ZFS problem reports handling.
notify 10 {
         match "system"          "ZFS";
         match "type"            "zpool";
         action "logger -p kern.err 'ZFS: failed to load zpool $pool'";
};

notify 10 {
         match "system"          "ZFS";
         match "type"            "vdev";
         action "logger -p kern.err 'ZFS: vdev failure, zpool=$pool 
type=$type'";
};

notify 10 {
         match "system"          "ZFS";
         match "type"            "data";
         action "logger -p kern.warn 'ZFS: zpool I/O failure, 
zpool=$pool error=$zio_err'";
};

notify 10 {
         match "system"          "ZFS";
         match "type"            "io";
         action "logger -p kern.warn 'ZFS: vdev I/O failure, zpool=$pool 
path=$vdev_path offset=$zio_offset size=$zio_size error=$zio_err'";
};

notify 10 {
         match "system"          "ZFS";
         match "type"            "checksum";
         action "logger -p kern.warn 'ZFS: checksum mismatch, 
zpool=$pool path=$vdev_path offset=$zio_offset size=$zio_size'";
};

I don't really know if these notifications actually work since I don't 
have hot-plug test machines, but if they do, this looks like a decent 
starting point.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?hn56sl$kor$1>