From owner-freebsd-fs@FreeBSD.ORG Tue Mar 9 18:33:29 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 509F0106564A for ; Tue, 9 Mar 2010 18:33:29 +0000 (UTC) (envelope-from korvus@comcast.net) Received: from mx04.pub.collaborativefusion.com (mx04.pub.collaborativefusion.com [206.210.72.84]) by mx1.freebsd.org (Postfix) with ESMTP id 0CBFC8FC0C for ; Tue, 9 Mar 2010 18:33:28 +0000 (UTC) Received: from [192.168.2.164] ([206.210.89.202]) by mx04.pub.collaborativefusion.com (StrongMail Enterprise 4.1.1.4(4.1.1.4-47689)); Tue, 09 Mar 2010 13:50:54 -0500 X-VirtualServerGroup: Default X-MailingID: 00000::00000::00000::00000::::380 X-SMHeaderMap: mid="X-MailingID" X-Destination-ID: freebsd-fs@freebsd.org X-SMFBL: ZnJlZWJzZC1mc0BmcmVlYnNkLm9yZw== Message-ID: <4B969477.70706@comcast.net> Date: Tue, 09 Mar 2010 13:33:27 -0500 From: Steve Polyack User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.1.7) Gecko/20100211 Thunderbird/3.0.1 MIME-Version: 1.0 To: Ivan Voras References: <4B953C92.5080606@comcast.net> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org Subject: Re: ZFS hot spares X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Mar 2010 18:33:29 -0000 On 03/09/10 05:11, Ivan Voras wrote: > On 03/08/10 19:06, Steve Polyack wrote: >> ZFS in FreeBSD lacks at least one major feature from the Solaris >> version: hot spares. There is a PR open at >> http://www.freebsd.org/cgi/query-pr.cgi?pr=134491, but there hasn't been >> any motion/thoughts posted on it since its creation almost one year ago. >> >> I'm aware that on Solaris, hot spare replacement is handled by a few >> Solaris-specific daemons, zfs-retire and zfs-diagnose, which both plug >> into the Solaris FMA (Fault Management Architecture). Have there been >> any thoughts on porting these over or getting something similar running >> within FreeBSD? With all of the recent SATA/SAS CAM hotplug work now >> committed, it would be nice to have automatic replacement of hot spares >> with a future hot-replacement of the failed drive. >> >> On the other side, I'd be interested in hearing if anyone has had >> success in rolling their own scripted solution: i.e. something which >> polls 'zpool status' looking for failed drives and performing hot-spare >> replacements automatically. > > You don't have to exactly poll it. See /etc/devd.conf: > > # Sample ZFS problem reports handling. > notify 10 { > match "system" "ZFS"; > match "type" "zpool"; > action "logger -p kern.err 'ZFS: failed to load zpool $pool'"; > }; > > notify 10 { > match "system" "ZFS"; > match "type" "vdev"; > action "logger -p kern.err 'ZFS: vdev failure, zpool=$pool > type=$type'"; > }; > > notify 10 { > match "system" "ZFS"; > match "type" "data"; > action "logger -p kern.warn 'ZFS: zpool I/O failure, > zpool=$pool error=$zio_err'"; > }; > > notify 10 { > match "system" "ZFS"; > match "type" "io"; > action "logger -p kern.warn 'ZFS: vdev I/O failure, > zpool=$pool path=$vdev_path offset=$zio_offset size=$zio_size > error=$zio_err'"; > }; > > notify 10 { > match "system" "ZFS"; > match "type" "checksum"; > action "logger -p kern.warn 'ZFS: checksum mismatch, > zpool=$pool path=$vdev_path offset=$zio_offset size=$zio_size'"; > }; > > I don't really know if these notifications actually work since I don't > have hot-plug test machines, but if they do, this looks like a decent > starting point. > Thanks for the suggestions. I received a similar one from someone else. If I get time to build a ZFS lab machine then I will certainly try these out and provide feedback on how they work.