From owner-freebsd-bugs@FreeBSD.ORG Tue May 12 16:10:01 2009 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AD5801065670 for ; Tue, 12 May 2009 16:10:01 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 89C448FC1B for ; Tue, 12 May 2009 16:10:01 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n4CGA1xL084652 for ; Tue, 12 May 2009 16:10:01 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n4CGA14i084651; Tue, 12 May 2009 16:10:01 GMT (envelope-from gnats) Resent-Date: Tue, 12 May 2009 16:10:01 GMT Resent-Message-Id: <200905121610.n4CGA14i084651@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Michel Bouissou Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 18653106564A for ; Tue, 12 May 2009 16:09:35 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21]) by mx1.freebsd.org (Postfix) with ESMTP id E16228FC13 for ; Tue, 12 May 2009 16:09:34 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (localhost [127.0.0.1]) by www.freebsd.org (8.14.3/8.14.3) with ESMTP id n4CG9Yrs098197 for ; Tue, 12 May 2009 16:09:34 GMT (envelope-from nobody@www.freebsd.org) Received: (from nobody@localhost) by www.freebsd.org (8.14.3/8.14.3/Submit) id n4CG9Y1C098196; Tue, 12 May 2009 16:09:34 GMT (envelope-from nobody) Message-Id: <200905121609.n4CG9Y1C098196@www.freebsd.org> Date: Tue, 12 May 2009 16:09:34 GMT From: Michel Bouissou To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.1 Cc: Subject: kern/134491: ZFS: Hot spares are rather cold... X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 May 2009 16:10:02 -0000 >Number: 134491 >Category: kern >Synopsis: ZFS: Hot spares are rather cold... >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Tue May 12 16:10:01 UTC 2009 >Closed-Date: >Last-Modified: >Originator: Michel Bouissou >Release: 7.2 >Organization: Bioclinica >Environment: >Description: Although ZFS offers the possibility to define devices as "spares" for MIRROR / RAIDZ / RAIDZ2 storage pools, and FreeBSD will happily accept this, such "spare" devices will *NOT* automagically take over if a RAID pool device fails. According to http://docs.sun.com/app/docs/doc/819-5461/gcvcw?a=view , I understand that the device replacement with a spare might not be performed by the kernel ZFS module but by an external agent/daemon ? « Automatic replacement – When a fault is received, an FMA agent examines the pool to see if it has any available hot spares. If so, it replaces the faulted device with an available spare. » I'm unable to find such a tool in FreeBSD, at least if it exists (?) it isn't active by default. So in the current status ZFS "spares" have to be activated / deactivated manually when a disk fails or is replaced. Not only this is suboptimal but this presents a data loss risk for people who would assume that "spares" would just do what they are intended for in all usual RAID implementations... Where they won't and will just sit there idle if a disk dies, until the admin manually activates them. This deserves preferably a fix, but at least a prominent WARNING note... Also, although SUN doc states « Multiple pools can share devices that are designated as hot spares », in the current FreeBSD implementation ZFS will refuse to assign to a pool a "spare" which is already assigned to another, stating the device is "busy", i.e.: # zpool status pool: syspool state: ONLINE (Blah-blah) NAME STATE READ WRITE CKSUM syspool ONLINE 0 0 0 mirror ONLINE 0 0 0 aacd1 ONLINE 0 0 0 aacd2 ONLINE 0 0 0 spares da15 AVAIL (Blah-blah) # zpool add vol01 spare da15 invalid vdev specification use '-f' to override the following errors: da15 is in use (r1w1e1) # zpool add -f vol01 spare da15 invalid vdev specification the following errors must be manually repaired: da15 is in use (r1w1e1) >How-To-Repeat: Create any redundant ZFS storage pool with a spare device. Hot-remove (or manually "offline") an active device from the pool. The spare won't take over unless a manual "zpool replace " is issued. >Fix: >Release-Note: >Audit-Trail: >Unformatted: