From owner-freebsd-stable@FreeBSD.ORG Tue Jul 20 02:07:32 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 92A73106564A for ; Tue, 20 Jul 2010 02:07:32 +0000 (UTC) (envelope-from dan@langille.org) Received: from nyi.unixathome.org (nyi.unixathome.org [64.147.113.42]) by mx1.freebsd.org (Postfix) with ESMTP id 640E38FC18 for ; Tue, 20 Jul 2010 02:07:32 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by nyi.unixathome.org (Postfix) with ESMTP id C0B6450B99; Tue, 20 Jul 2010 03:07:31 +0100 (BST) X-Virus-Scanned: amavisd-new at unixathome.org Received: from nyi.unixathome.org ([127.0.0.1]) by localhost (nyi.unixathome.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0PV8QalXR6U6; Tue, 20 Jul 2010 03:07:31 +0100 (BST) Received: from smtp-auth.unixathome.org (smtp-auth.unixathome.org [10.4.7.7]) (Authenticated sender: hidden) by nyi.unixathome.org (Postfix) with ESMTPSA id EE3D550B97 ; Tue, 20 Jul 2010 03:07:30 +0100 (BST) Message-ID: <4C4504DF.30602@langille.org> Date: Mon, 19 Jul 2010 22:07:27 -0400 From: Dan Langille Organization: The FreeBSD Diary User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.2.4) Gecko/20100608 Thunderbird/3.1 MIME-Version: 1.0 To: Freddie Cash References: In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-stable Subject: Re: Problems replacing failing drive in ZFS pool X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Jul 2010 02:07:32 -0000 On 7/19/2010 12:15 PM, Freddie Cash wrote: > On Mon, Jul 19, 2010 at 8:56 AM, Garrett Moore wrote: >> So you think it's because when I switch from the old disk to the new disk, >> ZFS doesn't realize the disk has changed, and thinks the data is just >> corrupt now? Even if that happens, shouldn't the pool still be available, >> since it's RAIDZ1 and only one disk has gone away? > > I think it's because you pull the old drive, boot with the new drive, > the controller re-numbers all the devices (ie da3 is now da2, da2 is > now da1, da1 is now da0, da0 is now da6, etc), and ZFS thinks that all > the drives have changed, thus corrupting the pool. I've had this > happen on our storage servers a couple of times before I started using > glabel(8) on all our drives (dead drive on RAID controller, remove > drive, reboot for whatever reason, all device nodes are renumbered, > everything goes kablooey). Can you explain a bit about how you use glabel(8) in conjunction with ZFS? If I can retrofit this into an exist ZFS array to make things easier in the future... 8.0-STABLE #0: Fri Mar 5 00:46:11 EST 2010 ]# zpool status pool: storage state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM storage ONLINE 0 0 0 raidz1 ONLINE 0 0 0 ad8 ONLINE 0 0 0 ad10 ONLINE 0 0 0 ad12 ONLINE 0 0 0 ad14 ONLINE 0 0 0 ad16 ONLINE 0 0 0 > Of course, always have good backups. ;) In my case, this ZFS array is the backup. ;) But I'm setting up a tape library, real soon now.... -- Dan Langille - http://langille.org/