Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 26 Jan 2010 18:10:12 +0100
From:      Gerrit =?ISO-8859-1?Q?K=FChn?= <gerrit@pmp.uni-hannover.de>
To:        Chuck Swiger <cswiger@mac.com>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: ZFS "zpool replace" problems
Message-ID:  <20100126181012.4669d417.gerrit@pmp.uni-hannover.de>
In-Reply-To: <5F20B2B6-D75C-4E27-9CC9-85C6E64D13BD@mac.com>
References:  <20100126143021.GA47535@icarus.home.lan> <20100126160320.6ed67b92.gerrit@pmp.uni-hannover.de> <FA0BAC0D-35A7-4296-B52C-9D4D8A6CC609@mac.com> <20100126172503.927e1bb5.gerrit@pmp.uni-hannover.de> <5F20B2B6-D75C-4E27-9CC9-85C6E64D13BD@mac.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 26 Jan 2010 08:59:27 -0800 Chuck Swiger <cswiger@mac.com> wrote
about Re: ZFS "zpool replace" problems:

CS> As a general matter of maintaining RAID systems, however, the approach
CS> to upgrading drive firmware on members of a RAID array should be to
CS> take down the entire container and offline the drives, update one
CS> drive, test it (via SMART self-test and read-only checksum comparison
CS> or similar), and then proceed to update all of the drives (preferably
CS> doing the SMART self-test for each, if time allows) before returning
CS> them to the RAID container and onlining them.

Well, I had several spare drives sitting on the shelf. So I updated the
firmware of these spare drives and now want to replace the drives with the
old firmware by new new ones one-by-one. Taking the system offline for
longer than a few minutes is not really an option. I'd rather roll in a
new machine to take over the job in that case.

CS> Pulling individual drives from a RAID set while live and updating the
CS> firmware one at a time is not an approach I would take-- running with
CS> mixed firmware versions doesn't thrill me, and I know of multiple
CS> cases where someone made a mistake reconnecting a drive with the wrong
CS> SCSI id or something like that, taking out a second drive while the
CS> RAID was not redundant, resulting in massive data corruption or even
CS> total loss of the RAID contents.

This scenario was exactly the reason why I plugged in the new drive to an
extra slot and asked zfs to replace it with an old one. Well, I did not
know what kind of fiasco the controller for this extra slot would turn out
to be - otherwise I would have used the hot-spare slot for this in the
first place.


cu
  Gerrit



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100126181012.4669d417.gerrit>