Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 19 Jul 2010 09:15:21 -0700
From:      Freddie Cash <fjwcash@gmail.com>
To:        freebsd-stable <freebsd-stable@freebsd.org>
Subject:   Re: Problems replacing failing drive in ZFS pool
Message-ID:  <AANLkTillT4yA5EJtcFUyhCUtD7b14u1n7svv02Y2IcqL@mail.gmail.com>
In-Reply-To: <AANLkTikPOgIqkm3GhIsEnvuvEHvlc44cnh6GJQ1k7Ja_@mail.gmail.com>
References:  <AANLkTimOrwHe7xiwoap2H2mUtA7vU6TjENkPC4yJ02_z@mail.gmail.com> <AANLkTimOIgCIO4txpPeeoMrRSYAqM25V7cm-h7djmZUC@mail.gmail.com> <AANLkTikPOgIqkm3GhIsEnvuvEHvlc44cnh6GJQ1k7Ja_@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Jul 19, 2010 at 8:56 AM, Garrett Moore <garrettmoore@gmail.com> wrote:
> So you think it's because when I switch from the old disk to the new disk,
> ZFS doesn't realize the disk has changed, and thinks the data is just
> corrupt now? Even if that happens, shouldn't the pool still be available,
> since it's RAIDZ1 and only one disk has gone away?

I think it's because you pull the old drive, boot with the new drive,
the controller re-numbers all the devices (ie da3 is now da2, da2 is
now da1, da1 is now da0, da0 is now da6, etc), and ZFS thinks that all
the drives have changed, thus corrupting the pool.  I've had this
happen on our storage servers a couple of times before I started using
glabel(8) on all our drives (dead drive on RAID controller, remove
drive, reboot for whatever reason, all device nodes are renumbered,
everything goes kablooey).

Doing the export and import will force ZFS to re-read the metadata on
the drives (ZFS does it's own "labelling" to say which drives belong
to which vdevs), and to pick things up correctly using the new device
nodes.

> I don't have / on ZFS; I'm only using it as a 'data' partition, so I should
> be able to try your suggestion. My only concern: is there any risk of
> trashing my pool if I try your instructions? Everything I've done so far,
> even when told "insufficient replicas / corrupt data", has not cost me any
> data as long as I switch back to the original (dying) drive. If I mix in
> export/import statements which might 'touch' the pool, is there a chance it
> will choke and trash my data?

Well, there's always a chance things explode.  :)  But an
export/import is safe so long as all drives are connected at the time.
 I've recovered "corrupted" pools by doing the above.  (I've now
switched to labelling all my drives to prevent this from happening.)

Of course, always have good backups.  ;)

-- 
Freddie Cash
fjwcash@gmail.com



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTillT4yA5EJtcFUyhCUtD7b14u1n7svv02Y2IcqL>