Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 19 May 2010 00:05:59 -0500 (CDT)
From:      Wes Morgan <morganw@chemikals.org>
To:        Todd Wasson <tsw5@duke.edu>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: zfs drive replacement issues
Message-ID:  <alpine.BSF.2.00.1005182357280.75234@ibyngvyr>
In-Reply-To: <53F15A8B-77DA-4CEF-A790-2902BEC91002@duke.edu>
References:  <0B97967D-1057-4414-BBD4-4F1AA2659A5D@duke.edu> <4BF0F231.9000706@mapper.nl> <53F15A8B-77DA-4CEF-A790-2902BEC91002@duke.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 17 May 2010, Todd Wasson wrote:

> > Hello,
> > You could try exporting and importing the pool with three disks.
> > Then make sure the "new" drive isn't part of any zpool (low-level format?).
> > Then try a "replace" again.
> > Have fun!
> >
>
>
> Hi Mark, I was about to try this, but I just tried putting the "old"
> (damaged) drive back in the pool and detaching the "new" drive from the
> pool, which I've tried before, but for some reason this time it
> succeeded.  I was then able to "zpool offline" the old drive, physically
> replace it with the new one, and "zpool replace" the old one with the
> new one.  It just finished successfully resilvering, and apparently
> everything is working well.  I'm going to initiate a scrub to be sure
> that everything is alright, but I'm fairly sure that the problem is
> solved.  I didn't do anything that I hadn't already tried, so I don't
> know why it worked this time, but I'm not complaining.  Thanks to
> everyone for your help; at the very least, the idea of putting the
> original drive back into the machine and mucking around with it led me
> in the right direction.  Next time I'll be sure to issue an offline
> command before replacing a device!

I'm not certain that you really always want to do that. When you offline a
device in a redundant pool you lose that redundancy. If you have a drive
that is completely dead, it is obviously the right thing to do, but
otherwise perhaps not. Were you the have another failure during the
rebuild, or if there was another error on a different vdev, you wouldn't
be able to recover that data because of the missing device. The same
reason why offlining and replacing each device in a raidz1 to "grow" it
isn't as safe as you might think -- any error could lead to data loss.

Just food for thought.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.2.00.1005182357280.75234>