Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 22 Jan 2013 16:36:52 +0100
From:      Borja Marcos <borjam@sarenet.es>
To:        Warren Block <wblock@wonkity.com>
Cc:        FreeBSD Filesystems <freebsd-fs@freebsd.org>
Subject:   Re: RFC: Suggesting ZFS "best practices" in FreeBSD
Message-ID:  <16B2C50C-DD36-4375-A002-F866A612D842@sarenet.es>
In-Reply-To: <alpine.BSF.2.00.1301220759420.61512@wonkity.com>
References:  <314B600D-E8E6-4300-B60F-33D5FA5A39CF@sarenet.es> <alpine.BSF.2.00.1301220759420.61512@wonkity.com>

next in thread | previous in thread | raw e-mail | index | archive | help

On Jan 22, 2013, at 4:04 PM, Warren Block wrote:

> I'm a proponent of using various types of labels, but my impression =
after a recent experience was that ZFS metadata was enough to identify =
the drives even if they were moved around.  That is, ZFS bare metadata =
on a drive with no other partitioning or labels.
>=20
> Is that incorrect?

I'm afraid it's inconvenient unless you enjoy reboots ;)

This is a patologic and likely example I just demonstrated to a friend.

We were testing a new server with 12 hard disks and a proper HBA.

The disks are, unspririsingly, da0-da11. There is a da12 used (for now) =
for the OS, so that there's no problem to create and destroy pools at =
leisure.

My friend had created a pool with two raidz vdevs nothing rocket =
science. da0-5, da6-11.

So, we were doing some tests and I've pulled one of the disks. Nothing =
special, ZFS recovers nicely.

Now it comes the fun part.=20

I reboot the machine with the missing disk.

What happens now?

I had pulled da4 I think. So, disks with an ID > 4 have been renamed to =
N  - 1. da5 became da4, da6 became da5, da7 became da6... and, =
critically, da12 became da11.

The reboot begun by failing to mount the root filesystem, but that one =
is trivial. Just tell the kernel where it is now (da11) and it boots =
happily.

Now, we have a degraded pool with a missing disk (da4) and a da4 that =
previously was da5. It works of course, but in degraded state.

OK, we found a replacement disk, and we plugged it. It became, guess! =
Yes, da12.

Now: I cannot "online" da4, because it exists. I cannot online da12  =
because it didn't belong to the pool. I cannot replace da4 with da12, =
because it is there.

Now that I think of it, in this case:
15896790606386444480  OFFLINE      0     0     0  was /dev/da4

Is it possible to say zpool replace 15896790606386444480 da12? I haven't =
tried it.

Anyway, seems to be a bit confusing. The logical, albeit cumbersome =
approach is to reboot the machine with the new da4 in place, and after =
rebooting, onlining or replacing.=20

Using names prevents this kind of confusion.






Borja.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?16B2C50C-DD36-4375-A002-F866A612D842>