FreeBSD Mail Archives

Date:      Fri, 8 Nov 2013 11:07:27 -0800
From:      Artem Belevich <art@freebsd.org>
To:        Benjamin Lutz <benjamin.lutz@biolab.ch>
Cc:        freebsd-fs <freebsd-fs@freebsd.org>, Andre Seidelt <andre.seidelt@biolab-technology.de>, Dirk Hoefle <dirk.hoefle@biolab.ch>
Subject:   Re: Ghost ZFS pool prevents mounting root fs
Message-ID:  <CAFqOu6gCuL6Hr86ceKkJTMhznCNtMStGNUWyz--U6TURUc1OCw@mail.gmail.com>
In-Reply-To: <OF58789A8B.60B611D5-ONC1257C1D.0039CC23-C1257C1D.003BDD97@biotronik.com>
References:  <OF58789A8B.60B611D5-ONC1257C1D.0039CC23-C1257C1D.003BDD97@biotronik.com>

On Fri, Nov 8, 2013 at 2:53 AM, Benjamin Lutz <benjamin.lutz@biolab.ch> wrote:
> Hello,
>
> I have a server here that after trying to reboot during the 9.2 update
> process refuses to mount the root file system, which is a ZFS (tank).
>
> The error message given is:
>   Trying to mount root from zfs:tank []...
>   Mounting from zfs:tank failed with error 5.
>
> Adding a mit more verbosity by setting vfs.zfs.debug=1 gives one
> additional crucial bit of information that probably explains why, it tries
> to find the disk /dev/label/disk7, but no such disk exists.

I ran into the same issue recently.
http://lists.freebsd.org/pipermail/freebsd-fs/2013-November/018496.html

> Can you tell me how to resolve the situation, i.e. how to make the ghost
> pool go away? I'd rather not recreate the pool or move the data to another
> system, since it's around 16TB and would take forever.

It should be doable, but usual "YMMV", "proceed at your own risk",
"here, there be dragons" warnings apply.

[snip]

> root@:~ # zdb -l /dev/da1
> --------------------------------------------
> LABEL 0
> --------------------------------------------
> failed to unpack label 0
> --------------------------------------------
> LABEL 1
> --------------------------------------------
> failed to unpack label 1
> --------------------------------------------
> LABEL 2
> --------------------------------------------
>     version: 28
>     name: 'tank'
>     state: 2
>     txg: 61
>     pool_guid: 4570073208211798611
>     hostid: 1638041647
>     hostname: 'blackhole'
>     top_guid: 5554077360160676751
>     guid: 11488943812765429059
>     vdev_children: 1
>     vdev_tree:
>         type: 'raidz'
>         id: 0
>         guid: 5554077360160676751
>         nparity: 3
>         metaslab_array: 30
>         metaslab_shift: 37
>         ashift: 12
>         asize: 16003153002496
>         is_log: 0
>         create_txg: 4
>         children[0]:
>             type: 'disk'
>             id: 0
>             guid: 7103686668495146668
>             path: '/dev/label/disk0'
>             phys_path: '/dev/label/disk0'
>             whole_disk: 1
>             create_txg: 4

The ghost labels are at the end of /dev/da1 (and, probably all other
drives that used to be part of that pool).
In my case I ended up manually zeroing out first sector of offending labels.

ZFS places two copies of the labels at 512K and 256K from the end of
the pool slice.
See ZFS on-disk specification here:
http://maczfs.googlecode.com/files/ZFSOnDiskFormat.pdf

It's fairly easy to find with:

#dd if=/dev/da1 bs=1m iseek={disk size in mb -1} count=1 | hexdump -C
| grep version

Once you know where exactly it is, deleting it is simple. Watch out
for dd typos or, perhaps use some sort of disk editor to make sure
you're not overwriting wrong data.
It's a fairly risky operation as you have to make sure you don't nuke
anything else by accident.
If the disk portion with the labels is currently unallocated, then
things are relatively safe.
If it's currently used, then you'll need to figure out whether it's
safe to overwrite those labels directly or find another way to do it.
I.e. if the area with the labels is currently used for some other
filesystem, you may be able to get rid of the label by filling up that
filesystem with data which would hopefully overwrite labels with
something else. If the labels are within the area that is part of the
current pool, you are probably safe as it's either in unused area or
it's not been used by ZFS yet. In my case the ghost labels were in the
neighbourhood of the labels of the current pool and nuking them
produced zero errors on scrub.

Once you've established that manual label nuking is what you want,
here's the recipe:

* Make sure risking your data is *really* worth it. Consider erasing
drives one-by-one and let raidz repair the pool if you have any
doubts.

Now that that's out of the way, let's nuke them.

* offline one of the drives with the ghost labels or do the operation
on an unmounted pool (I've booted from MFSBSD CD).

Make sure that it is the right sector you're writing to (i.e. it's the
label with wrong disks):
* dd if=/dev/daN bs=512 iseek=<sector that has 'version' word in the
label> count=10 | hexdump -C

Nuke the ghost! Note: you only want to write *one* sector. Make sure
you don't forget to edit count if you use shell history and reuse the
commend above.

* dd if=/dev/zero of=/dev/daN bs=512 oseek={sector that has 'version'
word in the label} count=1

* make sure "zdb -l /dev/daN" no longer shows ghost label.

* online the disk

* scrub the pool. In case you made a mistake and wrote to the wrong
place that may save your pool.
 I did the scrub only after I've erased label on the first drive to
make sure it didn't damage anything vital.

* repeat for all other disks with ghost labels.

* run the scrub after all ghost labels have been erased. Just in case.

Good luck.

--Artem

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFqOu6gCuL6Hr86ceKkJTMhznCNtMStGNUWyz--U6TURUc1OCw>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation