Date: Thu, 22 Jun 2023 12:30:28 +0000 From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 271989] zfs root mount error 6 after upgrade from 11.1-release to 13.2-release Message-ID: <bug-271989-227-NcU14ceZKL@https.bugs.freebsd.org/bugzilla/> In-Reply-To: <bug-271989-227@https.bugs.freebsd.org/bugzilla/> References: <bug-271989-227@https.bugs.freebsd.org/bugzilla/>
next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D271989 --- Comment #9 from Markus Wild <freebsd-bugs@virtualtec.ch> --- (In reply to Dan Langille from comment #8) there you go, the bogus=20 pool_guid: 18320603570228782289 is what causes your kernel to fail to load the pool, since it shows up in your console messages as mismatched comparisons against the vdevs the kernel found. This is most likely -as with my installation- the result of originally installing=20 the zpool on the entire disk, and then later removing that pool and reducing the=20 zfs partition and recreating the pool. From what I reverse engineered, a zp= ool seems to put 2 labels at the beginning of its assigned disk space and 2 labels at the end, most likely in an effort to be able to restore those labels should someone/something accidentally overwrite them. The stupidity of the whole thing is: the kernel code to load the zfs root filesystem seems to first scan the "entire disk device" for these 4 labels,= and if it finds any, will insist in using them and NOT consider any valid labels of partitions in the GPT partition table. zpool import doesn't do this, it's just the mount code in the kernel. There is a "zpool labelclear" command which is supposed to clear these wrong old labels, but I personally didn't trust it to not go ahead and=20 clear the labels of ALL zfs instances on the disk if you let it loose on the entire disk device. The man page is not very clear in this respect, and searching=20 for this shows I was not the only one confused on the exact behavior of tha= t=20 command. What I did in my case is: - use gpart to add an additional temporary swap partition to fill the disk: gpart add -t freebsd-swap nvd0 - this resulted in a nvd0p5 in my case - then I did dd if=3D/dev/zero of=3D/dev/nvd0p5 bs=3D1024M to clear that temp partition, and thus the end of the disk from the old=20 zpool label - remove the temp partition again: gpart delete -i 5 nvd0 if you check the device again after this (zdb -l), it shouldn't find any labels anymore. What I'd expect for the future, and why I didn't ask for this bug report=20 to be closed after I fixed my problem: - kernel mount code should first check all valid zfs partitions for labels - only if no labels are found in valid partitions should it also consider t= he entire disk device (nvd0, ada0, etc) to cover the cases where people defi= ne a zpool like "mirror /dev/ada0 /dev/ada1". I know this works for data poo= ls, but I'm not sure you could actually boot from such a pool. Cheers, Markus --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-271989-227-NcU14ceZKL>