Date: Sat, 20 Jul 2019 17:41:39 +0700 From: Eugene Grosbein <eugen@grosbein.net> To: Garrett Wollman <wollman@csail.mit.edu>, freebsd-stable@freebsd.org, Alexander Motin <mav@FreeBSD.org> Subject: Re: ZFS root mount regression Message-ID: <73cddcd9-97f0-e73f-da9d-2a454fd3ea1a@grosbein.net> In-Reply-To: <23858.2573.932364.128957@khavrinen.csail.mit.edu> References: <23858.2573.932364.128957@khavrinen.csail.mit.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
CC'ing Alexander Motin who comitted the change. 20.07.2019 1:21, Garrett Wollman wrote: > I recently upgraded several file servers from 11.2 to 11.3. All of > them boot from a ZFS pool called "tank" (the data is in a different > pool). In a couple of instances (which caused me to have to take a > late-evening 140-mile drive to the remote data center where they are > located), the servers crashed at the root mount phase. In one case, > it bailed out with error 5 (I believe that's [EIO]) to the usual > mountroot prompt. In the second case, the kernel panicked instead. > > The root cause (no pun intended) on both servers was a disk which was > supplied by the vendor with a label on it that claimed to be part of > the "tank" pool, and for some reason the 11.3 kernel was trying to > mount that (faulted) pool rather than the real one. The disks and > pool configuration were unchanged from 11.2 (and probably 11.1 as > well) so I am puzzled. > > Other than laboriously running "zpool labelclear -f /dev/somedisk" for > every piece of media that comes into my hands, is there anything else > I could have done to avoid this? Both 11.3-RELEASE announcement and Release Notes mention this: > The ZFS filesystem has been updated to implement parallel mounting. I strongly suggest reading Release documentation in case of troubles after upgrade, at least. Or better, read *before* updating. I guess this parallelism created some race for your case. Unfortunately, a way to fall back to sequential mounting seems undocumented. libzfs checks for ZFS_SERIAL_MOUNT environment variable to exist having any value. I'm not sure how you set it for mounting root, maybe it will use kenv, so try adding to /boot/loader.conf: ZFS_SERIAL_MOUNT=1 Alexander should have more knowledge on this. And of course, attaching unrelated device having label conflicting with root pool is asking for trouble. Re-label it ASAP.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?73cddcd9-97f0-e73f-da9d-2a454fd3ea1a>