Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 20 Jul 2019 17:41:39 +0700
From:      Eugene Grosbein <eugen@grosbein.net>
To:        Garrett Wollman <wollman@csail.mit.edu>, freebsd-stable@freebsd.org, Alexander Motin <mav@FreeBSD.org>
Subject:   Re: ZFS root mount regression
Message-ID:  <73cddcd9-97f0-e73f-da9d-2a454fd3ea1a@grosbein.net>
In-Reply-To: <23858.2573.932364.128957@khavrinen.csail.mit.edu>
References:  <23858.2573.932364.128957@khavrinen.csail.mit.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
CC'ing Alexander Motin who comitted the change.

20.07.2019 1:21, Garrett Wollman wrote:

> I recently upgraded several file servers from 11.2 to 11.3.  All of
> them boot from a ZFS pool called "tank" (the data is in a different
> pool).  In a couple of instances (which caused me to have to take a
> late-evening 140-mile drive to the remote data center where they are
> located), the servers crashed at the root mount phase.  In one case,
> it bailed out with error 5 (I believe that's [EIO]) to the usual
> mountroot prompt.  In the second case, the kernel panicked instead.
> 
> The root cause (no pun intended) on both servers was a disk which was
> supplied by the vendor with a label on it that claimed to be part of
> the "tank" pool, and for some reason the 11.3 kernel was trying to
> mount that (faulted) pool rather than the real one.  The disks and
> pool configuration were unchanged from 11.2 (and probably 11.1 as
> well) so I am puzzled.
> 
> Other than laboriously running "zpool labelclear -f /dev/somedisk" for
> every piece of media that comes into my hands, is there anything else
> I could have done to avoid this?

Both 11.3-RELEASE announcement and Release Notes mention this:

> The ZFS filesystem has been updated to implement parallel mounting.

I strongly suggest reading Release documentation in case of troubles
after upgrade, at least. Or better, read *before* updating.

I guess this parallelism created some race for your case.

Unfortunately, a way to fall back to sequential mounting seems undocumented.
libzfs checks for ZFS_SERIAL_MOUNT environment variable to exist having any value.
I'm not sure how you set it for mounting root, maybe it will use kenv,
so try adding to /boot/loader.conf:

ZFS_SERIAL_MOUNT=1

Alexander should have more knowledge on this.

And of course, attaching unrelated device having label conflicting
with root pool is asking for trouble. Re-label it ASAP.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?73cddcd9-97f0-e73f-da9d-2a454fd3ea1a>