Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 13 Aug 2017 15:41:42 +0000
From:      bugzilla-noreply@freebsd.org
To:        freebsd-fs@FreeBSD.org
Subject:   [Bug 221075] regression: 11.1 is unable to mount ZFS / on boot
Message-ID:  <bug-221075-3630-cPBtWCaYPU@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-221075-3630@https.bugs.freebsd.org/bugzilla/>
References:  <bug-221075-3630@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D221075

--- Comment #14 from Marius Strobl <marius@FreeBSD.org> ---
Yeah, I've missed that sdhci_pci(4) actually is attaching, probably due to
a typo in the search field. Still, the only explanation I have is that the
sheer presence of the geom_flasmap(4) class is triggering a GEOM-related ra=
ce.
Especially, since you apparently didn't have a SD card inserted, so neither
mmc(4) nor mmcsd(4) did attach and, consequently, no additional disk(9) was
present.

I can't find a GEOM-related change not present in stable/11 which looks like
it would fix such a race. Thus, I suspect that the particular race in fact =
is
also present in head, but due to some differences in timing you don't happen
to hit it there.

Recently it has been mentioned that geom_label(4) is racy, too:
https://lists.freebsd.org/pipermail/svn-src-all/2017-August/149683.html
The fact that as mentioned in that e-mail, bsdinstall(8) therefore doesn't
use labels - but apparently you do in your ZFS setup - might also explain w=
hy
not more people are hitting the problem you see. So it could be worthwhile =
to
try whether using ada[0,1]p1 directly for the zpools instead of DISK-<foo>p1
labels reliably gets you rid of the problem.

Apart from that I don't have an idea how to further debug the actual cause.
Part of the problem is that there are several known GEOM-related races, some
even documented in the code of geom(4). So changing something or even fixing
one race just might alter the timing enough so that the real culprit is hid=
den.
Another part is that I only know how geom(4) debugging output differs when
hitting the race I mentioned earlier, but I don't know how it would differ
for the other races, for example the geom_label(4)-related one.

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-221075-3630-cPBtWCaYPU>