Date: Tue, 30 Oct 2018 23:27:39 +0000 From: bugzilla-noreply@freebsd.org To: geom@FreeBSD.org Subject: [Bug 232835] [gmirror] gmirror fails to recover from degraded mirror sets in some circumstances (2/n) Message-ID: <bug-232835-14739@https.bugs.freebsd.org/bugzilla/>
next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D232835 Bug ID: 232835 Summary: [gmirror] gmirror fails to recover from degraded mirror sets in some circumstances (2/n) Product: Base System Version: CURRENT Hardware: Any OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: cem@freebsd.org CC: geom@FreeBSD.org, markj@FreeBSD.org Depends on: 232671 Blocks: 232683, 232684 This is related to bug 232671 , but not identical. Here is the example scenario: 1. I start with a GMIRROR with three ACTIVE disks (sc_ndisks =3D md_all =3D 3) 2. I essentially disconnect one of the disks 3. The remaining mirrors lower md_all to 2 and the syncid / generation gets bumped 4. I reboot, and the removed disk reappears 5. Geom tastes the stale removed disk first, and populates sc_ndisks from its md_all (3) 6. The two valid mirrors are tasted afterwards and the gmirror rejects both as having invalid metadata (md_all=3D2), despite their having a higher generation / sync id The problem is basically that gmirror doesn't "upgrade" its metadata to the newest valid mirrorset it finds -- it just sticks with whatever it found first. I think the solution is being a bit clever about detecting the latest mirror generation while a gmirror is still in the STARTING state; and also perhaps a little more clever about when we transition from STARTING to RUNNING (at which point a newer generation mirror showing up means we have corruption). Referenced Bugs: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D232671 [Bug 232671] [gmirror] gmirror fails to recover from degraded mirror sets in some circumstances https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D232683 [Bug 232683] [gmirror] gmirror could provide much better administrative introspection into decision-making processes https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D232684 [Bug 232684] [gmirror] gmirror overly aggressive provider destruction --=20 You are receiving this mail because: You are on the CC list for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-232835-14739>