Date: Mon, 25 Apr 2016 08:28:41 -0400 From: "Michael B. Eichorn" <ike@michaeleichorn.com> To: Fabian Keil <freebsd-listen@fabiankeil.de>, freebsd-fs@freebsd.org Subject: Re: GELI + Zpool Scrub Results in GELI Device Destruction (and Later a Corrupt Pool) Message-ID: <1461587321.22294.85.camel@michaeleichorn.com> In-Reply-To: <20160425101124.068c8333@fabiankeil.de> References: <1461560445.22294.53.camel@michaeleichorn.com> <20160425101124.068c8333@fabiankeil.de>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --] On Mon, 2016-04-25 at 10:11 +0200, Fabian Keil wrote: > "Michael B. Eichorn" <ike@michaeleichorn.com> wrote: > > > > > I just ran into something rather unexpected. I have a pool > > consisting > > of a mirrored pair of geli encrypted partitions on WD Red 3TB > > disks. > > > > The machine is running 10.3-RELEASE, the root zpool was setup with > > GELI > > encryption from the installer, the pool that is acting up was setup > > per > > the handbook. > [...] > > > > I had just noticed that I had failed to enable the zpool scrub > > periodic > > on this machine. So I began to run zpool scrub by hand. It > > succeeded > > for the root pool which is also geli encrypted, but when I ran it > > against my primary data pool I encountered: > > > > Apr 24 23:18:23 terra kernel: GEOM_ELI: Device ada3p1.eli > > destroyed. > > Apr 24 23:18:23 terra kernel: GEOM_ELI: Detached ada3p1.eli on last > > close. > > Apr 24 23:18:23 terra kernel: GEOM_ELI: Device ada2p1.eli > > destroyed. > > Apr 24 23:18:23 terra kernel: GEOM_ELI: Detached ada2p1.eli on last > > close. > Did you attach the devices using geli's -d (auto-detach) flag? > > I am using whatever the default setup comes out of the rc.d scripts. My rc.conf was: geli_devices="ada2p1 ada3p1" geli_default_flags="-k /root/encryption.key" zfs_enable="YES" I will try adding geli_autodetach="NO" and scubbing in about 9 hours. > If yes, this is a known issue: > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=117158 > Reading that bug in detail it appears to be *specifically* for the kernel panic and that zfs closing and reopening providers is expected behavior, and that if geli has autodetach configured that it would detach. It stikes me that even though this is expected behavior it should not be. Is there a way we could prevent the detach when zfs does closes and reopens providers? I cannnot think of a case where the desired behavior is for the pool to detach when zfs wants to reopen it. > > > > And the scrub failed to initialize (command never returned to the > > shell). > This could be the result of another known bug: > https://lists.freebsd.org/pipermail/freebsd-current/2015-October/0579 > 88.html > > > > > I immediately rebooted and both disks came back and resilvered, > > with > > permanent metadata errors > Did those errors appear while resilvering or could they have been > already present before? I do not think they were presesent before the disks flip-flopped. There was no error before my attempt to resliver. I would expect metadata errors as I effectively had: disk 1 online disk 2 offline then immediately without a resliver: disk 1 offline disk 1 online > Fabian [-- Attachment #2 --] 0 *H 010 `He 0 *H 000]0 *H 010 UIL10U StartCom Ltd.1+0)U"Secure Digital Certificate Signing1806U/StartCom Class 1 Primary Intermediate Client CA0 150613202446Z 160614003550Z0H10Uike@michaeleichorn.com1%0# *H ike@michaeleichorn.com0"0 *H 0 UՀ,k9D %Z|Y6J<rrK g;&|uNlUE9)V.[ט̊:qS](#vSYDz*CpugYݔ,v<`j(waS#ڒ6n(K5'KVLåErv<J=[}W bLA%gޭnVb| I?M7D:$׃bM_T[,ƃ\ 00 U0 0U0U%0++0Ujj: γ+39啖0U#0Sr풜\|~5NԸQ0!U0ike@michaeleichorn.com0LU C0?0;+70*0.+"http://www.startssl.com/policy.pdf0+00' StartCom Certification Authority0This certificate was issued according to the Class 1 Validation requirements of the StartCom CA policy, reliance only for the intended purpose in compliance of the relying party obligations.06U/0-0+)'%http://crl.startssl.com/crtu1-crl.crl0+009+0-http://ocsp.startssl.com/sub/class1/client/ca0B+06http://aia.startssl.com/certs/sub.class1.client.ca.crt0#U0http://www.startssl.com/0 *H x+ȐF}pw.XvF?rg P]EOp)L˻yA ;hi0u2]m [Sbp$_ gr Xm*YP3#H>mKAǠt)HO|=@}3ӝ'iO81>03 v'h5U "H;ECZtpҗ4rWHu^6+i*kJL8shAV|5;?HMc\ j[j|+000]0 *H 010 UIL10U StartCom Ltd.1+0)U"Secure Digital Certificate Signing1806U/StartCom Class 1 Primary Intermediate Client CA0 150613202446Z 160614003550Z0H10Uike@michaeleichorn.com1%0# *H ike@michaeleichorn.com0"0 *H 0 UՀ,k9D %Z|Y6J<rrK g;&|uNlUE9)V.[ט̊:qS](#vSYDz*CpugYݔ,v<`j(waS#ڒ6n(K5'KVLåErv<J=[}W bLA%gޭnVb| I?M7D:$׃bM_T[,ƃ\ 00 U0 0U0U%0++0Ujj: γ+39啖0U#0Sr풜\|~5NԸQ0!U0ike@michaeleichorn.com0LU C0?0;+70*0.+"http://www.startssl.com/policy.pdf0+00' StartCom Certification Authority0This certificate was issued according to the Class 1 Validation requirements of the StartCom CA policy, reliance only for the intended purpose in compliance of the relying party obligations.06U/0-0+)'%http://crl.startssl.com/crtu1-crl.crl0+009+0-http://ocsp.startssl.com/sub/class1/client/ca0B+06http://aia.startssl.com/certs/sub.class1.client.ca.crt0#U0http://www.startssl.com/0 *H x+ȐF}pw.XvF?rg P]EOp)L˻yA ;hi0u2]m [Sbp$_ gr Xm*YP3#H>mKAǠt)HO|=@}3ӝ'iO81>03 v'h5U "H;ECZtpҗ4rWHu^6+i*kJL8shAV|5;?HMc\ j[j|+0400 *H 0}10 UIL10U StartCom Ltd.1+0)U"Secure Digital Certificate Signing1)0'U StartCom Certification Authority0 071024210155Z 171024210155Z010 UIL10U StartCom Ltd.1+0)U"Secure Digital Certificate Signing1806U/StartCom Class 1 Primary Intermediate Client CA0"0 *H 0 -).2AUGo#G B|NDRpM-B=o-we5JQpa>O.#._<V [~**pz~3WG .ᘟMlr[<Ce6fqO"uxfWN#uicgkv$Lb%y`_{`xK'GN 00U00U0USr풜\|~5NԸQ0U#0N@[i04hCA0f+Z0X0'+0http://ocsp.startssl.com/ca0-+0!http://www.startssl.com/sfsca.crt0[UT0R0'%#!http://www.startssl.com/sfsca.crl0'%#!http://crl.startssl.com/sfsca.crl0U y0w0u+70f0.+"http://www.startssl.com/policy.pdf04+(http://www.startssl.com/intermediate.pdf0 *H }x,\c^#wMq}>UK/^yX֏y frMIŲB61ymQҨݬZ0&
