From owner-freebsd-stable@freebsd.org  Tue Apr 30 14:15:53 2019
Return-Path: <owner-freebsd-stable@freebsd.org>
Delivered-To: freebsd-stable@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3C4D31592D13
 for <freebsd-stable@mailman.ysv.freebsd.org>;
 Tue, 30 Apr 2019 14:15:53 +0000 (UTC)
 (envelope-from michelle@sorbs.net)
Received: from hades.sorbs.net (hades.sorbs.net [72.12.213.40])
 by mx1.freebsd.org (Postfix) with ESMTP id 8184F73298;
 Tue, 30 Apr 2019 14:15:52 +0000 (UTC)
 (envelope-from michelle@sorbs.net)
MIME-version: 1.0
Content-type: text/plain; charset=utf-8
Received: from [10.10.0.230] (gate.mhix.org [203.206.128.220])
 by hades.sorbs.net
 (Oracle Communications Messaging Server 7.0.5.29.0 64bit (built Jul 9 2013))
 with ESMTPSA id <0PQS002XI2XJY400@hades.sorbs.net>; Tue,
 30 Apr 2019 07:29:47 -0700 (PDT)
Subject: Re: ZFS...
From: Michelle Sullivan <michelle@sorbs.net>
X-Mailer: iPad Mail (16A404)
In-reply-to: <CAOtMX2iB7xJszO8nT_KU+rFuSkTyiraMHddz1fVooe23bEZguA@mail.gmail.com>
Date: Wed, 01 May 2019 00:15:47 +1000
Cc: Karl Denninger <karl@denninger.net>, FreeBSD <freebsd-stable@freebsd.org>
Content-transfer-encoding: quoted-printable
Message-id: <9F250929-BAA1-4A18-9025-06F3EC13CD42@sorbs.net>
References: <30506b3d-64fb-b327-94ae-d9da522f3a48@sorbs.net>
 <CAOtMX2gf3AZr1-QOX_6yYQoqE-H+8MjOWc=eK1tcwt5M3dCzdw@mail.gmail.com>
 <56833732-2945-4BD3-95A6-7AF55AB87674@sorbs.net>
 <3d0f6436-f3d7-6fee-ed81-a24d44223f2f@netfence.it>
 <17B373DA-4AFC-4D25-B776-0D0DED98B320@sorbs.net>
 <70fac2fe3f23f85dd442d93ffea368e1@ultra-secure.de>
 <70C87D93-D1F9-458E-9723-19F9777E6F12@sorbs.net>
 <CAGMYy3tYqvrKgk2c==WTwrH03uTN1xQifPRNxXccMsRE1spaRA@mail.gmail.com>
 <5ED8BADE-7B2C-4B73-93BC-70739911C5E3@sorbs.net>
 <d0118f7e-7cfc-8bf1-308c-823bce088039@denninger.net>
 <2e4941bf-999a-7f16-f4fe-1a520f2187c0@sorbs.net>
 <CAOtMX2gOwwZuGft2vPpR-LmTpMVRy6hM_dYy9cNiw+g1kDYpXg@mail.gmail.com>
 <34539589-162B-4891-A68F-88F879B59650@sorbs.net>
 <CAOtMX2iB7xJszO8nT_KU+rFuSkTyiraMHddz1fVooe23bEZguA@mail.gmail.com>
To: Alan Somers <asomers@freebsd.org>
X-Rspamd-Queue-Id: 8184F73298
X-Spamd-Bar: ---
Authentication-Results: mx1.freebsd.org;
 spf=pass (mx1.freebsd.org: domain of michelle@sorbs.net designates
 72.12.213.40 as permitted sender) smtp.mailfrom=michelle@sorbs.net
X-Spamd-Result: default: False [-3.04 / 15.00]; ARC_NA(0.00)[];
 RCVD_VIA_SMTP_AUTH(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-0.999,0];
 FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3];
 R_SPF_ALLOW(-0.20)[+a:hades.sorbs.net];
 NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain];
 DMARC_NA(0.00)[sorbs.net]; TO_MATCH_ENVRCPT_SOME(0.00)[];
 TO_DN_ALL(0.00)[];
 MX_GOOD(-0.01)[cached: battlestar.sorbs.net];
 NEURAL_HAM_SHORT(-0.56)[-0.564,0];
 RCVD_IN_DNSWL_NONE(0.00)[40.213.12.72.list.dnswl.org : 127.0.10.0];
 SUBJ_ALL_CAPS(0.45)[6];
 IP_SCORE(-0.72)[ip: (-1.89), ipnet: 72.12.192.0/19(-0.92), asn: 11114(-0.71),
 country: US(-0.06)]; RCVD_NO_TLS_LAST(0.10)[];
 FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[];
 MIME_TRACE(0.00)[0:+];
 ASN(0.00)[asn:11114, ipnet:72.12.192.0/19, country:US];
 MID_RHS_MATCH_FROM(0.00)[]; RCVD_COUNT_TWO(0.00)[2]
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-stable>, 
 <mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable/>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
 <mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 30 Apr 2019 14:15:53 -0000

This issue is definitely related to sudden unexpected loss of power during r=
esilver.. not ECC/non-ECC issues.

Michelle Sullivan
http://www.mhix.org/
Sent from my iPad

> On 01 May 2019, at 00:12, Alan Somers <asomers@freebsd.org> wrote:
>=20
>> On Tue, Apr 30, 2019 at 8:05 AM Michelle Sullivan <michelle@sorbs.net> wr=
ote:
>>=20
>>=20
>>=20
>> Michelle Sullivan
>> http://www.mhix.org/
>> Sent from my iPad
>>=20
>>>> On 01 May 2019, at 00:01, Alan Somers <asomers@freebsd.org> wrote:
>>>>=20
>>>> On Tue, Apr 30, 2019 at 7:30 AM Michelle Sullivan <michelle@sorbs.net> w=
rote:
>>>>=20
>>>> Karl Denninger wrote:
>>>>> On 4/30/2019 05:14, Michelle Sullivan wrote:
>>>>>>>> On 30 Apr 2019, at 19:50, Xin LI <delphij@gmail.com> wrote:
>>>>>>>> On Tue, Apr 30, 2019 at 5:08 PM Michelle Sullivan <michelle@sorbs.n=
et> wrote:
>>>>>>>> but in my recent experience 2 issues colliding at the same time res=
ults in disaster
>>>>>>> Do we know exactly what kind of corruption happen to your pool?  If y=
ou see it twice in a row, it might suggest a software bug that should be inv=
estigated.
>>>>>>>=20
>>>>>>> All I know is it=E2=80=99s a checksum error on a meta slab (122) and=
 from what I can gather it=E2=80=99s the spacemap that is corrupt... but I a=
m no expert.  I don=E2=80=99t believe it=E2=80=99s a software fault as such,=
 because this was cause by a hard outage (damaged UPSes) whilst resilvering a=
 single (but completely failed) drive.  ...and after the first outage a seco=
nd occurred (same as the first but more damaging to the power hardware)... t=
he host itself was not damaged nor were the drives or controller.
>>>>> .....
>>>>>>> Note that ZFS stores multiple copies of its essential metadata, and i=
n my experience with my old, consumer grade crappy hardware (non-ECC RAM, wi=
th several faulty, single hard drive pool: bad enough to crash almost monthl=
y and damages my data from time to time),
>>>>>> This was a top end consumer grade mb with non ecc ram that had been r=
unning for 8+ years without fault (except for hard drive platter failures.).=
 Uptime would have been years if it wasn=E2=80=99t for patching.
>>>>> Yuck.
>>>>>=20
>>>>> I'm sorry, but that may well be what nailed you.
>>>>>=20
>>>>> ECC is not just about the random cosmic ray.  It also saves your bacon=

>>>>> when there are power glitches.
>>>>=20
>>>> No. Sorry no.  If the data is only half to disk, ECC isn't going to sav=
e
>>>> you at all... it's all about power on the drives to complete the write.=

>>>=20
>>> ECC RAM isn't about saving the last few seconds' worth of data from
>>> before a power crash.  It's about not corrupting the data that gets
>>> written long before a crash.  If you have non-ECC RAM, then a cosmic
>>> ray/alpha ray/row hammer attack/bad luck can corrupt data after it's
>>> been checksummed but before it gets DMAed to disk.  Then disk will
>>> contain corrupt data and you won't know it until you try to read it
>>> back.
>>=20
>> I know this... unless I misread Karl=E2=80=99s message he implied the ECC=
 would have saved the corruption in the crash... which is patently false... I=
 think you=E2=80=99ll agree..
>=20
> I don't think that's what Karl meant.  I think he meant that the
> non-ECC RAM could've caused latent corruption that was only detected
> when the crash forced a reboot and resilver.
>=20
>>=20
>> Michelle
>>=20
>>=20
>>>=20
>>> -Alan
>>>=20
>>>>>=20
>>>>> Unfortunately however there is also cache memory on most modern hard
>>>>> drives, most of the time (unless you explicitly shut it off) it's on f=
or
>>>>> write caching, and it'll nail you too.  Oh, and it's never, in my
>>>>> experience, ECC.
>>>=20
>>> Fortunately, ZFS never sends non-checksummed data to the hard drive.
>>> So an error in the hard drive's cache ram will usually get detected by
>>> the ZFS checksum.
>>>=20
>>>>=20
>>>> No comment on that - you're right in the first part, I can't comment if=

>>>> there are drives with ECC.
>>>>=20
>>>>>=20
>>>>> In addition, however, and this is something I learned a LONG time ago
>>>>> (think Z-80 processors!) is that as in so many very important things
>>>>> "two is one and one is none."
>>>>>=20
>>>>> In other words without a backup you WILL lose data eventually, and it
>>>>> WILL be important.
>>>>>=20
>>>>> Raidz2 is very nice, but as the name implies it you have two
>>>>> redundancies.  If you take three errors, or if, God forbid, you *write=
*
>>>>> a block that has a bad checksum in it because it got scrambled while i=
n
>>>>> RAM, you're dead if that happens in the wrong place.
>>>>=20
>>>> Or in my case you write part data therefore invalidating the checksum..=
.
>>>>>=20
>>>>>> Yeah.. unlike UFS that has to get really really hosed to restore from=
 backup with nothing recoverable it seems ZFS can get hosed where issues occ=
ur in just the wrong bit... but mostly it is recoverable (and my experience h=
as been some nasty shit that always ended up being recoverable.)
>>>>>>=20
>>>>>> Michelle
>>>>> Oh that is definitely NOT true.... again, from hard experience,
>>>>> including (but not limited to) on FreeBSD.
>>>>>=20
>>>>> My experience is that ZFS is materially more-resilient but there is no=

>>>>> such thing as "can never be corrupted by any set of events."
>>>>=20
>>>> The latter part is true - and my blog and my current situation is not
>>>> limited to or aimed at FreeBSD specifically,  FreeBSD is my experience.=

>>>> The former part... it has been very resilient, but I think (based on
>>>> this certain set of events) it is easily corruptible and I have just
>>>> been lucky.  You just have to hit a certain write to activate the issue=
,
>>>> and whilst that write and issue might be very very difficult (read: hit=

>>>> and miss) to hit in normal every day scenarios it can and will
>>>> eventually happen.
>>>>=20
>>>>>  Backup
>>>>> strategies for moderately large (e.g. many Terabytes) to very large
>>>>> (e.g. Petabytes and beyond) get quite complex but they're also very
>>>>> necessary.
>>>>>=20
>>>> and there in lies the problem.  If you don't have a many 10's of
>>>> thousands of dollars backup solutions, you're either:
>>>>=20
>>>> 1/ down for a looooong time.
>>>> 2/ losing all data and starting again...
>>>>=20
>>>> ..and that's the problem... ufs you can recover most (in most
>>>> situations) and providing the *data* is there uncorrupted by the fault
>>>> you can get it all off with various tools even if it is a complete
>>>> mess....  here I am with the data that is apparently ok, but the
>>>> metadata is corrupt (and note: as I had stopped writing to the drive
>>>> when it started resilvering the data - all of it - should be intact...
>>>> even if a mess.)
>>>>=20
>>>> Michelle
>>>>=20
>>>> --
>>>> Michelle Sullivan
>>>> http://www.mhix.org/
>>>>=20
>>>> _______________________________________________
>>>> freebsd-stable@freebsd.org mailing list
>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
>>>> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.or=
g"