Date: Tue, 30 Apr 2019 10:15:34 -0500 From: Karl Denninger <karl@denninger.net> To: freebsd-stable@freebsd.org Subject: Re: ZFS... Message-ID: <576857a5-a5ab-eeb8-2391-992159d9c4f2@denninger.net> In-Reply-To: <CAOtMX2iB7xJszO8nT_KU%2BrFuSkTyiraMHddz1fVooe23bEZguA@mail.gmail.com> References: <30506b3d-64fb-b327-94ae-d9da522f3a48@sorbs.net> <CAOtMX2gf3AZr1-QOX_6yYQoqE-H%2B8MjOWc=eK1tcwt5M3dCzdw@mail.gmail.com> <56833732-2945-4BD3-95A6-7AF55AB87674@sorbs.net> <3d0f6436-f3d7-6fee-ed81-a24d44223f2f@netfence.it> <17B373DA-4AFC-4D25-B776-0D0DED98B320@sorbs.net> <70fac2fe3f23f85dd442d93ffea368e1@ultra-secure.de> <70C87D93-D1F9-458E-9723-19F9777E6F12@sorbs.net> <CAGMYy3tYqvrKgk2c==WTwrH03uTN1xQifPRNxXccMsRE1spaRA@mail.gmail.com> <5ED8BADE-7B2C-4B73-93BC-70739911C5E3@sorbs.net> <d0118f7e-7cfc-8bf1-308c-823bce088039@denninger.net> <2e4941bf-999a-7f16-f4fe-1a520f2187c0@sorbs.net> <CAOtMX2gOwwZuGft2vPpR-LmTpMVRy6hM_dYy9cNiw%2Bg1kDYpXg@mail.gmail.com> <34539589-162B-4891-A68F-88F879B59650@sorbs.net> <CAOtMX2iB7xJszO8nT_KU%2BrFuSkTyiraMHddz1fVooe23bEZguA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --]
On 4/30/2019 09:12, Alan Somers wrote:
> On Tue, Apr 30, 2019 at 8:05 AM Michelle Sullivan <michelle@sorbs.net> wrote:
> .
>> I know this... unless I misread Karl’s message he implied the ECC would have saved the corruption in the crash... which is patently false... I think you’ll agree..
> I don't think that's what Karl meant. I think he meant that the
> non-ECC RAM could've caused latent corruption that was only detected
> when the crash forced a reboot and resilver.
Exactly.
Non-ECC memory means you can potentially write data to *all* copies of a
block (and its parity in the case of a Raidz) where the checksum is
invalid and there is no way for the code to know it happened or defend
against it. Unfortunately since the checksum is very small compared to
the data size the odds are that IF that happens it's the *data* and not
the checksum that's bad and there are *no* good copies.
Contrary to popular belief the "power good" signal on your PSU and MB
do not provide 100% protection against transient power problems causing
this to occur with non-ECC memory either.
IMHO non-ECC memory systems are ok for personal desktop and laptop
machines where loss of stored data requiring a restore is acceptable
(assuming you have a reasonable backup paradigm for same) but not for
servers and *especially* not for ZFS storage. I don't like the price of
ECC memory and I really don't like Intel's practices when it comes to
only enabling ECC RAM on their "server" class line of CPUs either but it
is what it is. Pay up for the machines where it matters.
One of the ironies is that there's better data *integrity* with ZFS than
other filesystems in this circumstance; you're much more-likely to
*know* you're hosed even if the situation is unrecoverable and requires
a restore. With UFS and other filesystems you can quite-easily wind up
with silent corruption that can go undetected; the filesystem "works"
just fine but the data is garbage. From my point of view that's *much*
worse.
In addition IMHO consumer drives are not exactly safe for online ZFS
storage. Ironically they're *safer* for archival use because when not
actively in use they're dismounted and thus not subject to "you're
silently hosed" sort of failures. What sort of "you're hosed"
failures? Oh, for example, claiming to have flushed their cache buffers
before returning "complete" on that request when they really did not!
In combination with write re-ordering that can *really* screw you and
there's nothing that any filesystem can defensively do about it either.
This sort of "cheat" is much-more likely to be present in consumer
drives than ones sold for either enterprise or NAS purposes and it's
quite difficult to accurately test for this sort of thing on an
individual basis too.
--
Karl Denninger
karl@denninger.net <mailto:karl@denninger.net>
/The Market Ticker/
/[S/MIME encrypted email preferred]/
[-- Attachment #2 --]
0 *H
010
`He 0 *H
00 H^Ōc!5
H0
*H
010 UUS10UFlorida10U Niceville10U
Cuda Systems LLC10UCuda Systems CA1!0UCuda Systems LLC 2017 CA0
170817164217Z
270815164217Z0{10 UUS10UFlorida10U
Cuda Systems LLC10UCuda Systems CA1%0#UCuda Systems LLC 2017 Int CA0"0
*H
0
h-5B>[;olӴ0~͎O9}9Ye*$g!ukvʶLzN`jL>MD'7U 45CB+kY`bd~b*c3Ny-78ju]9HeuέsӬDؽmgwER?&UURj'}9nWD i`XcbGz \gG=u%\Oi13ߝ4
K44pYQr]Ie/r0+eEޝݖ0C15Mݚ@JSZ(zȏ NTa(25DD5.l<g[[ZarQQ%Buȴ~~`IohRbʳڟu2MS8EdFUClCMaѳ !}ș+2k/bųE,n当ꖛ\(8WV8 d]b yXw ܊:I39
00U]^§Q\ӎ0U#0T039N0b010 UUS10UFlorida10U Niceville10U
Cuda Systems LLC10UCuda Systems CA1!0UCuda Systems LLC 2017 CA @Ui0U0 0U0
*H
:P U!>vJnio-#ן]WyujǑR̀Q
nƇ!GѦFg\yLxgw=OPycehf[}ܷ['4ڝ\[p 6\o.B&JF"ZC{;*o*mcCcLY߾`
t*S!(`]DHP5A~/NPp6=mhk밣'doA$86hm5ӚS@jެEgl
)0JG`%k35PaC?σ
׳HEt}!P㏏%*BxbQwaKG$6h¦Mve;[o-Iی&
I,Tcߎ#t wPA@l0P+KXBպT zGv;NcI3&JĬUPNa?/%W6G۟N000 k#Xd\=0
*H
0{10 UUS10UFlorida10U
Cuda Systems LLC10UCuda Systems CA1%0#UCuda Systems LLC 2017 Int CA0
170817212120Z
220816212120Z0W10 UUS10UFlorida10U
Cuda Systems LLC10Ukarl@denninger.net0"0
*H
0
T[I-ΆϏ dn;Å@שy.us~_ZG%<MYd\gvfnsa1'6Egyjs"C [{~_K Pn+<*pv#Q+H/7[-vqDV^U>f%GX)H.|l`M(Cr>е͇6#odc"YljҦln8@5SA0&ۖ"OGj?UDWZ5 dDB7k-)9Izs-JAv
J6L$Ն1SmY.Lqw*SH;EF'DĦH]MOgQQ|Mٙג2Z9y@y]}6ٽeY9Y2xˆ$T=eCǺǵbn֛{j|@LLt1[Dk5:$= ` M 00<+00.0,+0 http://ocsp.cudasystems.net:88880 U0 0 `HB0U0U%0++03 `HB
&$OpenSSL Generated Client Certificate0U%՞V=;bzQ0U#0]^§Q\ӎϡ010 UUS10UFlorida10U Niceville10U
Cuda Systems LLC10UCuda Systems CA1!0UCuda Systems LLC 2017 CA H^Ōc!5
H0U0karl@denninger.net0
*H
۠A0-j%--$%g2#ޡ1^>{K+uGEv1ş7Af&b&O;.;A5*U)ND2bF|\=]<sˋL!wrw٧>YMÄ3\mWR hSv!_zvl? 3_ xU%\^#O*Gk̍YI_&Fꊛ@&1n } ͬ:{hTP3B.;bU8:Z=^Gw8!k-@xE@i,+'Iᐚ:fhztX7/(hY` O.1}a`%RW^akǂpCAufgDix UTЩ/7}%=jnVZvcF<M=
2^GKH5魉
_O4ެByʈySkw=5@h.0z>
W1000{10 UUS10UFlorida10U
Cuda Systems LLC10UCuda Systems CA1%0#UCuda Systems LLC 2017 Int CA k#Xd\=0
`He E0 *H
1 *H
0 *H
1
190430151534Z0O *H
1B@<y5FY`PwRV;M7ԯAamڗߊ|5p4YH8LFu0l *H
1_0]0 `He*0 `He0
*H
0*H
0
*H
@0+0
*H
(0 +7100{10 UUS10UFlorida10U
Cuda Systems LLC10UCuda Systems CA1%0#UCuda Systems LLC 2017 Int CA k#Xd\=0*H
10{10 UUS10UFlorida10U
Cuda Systems LLC10UCuda Systems CA1%0#UCuda Systems LLC 2017 Int CA k#Xd\=0
*H
SO]=CQ 6'_Xa7=z+-B$Q "|2ŷڎK] 6M3UVQKP#$N&rfB!l/g*_]0<ŗ-ݒ7Z|ׅ<>WP 8Q.590QmS+!IF+'[=bpQR&xR)<Qz|^Qr?9(
;IzQ|O!%Gw7Ff~N$3g<t4 %͑u̞)ԫśBWVQSDIӰǙRY.ƦzƺRlϮdѶoꦞ3B2[01='`Am _u6b8aSi
5erG&bYB&2jP@l)pVa R
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?576857a5-a5ab-eeb8-2391-992159d9c4f2>
