From owner-freebsd-stable@freebsd.org Tue Apr 30 15:37:31 2019 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7A0271595972 for ; Tue, 30 Apr 2019 15:37:31 +0000 (UTC) (envelope-from borjam@sarenet.es) Received: from cu01176b.smtpx.saremail.com (cu01176b.smtpx.saremail.com [195.16.151.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CF8FD77D44 for ; Tue, 30 Apr 2019 15:37:29 +0000 (UTC) (envelope-from borjam@sarenet.es) Received: from [172.16.8.250] (unknown [192.148.167.11]) by proxypop01.sare.net (Postfix) with ESMTPA id 0274D9DF4F2; Tue, 30 Apr 2019 17:37:19 +0200 (CEST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.8\)) Subject: Re: ZFS... From: Borja Marcos In-Reply-To: <2e4941bf-999a-7f16-f4fe-1a520f2187c0@sorbs.net> Date: Tue, 30 Apr 2019 17:37:17 +0200 Cc: Karl Denninger , freebsd-stable@freebsd.org Content-Transfer-Encoding: quoted-printable Message-Id: <75A78DAC-DF85-481B-ABC5-70E5E3960341@sarenet.es> References: <30506b3d-64fb-b327-94ae-d9da522f3a48@sorbs.net> <56833732-2945-4BD3-95A6-7AF55AB87674@sorbs.net> <3d0f6436-f3d7-6fee-ed81-a24d44223f2f@netfence.it> <17B373DA-4AFC-4D25-B776-0D0DED98B320@sorbs.net> <70fac2fe3f23f85dd442d93ffea368e1@ultra-secure.de> <70C87D93-D1F9-458E-9723-19F9777E6F12@sorbs.net> <5ED8BADE-7B2C-4B73-93BC-70739911C5E3@sorbs.net> <2e4941bf-999a-7f16-f4fe-1a520f2187c0@sorbs.net> To: Michelle Sullivan X-Mailer: Apple Mail (2.3445.104.8) X-Rspamd-Queue-Id: CF8FD77D44 X-Spamd-Bar: ----- X-Spamd-Result: default: False [-5.57 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; R_SPF_ALLOW(-0.20)[+ip4:195.16.151.0/24]; MV_CASE(0.50)[]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MX_GOOD(-0.01)[smtp.sarenet.es,smtp.sarenet.es,smtp.sarenet.es]; DMARC_POLICY_ALLOW(-0.50)[sarenet.es,reject]; RCVD_IN_DNSWL_NONE(0.00)[151.151.16.195.list.dnswl.org : 127.0.10.0]; SUBJ_ALL_CAPS(0.45)[6]; IP_SCORE(-2.72)[ip: (-7.54), ipnet: 195.16.128.0/19(-3.49), asn: 3262(-2.59), country: ES(0.04)]; NEURAL_HAM_SHORT(-1.00)[-0.995,0]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:3262, ipnet:195.16.128.0/19, country:ES]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Apr 2019 15:37:31 -0000 > On 30 Apr 2019, at 15:30, Michelle Sullivan = wrote: >=20 >> I'm sorry, but that may well be what nailed you. >>=20 >> ECC is not just about the random cosmic ray. It also saves your = bacon >> when there are power glitches. >=20 > No. Sorry no. If the data is only half to disk, ECC isn't going to = save you at all... it's all about power on the drives to complete the = write. Not necessarily. Depending on the power outage things can get really = funny during the power loss event. 25+ years ago I witnessed a severe 2 second voltage drop and during that time the hard disk in our = SCO Unix server got really crazy. Even the low level format was corrupted, damage was way beyond mere filesystem corruption. During the start of a power outage (especially when it=E2=80=99s not a = clean power cut, but it comes preceded by some voltage swings) data corruption can be extensive. As far as I know high end systems include = power management elements to reduce the impact.=20 I have other war stories about UPS systems providing an extremely dirty = waveform and causing format problems in disks. That happened in 1995 or so. >>=20 >> Unfortunately however there is also cache memory on most modern hard >> drives, most of the time (unless you explicitly shut it off) it's on = for >> write caching, and it'll nail you too. Oh, and it's never, in my >> experience, ECC. >=20 > No comment on that - you're right in the first part, I can't comment = if there are drives with ECC. Even with cache corruption, ZFS being transaction oriented should offer = a reasonable guarantee of integrity. You may lose 1 miunte, 5 minutes of changes, but there should be stable, committed = data on the disk. Unless the electronics got insane for some milliseconds during the = outage event (see above). >> Oh that is definitely NOT true.... again, from hard experience, >> including (but not limited to) on FreeBSD. >>=20 >> My experience is that ZFS is materially more-resilient but there is = no >> such thing as "can never be corrupted by any set of events." >=20 > The latter part is true - and my blog and my current situation is not = limited to or aimed at FreeBSD specifically, FreeBSD is my experience. = The former part... it has been very resilient, but I think (based on = this certain set of events) it is easily corruptible and I have just = been lucky. You just have to hit a certain write to activate the issue, = and whilst that write and issue might be very very difficult (read: hit = and miss) to hit in normal every day scenarios it can and will = eventually happen. >=20 >> Backup >> strategies for moderately large (e.g. many Terabytes) to very large >> (e.g. Petabytes and beyond) get quite complex but they're also very >> necessary. >>=20 > and there in lies the problem. If you don't have a many 10's of = thousands of dollars backup solutions, you're either: >=20 > 1/ down for a looooong time. > 2/ losing all data and starting again... >=20 > ..and that's the problem... ufs you can recover most (in most = situations) and providing the *data* is there uncorrupted by the fault = you can get it all off with various tools even if it is a complete = mess.... here I am with the data that is apparently ok, but the = metadata is corrupt (and note: as I had stopped writing to the drive = when it started resilvering the data - all of it - should be intact... = even if a mess.) The advantage of ZFS is that it makes it feasible to replicate data. If = you keep a mirror storage server your disaster recovery actions won=E2=80=99= t require the recovery of a full backup (which can take an inordinate = amount of time) but reconfiguring the replica server to assume the role = of the master one.=20 Again, being transaction based somewhat reduces the likelyhood of a = software bug on the master to propagate to the slave causing extensive = corruption. Rewinding to the previous snapshot should help. Borja.