Date: Wed, 01 May 2019 13:17:36 +1000 From: Michelle Sullivan <michelle@sorbs.net> To: Karl Denninger <karl@denninger.net> Cc: freebsd-stable@freebsd.org Subject: Re: ZFS... Message-ID: <CB86C16D-87D9-4D3F-9291-1E2586246E04@sorbs.net> In-Reply-To: <bf630074-2e68-2f8f-b69f-adf99ac5d3de@denninger.net> References: <30506b3d-64fb-b327-94ae-d9da522f3a48@sorbs.net> <CAOtMX2gf3AZr1-QOX_6yYQoqE-H%2B8MjOWc=eK1tcwt5M3dCzdw@mail.gmail.com> <56833732-2945-4BD3-95A6-7AF55AB87674@sorbs.net> <3d0f6436-f3d7-6fee-ed81-a24d44223f2f@netfence.it> <17B373DA-4AFC-4D25-B776-0D0DED98B320@sorbs.net> <70fac2fe3f23f85dd442d93ffea368e1@ultra-secure.de> <70C87D93-D1F9-458E-9723-19F9777E6F12@sorbs.net> <CAGMYy3tYqvrKgk2c==WTwrH03uTN1xQifPRNxXccMsRE1spaRA@mail.gmail.com> <5ED8BADE-7B2C-4B73-93BC-70739911C5E3@sorbs.net> <d0118f7e-7cfc-8bf1-308c-823bce088039@denninger.net> <2e4941bf-999a-7f16-f4fe-1a520f2187c0@sorbs.net> <CAOtMX2gOwwZuGft2vPpR-LmTpMVRy6hM_dYy9cNiw%2Bg1kDYpXg@mail.gmail.com> <34539589-162B-4891-A68F-88F879B59650@sorbs.net> <CAOtMX2iB7xJszO8nT_KU%2BrFuSkTyiraMHddz1fVooe23bEZguA@mail.gmail.com> <576857a5-a5ab-eeb8-2391-992159d9c4f2@denninger.net> <A7928311-8F51-4C72-839C-C9C2BA62C66E@sorbs.net> <b0fa0f8e-dc45-9d66-cc48-c733cbb9645b@denninger.net> <FD9802E0-E2E4-464A-8ABD-83B0A21C08F2@sorbs.net> <bf63007@sorbs.net>
next in thread | previous in thread | raw e-mail | index | archive | help
Michelle Sullivan http://www.mhix.org/ Sent from my iPad > On 01 May 2019, at 12:37, Karl Denninger <karl@denninger.net> wrote: >=20 > On 4/30/2019 20:59, Michelle Sullivan wrote >>> On 01 May 2019, at 11:33, Karl Denninger <karl@denninger.net> wrote: >>>=20 >>>> On 4/30/2019 19:14, Michelle Sullivan wrote: >>>>=20 >>>> Michelle Sullivan >>>> http://www.mhix.org/ >>>> Sent from my iPad >>>>=20 >>> Nope. I'd much rather *know* the data is corrupt and be forced to >>> restore from backups than to have SILENT corruption occur and perhaps >>> screw me 10 years down the road when the odds are my backups have >>> long-since been recycled. >> Ahh yes the be all and end all of ZFS.. stops the silent corruption of da= ta.. but don=E2=80=99t install it on anything unless it=E2=80=99s server gra= de with backups and ECC RAM, but it=E2=80=99s good on laptops because it pro= tects you from silent corruption of your data when 10 years later the backup= s have long-since been recycled... umm is that not a circular argument? >>=20 >> Don=E2=80=99t get me wrong here.. and I know you (and some others are) zf= s in the DC with 10s of thousands in redundant servers and/or backups to kee= p your critical data corruption free =3D good thing. >>=20 >> ZFS on everything is what some say (because it prevents silent corruption= ) but then you have default policies to install it everywhere .. including h= ardware not equipped to function safely with it (in your own arguments) and y= et it=E2=80=99s still good because it will still prevent silent corruption e= ven though it relies on hardware that you can trust... umm say what? >>=20 >> Anyhow veered way way off (the original) topic... >>=20 >> Modest (part consumer grade, part commercial) suffered irreversible data l= oss because of a (very unusual, but not impossible) double power outage.. an= d no tools to recover the data (or part data) unless you have some form of b= ackup because the file system deems the corruption to be too dangerous to le= t you access any of it (even the known good bits) ... =20 >>=20 >> Michelle >=20 > IMHO you're dead wrong Michelle. I respect your opinion but disagree > vehemently. I guess we=E2=80=99ll have to agree to disagree then, but I think your attit= ude to pronounce me =E2=80=9Cdead wrong=E2=80=9D is short sighted, because i= t strikes of =E2=80=9CI=E2=80=99m right because ZFS is the answer to all pro= blems.=E2=80=9D .. I=E2=80=99ve been around in the industry long enough to s= ee a variety of issues... some disasters, some not so... I also should know better than to run without backups but financial constrai= nts precluded me.... as will for many non commercial people. >=20 > I run ZFS on both of my laptops under FreeBSD. Both have > non-power-protected SSDs in them. Neither is mirrored or Raidz-anything. >=20 > So why run ZFS instead of UFS? >=20 > Because a scrub will detect data corruption that UFS cannot detect *at all= .* I get it, I really do, but that balances out against, if you can=E2=80=99t r= ebuild it make sure you have (tested and working) backups and be prepared fo= r downtime when such corruption does occur. >=20 > It is a balance-of-harms test and you choose. I can make a very clean > argument that *greater information always wins*; that is, I prefer in > every case to *know* I'm screwed rather than not. I can defend against > being screwed with some amount of diligence but in order for that > diligence to be reasonable I have to know about the screwing in a > reasonable amount of time after it happens. Not disagreeing (and have not been.) >=20 > You may have never had silent corruption bite you. I have... but not with data on disks.. most of my silent corruption issues h= ave been with a layer or two above the hardware... like subversion commits o= verwriting previous commits without notification (damn I wish I could reliab= ly replicate it!) > I have had it happen > several times over my IT career. If that happens to you the odds are > that it's absolutely unrecoverable and whatever gets corrupted is > *gone.* Every drive corruption I have suffered in my career I have been able to reco= ver, all or partial data except where the hardware itself was totally hosed (= Ie clean room options only available)... even with brtfs.. yuk.. puck.. yuk.= . oh what a mess that was... still get nightmares on that one... but I sti= ll managed to get most of the data off... in fact I put it onto this machine= I currently have problems with.. so after the nightmare of brtfs looks like= zfs eventually nailed me. > The defensive measures against silent corruption require > retention of backup data *literally forever* for the entire useful life > of the information because from the point of corruption forward *the > backups are typically going to be complete and correct copies of the > corrupt data and thus equally worthless to what's on the disk itself.*=20 > With non-ZFS filesystems quite a lot of thought and care has to go into > defending against that, and said defense usually requires the active > cooperation of whatever software wrote said file in the first place Say what? =20 > (e.g. a database, etc.) So dbs (any?) talk actively to the file systems (any?) to actively prevent s= ilent corruption? Lol... I=E2=80=99m guessing you are actually talking about internal checks and bala= nces of data in the DB to ensure that dat retrieved from disk is not corrupt= /altered... you know like writing sha256 checksums of files you might downl= oad from the internet to ensure you got what you asked for and it wasn=E2=80= =99t changed/altered in transit. > If said software has no tools to "walk" said > data or if it's impractical to have it do so you're at severe risk of > being hosed. Umm what? I=E2=80=99m talking about a userland (libzfs) tool (Ie doesn=E2=80= =99t need the pool imported) such as zfs send (which requires the pool to be= imported - hence me not calling it a userland tool) to allow a sending of d= ata that can be found to other places where it can be either blindly recover= ed (corruption might be present) or can be used to locate files/paths etc th= at are known to be good (checksums match etc).. walk the structures, feed th= e data elsewhere where it can be examined/recovered... don=E2=80=99t alter i= t.... it=E2=80=99s a last resort tool when you don=E2=80=99t have working ba= ckups.. > Prior to ZFS there really wasn't any comprehensive defense > against this sort of event. There are a whole host of applications that > manipulate data that are absolutely reliant on that sort of thing not > happening (e.g. anything using a btree data structure) and recovery if > it *does* happen is a five-alarm nightmare if it's possible at all. In > the worst-case scenario you don't detect the corruption and the data > that has the pointer to it that gets corrupted is overwritten and=20 > destroyed. >=20 > A ZFS scrub on a volume that has no redundancy cannot *fix* that > corruption but it can and will detect it. So you=E2=80=99re advocating restore from backup for every corruption ... ok= ... > This puts a boundary on the > backups that I must keep in order to *not* have that happen. This is of > very high value to me and is why, even on systems without ECC memory and > without redundant disks, provided there is enough RAM to make it > reasonable (e.g. not on embedded systems I do development on with are > severely RAM-constrained) I run ZFS. >=20 > BTW if you've never had a UFS volume unlink all the blocks within a file > on an fsck and then recover them back into the free list after a crash > you're a rare bird indeed. If you think a corrupt ZFS volume is fun try > to get your data back from said file after that happens. Been there done that though with ext2 rather than UFS.. still got all my da= ta back... even though it was a nightmare.. >=20 > --=20 > Karl Denninger > karl@denninger.net <mailto:karl@denninger.net> > /The Market Ticker/ > /[S/MIME encrypted email preferred]/
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CB86C16D-87D9-4D3F-9291-1E2586246E04>