Date: Tue, 30 Apr 2019 17:50:40 +0800 From: Xin LI <delphij@gmail.com> To: Michelle Sullivan <michelle@sorbs.net> Cc: rainer@ultra-secure.de, owner-freebsd-stable@freebsd.org, freebsd-stable <freebsd-stable@freebsd.org>, Andrea Venturoli <ml@netfence.it> Subject: Re: ZFS... Message-ID: <CAGMYy3tYqvrKgk2c==WTwrH03uTN1xQifPRNxXccMsRE1spaRA@mail.gmail.com> In-Reply-To: <70C87D93-D1F9-458E-9723-19F9777E6F12@sorbs.net> References: <30506b3d-64fb-b327-94ae-d9da522f3a48@sorbs.net> <CAOtMX2gf3AZr1-QOX_6yYQoqE-H%2B8MjOWc=eK1tcwt5M3dCzdw@mail.gmail.com> <56833732-2945-4BD3-95A6-7AF55AB87674@sorbs.net> <3d0f6436-f3d7-6fee-ed81-a24d44223f2f@netfence.it> <17B373DA-4AFC-4D25-B776-0D0DED98B320@sorbs.net> <70fac2fe3f23f85dd442d93ffea368e1@ultra-secure.de> <70C87D93-D1F9-458E-9723-19F9777E6F12@sorbs.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Apr 30, 2019 at 5:08 PM Michelle Sullivan <michelle@sorbs.net> wrote: > but in my recent experience 2 issues colliding at the same time results in > disaster > Do we know exactly what kind of corruption happen to your pool? If you see it twice in a row, it might suggest a software bug that should be investigated. Note that ZFS stores multiple copies of its essential metadata, and in my experience with my old, consumer grade crappy hardware (non-ECC RAM, with several faulty, single hard drive pool: bad enough to crash almost monthly and damages my data from time to time), I've never seen a corruption this bad and I was always able to recover the pool. At previous employer, the only case that we had the pool corrupted enough to the point that mount was not allowed was because two host nodes happen to import the pool at the same time, which is a situation that can be avoided with SCSI reservation; their hardware was of much better quality, though. Speaking for a tool like 'fsck': I think I'm mostly convinced that it's *not* necessary, because at the point ZFS says the metadata is corrupted, it means that these metadata was really corrupted beyond repair (all replicas were corrupted; otherwise it would recover by finding out the right block and rewrite the bad ones). An interactive tool may be useful (e.g. "I saw data structure version 1, 2, 3 available, and all with bad checksum, choose which one you would want to try"), but I think they wouldn't be very practical for use with large data pools -- unlike traditional filesystems, ZFS uses copy-on-write and heavily depends on the metadata to find where the data is, and a regular "scan" is not really useful. I'd agree that you need a full backup anyway, regardless what storage system is used, though.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGMYy3tYqvrKgk2c==WTwrH03uTN1xQifPRNxXccMsRE1spaRA>