Date: Tue, 30 Apr 2019 20:14:19 +1000 From: Michelle Sullivan <michelle@sorbs.net> To: Xin LI <delphij@gmail.com> Cc: rainer@ultra-secure.de, owner-freebsd-stable@freebsd.org, freebsd-stable <freebsd-stable@freebsd.org>, Andrea Venturoli <ml@netfence.it> Subject: Re: ZFS... Message-ID: <5ED8BADE-7B2C-4B73-93BC-70739911C5E3@sorbs.net> In-Reply-To: <CAGMYy3tYqvrKgk2c==WTwrH03uTN1xQifPRNxXccMsRE1spaRA@mail.gmail.com> References: <30506b3d-64fb-b327-94ae-d9da522f3a48@sorbs.net> <CAOtMX2gf3AZr1-QOX_6yYQoqE-H%2B8MjOWc=eK1tcwt5M3dCzdw@mail.gmail.com> <56833732-2945-4BD3-95A6-7AF55AB87674@sorbs.net> <3d0f6436-f3d7-6fee-ed81-a24d44223f2f@netfence.it> <17B373DA-4AFC-4D25-B776-0D0DED98B320@sorbs.net> <70fac2fe3f23f85dd442d93ffea368e1@ultra-secure.de> <70C87D93-D1F9-458E-9723-19F9777E6F12@sorbs.net> <CAGMYy3tYqvrKgk2c==WTwrH03uTN1xQifPRNxXccMsRE1spaRA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Michelle Sullivan http://www.mhix.org/ Sent from my iPad > On 30 Apr 2019, at 19:50, Xin LI <delphij@gmail.com> wrote: >=20 >=20 >> On Tue, Apr 30, 2019 at 5:08 PM Michelle Sullivan <michelle@sorbs.net> wr= ote: >> but in my recent experience 2 issues colliding at the same time results i= n disaster >=20 > Do we know exactly what kind of corruption happen to your pool? If you se= e it twice in a row, it might suggest a software bug that should be investig= ated. All I know is it=E2=80=99s a checksum error on a meta slab (122) and from wh= at I can gather it=E2=80=99s the spacemap that is corrupt... but I am no exp= ert. I don=E2=80=99t believe it=E2=80=99s a software fault as such, because= this was cause by a hard outage (damaged UPSes) whilst resilvering a single= (but completely failed) drive. ...and after the first outage a second occu= rred (same as the first but more damaging to the power hardware)... the host= itself was not damaged nor were the drives or controller. >=20 > Note that ZFS stores multiple copies of its essential metadata, and in my e= xperience with my old, consumer grade crappy hardware (non-ECC RAM, with sev= eral faulty, single hard drive pool: bad enough to crash almost monthly and d= amages my data from time to time), This was a top end consumer grade mb with non ecc ram that had been running f= or 8+ years without fault (except for hard drive platter failures.). Uptime w= ould have been years if it wasn=E2=80=99t for patching. > I've never seen a corruption this bad and I was always able to recover the= pool.=20 So far, same. > At previous employer, the only case that we had the pool corrupted enough t= o the point that mount was not allowed was because two host nodes happen to i= mport the pool at the same time, which is a situation that can be avoided wi= th SCSI reservation; their hardware was of much better quality, though. >=20 > Speaking for a tool like 'fsck': I think I'm mostly convinced that it's no= t necessary, because at the point ZFS says the metadata is corrupted, it mea= ns that these metadata was really corrupted beyond repair (all replicas were= corrupted; otherwise it would recover by finding out the right block and re= write the bad ones). I see this message all the time and mostly agree.. actually I do agree with p= ossibly a minor exception, but so minor it=E2=80=99s probably not worth it. = However as I suggested in my original post.. the pool says the files are th= ere, a tool that would send them (aka zfs send) but ignoring errors to space= maps etc would be real useful (to me.) >=20 > An interactive tool may be useful (e.g. "I saw data structure version 1, 2= , 3 available, and all with bad checksum, choose which one you would want to= try"), but I think they wouldn't be very practical for use with large data p= ools -- unlike traditional filesystems, ZFS uses copy-on-write and heavily d= epends on the metadata to find where the data is, and a regular "scan" is no= t really useful. Zdb -AAA showed (shows) 36m files.. which suggests the data is intact, but i= t aborts the mount with I/o error because it says metadata has three errors.= . 2 =E2=80=98metadata=E2=80=99 and one =E2=80=9C<storage:0x0>=E2=80=9D (stor= age being the pool name).. it does import, and it attempts to resilver but r= eports the resilver finishes at some 780M (ish).. export import and it does i= t all again... zdb without -AAA aborts loading metaslab 122. >=20 > I'd agree that you need a full backup anyway, regardless what storage syst= em is used, though. Yeah.. unlike UFS that has to get really really hosed to restore from backup= with nothing recoverable it seems ZFS can get hosed where issues occur in j= ust the wrong bit... but mostly it is recoverable (and my experience has bee= n some nasty shit that always ended up being recoverable.) Michelle=20=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5ED8BADE-7B2C-4B73-93BC-70739911C5E3>