Date: Mon, 19 Feb 2018 06:13:08 +1100 From: Michelle Sullivan <michelle@sorbs.net> To: Ben RUBSON <ben.rubson@gmail.com>, "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org> Subject: Re: ZFS pool faulted (corrupt metadata) but the disk data appears ok... Message-ID: <5eb35692-37ab-33bf-aea1-9f4aa61bb7f7@sorbs.net> In-Reply-To: <42C31457-1A84-4CCA-BF14-357F1F3177DA@gmail.com> References: <54D3E9F6.20702@sorbs.net> <54D41608.50306@delphij.net> <54D41AAA.6070303@sorbs.net> <54D41C52.1020003@delphij.net> <54D424F0.9080301@sorbs.net> <54D47F94.9020404@freebsd.org> <54D4A552.7050502@sorbs.net> <54D4BB5A.30409@freebsd.org> <54D8B3D8.6000804@sorbs.net> <54D8CECE.60909@freebsd.org> <54D8D4A1.9090106@sorbs.net> <54D8D5DE.4040906@sentex.net> <54D8D92C.6030705@sorbs.net> <54D8E189.40201@sorbs.net> <54D924DD.4000205@sorbs.net> <54DCAC29.8000301@sorbs.net> <9c995251-45f1-cf27-c4c8-30a4bd0f163c@sorbs.net> <8282375D-5DDC-4294-A69C-03E9450D9575@gmail.com> <73dd7026-534e-7212-a037-0cbf62a61acd@sorbs.net> <FAB7C3BA-057F-4AB4-96E1-5C3208BABBA7@gmail.com> <027070fb-f7b5-3862-3a52-c0f280ab46d1@sorbs.net> <42C31457-1A84-4CCA-BF14-357F1F3177DA@gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Ben RUBSON wrote: > On 02 Feb 2018 21:48, Michelle Sullivan wrote: > >> Ben RUBSON wrote: >> >>> So disks died because of the carrier, as I assume the second >>> unscathed server was OK... >> >> Pretty much. >> >>> Heads must have scratched the platters, but they should have been >>> parked, so... Really strange. >> >> You'd have thought... though 2 of the drives look like it was wear >> and wear issues (the 2 not showing red lights) just not picked up on >> the periodic scrub.... Could be that the recovery showed that one >> up... you know - how you can have an array working fine, but one disk >> dies then others fail during the rebuild because of the extra workload. > > Yes... To try to mitigate this, when I add a new vdev to a pool, I > spread the new disks I have among the existing vdevs, and construct > the new vdev with the remaining new disk(s) + other disks retrieved > from the other vdevs. Thus, when possible, avoiding vdevs with all > disks at the same runtime. > However I only use mirrors, applying this with raid-Z could be a > little bit more tricky... > Believe it or not... # zpool status -v pool: VirtualDisks state: ONLINE status: One or more devices are configured to use a non-native block size. Expect reduced performance. action: Replace affected devices with devices that support the configured block size, or migrate data to a properly configured pool. scan: none requested config: NAME STATE READ WRITE CKSUM VirtualDisks ONLINE 0 0 0 zvol/sorbs/VirtualDisks ONLINE 0 0 0 block size: 512B configured, 8192B native errors: No known data errors pool: sorbs state: ONLINE scan: resilvered 2.38T in 307445734561816429h29m with 0 errors on Sat Aug 26 09:26:53 2017 config: NAME STATE READ WRITE CKSUM sorbs ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 mfid0 ONLINE 0 0 0 mfid1 ONLINE 0 0 0 mfid7 ONLINE 0 0 0 mfid8 ONLINE 0 0 0 mfid12 ONLINE 0 0 0 mfid10 ONLINE 0 0 0 mfid14 ONLINE 0 0 0 mfid11 ONLINE 0 0 0 mfid6 ONLINE 0 0 0 mfid15 ONLINE 0 0 0 mfid2 ONLINE 0 0 0 mfid3 ONLINE 0 0 0 spare-12 ONLINE 0 0 3 mfid13 ONLINE 0 0 0 mfid9 ONLINE 0 0 0 mfid4 ONLINE 0 0 0 mfid5 ONLINE 0 0 0 spares 185579620420611382 INUSE was /dev/mfid9 errors: No known data errors It would appear that the when I replaced the damaged drives it picked one of them up as being rebuilt from back in August (before it was packed up to go) and that was why it saw it as 'corrupted metadata' and spent the last 3 weeks importing it, it rebuilt it as it was importing it.. no dataloss that I can determine. (literally just finished in the middle of the night here.) -- Michelle Sullivan http://www.mhix.org/
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5eb35692-37ab-33bf-aea1-9f4aa61bb7f7>