Date: Sun, 9 Aug 2009 13:14:32 +0200 From: Jason Edwards <sub.mesa@gmail.com> To: freebsd-fs@freebsd.org Subject: ZFS corruption on 8-CURRENT Message-ID: <883b2dc50908090414o71bc5fc2q5aef64c2b5da653e@mail.gmail.com>
next in thread | raw e-mail | index | archive | help
Hi guys, I'm investigating some weird corruption issue. After filling up my 8-disk RAID-Z pool with data and using it for a few weeks, it started to show me this: # zpool status sub pool: sub state: UNAVAIL status: One or more devices could not be used because the label is missing or invalid. There are insufficient replicas for the pool to continue functioning. action: Destroy and re-create the pool from a backup source. see: http://www.sun.com/msg/ZFS-8000-5E scrub: none requested config: NAME STATE READ WRITE CKSUM sub UNAVAIL 0 0 0 insufficient replicas raidz1 UNAVAIL 0 0 0 insufficient replicas ad14a FAULTED 0 0 0 corrupted data ad8a ONLINE 0 0 0 ad10a ONLINE 0 0 0 ad10a FAULTED 0 0 0 corrupted data ad18a FAULTED 0 0 0 corrupted data ad12a FAULTED 0 0 0 corrupted data ad16a FAULTED 0 0 0 corrupted data ad8a FAULTED 0 0 0 corrupted data oops? What happened here? Besides the "corrupted data" it can also be seen ad10a is displayed twice, one online and one failed. After rebooting, it shows a little cleaner, but it found a problem with the ZIL: # zpool status sub pool: sub state: FAULTED status: An intent log record could not be read. Waiting for adminstrator intervention to fix the faulted pool. action: Either restore the affected device(s) and run 'zpool online', or ignore the intent log records by running 'zpool clear'. scrub: none requested config: NAME STATE READ WRITE CKSUM sub FAULTED 0 0 0 bad intent log raidz1 ONLINE 0 0 0 ad14a ONLINE 0 0 0 ad4a ONLINE 0 0 0 ad6a ONLINE 0 0 0 ad10a ONLINE 0 0 0 ad18a ONLINE 6 0 0 ad12a ONLINE 0 0 0 ad16a ONLINE 0 0 0 ad8a ONLINE 0 0 0 Additionally, i got some read errors on ad18. But since this is a raid-z i guess one disk alone cannot corrupt/fail the entire array. Before i do any actions that might be destructive, anybody has a clue what happened here and how i can prevent this in the future? Box is a quadcore X4 9350e with 6GB RAM and its running 8-CURRENT as of July 21th 2009 (after 8.0-BETA2). It did work correctly before upgrading CURRENT to a newer date. Maybe some bug slipped in? Kind regards, sub
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?883b2dc50908090414o71bc5fc2q5aef64c2b5da653e>