Date: Tue, 4 Oct 2011 10:31:22 -0400 From: Paul Mather <paul@gromit.dlib.vt.edu> To: Artem Belevich <art@freebsd.org> Cc: freebsd-current@freebsd.org Subject: Re: Strange ZFS filesystem corruption Message-ID: <0AD3BA75-58D7-4359-B61F-B5F4815D3843@gromit.dlib.vt.edu> In-Reply-To: <CAFqOu6hAGGS-B2=knPEzkiEoQhtNHjGpbqFQBtt2__DqAZvsUg@mail.gmail.com> References: <8B59D754-9062-4499-9873-7C2167622032@gromit.dlib.vt.edu> <CAFqOu6hAGGS-B2=knPEzkiEoQhtNHjGpbqFQBtt2__DqAZvsUg@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Oct 3, 2011, at 6:19 PM, Artem Belevich wrote: > On Mon, Oct 3, 2011 at 11:21 AM, Paul Mather <paul@gromit.dlib.vt.edu> = wrote: >> =3D=3D=3D=3D=3D >>=20 >> The pool itself reports no errors. I performed a scrub on the pool = yet this bizarre filesystem corruption persists: >>=20 >> =3D=3D=3D=3D=3D >> tape# zpool status backups >> pool: backups >> state: ONLINE >> scan: scrub repaired 15K in 7h33m with 0 errors on Sat Oct 1 = 19:22:35 2011 >=20 > The pool *did* report 15K errors that it was able to repair. >=20 > I'd start with testing your RAM with memtest86 or memtest86+. ZFS > errors without reported checksum errors may be the sign of bad memory. > I.e. data gets corrupted before ZFS gets to calculate checksum and > later invalid data with valid checksum gets written to disk. Because this machine has ECC RAM, I checked the BIOS logs for ECC errors = (the BIOS is set to log them) and there are no ECC errors logged. If = the RAM were going bad, I would expect it to leave some kind of trace in = the BIOS log. Do uncorrectable ECC errors get logged as MCEs under FreeBSD 9? I've never noticed any problems when doing a "make -j8 buildworld" on = this machine, either. Cheers, Paul.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?0AD3BA75-58D7-4359-B61F-B5F4815D3843>