Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 4 Oct 2011 10:31:22 -0400
From:      Paul Mather <paul@gromit.dlib.vt.edu>
To:        Artem Belevich <art@freebsd.org>
Cc:        freebsd-current@freebsd.org
Subject:   Re: Strange ZFS filesystem corruption
Message-ID:  <0AD3BA75-58D7-4359-B61F-B5F4815D3843@gromit.dlib.vt.edu>
In-Reply-To: <CAFqOu6hAGGS-B2=knPEzkiEoQhtNHjGpbqFQBtt2__DqAZvsUg@mail.gmail.com>
References:  <8B59D754-9062-4499-9873-7C2167622032@gromit.dlib.vt.edu> <CAFqOu6hAGGS-B2=knPEzkiEoQhtNHjGpbqFQBtt2__DqAZvsUg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Oct 3, 2011, at 6:19 PM, Artem Belevich wrote:

> On Mon, Oct 3, 2011 at 11:21 AM, Paul Mather <paul@gromit.dlib.vt.edu> =
wrote:
>> =3D=3D=3D=3D=3D
>>=20
>> The pool itself reports no errors.  I performed a scrub on the pool =
yet this bizarre filesystem corruption persists:
>>=20
>> =3D=3D=3D=3D=3D
>> tape# zpool status backups
>>  pool: backups
>>  state: ONLINE
>>  scan: scrub repaired 15K in 7h33m with 0 errors on Sat Oct  1 =
19:22:35 2011
>=20
> The pool *did* report 15K errors that it was able to repair.
>=20
> I'd start with testing your RAM with memtest86 or memtest86+. ZFS
> errors without reported checksum errors may be the sign of bad memory.
> I.e. data gets corrupted before ZFS gets to calculate checksum and
> later invalid data with valid checksum gets written to disk.


Because this machine has ECC RAM, I checked the BIOS logs for ECC errors =
(the BIOS is set to log them) and there are no ECC errors logged.  If =
the RAM were going bad, I would expect it to leave some kind of trace in =
the BIOS log.

Do uncorrectable ECC errors get logged as MCEs under FreeBSD 9?

I've never noticed any problems when doing a "make -j8 buildworld" on =
this machine, either.

Cheers,

Paul.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?0AD3BA75-58D7-4359-B61F-B5F4815D3843>