Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 20 Jul 2012 15:22:44 -0700
From:      James Snow <snow@teardrop.org>
To:        Dr Josef Karthauser <joe@tao.org.uk>
Cc:        "freebsd-stable@freebsd.org" <freebsd-stable@freebsd.org>
Subject:   Re: Checksum errors across ZFS array
Message-ID:  <20120720222244.GA18627@teardrop.org>
In-Reply-To: <BC2AD7AE-4D82-4989-9D51-F1F2329C00EB@tao.org.uk>
References:  <20120719152909.GL32960@teardrop.org> <002D6A20-D2A4-4909-B2EA-3DB562326050@tao.org.uk> <20120719171548.GM32960@teardrop.org> <BC2AD7AE-4D82-4989-9D51-F1F2329C00EB@tao.org.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Jul 20, 2012 at 04:09:28PM +0100, Dr Josef Karthauser wrote:

> Take care though, my system which had been working fine for about
> a year when I noticed the ZFS rot (which all appears to be recent
> in time). I ran memcheck+ on it for 8 hours or so, and it showed no
> errors at all. However, when I replaced the memory with a different
> vendor the problems went away. (Reboots and power off/on restarts
> hadn't fixed the problem before!).
>
> So, take care if the memory doesn't report any failures, it might
> still be faulty.

I've run memtest for about 20 hours now (13 hours in one pass, 7 and
counting on the second) and seen no errors. Hrm.

> p.s. It was my fault that I wasn't running ECC memory on the system!

I am running ECC memory though. If you'd had ECC memory to start do you
think you might have seen a different result?

In my case, replacing all the RAM and getting a 2nd controller are
almost the same cost. Since a second controller will give me the best
visibility - or long-term expandability if it turns out not to be the
controller - I've gone ahead and ordered one.

If I move half the disks to the new controller and continue to see the
problems only on the old controller, I know it's the controller or the
slot on the motherboard. If the problem continues without any change, I
can replace RAM, and then the motherboard.


-Snow




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120720222244.GA18627>