Date: Mon, 20 Feb 2012 13:27:24 +1000 From: Stephen McKay <mckay@freebsd.org> To: freebsd-fs@freebsd.org Cc: Stephen McKay <mckay@freebsd.org> Subject: Re: Constant minor ZFS corruption, probably solved Message-ID: <201202200327.q1K3ROrt009042@dungeon.home> In-Reply-To: <201107052241.p65MfqVA002215@dungeon.home> from Stephen McKay at "Wed, 06 Jul 2011 08:41:52 %2B1000" References: <201103081425.p28EPQtM002115@dungeon.home> <201107052241.p65MfqVA002215@dungeon.home>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wednesday, 6th July 2011, Stephen McKay wrote: >Perhaps you remember me struggling with a small but continuous amount >of corruption on ZFS volumes with a new server we had built at work. >... I've now done enough tests so that I'm 90% >certain what the problem is: Seagate's caching firmware. >... I'm certain that disabling write caching >has given us a stable machine. And I'm 90% certain that it's because >of bugs in Seagate's cache firmware. I hope someone else can replicate >this and settle the issue. I'm following up on an old post of mine to confirm that my write cache disabling workaround is well and truly successful. Eight months later we've seen no further corruption when using Seagate ST2000DL003 disks. The machine (now running 9.0-RELEASE) sees constant moderate to low activity as a file server (about 6TB in use). I did receive a message from one other person suffering from the same problem. It was solved by disabling write caching, so that's two data points. And two data points is a trend, right? :-) His system was running 8.2-stable on an AMD Phenom CPU in a MSI 870-G45 motherboard (AMD SB710 southbridge) so there's very little overlap with our system: just zfs and Seagate green disks. His disks were ST1500DL003 (1.5TB) with firmware CC32 so that more or less means the common points are simply zfs and Seagate CC32 firmware. You already know which one I think is to blame. But then again no avalanche of complaints has been seen either, so it's still somewhat mysterious. Is there some other problem that is just being masked by disabling the cache? Unless there's a sudden surge in reports, we'll never know for certain. So, if you've seen this problem and cured it by disabling the write cache, I'd like to know about it. How's your data? Run a scrub lately? Perhaps now is a good time. ;-) Cheers, Stephen.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201202200327.q1K3ROrt009042>