Date: Mon, 20 Aug 2007 14:20:33 +0200 From: Kenneth Vestergaard Schmidt <kvs@pil.dk> To: Pawel Jakub Dawidek <pjd@FreeBSD.org> Cc: freebsd-fs@freebsd.org Subject: Re: ZFS: 'checksum mismatch' all over the place Message-ID: <m1ps1iz9bi.fsf@binarysolutions.dk> In-Reply-To: <20070820112946.GC16977@garage.freebsd.pl> (Pawel Jakub Dawidek's message of "Mon\, 20 Aug 2007 13\:29\:46 %2B0200") References: <m1wsvtkviw.fsf@binarysolutions.dk> <20070820112946.GC16977@garage.freebsd.pl>
next in thread | previous in thread | raw e-mail | index | archive | help
Pawel Jakub Dawidek <pjd@FreeBSD.org> writes: >> The drive-cage was previously used to expose a RAID-5 array, composed of >> the 12 disks. This worked just fine, connecting to the same machine and >> controller (i386 IBM xSeries X335, mpt(4) controller). > > How do you know it was fine? Did you have something that did > checksumming? You could try geli with integrity verification feature > turned on, fill the disks with some random data and then read it back, > if your controller corrupts the data, geli should tell you this. I may have to do this. The previous drive was almost filled to the brim with data, which rsync looked at each day, and we didn't have a lot of re-transfer, but that doesn't necessarily mean anything. The same controller is used in 50+ other machines, but only connected to two internal drives. There are no problems in those machines. Still, the really weird thing is that we're seeing checksum-errors in the same block across many drives. This does smell like either an issue with the driver, the controller, or the drivecage, and not ZFS or GEOM. The machine should have been in production, but the array just failed, and if I can't salvage it, I'll have to start over. I might just as well try geli with integrity verification before recreating the ZFS array, then. -- Kenneth Schmidt pil.dk
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?m1ps1iz9bi.fsf>