Date: Mon, 20 Aug 2007 10:53:47 -0700 From: Bakul Shah <bakul@bitblocks.com> To: Kenneth Vestergaard Schmidt <kvs@pil.dk> Cc: freebsd-fs@freebsd.org Subject: Re: ZFS: 'checksum mismatch' all over the place Message-ID: <20070820175347.861295B30@mail.bitblocks.com> In-Reply-To: Your message of "Mon, 20 Aug 2007 09:28:00 %2B0200." <m17inq1x8f.fsf@binarysolutions.dk>
next in thread | previous in thread | raw e-mail | index | archive | help
> Aug 20 01:00:24 leibniz root: ZFS: checksum mismatch, zpool=pil path=/dev/da0 > offset=58350080 size=512 > Aug 20 01:00:24 leibniz root: ZFS: checksum mismatch, zpool=pil path=/dev/da1 > offset=58350080 size=512 > Aug 20 01:00:24 leibniz root: ZFS: checksum mismatch, zpool=pil path=/dev/da2 > offset=58350080 size=512 > Aug 20 01:00:24 leibniz root: ZFS: checksum mismatch, zpool=pil path=/dev/da3 > offset=58350080 size=512 > > Aug 20 01:00:24 leibniz root: ZFS: checksum mismatch, zpool=pil path=/dev/da2 > offset=38010880 size=512 > Aug 20 01:00:24 leibniz root: ZFS: checksum mismatch, zpool=pil path=/dev/da3 > offset=38010880 size=512 > Aug 20 01:00:24 leibniz root: ZFS: checksum mismatch, zpool=pil path=/dev/da4 > offset=38010880 size=512 > Aug 20 01:00:24 leibniz root: ZFS: checksum mismatch, zpool=pil path=/dev/da5 > offset=38010880 size=512 > Aug 20 01:00:24 leibniz root: ZFS: checksum mismatch, zpool=pil path=/dev/da6 > offset=38010880 size=512 > Aug 20 01:00:24 leibniz root: ZFS: checksum mismatch, zpool=pil path=/dev/da7 > offset=38010880 size=512 > Aug 20 01:00:24 leibniz root: ZFS: checksum mismatch, zpool=pil path=/dev/da8 > offset=38010880 size=512 > Aug 20 01:00:24 leibniz root: ZFS: checksum mismatch, zpool=pil path=/dev/da9 > offset=38010880 size=512 > Aug 20 01:00:24 leibniz root: ZFS: checksum mismatch, zpool=pil path=/dev/da1 > 0 offset=38010880 size=512 > > Can anybody offer anything to help me with this? I'm pretty much at a > loss as to how I can find the cause of this. This probably means the more than two blocks in zraid2 were bad so zpool can't correct the error. Just speculating here but may be the controller or disk writes there "behind your back" (assuming the offset reported is correct -- you can check zfs logic for that)? Can you map the offset to a disk block number? You can try writing/reading that block (after disabling zfs) and see if it changes in an unexpected way. This may not show any error if the problem is some complex interaction. If the disks are all the same and new, check the vendor website to see if there is a firmware upgrade. See if replacing one disk with another type of disk changes the error. 38010880 is 0x2440000 -- don't know if that is magic in any way but sometime a hex value can reveal a pattern. Always look at the binary or hex representation of any reported number in an error message!
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070820175347.861295B30>