Date: Wed, 2 Jan 2008 08:01:46 +0100 From: Bernd Walter <ticso@cicely12.cicely.de> To: Eric Anderson <anderson@freebsd.org> Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org> Subject: Re: ZFS i/o errors - which disk is the problem? Message-ID: <20080102070146.GH49874@cicely12.cicely.de> In-Reply-To: <477B16BB.8070104@freebsd.org> References: <477B16BB.8070104@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Jan 01, 2008 at 10:44:43PM -0600, Eric Anderson wrote: > I created a zpool with two new identical (500GB) SATA disks. I rsync'ed > a bunch of data over to the new ZFS file systems, and started seeing i/o > errors. > > Here's how I created the file systems: > > zpool create tank mirror ad6 ad8 > zfs create tank/media > zfs create tank/documents > zfs set sharenfs=on tank/media > zfs set sharenfs=on tank/documents > zfs set atime=off tank > zfs set mountpoint=/media tank/media > zfs set mountpoint=/documents tank/documents > > > Here's what zpool status says: > > # zpool status > pool: tank > state: ONLINE > status: One or more devices has experienced an error resulting in data > corruption. Applications may be affected. > action: Restore the file in question if possible. Otherwise restore the > entire pool from backup. > see: http://www.sun.com/msg/ZFS-8000-8A > scrub: scrub completed with 731 errors on Tue Jan 1 15:17:08 2008 > config: > > NAME STATE READ WRITE CKSUM > tank ONLINE 0 0 1.47K > mirror ONLINE 0 0 1.47K > ad6 ONLINE 0 0 5.12K > ad8 ONLINE 0 0 4.66K > > How can I tell which drive gave the problems, or where the problem came > from? I see several errors in /var/log/messages, like: > > ZFS: zpool I/O failure, zpool=tank error=86 zpool status -v should tell you more details. But it is not required, since the message below is enough. > and many many of these: > > ZFS: checksum mismatch, zpool=tank path=/dev/ad6 offset=31970426880 > size=131072 > > for both the ad6 and ad8 devices. So you have crc errors on both drives. > I'm happy to swap the drive out, but I don't know which is the problem. > I was also wondering if it was a saturated I/O issue on the system > (it's a fairly slow and poky old box). The errors mean that silently data written to disk were not the same when they were read back. I doubt that this are the drives, but if they are identic it is possible of course, since firmware bugs are not impossible. More likely you have a problematic ata controller or maybe defective ram. -- B.Walter http://www.bwct.de http://www.fizon.de bernd@bwct.de info@bwct.de support@fizon.de
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080102070146.GH49874>