From owner-freebsd-fs@FreeBSD.ORG Wed Jan 2 07:02:02 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E618F16A418 for ; Wed, 2 Jan 2008 07:02:02 +0000 (UTC) (envelope-from ticso@cicely12.cicely.de) Received: from raven.bwct.de (raven.bwct.de [85.159.14.73]) by mx1.freebsd.org (Postfix) with ESMTP id 7533213C467 for ; Wed, 2 Jan 2008 07:02:02 +0000 (UTC) (envelope-from ticso@cicely12.cicely.de) Received: from cicely5.cicely.de ([10.1.1.7]) by raven.bwct.de (8.13.4/8.13.4) with ESMTP id m02720UX040813; Wed, 2 Jan 2008 08:02:00 +0100 (CET) (envelope-from ticso@cicely12.cicely.de) Received: from cicely12.cicely.de (cicely12.cicely.de [10.1.1.14]) by cicely5.cicely.de (8.13.4/8.13.4) with ESMTP id m0271l1n016018 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 2 Jan 2008 08:01:47 +0100 (CET) (envelope-from ticso@cicely12.cicely.de) Received: from cicely12.cicely.de (localhost [127.0.0.1]) by cicely12.cicely.de (8.13.4/8.13.3) with ESMTP id m0271l8M056031; Wed, 2 Jan 2008 08:01:47 +0100 (CET) (envelope-from ticso@cicely12.cicely.de) Received: (from ticso@localhost) by cicely12.cicely.de (8.13.4/8.13.3/Submit) id m0271lud056030; Wed, 2 Jan 2008 08:01:47 +0100 (CET) (envelope-from ticso) Date: Wed, 2 Jan 2008 08:01:46 +0100 From: Bernd Walter To: Eric Anderson Message-ID: <20080102070146.GH49874@cicely12.cicely.de> References: <477B16BB.8070104@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <477B16BB.8070104@freebsd.org> X-Operating-System: FreeBSD cicely12.cicely.de 5.4-STABLE alpha User-Agent: Mutt/1.5.9i X-Spam-Status: No, score=-4.4 required=5.0 tests=ALL_TRUSTED=-1.8, BAYES_00=-2.599 autolearn=ham version=3.2.3 X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on cicely12.cicely.de Cc: "freebsd-fs@freebsd.org" Subject: Re: ZFS i/o errors - which disk is the problem? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: ticso@cicely.de List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Jan 2008 07:02:03 -0000 On Tue, Jan 01, 2008 at 10:44:43PM -0600, Eric Anderson wrote: > I created a zpool with two new identical (500GB) SATA disks. I rsync'ed > a bunch of data over to the new ZFS file systems, and started seeing i/o > errors. > > Here's how I created the file systems: > > zpool create tank mirror ad6 ad8 > zfs create tank/media > zfs create tank/documents > zfs set sharenfs=on tank/media > zfs set sharenfs=on tank/documents > zfs set atime=off tank > zfs set mountpoint=/media tank/media > zfs set mountpoint=/documents tank/documents > > > Here's what zpool status says: > > # zpool status > pool: tank > state: ONLINE > status: One or more devices has experienced an error resulting in data > corruption. Applications may be affected. > action: Restore the file in question if possible. Otherwise restore the > entire pool from backup. > see: http://www.sun.com/msg/ZFS-8000-8A > scrub: scrub completed with 731 errors on Tue Jan 1 15:17:08 2008 > config: > > NAME STATE READ WRITE CKSUM > tank ONLINE 0 0 1.47K > mirror ONLINE 0 0 1.47K > ad6 ONLINE 0 0 5.12K > ad8 ONLINE 0 0 4.66K > > How can I tell which drive gave the problems, or where the problem came > from? I see several errors in /var/log/messages, like: > > ZFS: zpool I/O failure, zpool=tank error=86 zpool status -v should tell you more details. But it is not required, since the message below is enough. > and many many of these: > > ZFS: checksum mismatch, zpool=tank path=/dev/ad6 offset=31970426880 > size=131072 > > for both the ad6 and ad8 devices. So you have crc errors on both drives. > I'm happy to swap the drive out, but I don't know which is the problem. > I was also wondering if it was a saturated I/O issue on the system > (it's a fairly slow and poky old box). The errors mean that silently data written to disk were not the same when they were read back. I doubt that this are the drives, but if they are identic it is possible of course, since firmware bugs are not impossible. More likely you have a problematic ata controller or maybe defective ram. -- B.Walter http://www.bwct.de http://www.fizon.de bernd@bwct.de info@bwct.de support@fizon.de