From owner-freebsd-fs@FreeBSD.ORG  Mon Jan  7 13:59:47 2008
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 7369E16A41B
	for <freebsd-fs@freebsd.org>; Mon,  7 Jan 2008 13:59:47 +0000 (UTC)
	(envelope-from ticso@cicely12.cicely.de)
Received: from raven.bwct.de (raven.bwct.de [85.159.14.73])
	by mx1.freebsd.org (Postfix) with ESMTP id F3B8113C467
	for <freebsd-fs@freebsd.org>; Mon,  7 Jan 2008 13:59:46 +0000 (UTC)
	(envelope-from ticso@cicely12.cicely.de)
Received: from cicely5.cicely.de ([10.1.1.7])
	by raven.bwct.de (8.13.4/8.13.4) with ESMTP id m07DxcQE029856;
	Mon, 7 Jan 2008 14:59:38 +0100 (CET)
	(envelope-from ticso@cicely12.cicely.de)
Received: from cicely12.cicely.de (cicely12.cicely.de [10.1.1.14])
	by cicely5.cicely.de (8.13.4/8.13.4) with ESMTP id m07DxRFl075184
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Mon, 7 Jan 2008 14:59:27 +0100 (CET)
	(envelope-from ticso@cicely12.cicely.de)
Received: from cicely12.cicely.de (localhost [127.0.0.1])
	by cicely12.cicely.de (8.13.4/8.13.3) with ESMTP id m07DxQbl075918;
	Mon, 7 Jan 2008 14:59:26 +0100 (CET)
	(envelope-from ticso@cicely12.cicely.de)
Received: (from ticso@localhost)
	by cicely12.cicely.de (8.13.4/8.13.3/Submit) id m07DxQZO075917;
	Mon, 7 Jan 2008 14:59:26 +0100 (CET) (envelope-from ticso)
Date: Mon, 7 Jan 2008 14:59:26 +0100
From: Bernd Walter <ticso@cicely12.cicely.de>
To: Tz-Huan Huang <tzhuan@csie.org>
Message-ID: <20080107135925.GF65134@cicely12.cicely.de>
References: <477B16BB.8070104@freebsd.org>
	<20080102070146.GH49874@cicely12.cicely.de>
	<477B8440.1020501@freebsd.org>
	<200801031750.31035.peter.schuller@infidyne.com>
	<477D16EE.6070804@freebsd.org>
	<20080103171825.GA28361@lor.one-eyed-alien.net>
	<6a7033710801061844m59f8c62dvdd3eea80f6c239c1@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <6a7033710801061844m59f8c62dvdd3eea80f6c239c1@mail.gmail.com>
X-Operating-System: FreeBSD cicely12.cicely.de 5.4-STABLE alpha
User-Agent: Mutt/1.5.9i
X-Spam-Status: No, score=-4.4 required=5.0 tests=ALL_TRUSTED=-1.8,
	BAYES_00=-2.599 autolearn=ham version=3.2.3
X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on cicely12.cicely.de
Cc: freebsd-fs@freebsd.org, Brooks Davis <brooks@freebsd.org>
Subject: Re: ZFS i/o errors - which disk is the problem?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: ticso@cicely.de
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Jan 2008 13:59:47 -0000

On Mon, Jan 07, 2008 at 10:44:13AM +0800, Tz-Huan Huang wrote:
> 2008/1/4, Brooks Davis <brooks@freebsd.org>:
> >
> > We've definitely seen cases where hardware changes fixed ZFS checksum errors.
> > In once case, a firmware upgrade on the raid controller fixed it.  In another
> > case, we'd been connecting to an external array with a SCSI card that didn't
> > have a PCI bracket and the errors went away when the replacement one arrived
> > and was installed.  The fact that there were significant errors caught by ZFS
> > was quite disturbing since we wouldn't have found them with UFS.
> 
> Hi,
> 
> We have a nfs server using zfs with the similar problem.
> The box is i386 7.0-PRERELEASE with 3G ram:
> 
> # uname -a
> FreeBSD cml3 7.0-PRERELEASE FreeBSD 7.0-PRERELEASE #2:
> Sat Jan  5 14:42:41 CST 2008 root@cml3:/usr/obj/usr/src/sys/CML2  i386
> 
> The zfs pool contains 3 raids now:
> 
> 2007-11-20.11:49:17 zpool create pool /dev/label/proware263
> 2007-11-20.11:53:31 zfs create pool/project
> ... (zfs create other filesystems) ...
> 2007-11-20.11:54:32 zfs set atime=off pool
> 2007-12-08.22:59:15 zpool add pool /dev/da0
> 2008-01-05.21:20:03 zpool add pool /dev/label/proware262
> 
> After a power loss yesterday, the zfs status shows
> 
> # zpool status -v
>   pool: pool
>  state: ONLINE
> status: One or more devices has experienced an error resulting in data
>         corruption.  Applications may be affected.
> action: Restore the file in question if possible.  Otherwise restore the
>         entire pool from backup.
>    see: http://www.sun.com/msg/ZFS-8000-8A
>  scrub: scrub completed with 231 errors on Mon Jan  7 08:05:35 2008
> config:
> 
>         NAME                STATE     READ WRITE CKSUM
>         pool                ONLINE       0     0   516
>           label/proware263  ONLINE       0     0   231
>           da0               ONLINE       0     0   285
>           label/proware262  ONLINE       0     0     0
> 
> errors: Permanent errors have been detected in the following files:
> 
>         /system/database/mysql/flickr_geo/flickr_raw_tag.MYI
>         pool/project:<0x0>
>         pool/home/master/96:<0xbf36>
> 
> The main problem is that we cannot mount pool/project any more:
> 
> # zfs mount pool/project
> cannot mount 'pool/project': Input/output error
> # grep ZFS /var/log/messages
> Jan  7 10:08:35 cml3 root: ZFS: zpool I/O failure, zpool=pool error=86
> (repeat many times)
> 
> There are many data in pool/project, probably 3.24T. zdb shows
> 
> # zdb pool
> ...
> Dataset pool/project [ZPL], ID 33, cr_txg 57, 3.24T, 22267231 objects
> ...
> 
> (zdb is still running now, we can provide the output if helpful)
> 
> Is there any way to recover any data from pool/project?

The data is corrupted by controller and/or disk subsystem.
You have no other data sources for the broken data, so it is lost.
The only garantied way is to get it back from backup.
Maybe older snapshots/clones are still readable - I don't know.
Nevertheless data is corrupted and that's the purpose for alternative
data sources such as raidz/mirror and at last backup.
You shouldn't have ignored those errors at first, because you are
running with faulty hardware.
Without ZFS checksumming the system would just process the broken
data with unpredictable results.
If all those errors are fresh then you likely used a broken RAID
controller below ZFS, which silently corrupted syncronity and then
blow when disk state changed.
Unfortunately many RAID controllers are broken and therefor useless.

-- 
B.Walter                http://www.bwct.de      http://www.fizon.de
bernd@bwct.de           info@bwct.de            support@fizon.de