From owner-freebsd-fs@FreeBSD.ORG  Tue Feb  5 17:39:42 2008
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 8DE8816A417
	for <freebsd-fs@freebsd.org>; Tue,  5 Feb 2008 17:39:42 +0000 (UTC)
	(envelope-from joe@skyrush.com)
Received: from shadow.wildlava.net (shadow.wildlava.net [67.40.138.81])
	by mx1.freebsd.org (Postfix) with ESMTP id 5865213C442
	for <freebsd-fs@freebsd.org>; Tue,  5 Feb 2008 17:39:42 +0000 (UTC)
	(envelope-from joe@skyrush.com)
Received: from [129.162.240.95] (unknown [129.162.240.95])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by shadow.wildlava.net (Postfix) with ESMTP id 28BD78F424;
	Tue,  5 Feb 2008 10:39:41 -0700 (MST)
Message-ID: <47A89F0F.1030505@skyrush.com>
Date: Tue, 05 Feb 2008 10:38:23 -0700
From: Joe Peterson <joe@skyrush.com>
User-Agent: Thunderbird 2.0.0.9 (Windows/20071031)
MIME-Version: 1.0
To: =?UTF-8?B?RGFnLUVybGluZyBTbcO4cmdyYXY=?= <des@des.no>
References: <47A73C8D.3000107@skyrush.com>
	<86prvby5o1.fsf@ds4.des.no>	<47A864D9.4060504@skyrush.com>
	<864pcnxz8f.fsf@ds4.des.no>	<47A88ADE.7050503@skyrush.com>
	<86abmfwc6h.fsf@ds4.des.no>
In-Reply-To: <86abmfwc6h.fsf@ds4.des.no>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Cc: freebsd-fs@freebsd.org
Subject: Re: Forcing full file read in ZFS even when checksum error
	encountered
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 05 Feb 2008 17:39:42 -0000

Dag-Erling Smørgrav wrote:
> A checksum error results from a read error.  Check your drive's SMART
> error log if it has one.  It might not be detectable in a surface scan,
> as the damaged sector will be automatically reassigned if it's written
> to (which ZFS may very well have done)

I've checked SMART - no [unrecoverable] errors and no additional sector
reallocations, and I've done a SeaTools long test - no problems found.

But I do not understand: in zpool status, there are stats on read errors in
addition to checksum errors.  If I understand correctly, a read error would be
the system/HW reporting an error on read, whereas the whole idea of the
checksums in ZFS is to catch errors that are *not* reported as read errors
(i.e. silent bit changes that normal filesystems would never catch).  What I
seem to be seeing is a case in which ZFS says the checksum is wrong.  There
are only counts in the CKSUM col, not the other cols in the status, so I do
not think this is a "read error" - it is ZFS's last line of defense (the
checksum) reporting a mismatch.

In other words, I assume the read would complete if ZFS did not catch the
checksum mismatch, and what I'd like to do is let it complete so I can see for
myself where these bit errors are by comparing the read file to a known good
copy (that I have).  If there are no mismatches, it would mean there is a
metadata error of ZFS bug.

						-Joe