From owner-freebsd-fs@FreeBSD.ORG Tue Feb 5 17:39:42 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8DE8816A417 for ; Tue, 5 Feb 2008 17:39:42 +0000 (UTC) (envelope-from joe@skyrush.com) Received: from shadow.wildlava.net (shadow.wildlava.net [67.40.138.81]) by mx1.freebsd.org (Postfix) with ESMTP id 5865213C442 for ; Tue, 5 Feb 2008 17:39:42 +0000 (UTC) (envelope-from joe@skyrush.com) Received: from [129.162.240.95] (unknown [129.162.240.95]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by shadow.wildlava.net (Postfix) with ESMTP id 28BD78F424; Tue, 5 Feb 2008 10:39:41 -0700 (MST) Message-ID: <47A89F0F.1030505@skyrush.com> Date: Tue, 05 Feb 2008 10:38:23 -0700 From: Joe Peterson User-Agent: Thunderbird 2.0.0.9 (Windows/20071031) MIME-Version: 1.0 To: =?UTF-8?B?RGFnLUVybGluZyBTbcO4cmdyYXY=?= References: <47A73C8D.3000107@skyrush.com> <86prvby5o1.fsf@ds4.des.no> <47A864D9.4060504@skyrush.com> <864pcnxz8f.fsf@ds4.des.no> <47A88ADE.7050503@skyrush.com> <86abmfwc6h.fsf@ds4.des.no> In-Reply-To: <86abmfwc6h.fsf@ds4.des.no> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: freebsd-fs@freebsd.org Subject: Re: Forcing full file read in ZFS even when checksum error encountered X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Feb 2008 17:39:42 -0000 Dag-Erling Smørgrav wrote: > A checksum error results from a read error. Check your drive's SMART > error log if it has one. It might not be detectable in a surface scan, > as the damaged sector will be automatically reassigned if it's written > to (which ZFS may very well have done) I've checked SMART - no [unrecoverable] errors and no additional sector reallocations, and I've done a SeaTools long test - no problems found. But I do not understand: in zpool status, there are stats on read errors in addition to checksum errors. If I understand correctly, a read error would be the system/HW reporting an error on read, whereas the whole idea of the checksums in ZFS is to catch errors that are *not* reported as read errors (i.e. silent bit changes that normal filesystems would never catch). What I seem to be seeing is a case in which ZFS says the checksum is wrong. There are only counts in the CKSUM col, not the other cols in the status, so I do not think this is a "read error" - it is ZFS's last line of defense (the checksum) reporting a mismatch. In other words, I assume the read would complete if ZFS did not catch the checksum mismatch, and what I'd like to do is let it complete so I can see for myself where these bit errors are by comparing the read file to a known good copy (that I have). If there are no mismatches, it would mean there is a metadata error of ZFS bug. -Joe