From owner-freebsd-fs@FreeBSD.ORG  Fri Jun 22 07:54:42 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 57A781065673
	for <freebsd-fs@freebsd.org>; Fri, 22 Jun 2012 07:54:42 +0000 (UTC)
	(envelope-from daniel@digsys.bg)
Received: from smtp-sofia.digsys.bg (smtp-sofia.digsys.bg [193.68.3.230])
	by mx1.freebsd.org (Postfix) with ESMTP id D1D908FC12
	for <freebsd-fs@freebsd.org>; Fri, 22 Jun 2012 07:54:41 +0000 (UTC)
Received: from dcave.digsys.bg (dcave.digsys.bg [192.92.129.5])
	(authenticated bits=0)
	by smtp-sofia.digsys.bg (8.14.5/8.14.5) with ESMTP id q5M7sb1I087013
	(version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO)
	for <freebsd-fs@freebsd.org>; Fri, 22 Jun 2012 10:54:37 +0300 (EEST)
	(envelope-from daniel@digsys.bg)
Message-ID: <4FE424BC.5090000@digsys.bg>
Date: Fri, 22 Jun 2012 10:54:36 +0300
From: Daniel Kalchev <daniel@digsys.bg>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
	rv:10.0.5) Gecko/20120607 Thunderbird/10.0.5
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
References: <467652020.30738.1340325033684.JavaMail.root@sz0192a.westchester.pa.mail.comcast.net>
In-Reply-To: <467652020.30738.1340325033684.JavaMail.root@sz0192a.westchester.pa.mail.comcast.net>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: ZFS Checksum errors
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 22 Jun 2012 07:54:42 -0000


On 22.06.12 03:30, rondzierwa@comcast.net wrote:
> the problem was created by a disk error, that is no longer happening, but now I have this corrupted file. how do i clean up the mess? the scrub takes hours, and there are folks that are watching. i'm working on the third iteration of clear and scrub, how many times should it take? I can be patient, but it would be nice if i had an answer for the folks that keep asking "are we there yet?".

The easiest fix to your problem is to

- backup all data
- destroy the ZFS pool
- destroy the RAID volume
- create single-disk volumes for each disk or just export disks as JBOD
- create your ZFS pool using the individual drives (*)
- restore all data
- run your tests again

You will be able to identify which disk is having problems. Sometimes, 
problems that you describe are caused by faulty disk. Re-seating the 
cables (or unplugging and plugging again the hot-swap disk) seem to fix 
it.. but that is only temporary. Such disks rarely show as 'bad' to 
"hardware RAID" controllers, but ZFS detects them always.

Another "fix" is to stop using ZFS altogether, use some other file 
system. Do not see any errors anymore. Silently corrupt data. It is your 
data, your choice. I wouldn't do that.

(*) If you have large number of disks, you may wish to label them and 
use labels instead of 'raw' drive names. You could use either glabel(8) 
or gpart(8) to create the labels, then use these to build the zpool. If 
for example you label the disks by their position in the chassis, then 
you can easily find out which disk to replace from the zpool output.

Daniel