From owner-freebsd-fs@FreeBSD.ORG Fri Jun 22 07:54:42 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 57A781065673 for ; Fri, 22 Jun 2012 07:54:42 +0000 (UTC) (envelope-from daniel@digsys.bg) Received: from smtp-sofia.digsys.bg (smtp-sofia.digsys.bg [193.68.3.230]) by mx1.freebsd.org (Postfix) with ESMTP id D1D908FC12 for ; Fri, 22 Jun 2012 07:54:41 +0000 (UTC) Received: from dcave.digsys.bg (dcave.digsys.bg [192.92.129.5]) (authenticated bits=0) by smtp-sofia.digsys.bg (8.14.5/8.14.5) with ESMTP id q5M7sb1I087013 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Fri, 22 Jun 2012 10:54:37 +0300 (EEST) (envelope-from daniel@digsys.bg) Message-ID: <4FE424BC.5090000@digsys.bg> Date: Fri, 22 Jun 2012 10:54:36 +0300 From: Daniel Kalchev User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:10.0.5) Gecko/20120607 Thunderbird/10.0.5 MIME-Version: 1.0 To: freebsd-fs@freebsd.org References: <467652020.30738.1340325033684.JavaMail.root@sz0192a.westchester.pa.mail.comcast.net> In-Reply-To: <467652020.30738.1340325033684.JavaMail.root@sz0192a.westchester.pa.mail.comcast.net> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: ZFS Checksum errors X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jun 2012 07:54:42 -0000 On 22.06.12 03:30, rondzierwa@comcast.net wrote: > the problem was created by a disk error, that is no longer happening, but now I have this corrupted file. how do i clean up the mess? the scrub takes hours, and there are folks that are watching. i'm working on the third iteration of clear and scrub, how many times should it take? I can be patient, but it would be nice if i had an answer for the folks that keep asking "are we there yet?". The easiest fix to your problem is to - backup all data - destroy the ZFS pool - destroy the RAID volume - create single-disk volumes for each disk or just export disks as JBOD - create your ZFS pool using the individual drives (*) - restore all data - run your tests again You will be able to identify which disk is having problems. Sometimes, problems that you describe are caused by faulty disk. Re-seating the cables (or unplugging and plugging again the hot-swap disk) seem to fix it.. but that is only temporary. Such disks rarely show as 'bad' to "hardware RAID" controllers, but ZFS detects them always. Another "fix" is to stop using ZFS altogether, use some other file system. Do not see any errors anymore. Silently corrupt data. It is your data, your choice. I wouldn't do that. (*) If you have large number of disks, you may wish to label them and use labels instead of 'raw' drive names. You could use either glabel(8) or gpart(8) to create the labels, then use these to build the zpool. If for example you label the disks by their position in the chassis, then you can easily find out which disk to replace from the zpool output. Daniel