From owner-freebsd-fs@FreeBSD.ORG Fri May 29 19:01:50 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 179781065670 for ; Fri, 29 May 2009 19:01:50 +0000 (UTC) (envelope-from nhoyle@hoyletech.com) Received: from mout.perfora.net (mout.perfora.net [74.208.4.194]) by mx1.freebsd.org (Postfix) with ESMTP id D6F488FC0A for ; Fri, 29 May 2009 19:01:49 +0000 (UTC) (envelope-from nhoyle@hoyletech.com) Received: from [127.0.0.1] (pool-96-231-140-65.washdc.fios.verizon.net [96.231.140.65]) by mrelay.perfora.net (node=mrus1) with ESMTP (Nemesis) id 0MKpCa-1MA78e0PtR-000d7r; Fri, 29 May 2009 14:49:11 -0400 Message-ID: <4A202E23.4070002@hoyletech.com> Date: Fri, 29 May 2009 14:49:07 -0400 From: Nathanael Hoyle User-Agent: Thunderbird 2.0.0.21 (Windows/20090302) MIME-Version: 1.0 To: Dmitry Marakasov , freebsd-fs@freebsd.org References: <20090527155342.GA45258@hades.panopticon> <4A1DB3D1.6080003@modulus.org> <20090528132634.GG45258@hades.panopticon> <4A1F10DA.5080905@modulus.org> <20090529000818.GE95240@hades.panopticon> In-Reply-To: <20090529000818.GE95240@hades.panopticon> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Provags-ID: V01U2FsdGVkX18/1o05xz89gpxANK6EOJ3X7D+SXDtcDwZrNRk 33mZnexDjFXDBQeJM0AhGypRT5k1wZrFLNp/R6Evm2+rdGjl8m WNizsUR/XCqo5gL0bvfXtkYJXaTSCGv Cc: Subject: Re: ZFS scrub/selfheal not really working X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 May 2009 19:01:50 -0000 Dmitry Marakasov wrote: > * Andrew Snow (andrew@modulus.org) wrote: > > >> Because your disk subsystem is broken and keeps returning new sets of >> bad sectors. >> > > Still running my utility fixed them. Running scrub don't. > > scrub isn't trying to repair bad sectors on your hard drive. scrub is trying to restore the health of all copies of the data based on available parity or mirrors. These are different goals. As was suggested, I expect that if you have either power supply or controller errors, you are having new, unique, device-layer read errors every scrub pass (note your counts between passes are very different). You are not having zpool-layer failures (zfs is successfully using parity or mirror information to ensure that no bad data was returned to the application layer). In the course of a scrub, a read error should result in a new sector being written with the appropriate data, based on parity or mirror data, this does not prevent the original sector from having read failures if you attempt to read it on a block by block basis. The diagram you referenced is too high level to be meaningful. For better treatment, I suggest you refer to http://dlc.sun.com/pdf/819-5461/819-5461.pdf specifically "Checking ZFS Data Integrity" beginning on page 251, and continuing through say page 261. Best of luck, -Nathanael