From owner-freebsd-current@FreeBSD.ORG  Mon May 25 00:50:37 2009
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 69780106566B
	for <freebsd-current@freebsd.org>; Mon, 25 May 2009 00:50:37 +0000 (UTC)
	(envelope-from james-freebsd-current@jrv.org)
Received: from mail.jrv.org (rrcs-24-73-246-106.sw.biz.rr.com [24.73.246.106])
	by mx1.freebsd.org (Postfix) with ESMTP id 2D6648FC14
	for <freebsd-current@freebsd.org>; Mon, 25 May 2009 00:50:36 +0000 (UTC)
	(envelope-from james-freebsd-current@jrv.org)
Received: from kremvax.housenet.jrv (kremvax.housenet.jrv [192.168.3.124])
	by mail.jrv.org (8.14.3/8.14.3) with ESMTP id n4P0oZF2021107;
	Sun, 24 May 2009 19:50:36 -0500 (CDT)
	(envelope-from james-freebsd-current@jrv.org)
Authentication-Results: mail.jrv.org; domainkeys=pass (testing)
	header.from=james-freebsd-current@jrv.org
DomainKey-Signature: a=rsa-sha1; s=enigma; d=jrv.org; c=nofws; q=dns;
	h=message-id:date:from:user-agent:mime-version:cc:subject:
	references:in-reply-to:content-type:content-transfer-encoding;
	b=ddpmF2eHlQOzZdorpAm93pDZnW6fxBo+FhRF6hWhL2ZVkB+3T616WPpKZFAk4deVd
	Qc2nFewA/bqGqhiedhVn9o93BWfgVD1BU8vMmJPWc8wjl8TaMcfFMKSBa0ZiM+xgW22
	lxG+tGPoOxh8wG73oRr+9O8dLOOLjiX9cJk71u8=
Message-ID: <4A19EB5B.1000806@jrv.org>
Date: Sun, 24 May 2009 19:50:35 -0500
From: "James R. Van Artsdalen" <james-freebsd-current@jrv.org>
User-Agent: Thunderbird 2.0.0.21 (Macintosh/20090302)
MIME-Version: 1.0
References: <4E6E325D-BB18-4478-BCFD-633D6F4CFD88@exscape.org>	<4FE794E9-075D-4563-B395-BD5E459937DF@exscape.org>
	<gvckuv$u9l$1@ger.gmane.org>
In-Reply-To: <gvckuv$u9l$1@ger.gmane.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: freebsd-current@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject: Re: ZFS panic under extreme circumstances (2/3 disks corrupted)
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 25 May 2009 00:50:37 -0000

Ivan Voras wrote:
> Thomas Backman wrote:
>   
>> On May 24, 2009, at 09:02 PM, Thomas Backman wrote:
>>
>>     
>>> 5) Check if the md5 of file: everything OK, zpool status shows a
>>> degraded pool.
>>> 6) Repeat step #4, but with disk 3.
>>> 7) zpool scrub test
>>> 8) Panic!
>>>
>>>       
> Did you account for the time factor? Between your steps 5 and 6,
> wouldn't ZFS automatically begin data repair?
>   


ZFS probably only repairs errors it sees in step 5, i.e. if he reads a
corrupted sector that sector might be fixed, but ZFS does not start a
scrub looking for other corruption.

His test probably clobbered metadata for the pool or such: something not
touched by the md5(1) in step 5.  That error might not have been seen
until step 7 by which point step 6 has rendered the pool unrepairable.

The original test might need to actually read the disk blocks before
overwrite to make sure it's file data and not something else otherwise
the test probably isn't going to be a valid test of automatic self-repair.