From owner-freebsd-current@FreeBSD.ORG Tue Aug 28 20:48:35 2007 Return-Path: Delivered-To: current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 47B3916A498; Tue, 28 Aug 2007 20:48:35 +0000 (UTC) (envelope-from bakul@bitblocks.com) Received: from mail.bitblocks.com (bitblocks.com [64.142.15.60]) by mx1.freebsd.org (Postfix) with ESMTP id 2399013C45A; Tue, 28 Aug 2007 20:48:35 +0000 (UTC) (envelope-from bakul@bitblocks.com) Received: from bitblocks.com (localhost.bitblocks.com [127.0.0.1]) by mail.bitblocks.com (Postfix) with ESMTP id 9A7F85B3B; Tue, 28 Aug 2007 13:48:34 -0700 (PDT) To: Pawel Jakub Dawidek In-reply-to: Your message of "Tue, 28 Aug 2007 20:02:28 +0200." <20070828180228.GD39562@garage.freebsd.pl> Date: Tue, 28 Aug 2007 13:48:34 -0700 From: Bakul Shah Message-Id: <20070828204834.9A7F85B3B@mail.bitblocks.com> Cc: current@FreeBSD.org, Pascal Hofstee Subject: Re: ZFS kernel panic X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Aug 2007 20:48:35 -0000 Pawel Jakub Dawidek wrote: > On Tue, Aug 28, 2007 at 10:02:42AM -0700, Bakul Shah wrote: > > > When you don't use redundant configuration (no mirror, no raidz, no > > > copies>1) then ZFS is going to panic on a write failure. It looks like > > > ZFS found a bad block on your disk. > > > > Does SUN really say this about ZFS? Is this acceptable in a > > production environment? What if one of your mirrored disk > > fails and in the "degraded" environment (before you have had > > a chance to replace the bad disk) ZFS discovers that a write > > fails? Why can't it find an alternative block to write to? > > There were many complains on zfs-discuss@, you may want to look into > archive. The short version is that many users doesn't like that, and it > should change in the future - because of COW model it should be quite > easy to just mark block as bad and take next one, but it's not currently > implemented. It's much less of a problem when one uses redundancy. Good to know others are complaining too :-) My real concern is the panic. This situation may be rare if using redundancy + regular scrubbing, but it can definitely occur. And as long as non redundant ZFS is *allowed*, you pretty much have to deal with it without any panicking. Originally panic() was used to indicate that some *system invariant* has been violated. That either meant a hardware error or an unknown software error but in any case some data structure was likely corrupted and continuing can make matters worse. But that is not the case here (in general). zfs does not have the appropriate information to be able to decide whether the write error is fatal. The simplest thing to do in case of a write error is to simply ignore it. You *will* catch this problem when you try to read this block. One step better is to do what you suggest. What happens now when you do use redundancy and there is a write error while writing one of the copies? Does the system panic or is this error ignored?