From owner-freebsd-current@FreeBSD.ORG Tue Aug 28 20:57:09 2007 Return-Path: Delivered-To: current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B4E8616A419 for ; Tue, 28 Aug 2007 20:57:09 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (arm132.internetdsl.tpnet.pl [83.17.198.132]) by mx1.freebsd.org (Postfix) with ESMTP id 1260D13C46A for ; Tue, 28 Aug 2007 20:57:08 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 05E2445E91; Tue, 28 Aug 2007 22:57:07 +0200 (CEST) Received: from localhost (154.81.datacomsa.pl [195.34.81.154]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 4F17F45683; Tue, 28 Aug 2007 22:57:01 +0200 (CEST) Date: Tue, 28 Aug 2007 22:55:55 +0200 From: Pawel Jakub Dawidek To: Bakul Shah Message-ID: <20070828205554.GI39562@garage.freebsd.pl> References: <20070828180228.GD39562@garage.freebsd.pl> <20070828204834.9A7F85B3B@mail.bitblocks.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="jaoouwwPWoQSJZYp" Content-Disposition: inline In-Reply-To: <20070828204834.9A7F85B3B@mail.bitblocks.com> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 7.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 autolearn=ham version=3.0.4 Cc: current@FreeBSD.org, Pascal Hofstee Subject: Re: ZFS kernel panic X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Aug 2007 20:57:09 -0000 --jaoouwwPWoQSJZYp Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Aug 28, 2007 at 01:48:34PM -0700, Bakul Shah wrote: > Pawel Jakub Dawidek wrote: > > On Tue, Aug 28, 2007 at 10:02:42AM -0700, Bakul Shah wrote: > > > > When you don't use redundant configuration (no mirror, no raidz, no > > > > copies>1) then ZFS is going to panic on a write failure. It looks l= ike > > > > ZFS found a bad block on your disk. > > > > > > Does SUN really say this about ZFS? Is this acceptable in a > > > production environment? What if one of your mirrored disk > > > fails and in the "degraded" environment (before you have had > > > a chance to replace the bad disk) ZFS discovers that a write > > > fails? Why can't it find an alternative block to write to? > >=20 > > There were many complains on zfs-discuss@, you may want to look into > > archive. The short version is that many users doesn't like that, and it > > should change in the future - because of COW model it should be quite > > easy to just mark block as bad and take next one, but it's not currently > > implemented. It's much less of a problem when one uses redundancy. >=20 > Good to know others are complaining too :-) >=20 > My real concern is the panic. This situation may be rare if > using redundancy + regular scrubbing, but it can definitely > occur. And as long as non redundant ZFS is *allowed*, you > pretty much have to deal with it without any panicking. >=20 > Originally panic() was used to indicate that some *system > invariant* has been violated. That either meant a hardware > error or an unknown software error but in any case some data > structure was likely corrupted and continuing can make > matters worse. But that is not the case here (in general). > zfs does not have the appropriate information to be able to > decide whether the write error is fatal. >=20 > The simplest thing to do in case of a write error is to > simply ignore it. You *will* catch this problem when you try > to read this block. One step better is to do what you > suggest. You can't ignore write error, because application already assumed the write succeeded, which can lead to misbehaviour later. ZFS cannot yet handle write error, so it panics to preserve data consistency. This is the good reaction on ZFS side until skipping bad blocks is not implemented. > What happens now when you do use redundancy and there is a > write error while writing one of the copies? Does the system > panic or is this error ignored? Don't remember off hand, but component is probably marked as bad and vdev group goes to degraded state. You can simulate this easly with gnop(8). --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --jaoouwwPWoQSJZYp Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFG1IvaForvXbEpPzQRAi20AKD3Ag5xU8Sauqi5CWQM72UdzByhZACgoQLK mZkoeg+REgUuqBhakNAVz8w= =kA5l -----END PGP SIGNATURE----- --jaoouwwPWoQSJZYp--