From owner-freebsd-fs@FreeBSD.ORG Wed Mar 14 12:34:34 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 258AC1065680 for ; Wed, 14 Mar 2012 12:34:34 +0000 (UTC) (envelope-from markm-lists@intellasoft.net) Received: from mail.mystoragebox.com (mail.mystoragebox.com [64.27.7.19]) by mx1.freebsd.org (Postfix) with ESMTP id 096328FC1A for ; Wed, 14 Mar 2012 12:34:33 +0000 (UTC) Received: from cpe-67-240-79-210.nycap.res.rr.com ([67.240.79.210] helo=[192.168.35.9]) by mail.mystoragebox.com with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.76) (envelope-from ) id 1S7n5R-00078n-MC for freebsd-fs@freebsd.org; Wed, 14 Mar 2012 08:13:50 -0400 Message-ID: <4F609052.5010300@intellasoft.net> Date: Wed, 14 Mar 2012 08:34:26 -0400 From: Mark Murawski User-Agent: Mozilla/5.0 (X11; Linux i686; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2 MIME-Version: 1.0 To: freebsd-fs@freebsd.org References: <4F5F7116.3020400@intellasoft.net> <4F5F97A4.6070000@brockmann-consult.de> <4F60266D.1090302@intellasoft.net> <4F6027B3.5080006@intellasoft.net> <20120314105011.Horde.mYG5YpjmRSRPYGnT6OFEHuA@webmail.leidinger.net> In-Reply-To: <20120314105011.Horde.mYG5YpjmRSRPYGnT6OFEHuA@webmail.leidinger.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: ZFS file corruption problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Mar 2012 12:34:34 -0000 On 03/14/2012 05:50 AM, Alexander Leidinger wrote: > Quoting Mark Murawski (from Wed, 14 Mar > 2012 01:08:03 -0400): > >> On 03/14/12 01:02, Mark Murawski wrote: > >>> Why would the whole pool now become available upon access to a bad file? > > Because you configured it like this (respectively didn't configure a > different behavior). > >> Also... isn't this pretty terrible behavior that the process accessing >> the bad file is unkillable? > > If you are in an environment where the disks are not local (ZFS is > designed with corporate environments in mind), you do not want to fail > on an application level or panic because of a small hickup in the network. > > man zpool: > ---snip--- > failmode=wait | continue | panic > Controls the system behavior in the event of catastrophic pool fail? > ure. This condition is typically a result of a loss of connectivity > to the underlying storage device(s) or a failure of all devices > within the pool. The behavior of such an event is determined as fol? > lows: > > wait Blocks all I/O access until the device connectivity is recov? > ered and the errors are cleared. This is the default behav? > ior. > > continue > Returns EIO to any new write I/O requests but allows reads to > any of the remaining healthy devices. Any write requests that > have yet to be committed to disk would be blocked. > > panic Prints out a message to the console and generates a system > crash dump. > ---snip--- > > It is up to you to switch to 'continue' or 'panic' for local disks. > > Bye, > Alexander. > Oh... wow. It's not that I've configured if that way in particular, it's more of a matter of the default settings came like that. But anyway, thanks a ton. I had no idea that was configurable. I was even thinking that "you know, it would be nice if that behavior was configurable". Once I started running into these problems I started losing faith in zfs and its design. I've been dealing with this corrupt files problem for about a week now. Finding out the fix was as simple as deleting the file and setting a new config option has reaffirmed my belief in zfs.