From owner-freebsd-fs@FreeBSD.ORG Sun Sep 9 20:57:58 2007 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 332E616A41B for ; Sun, 9 Sep 2007 20:57:58 +0000 (UTC) (envelope-from kris@FreeBSD.org) Received: from weak.local (hub.freebsd.org [IPv6:2001:4f8:fff6::36]) by mx1.freebsd.org (Postfix) with ESMTP id 4312713C465; Sun, 9 Sep 2007 20:57:57 +0000 (UTC) (envelope-from kris@FreeBSD.org) Message-ID: <46E45E54.6040207@FreeBSD.org> Date: Sun, 09 Sep 2007 22:57:56 +0200 From: Kris Kennaway User-Agent: Thunderbird 2.0.0.6 (Macintosh/20070728) MIME-Version: 1.0 To: Peter Schuller References: <46E4225F.1020806@gmx.net> <46E42D14.5060605@FreeBSD.org> <20070909200933.GA98161@hyperion.scode.org> In-Reply-To: <20070909200933.GA98161@hyperion.scode.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, Johannes Totz Subject: Re: UFS not handling errors correctly X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 09 Sep 2007 20:57:58 -0000 Peter Schuller wrote: >> bg fsck cannot fix arbitrary filesystem corruption. Nor is it intended to. > > But is not the proposed problem that UFS caused the corruption to > begin with? No, "something unexpected happened". > Given that updates are already done in such a way as to cause > predictable inconsistency in the event of a crash, should not > appropriate course of action in a case such as this be to panic, > unmount the fs, rollback and error out, or otherwise abort the > operation in a way the filesystem can be fsck:ed and re-mounted > (assuming the device is alive), rather than cause corruption? Soft updates isn't journalling, so you can't "roll back" an error. It works by maintaining knowledge of the on-disk state of data and ensuring that it only writes to disk in a suitable order so that the on-disk state is supposed to remain consistent. Unfortunately there are many ways in which this can fail, mostly involving external factors violating the assumptions upon which soft updates relies. For example, the data written on disk may not correspond to the data dispatched by soft updates, due to things like write caching in the hardware, write reordering, data corruption, unpredictable disk behaviour during power loss, hardware failure, etc. Similarly, background fsck assumes that the only filesystem errors it will encounter are those permitted by the soft updates model, which are "benign", i.e. non-fatal and correctable at runtime. When the state of your disk departs from the realm of these assumptions, bg fsck may not be able to repair the damage. Kris