From owner-freebsd-fs@FreeBSD.ORG Thu Mar 17 08:37:45 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BC4C3106564A for ; Thu, 17 Mar 2011 08:37:45 +0000 (UTC) (envelope-from alexander@leidinger.net) Received: from mail.ebusiness-leidinger.de (mail.ebusiness-leidinger.de [217.11.53.44]) by mx1.freebsd.org (Postfix) with ESMTP id 4DB138FC1A for ; Thu, 17 Mar 2011 08:37:45 +0000 (UTC) Received: from outgoing.leidinger.net (p5B15588E.dip.t-dialin.net [91.21.88.142]) by mail.ebusiness-leidinger.de (Postfix) with ESMTPSA id E73A184400E; Thu, 17 Mar 2011 09:37:39 +0100 (CET) Received: from webmail.leidinger.net (webmail.Leidinger.net [IPv6:fd73:10c7:2053:1::2:102]) by outgoing.leidinger.net (Postfix) with ESMTP id AA82B3C68; Thu, 17 Mar 2011 09:37:36 +0100 (CET) Received: (from www@localhost) by webmail.leidinger.net (8.14.4/8.13.8/Submit) id p2H8bFAd027852; Thu, 17 Mar 2011 09:37:15 +0100 (CET) (envelope-from Alexander@Leidinger.net) Received: from pslux.ec.europa.eu (pslux.ec.europa.eu [158.169.9.14]) by webmail.leidinger.net (Horde Framework) with HTTP; Thu, 17 Mar 2011 09:37:15 +0100 Message-ID: <20110317093715.300351qg801prjgo@webmail.leidinger.net> Date: Thu, 17 Mar 2011 09:37:15 +0100 From: Alexander Leidinger To: Jeremy Chadwick References: <20110317071618.GB49199@blazingdot.com> <20110317074558.GA2248@icarus.home.lan> In-Reply-To: <20110317074558.GA2248@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; DelSp="Yes"; format="flowed" Content-Disposition: inline Content-Transfer-Encoding: 7bit User-Agent: Dynamic Internet Messaging Program (DIMP) H3 (1.1.6) X-EBL-MailScanner-Information: Please contact the ISP for more information X-EBL-MailScanner-ID: E73A184400E.A3D35 X-EBL-MailScanner: Found to be clean X-EBL-MailScanner-SpamCheck: not spam, spamhaus-ZEN, SpamAssassin (not cached, score=0, required 6, autolearn=disabled) X-EBL-MailScanner-From: alexander@leidinger.net X-EBL-MailScanner-Watermark: 1300955860.8858@uTM8Qu1IesIQ/sF6pGKOBw X-EBL-Spam-Status: No Cc: freebsd-fs@freebsd.org Subject: Re: ZFS vfs.zfs.cache_flush_disable and ZIL reliability X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Mar 2011 08:37:45 -0000 Quoting Jeremy Chadwick (from Thu, 17 Mar 2011 00:45:58 -0700): > Whenever this topic comes up, I always ask people the same 2 questions: > > > 1) What *absolute guarantee* do you have that data *actually gets > written to the platters* when BIO_FLUSH is called? You can > sync/sync/sync all you want -- there's no guarantee that the hard disk > itself (that is to say, the cache that lives on the hard disk) has fully > written all of its data to its platters. Obvious answer: None, if the disk lies to you. > 2) What do you think will happen when the hard disk abruptly loses > power? Could be the system PSU dying, could be the power circuitry on > the drive failing, could be a "quirk" that causes the drive to > power-cycle itself, etc... Obvious answer: You lose the data until the last sync (if the FS is DTRT like UFS+softupdates/journal or ZFS). > General question to users and/or developers: > > Can someone please explain to me why people are so horribly focused (I > would go as far to say OCD) on this topic? > > Won't there *always* be some degree of potential loss of data in the > above two circumstances? Shouldn't the concern be less about "how much > data just got lost" and more about "is the filesystem actually usable > and clean/correct?" (ZFS implements the latter two assuming you're > using mirror or raidz). You want to always have a consistent FS, that's sure. Parts of consistency guarantees depend upon having data on disk for sure before other changes. You do not want to have the data (FS meta-data) before a flush point reordered in the cache after data (FS meta-data) which was written after the flush point. You also want to lose as less data as possible: Think about your bank account while doing transactions. If the disk lies, it could be (attention, huge simplification here) that your transaction to someone was made but the bank "forgets" to remove the money from your account. This is surely something nobody of us would mind, but the bank does. The other way around, someone transfers money to you, it is removed from his account, but not added to your one, is a more unpleasant one you surely would object about. I'm sure you know about the "only acknowledge to the remote side if the data is really stored" way of handling transfers (mail, DB, ...). If the disk lies to you, you can not do anything (maybe you got what you payed for), but if you have disks which actually DTRT, you do not lose mail (sender retries) or money (the transaction processing can restart from the last ACKed point). Bye, Alexander. -- Even if you do learn to speak correct English, whom are you going to speak it to? -- Clarence Darrow http://www.Leidinger.net Alexander @ Leidinger.net: PGP ID = B0063FE7 http://www.FreeBSD.org netchild @ FreeBSD.org : PGP ID = 72077137