From owner-freebsd-hackers@FreeBSD.ORG  Fri Mar 14 19:18:55 2014
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 09CA2798;
 Fri, 14 Mar 2014 19:18:55 +0000 (UTC)
Received: from mail-ee0-x236.google.com (mail-ee0-x236.google.com
 [IPv6:2a00:1450:4013:c00::236])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 6A83927D;
 Fri, 14 Mar 2014 19:18:54 +0000 (UTC)
Received: by mail-ee0-f54.google.com with SMTP id d49so1771959eek.27
 for <multiple recipients>; Fri, 14 Mar 2014 12:18:52 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=sender:subject:mime-version:content-type:from:in-reply-to:date:cc
 :content-transfer-encoding:message-id:references:to;
 bh=snZKayD9LN0Kt0nHRRjd8vKb9CBZHE6TvoUaAoAS7zc=;
 b=nN4QMjWlK2Iku8d3+OapIFWQwePczmKmekJpAHuLJvFF0MvEd1VVQD0u+Qlo6fbNZt
 BNzle9Y0zWc6BpKWJfL9gLCJ99ISXg+QszGtC8OiUxyyg4+and0R68zy66ykhfqBXWUb
 vWJd81cGdFz2+EWc+G4MSwtIXbZ+wEgQfJKN70dt+diJSiaNGDMWUESioSYHjp7nwA5I
 EYHzwwRvB5HozfXFna3ctjdUCNYLnKLzW1o9BsgadpkKqGT1zdOnkFWLLT32bvdvrC9u
 lards4w2sIHLUL5FjxDWfp8PewWM2EhmGyyq+vDlyv8JMp6UMg9w50V9itSYKu+imlOS
 FIdw==
X-Received: by 10.14.172.69 with SMTP id s45mr10109083eel.26.1394824732798;
 Fri, 14 Mar 2014 12:18:52 -0700 (PDT)
Received: from strashydlo.home (adfi238.neoplus.adsl.tpnet.pl.
 [79.184.112.238])
 by mx.google.com with ESMTPSA id cb5sm19102744eeb.18.2014.03.14.12.18.51
 for <multiple recipients>
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Fri, 14 Mar 2014 12:18:52 -0700 (PDT)
Sender: =?UTF-8?Q?Edward_Tomasz_Napiera=C5=82a?= <etnapierala@gmail.com>
Subject: Re: GSoC proposition: multiplatform UFS2 driver
Mime-Version: 1.0 (Apple Message framework v1283)
Content-Type: text/plain; charset=iso-8859-2
From: =?iso-8859-2?Q?Edward_Tomasz_Napiera=B3a?= <trasz@FreeBSD.org>
In-Reply-To: <53235014.1040003@gentoo.org>
Date: Fri, 14 Mar 2014 20:18:50 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <9DA009CD-0629-4402-A2A0-0A6BDE1E86FD@FreeBSD.org>
References: <CAA3ZYrCPJ1AydSS9n4dDBMFjHh5Ug6WDvTzncTtTw4eYrmcywg@mail.gmail.com>
 <20140314152732.0f6fdb02@gumby.homeunix.com>
 <1394811577.1149.543.camel@revolution.hippie.lan>
 <0405D29C-D74B-4343-82C7-57EA8BEEF370@FreeBSD.org>
 <53235014.1040003@gentoo.org>
To: Richard Yao <ryao@gentoo.org>
X-Mailer: Apple Mail (2.1283)
Cc: freebsd-hackers@FreeBSD.org, RW <rwmaillists@googlemail.com>,
 Ian Lepore <ian@FreeBSD.org>
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Mar 2014 19:18:55 -0000

Wiadomo=B6=E6 napisana przez Richard Yao w dniu 14 mar 2014, o godz. =
19:53:
> On 03/14/2014 02:36 PM, Edward Tomasz Napiera=B3a wrote:
>> Wiadomo=B6=E6 napisana przez Ian Lepore w dniu 14 mar 2014, o godz. =
16:39:
>>> On Fri, 2014-03-14 at 15:27 +0000, RW wrote:
>>>> On Thu, 13 Mar 2014 18:22:10 -0800
>>>> Dieter BSD wrote:
>>>>=20
>>>>> Julio writes,
>>>>>> That being said, I do not like the idea of using NetBSD's UFS2
>>>>>> code. It lacks Soft-Updates, which I consider to make FreeBSD =
UFS2
>>>>>> second only to ZFS in desirability.
>>>>>=20
>>>>> FFS has been in production use for decades.  ZFS is still wet =
behind
>>>>> the ears. Older versions of NetBSD have soft updates, and they =
work
>>>>> fine for me. I believe that NetBSD 6.0 is the first release =
without
>>>>> soft updates.  They claimed that soft updates was "too difficult" =
to
>>>>> maintain.  I find that soft updates are *essential* for data
>>>>> integrity (I don't know *why*, I'm not a FFS guru).=20
>>>>=20
>>>> NetBSD didn't simply drop soft-updates, they replaced it with
>>>> journalling, which is the approach used by practically all modern
>>>> filesystems.=20
>>>>=20
>>>> A number of people on the questions list have said that they find
>>>> UFS+SU to be considerably less robust than the journalled =
filesystems
>>>> of other OS's. =20
>>=20
>> Let me remind you that some other OS-es had problems such as =
truncation
>> of files which were _not_ written (XFS), silently corrupting metadata =
when
>> there were too many files in a single directory (ext3), and panicing =
instead
>> of returning ENOSPC (btrfs).  ;->
>=20
> Lets be clear that such problems live between the VFS and block layer
> and therefore are isolated to specific filesystems. Such problems
> disappear when using ZFS.

Such problems disappear after fixing bugs that caused them.  Just like
with ZFS - some people _have_ lost zpools in the past.

>>> What I've seen claimed is that UFS+SUJ is less robust.  That's a =
very
>>> different thing than UFS+SU.  Journaling was nailed onto the side of =
UFS
>>> +SU as an afterthought, and it shows.
>>=20
>> Not really - it was developed rather recently, and with filesystems =
it usually
>> shows, but it's not "nailed onto the side": it complements SU =
operation
>> by journalling the few things which SU doesn't really handle and =
which
>> used to require background fsck.
>>=20
>> One problem with SU is that it depends on hardware not lying about
>> write completion.  Journalling filesystems usually just issue flushes
>> instead.
>=20
> This point about write completion being done on unflushed data and no
> flushes being done could explain the disconnect between RW's =
statements
> and what Soft Updates should accomplish. However, it does not change =
my
> assertion that placing UFS SU on a ZFS zvol will avoid such failure
> modes.

Assuming everything between UFS and ZFS below behaves correctly.

> In ZFS, we have a two stage transaction commit that issues a
> flush at each stage to ensure that data goes to disk, no matter what =
the
> drive reported. Unless the hardware disobeys flushes, the second stage
> cannot happen if the first stage does not complete and if the second
> stage does not complete, all changes are ignored.
>=20
> What keeps soft updates from issuing a flush following write =
completion?
> If there are no pending writes, it is a noop. If the hardware lies, =
then
> this will force the write. The internal dependency tracking mechanisms
> in Soft Updates should make figuring out when a flush needs to be =
issued
> should hardware have lied about completion rather simple. At a high
> level, what needs to be done is to batch the things that can be done
> simultaneously and separate those that cannot by flushes. If such
> behavior is implemented, it should have a mount option for toggling =
it.
> It simply is not needed on well behaved devices, such as ZFS zvols.

As you say, it's not needed on well-behaved devices.  While it could
help with crappy hardware, I think it would be either very complicated
(batching, as described), or would perform very poorly.

To be honest, I wonder how many problems could be avoided by
disabling write cache by default.  With NCQ it shouldn't cause
performance problems, right?