From owner-freebsd-current@FreeBSD.ORG  Wed Dec 28 19:14:20 2011
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 48FE2106566B
	for <freebsd-current@freebsd.org>; Wed, 28 Dec 2011 19:14:20 +0000 (UTC)
	(envelope-from mdf356@gmail.com)
Received: from mail-pw0-f54.google.com (mail-pw0-f54.google.com
	[209.85.160.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 1ECAF8FC0C
	for <freebsd-current@freebsd.org>; Wed, 28 Dec 2011 19:14:19 +0000 (UTC)
Received: by pbcc3 with SMTP id c3so10720368pbc.13
	for <freebsd-current@freebsd.org>; Wed, 28 Dec 2011 11:14:19 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=mime-version:sender:in-reply-to:references:date
	:x-google-sender-auth:message-id:subject:from:to:cc:content-type
	:content-transfer-encoding;
	bh=BoG+w7yM6Vm5MLyp2tV9jIOBUoxen4jp/RqrESwKGGE=;
	b=Cq7daCLSXhxD3tdEGlgKDfNE6IYywbyE89/+CgssLqZPCM9lqVzqmTLQATljpHjrop
	vy81XrQBBNYSxfbXZxZY4g1nMGC9ixQbkjTRBcvU/h/kQihtffRZeISM6j1WDnn5tR/W
	y+FeoS5f9t32Nkq5DDeYkv9+TvZJNys+bPnYk=
MIME-Version: 1.0
Received: by 10.68.209.68 with SMTP id mk4mr48947620pbc.88.1325099659358; Wed,
	28 Dec 2011 11:14:19 -0800 (PST)
Sender: mdf356@gmail.com
Received: by 10.68.208.167 with HTTP; Wed, 28 Dec 2011 11:14:19 -0800 (PST)
In-Reply-To: <CAJcQMWeDeCC-pvz4tfwZKAUqxZnpPBOpa9r12o0_w4=YyGf4zA@mail.gmail.com>
References: <20111227215330.GI45484@redundancy.redundancy.org>
	<4EFB470D.3070309@gmx.de>
	<CAJcQMWeDeCC-pvz4tfwZKAUqxZnpPBOpa9r12o0_w4=YyGf4zA@mail.gmail.com>
Date: Wed, 28 Dec 2011 11:14:19 -0800
X-Google-Sender-Auth: 13iD09_1mvTC8vIEKrs8MI_Gaxo
Message-ID: <CAMBSHm-+JGQiXHRh0ona3qZN2O2s=7dgnQ_3q_NoLe-CWM2yww@mail.gmail.com>
From: mdf@FreeBSD.org
To: Maxim Khitrov <max@mxcrypt.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: Matthias Andree <matthias.andree@gmx.de>, freebsd-current@freebsd.org
Subject: Re: SU+J systems do not fsck themselves
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 28 Dec 2011 19:14:20 -0000

On Wed, Dec 28, 2011 at 8:54 AM, Maxim Khitrov <max@mxcrypt.com> wrote:
> On Wed, Dec 28, 2011 at 11:42 AM, Matthias Andree
> <matthias.andree@gmx.de> wrote:
>> Am 27.12.2011 22:53, schrieb David Thiel:
>>> I've had multiple machines now (9.0-RC3, amd64, i386 and earlier
>>> 9-CURRENT on ppc) running SU+J that have had unexplained panics and
>>> crashes start happening relating to disk I/O. When I end up running a
>>> full fsck, it keeps turning out that the disk is dirty and corrupted,
>>> but no mechanism is in place with SU+J to detect and fix this. A bgfsck
>>> never happens, but a manual fsck in single-user does indeed fix the
>>> crashing and weird behavior. Others have tested their SU+J volumes and
>>> found them to have errors as well. This makes me super nervous.
>>
>> The one thing I figured is that in the light of power outages, or
>> crashing virtualization hosts, you really really really need to disable
>> disk write caches, and this affects softupdates, journalling, asynch
>> file systems, just about everything.
>>
>> The fact that makes matters worse is that journalling or softupdates
>> allow you to mount a silently-corrupted file system, whereas the
>> traditional UFS/UFS2 sync/asynch mounts will fsck themselves in the
>> foreground, so they get fixed before the FS panics.
>>
>> So can you be sure that:
>>
>> - your driver, chip set and hard disk execute ordered writes in order,

If they don't, it's a bug.  Not that there isn't buggy firmware out
there, but each layer of software does need to rely on the one below
actually doing what it's promised.

>> - your driver, chip set and hard disk actually write data to permanent
>> storage BEFORE acknowledging a successful write?

Not required by SU as they use an explicit BIO_FLUSH which should be
handled by the driver.

>> Whenever I fixed these issues, I had no more corruptions.
>>
>> For ata and sata, there are loader tunables you will want to set,
>> hw.ata.wc=3D0 and kern.cam.ada.write_cache=3D0.

This should not be necessary if the driver and firmware are not buggy.

>> If your drives are under ada, ad, or ahci related control, try these
>> settings. =A0For SCSI, use camcontrol to turn the write cache off.
>> softupdates is supposed to rectify most of the performance penalties
>> incurred.
>>
>> Note also that you needed to set ahci_load=3DYES and atapicam_load=3DYES=
 in
>> 8.X, I've never bothered to check 7.X or 9.X WRT these settings.
>
> This is a bit off-topic, but I'm curious what the effect of NCQ is on
> softupdates? Since that too has the ability to reorder writes to disk,
> should it be disabled in addition to cache?

SU doesn't care about write ordering, as long as everything before a
BIO_FLUSH is really flushed by the time the BIO_FLUSH is acknowledged.

Cheers,
matthew

> Also, I would say that if you are using a hardware raid controller
> with a BBU, then allowing the use of controller's cache and write-back
> policy should be safe for use with softupdates. Any caching mechanism,
> for that matter, that has a separate power supply source should be ok.
> For example, the Intel 320 SSDs have a few on-board capacitors that
> are used to flush the cache in the event of a power loss.
>
> - Max
> _______________________________________________
> freebsd-current@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org=
"