From owner-freebsd-questions@FreeBSD.ORG  Thu Jun 21 14:44:33 2012
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: freebsd-questions@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 641751065675
	for <freebsd-questions@freebsd.org>;
	Thu, 21 Jun 2012 14:44:33 +0000 (UTC)
	(envelope-from wojtek@wojtek.tensor.gdynia.pl)
Received: from wojtek.tensor.gdynia.pl (wojtek.tensor.gdynia.pl [89.206.35.99])
	by mx1.freebsd.org (Postfix) with ESMTP id A46B58FC1E
	for <freebsd-questions@freebsd.org>;
	Thu, 21 Jun 2012 14:44:32 +0000 (UTC)
Received: from wojtek.tensor.gdynia.pl (localhost [127.0.0.1])
	by wojtek.tensor.gdynia.pl (8.14.5/8.14.5) with ESMTP id q5LEiToq003267;
	Thu, 21 Jun 2012 16:44:29 +0200 (CEST)
	(envelope-from wojtek@wojtek.tensor.gdynia.pl)
Received: from localhost (wojtek@localhost)
	by wojtek.tensor.gdynia.pl (8.14.5/8.14.5/Submit) with ESMTP id
	q5LEiTCZ003264; Thu, 21 Jun 2012 16:44:29 +0200 (CEST)
	(envelope-from wojtek@wojtek.tensor.gdynia.pl)
Date: Thu, 21 Jun 2012 16:44:29 +0200 (CEST)
From: Wojciech Puchar <wojtek@wojtek.tensor.gdynia.pl>
To: Julien Cigar <jcigar@ulb.ac.be>
In-Reply-To: <4FE32FF5.60603@ulb.ac.be>
Message-ID: <alpine.BSF.2.00.1206211635400.3170@wojtek.tensor.gdynia.pl>
References: <4FE2CE38.9000100@gmail.com>
	<CAPj0R5Kmi-+dJ7mPvTrTAoS8O983svOyR2WyK2_v1Cr07dSS_A@mail.gmail.com>
	<alpine.BSF.2.00.1206211413140.2263@wojtek.tensor.gdynia.pl>
	<CA+D9QhuQ+bxKW9+dX+zS9mErwz8JSkV2G7qL0KfB8BH_LGJAgA@mail.gmail.com>
	<alpine.BSF.2.00.1206211539230.2903@wojtek.tensor.gdynia.pl>
	<CA+D9QhvR_eKtVxdKcaMyOS7tLw_AOHKgUy3o7mJn2b=chMA0Xw@mail.gmail.com>
	<4FE32FF5.60603@ulb.ac.be>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.2.7
	(wojtek.tensor.gdynia.pl [127.0.0.1]);
	Thu, 21 Jun 2012 16:44:29 +0200 (CEST)
Cc: freebsd-questions@freebsd.org
Subject: Re: Is ZFS production ready?
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 21 Jun 2012 14:44:33 -0000

> One interesting feature of ZFS if it's block checksum: all reads and writes 
> include block checksum, so it can easily detect situations where, for 
> example, data is quietly corrupted by RAM.

you may be shocked but you are sometimes wrong. i already demostrated it 
and checksumming doesn't get any errors, and do write wrong data with right 
checksums :)

it's quite easy to explain if one understand hardware details.

Checksumming will protect you from

- failed SATA/SAS port, on-disk controller that returns bad data as good. 
This is actually really rare case. i never seen that, but maybe it 
happens.

- some types of DRAM failure - but not all. Actually just a small 
fraction because DRAM failure like that would bring your system to crash 
so quickly that you are unlikely to get big data corruption.

Common case with DRAM memory is that after you write to it, keeps right 
data some time and RARELY flips some bit later in spite of refresh.

With this type you may run your machine for hours, even days or longer.
And ZFS would calculate proper checksum of wrong data and will write it to 
disk.


This is the reason i keep few failed DIMMs - for testing how 
different software behaves on broken machine.

UFS resulted in few corrupted files after half a day of heavy work and 4 
crashes. fsck always recovered things well (of course "unexpected 
softupdate inconsistency....")

ZFS survived 2 crashes. After third it panicked on startup.

Of course - no zfs_fsck.
And no possibility of making really good zfs_fsck because of data layout, 
at least not easy.


> This feature is very important for databases.
is data integrity not important for the rest? :)

Still - disks itself perform quite heavy ECC and both SATA and SAS ports.