From owner-freebsd-fs@FreeBSD.ORG  Thu Mar 10 20:43:46 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 1268A1065676
	for <freebsd-fs@freebsd.org>; Thu, 10 Mar 2011 20:43:46 +0000 (UTC)
	(envelope-from cforgeron@acsi.ca)
Received: from mta04.eastlink.ca (mta04.eastlink.ca [24.224.136.10])
	by mx1.freebsd.org (Postfix) with ESMTP id CDA338FC18
	for <freebsd-fs@freebsd.org>; Thu, 10 Mar 2011 20:43:45 +0000 (UTC)
MIME-version: 1.0
Content-transfer-encoding: 7BIT
Content-type: text/plain; CHARSET=US-ASCII
Received: from ip01.eastlink.ca ([unknown] [24.222.39.10])
	by mta04.eastlink.ca (Sun Java(tm) System Messaging Server 7.3-11.01
	64bit
	(built Sep 1 2009)) with ESMTP id <0LHV00GH908WI0S1@mta04.eastlink.ca>;
	Thu, 10 Mar 2011 16:43:44 -0400 (AST)
X-CMAE-Score: 0
X-CMAE-Analysis: v=1.1 cv=mORQtGzMSGJSBwuMSvVfB0MKjPGmXehAuj88Uvu04o4= c=1
	sm=1 a=kj9zAlcOel0A:10 a=Npn9PEg5AAAA:8 a=6I5d2MoRAAAA:8
	a=cD9dlkaaAFRq7P8gXd4A:9 a=j3nuLlU1tdTdNHtoYNYA:7
	a=krB3caHU28uymd_bvpOVvgeGbZ0A:4 a=CjuIK1q_8ugA:10 a=SV7veod9ZcQA:10
	a=23TYa8vstklZDaDr:21 a=K_Jb1XJ4RimbJMEe:21
	a=E/PVjAe7IbPkHCM0BPV0xg==:117
Received: from blk-222-10-85.eastlink.ca (HELO server7.acsi.ca)
	([24.222.10.85]) by ip01.eastlink.ca with ESMTP; Thu,
	10 Mar 2011 16:43:44 -0400
Received: from server7.acsi.ca ([192.168.9.7])
	by server7.acsi.ca ([192.168.9.7]) with mapi;
	Thu, 10 Mar 2011 16:43:44 -0400
From: Chris Forgeron <cforgeron@acsi.ca>
To: Stephen McKay <mckay@freebsd.org>, Mark Felder <feld@feld.me>
Date: Thu, 10 Mar 2011 16:43:43 -0400
Thread-topic: Constant minor ZFS corruption
Thread-index: AcveXkZE+SpnKX74Sv+/Ft28q00SywBBCZYw
Message-id: <BEBC15BA440AB24484C067A3A9D38D7E014DA6658521@server7.acsi.ca>
References: <201103081425.p28EPQtM002115@dungeon.home>
	<BEBC15BA440AB24484C067A3A9D38D7E014DA66584F0@server7.acsi.ca>
	<201103091241.p29CfUM1003302@dungeon.home>
In-reply-to: <201103091241.p29CfUM1003302@dungeon.home>
Accept-Language: en-US
Content-language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
acceptlanguage: en-US
Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Subject: RE: Constant minor ZFS corruption
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Mar 2011 20:43:46 -0000

You know,  I've had better luck with v28 and FreeBSD-9-CURRENT.  Make a very minimal compile, test it well, and you should be fine. I just upgraded my last 8.2 v14 ZFS FreeBSD system earlier this week, so I'm now 9-Current with v28 across the board. The only issue I've found so far is a small oddity with displaying files across ZFS, but pjd has already patched that in r219404. (I'm about to test it now)

Oh - and you're AMD64, correct, not i386? I think we (royal we) should remove support for i385 in ZFS, it has never been stable for me, and I see a lot of grief about it on the boards.  I also think you need 8 GB of RAM to play seriously. I've had reasonable success with 4GB and a light load, but any serious file traffic needs 8GB of breathing room as ZFS gobbles up the RAM in a very aggressive manner. 

Lastly, check what Mike Tancsa said about his hardware - All of my gear is quality,  1000W dual redundant power supplies, LSI SAS controllers, ECC registered ram, no overclocking, etc, etc.  You may have a software issue, but it's more likely that ZFS is just exposing some instability in your system. Has your RAM checked out with a Memtest run overnight? We're talking small, intermittent errors here, not big red flags that will be obvious to spot. 


-----Original Message-----
From: smckay@internode.on.net [mailto:smckay@internode.on.net] On Behalf Of Stephen McKay
Sent: Wednesday, March 09, 2011 8:42 AM
To: Chris Forgeron; Mark Felder
Cc: Stephen McKay; freebsd-fs@freebsd.org
Subject: Re: Constant minor ZFS corruption 

On Tuesday, 8th March 2011, Chris Forgeron wrote:

>Have you make sure it's not always the same drives with the checksum 
>errors? It make take a few days to know for sure..

Of the 12 disks, only 1 has been error-free.  I've been doing this for about 10 days now and there is no pattern that I can see in the errors.

>It shouldn't matter which NFS client you use, if you're seeing ZFS 
>checksum errors in zpool status, that won't be from whatever program is 
>writing it.

I'm mounting another server using NFS to get my 1TB of test data, so the problem box is the NFS client, not the server.  Sorry if there was any confusion.

I had a theory that there was a race in the NFS client or perhaps the code that steal memory from ZFS's ARC when NFS needs it.  However, that seems less likely now as I have done the same 1TB test copy again today but this time using ssh as the transport.  I saw the same ZFS checksum errors as before. :-(

>..oh, and don't forget about fun with Expanders. I assume you're using one?

No.  This board has 14 ports made up of 6 native and 8 from the LSI2008 chip on the PIKE card.  Each is cabled directly to a drive.

>I've got 2 LSI2008 based controllers in my 9-Current machine without 
>any fuss. That's running a 24 disk Mirror right now.

That's encouraging news.  Maybe I can win eventually.


On Tuesday, 8th March 2011, Mark Felder wrote:
>Highly interested in what FreeBSD version and what ZFS version and 
>zpool version you're running.

I was using 8.2-release plus the mps driver from 8.2-stable.  Hence the filesystem version is 4 and pool version is 15.  But I installed -current a few days ago while keeping the same pool and found that the errors still occurred.  The v28 code has extra locking in interesting places but it made no difference to the checksum errors.

As of today, I've destroyed the pool and built a version 28 pool (fs version 5) on a subset of disks (those attached to the onboard controller).  I'll know by tomorrow how that went.

BTW, with my code in place to decipher "type 19" entries and a kernel printf that bypasses the need to get devd.conf right, I see something like this for each checksum error:

log_sysevent: ESC_dev_dle:  class=ereport.fs.zfs.checksum ena=4822220020083854337 detector={ version=0 scheme=zfs pool=44947180927799912 vdev=6194846651369573567} pool=dread pool_guid=44947180927799912 pool_context=0 pool_failmode=wait vdev_guid=6194846651369573567 vdev_type=disk vdev_path=/dev/gpt/bay7 parent_guid=18008078209829074821 parent_type=raidz zio_err=0 zio_offset=194516353024 zio_size=4096 zio_objset=276 zio_object=0 zio_level=0 zio_blkid=132419 bad_ranges=0000000000001000 bad_ranges_min_gap=8 bad_range_sets=00000445 bad_range_clears=00002924 bad_set_histogram=001b001a001e002b0021001600120018001a001600210018001500150016001c001c0019001200190022001b0019001b0017000f0014000e0013001a001c001f000c000c000c0007000b000d0010001f00060009000800080007000c0010000f00070007000500070008000600080008000a0002000100060004000300070004 bad_cleared_histogram=00820089009700ac00a700b900b2009000730084009500af00a300ad00a900ac0082009300ad00c200ac00d200a8008f0078008b008e00b700bf00b9009f00a60083!
 009500a400c100c200b700cd009900780090009b00be00af00c100a700980083008a00a200c900bc00d400b200a3007e0089009400c400c700d400b8009b

That's a hideous blob of awful, and I don't really know what to do with it.

Cheers,

Stephen.