From owner-freebsd-current@FreeBSD.ORG Tue Oct 4 14:31:53 2011 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BFF191065672; Tue, 4 Oct 2011 14:31:53 +0000 (UTC) (envelope-from paul@gromit.dlib.vt.edu) Received: from lennier.cc.vt.edu (lennier.cc.vt.edu [198.82.162.213]) by mx1.freebsd.org (Postfix) with ESMTP id 7B1C28FC18; Tue, 4 Oct 2011 14:31:53 +0000 (UTC) Received: from vivi.cc.vt.edu (vivi.cc.vt.edu [198.82.163.43]) by lennier.cc.vt.edu (8.13.8/8.13.8) with ESMTP id p94EVCTv017065; Tue, 4 Oct 2011 10:31:22 -0400 Received: from auth3.smtp.vt.edu (EHLO auth3.smtp.vt.edu) ([198.82.161.152]) by vivi.cc.vt.edu (MOS 4.2.2-FCS FastPath queued) with ESMTP id SWM58912; Tue, 04 Oct 2011 10:31:22 -0400 (EDT) Received: from pmather.tower.lib.vt.edu (pmather.tower.lib.vt.edu [128.173.51.28]) (authenticated bits=0) by auth3.smtp.vt.edu (8.13.8/8.13.8) with ESMTP id p94EVMWK017682 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Tue, 4 Oct 2011 10:31:22 -0400 Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=us-ascii From: Paul Mather In-Reply-To: Date: Tue, 4 Oct 2011 10:31:22 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: <0AD3BA75-58D7-4359-B61F-B5F4815D3843@gromit.dlib.vt.edu> References: <8B59D754-9062-4499-9873-7C2167622032@gromit.dlib.vt.edu> To: Artem Belevich X-Mailer: Apple Mail (2.1084) X-Mirapoint-Received-SPF: 198.82.161.152 auth3.smtp.vt.edu paul@gromit.dlib.vt.edu 5 none X-Junkmail-Status: score=10/50, host=vivi.cc.vt.edu X-Junkmail-Signature-Raw: score=unknown, refid=str=0001.0A020203.4E8B18BA.015C,ss=1,fgs=0, ip=0.0.0.0, so=2010-07-22 22:03:31, dmn=2009-09-10 00:05:08, mode=single engine X-Junkmail-IWF: false Cc: freebsd-current@freebsd.org Subject: Re: Strange ZFS filesystem corruption X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Oct 2011 14:31:53 -0000 On Oct 3, 2011, at 6:19 PM, Artem Belevich wrote: > On Mon, Oct 3, 2011 at 11:21 AM, Paul Mather = wrote: >> =3D=3D=3D=3D=3D >>=20 >> The pool itself reports no errors. I performed a scrub on the pool = yet this bizarre filesystem corruption persists: >>=20 >> =3D=3D=3D=3D=3D >> tape# zpool status backups >> pool: backups >> state: ONLINE >> scan: scrub repaired 15K in 7h33m with 0 errors on Sat Oct 1 = 19:22:35 2011 >=20 > The pool *did* report 15K errors that it was able to repair. >=20 > I'd start with testing your RAM with memtest86 or memtest86+. ZFS > errors without reported checksum errors may be the sign of bad memory. > I.e. data gets corrupted before ZFS gets to calculate checksum and > later invalid data with valid checksum gets written to disk. Because this machine has ECC RAM, I checked the BIOS logs for ECC errors = (the BIOS is set to log them) and there are no ECC errors logged. If = the RAM were going bad, I would expect it to leave some kind of trace in = the BIOS log. Do uncorrectable ECC errors get logged as MCEs under FreeBSD 9? I've never noticed any problems when doing a "make -j8 buildworld" on = this machine, either. Cheers, Paul.=