From owner-freebsd-stable@FreeBSD.ORG Fri Feb 8 23:07:22 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9FB6316A418; Fri, 8 Feb 2008 23:07:22 +0000 (UTC) (envelope-from mday@apple.com) Received: from mail-out4.apple.com (mail-out4.apple.com [17.254.13.23]) by mx1.freebsd.org (Postfix) with ESMTP id 81DCF13C465; Fri, 8 Feb 2008 23:07:22 +0000 (UTC) (envelope-from mday@apple.com) Received: from relay12.apple.com (relay12.apple.com [17.128.113.53]) by mail-out4.apple.com (Postfix) with ESMTP id 181D2218F490; Fri, 8 Feb 2008 14:49:55 -0800 (PST) Received: from relay12.apple.com (unknown [127.0.0.1]) by relay12.apple.com (Symantec Mail Security) with ESMTP id 00CE0464002; Fri, 8 Feb 2008 14:49:55 -0800 (PST) X-AuditID: 11807135-a9033bb00000423e-ab-47acdc921fbe Received: from doomsday.apple.com (doomsday.apple.com [17.202.43.217]) by relay12.apple.com (Apple SCV relay) with ESMTP id D0FE9420005; Fri, 8 Feb 2008 14:49:54 -0800 (PST) Message-Id: From: Mark Day To: Joe Peterson In-Reply-To: <47ACD7D4.5050905@skyrush.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v919.2) Date: Fri, 8 Feb 2008 14:49:54 -0800 References: <47ACD7D4.5050905@skyrush.com> X-Mailer: Apple Mail (2.919.2) X-Brightmail-Tracker: AAAAAA== Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org Subject: Re: Analysis of disk file block with ZFS checksum error X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Feb 2008 23:07:22 -0000 On Feb 8, 2008, at 2:29 PM, Joe Peterson wrote: > For one thing (as I mentioned), only 65536 bytes are bad (and it's > exactly this many, with a few "good" bytes thrown in, but not far from > what matches random chance would produce. Also, all bad bytes have a > zero in the high bit - interesting? Also, near the end of the block, > the bad bytes all go to zero, strangely coincident with the first > "good" > zero in that bad block - not sure if that's coincidence or not. > Also, I > calculated the number of "Bits same" (matching bits) in the good vs. > bad > bytes, and it appears fairly random, so it appears that the bad bytes > are very random in nature and not correlated much at all with the good > bytes. > > So except for the fact that the 2nd half (65536 bytes) of the ZFS > block > are good, the bad block seems to consist of random data, except for > the > string of zero bytes near the end and the zero high-bit. It's not > as if > one bit on the disk flipped - it affects the whole (1/2) block. Does > this seem like a disk error, controller error/bug, cable problem (I > recently put a new cable on, so I doubt this). It seems to me > something > more systemic rather than a random bit error - opinions are more than > welcome. Based on the subset of data you posted, the bad data looks like ASCII text. The bad data from offset a0000 to a000f is: ${138AFE{@ @$$}1 The bad data from offset af6c1 to af6c8 is: 392A9}@ I don't recognize the content beyond that, but I'd guess that somehow the contents of some other file managed to overwrite that portion of the bad file. As for how that happened, I don't know. But if someone recognizes where the bad content came from, that might be a clue. -Mark