Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 06 Feb 2008 09:42:51 -0700
From:      Joe Peterson <joe@skyrush.com>
To:        Bakul Shah <bakul@bitblocks.com>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: Forcing full file read in ZFS even when checksum error encountered
Message-ID:  <47A9E38B.6040100@skyrush.com>
In-Reply-To: <20080205190946.3D69C5B59@mail.bitblocks.com>
References:  <20080205190946.3D69C5B59@mail.bitblocks.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Bakul Shah wrote:
> It could also be a memory error of some sort.  Does your
> system haev ECC memory?

Yes, I always insist on ECC.

> Also note that standalone tests do
> not seem to catch all sorts of errors that heavy use of Unix
> can sometimes trigger on a marginal system.

I do plan to do a few more HW checks (cables, etc.), just to make sure.
 I had been avoiding touching my HW config to preserve the current state
of this issue.  However, given the coincidental experience Jeremy talked
about and the fact that the DMA errors I have seen using ZFS on FreeBSD
that I do not see using ZFS-Fuse on the same disk/pool in Linux, I have
a gut feeling something funny is going on.

> But I agree with you that it would be useful to have a debug
> mode where you can get at the data even if it is bad (and a
> test mode where you can write bad data on purpose:-). [A
> long rant on writing testable code deleted]

Yes, the danger of course is if someone forget's that the debug mode is
engaged, but I think care could be taken to make sure this cannot easily
be done accidentally or massive warnings can be issues to make sure the
user knows.

> You have access to the zfs sources! At the very least you can
> add code to report the bad checksum & offset and see if
> matches with checksum of the same block(s) in your known good
> copy.

Yep, this is my next planned step.

			Thanks, Joe





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?47A9E38B.6040100>