Date: Thu, 28 Aug 2014 01:36:05 -0500 From: Scott Bennett <bennett@sdf.org> To: paul@kraus-haus.org Cc: freebsd-questions@freebsd.org, freebsd@qeng-ho.org, Trond.Endrestol@fagskolen.gjovik.no Subject: Re: gvinum raid5 vs. ZFS raidz Message-ID: <201408280636.s7S6a5OZ022667@sdf.org> In-Reply-To: <9588077E-1198-45AF-8C4A-606C46C6E4F8@kraus-haus.org> References: <201408020621.s726LsiA024208@sdf.org> <alpine.BSF.2.11.1408020356250.1128@wonkity.com> <53DCDBE8.8060704@qeng-ho.org> <201408060556.s765uKJA026937@sdf.org> <53E1FF5F.1050500@qeng-ho.org> <201408070831.s778VhJc015365@sdf.org> <alpine.BSF.2.11.1408071034510.64214@mail.fig.ol.no> <201408070936.s779akMv017524@sdf.org> <alpine.BSF.2.11.1408071226020.64214@mail.fig.ol.no> <201408071106.s77B6JCI005742@sdf.org> <5B99AAB4-C8CB-45A9-A6F0-1F8B08221917@kraus-haus.org> <201408220940.s7M9e6pZ008296@sdf.org> <7971D6CA-AEE3-447D-8D09-8AC0B9CC6DBE@kraus-haus.org> <201408260641.s7Q6feBc004970@sdf.org> <9588077E-1198-45AF-8C4A-606C46C6E4F8@kraus-haus.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Paul Kraus <paul@kraus-haus.org> wrote: > On Aug 26, 2014, at 2:41, Scott Bennett <bennett@sdf.org> wrote: > > Paul Kraus <paul@kraus-haus.org> wrote: > >> On Aug 22, 2014, at 5:40, Scott Bennett <bennett@sdf.org> wrote: > >>> What I'm seeing here is ~2 KB of errors out > >>> of ~1.1TB, which is an error rate (in bytes, not bits) of ~1.82e+09, and the As I caught and corrected before, the above should have said, "~1.82e-09". > >>> majority of the erroneous bytes I looked at had multibit errors. I consider > >>> that to be a huge change in the actual device error rates, specs be damned. > >> > >> That seems like a very high error rate. Is the drive reporting those errors or are they getting past the drive?s error correction and showing up as checksum errors in ZFS ? A drive that is throwing that many errors is clearly defective or dying. > > > > I'm not using ZFS yet. Once I get a couple more 2 TB drives, I'll give > > it a shot. > > The numbers are from running direct comparisons between the source file > > and the copy of it using cmp(1). In one case, I ran the cmp twice and got > > identical results, which I interpret as an indication that the errors are > > occurring during the writes to the target disk during the copying. > > Wow. That implies you are hitting a drive with a very high uncorrectable error rate since the drive did not report any errors and the data is corrupt. I have yet to run into one of those. How would an uncorrectable error be detected by the drive without any parity checking or hardware-implemented write-with-verify? Are you using any drives larger than 1 TB? If so, try copying a 1.1 TB file to one of them, and then trying comparing the copy against the original. Out of the three drives I could test that way, I got that kind of result on two every time I tried it. One of the two was a new Samsung (i.e., a Seagate), and the other was a refurbished Seagate supplied as a replacement under warranty. The third got a clean copy the first time and two bytes with single-bit errors on the second try. That one was also a refurbished Seagate provided under warranty. Scott Bennett, Comm. ASMELG, CFIAG ********************************************************************** * Internet: bennett at sdf.org *xor* bennett at freeshell.org * *--------------------------------------------------------------------* * "A well regulated and disciplined militia, is at all times a good * * objection to the introduction of that bane of all free governments * * -- a standing army." * * -- Gov. John Hancock, New York Journal, 28 January 1790 * **********************************************************************
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201408280636.s7S6a5OZ022667>