Date: Tue, 8 Jan 2008 12:58:57 -0600 From: Brooks Davis <brooks@freebsd.org> To: Scott Long <scottl@samsco.org> Cc: freebsd-fs@freebsd.org, ticso@cicely.de, Tz-Huan Huang <tzhuan@csie.org> Subject: Re: ZFS i/o errors - which disk is the problem? Message-ID: <20080108185857.GA5601@lor.one-eyed-alien.net> In-Reply-To: <47839386.8020203@samsco.org> References: <20080102070146.GH49874@cicely12.cicely.de> <477B8440.1020501@freebsd.org> <200801031750.31035.peter.schuller@infidyne.com> <477D16EE.6070804@freebsd.org> <20080103171825.GA28361@lor.one-eyed-alien.net> <6a7033710801061844m59f8c62dvdd3eea80f6c239c1@mail.gmail.com> <20080107135925.GF65134@cicely12.cicely.de> <47830BC0.5060100@samsco.org> <20080108083822.GL76422@cicely12.cicely.de> <47839386.8020203@samsco.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--a8Wt8u1KmwUX3Y2C Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Jan 08, 2008 at 08:15:18AM -0700, Scott Long wrote: > Bernd Walter wrote: >> On Mon, Jan 07, 2008 at 10:36:00PM -0700, Scott Long wrote: >>> Bernd Walter wrote: >>>> On Mon, Jan 07, 2008 at 10:44:13AM +0800, Tz-Huan Huang wrote: >>>>> 2008/1/4, Brooks Davis <brooks@freebsd.org>: >>>> The data is corrupted by controller and/or disk subsystem. >>>> You have no other data sources for the broken data, so it is lost. >>>> The only garantied way is to get it back from backup. >>>> Maybe older snapshots/clones are still readable - I don't know. >>>> Nevertheless data is corrupted and that's the purpose for alternative >>>> data sources such as raidz/mirror and at last backup. >>>> You shouldn't have ignored those errors at first, because you are >>>> running with faulty hardware. >>>> Without ZFS checksumming the system would just process the broken >>>> data with unpredictable results. >>>> If all those errors are fresh then you likely used a broken RAID >>>> controller below ZFS, which silently corrupted syncronity and then >>>> blow when disk state changed. >>>> Unfortunately many RAID controllers are broken and therefor useless. >>>>=20 >>> Huh? Could you be any more vague? Which controllers are broken? Have= =20 >>> you contacted anyone about the breakage? Can you describe the breakage? >>> I call bullshit, pure and simple. >> Just go back a few mails in the same thread were someone fixed CRC >> errors by updating the RAID controller firmware. >> I'm amazed how often I read something like this lately. >> And if you read the whole thread then you will notice that we are >> currently talking about another person which has corrupted data on >> a RAID disk - not sure if this is the controller, a drive or the >> drivers, but something is faulty here and I wouldn't be surprised >> if it is the controller. >> And then there are so many RAID controllers without backed memory or >> other mechanism to garantie syncronity for the disks, which I call >> broken by design. >> You know yourself how important syncronity is for RAID, especially >> when it comes to parity based RAID and you know how fragile it is >> when it comes to power failure. >=20 > Your argument is complete hearsay and poorly formed opinion. That's > fine, just be honest about it and don't mislead others into thinking > that you know what you're talking about when it comes to RAID. We saw ZFS CRC errors on one system running Solaris x86 with a 16-port Areca controller (I don't have the model number handy) until we did a firmware upgrade after contacting Areca. The controller was running in JBOD mode. -- Brooks --a8Wt8u1KmwUX3Y2C Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (FreeBSD) iD8DBQFHg8fwXY6L6fI4GtQRAsweAKCwDbsQ5vPGkkmUhCQ/4WLBNwV3KACcCNvL 6BKxUgfbh8VCgNSEzT6S7+U= =0Scq -----END PGP SIGNATURE----- --a8Wt8u1KmwUX3Y2C--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080108185857.GA5601>