Date: Wed, 25 May 2011 10:55:26 -0700 From: Jeremy Chadwick <freebsd@jdc.parodius.com> To: "Vladislav V. Prodan" <universite@ukr.net> Cc: fs@freebsd.org Subject: Re: how to import raidz2, if only one disk is missing? Message-ID: <20110525175526.GA45398@icarus.home.lan> In-Reply-To: <4DDD0516.4060000@ukr.net> References: <4DDC0D13.3030401@ukr.net> <20110524201118.GF2415@garage.freebsd.pl> <4DDC128F.80203@ukr.net> <20110525025831.GA2363@DataIX.net> <4DDD0516.4060000@ukr.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, May 25, 2011 at 04:33:10PM +0300, Vladislav V. Prodan wrote: > 25.05.2011 5:58, Jason Hellenthal wrote: > > > >Vladislav, > > > >Hi. Just a heads up on this instead of waiting for the MFC to happen you > >may want to boot mfsBSD from Martin Matuska [1][2] to check your disks > >ahead of time and diagnose whether it is worthwhile waiting for the MFC. > > Thank you for reminding me about this rescue-CD. > > I received a broken pool raidz2. > > 1) Run smartctl on all 6 disks. He showed that the three discs > problem with counters: > 184 End-to-End_Error 0x0033 001 001 099 Pre-fail Always FAILING_NOW 149 Without knowing the exact device model of disk ("Device Model:"), and whether or not the disk is within smartmontools' internal drive database ("Device is:"), this attribute may or may not actually be End-to-End_Error. It would be helpful if you could provide that. Are these HP drives, per chance? Assuming these are HP drives: end-to-end error indicates, more or less, bad cache on the drive itself. HP implemented a parity check on every 512 bytes read/written from/to the drive's cache. There's no error correction used (to my knowledge), and failures are reported back to the (host) controller in some manner. HP does document that "in some situations" (reads) the drive can attempt re-reads and re-write that block of data in the cache, in hopes that a subsequent read will work. In that situation I imagine the attribute would be incremented but a hard failure (ATA error, etc.) not shown. Are you absolutely certain you haven't seen a single error on your FreeBSD console (or in /var/log/messages, etc.) since these drives were put into use? Were these brand new drives or previously used? (Footnote for readers: this SMART attribute shouldn't be confused with attribute 199 (CRC errors), which indicates communication failures between both controllers (the controller in the host, and the controller on the drive PCB) and is often an indicator of bad cabling, a bad hot-swap backplane, a dusty/dirty SATA port, etc...) -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP 4BD6C0CB |
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110525175526.GA45398>