Date: Fri, 08 Aug 2008 08:23:41 +0400 From: "Andrey V. Elsukov" <bu7cher@yandex.ru> To: Jeremy Chadwick <koitsu@FreeBSD.org> Cc: freebsd-fs@freebsd.org, Scott Long <scottl@FreeBSD.org> Subject: Re: zpool degraded - 'UNAVAIL cannot open' functioning drive Message-ID: <489BCA4D.3050704@yandex.ru> In-Reply-To: <20080807121245.GA26629@eos.sc1.parodius.com> References: <6c3c36d00808062109y6ae176a0ha055129392b00542@mail.gmail.com> <20080807044759.GA7505@eos.sc1.parodius.com> <6c3c36d00808062212y4e9a1464i48e146e84725a36e@mail.gmail.com> <6c3c36d00808062235v5cbb4470v990b76d569f85614@mail.gmail.com> <20080807055841.GB9735@eos.sc1.parodius.com> <489A9739.20707@yandex.ru> <20080807071434.GA15465@eos.sc1.parodius.com> <489ADD89.8070809@mawer.org> <20080807121245.GA26629@eos.sc1.parodius.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Jeremy Chadwick wrote: > In almost every case I've looked at so far, the individuals' chipsets, > disks, and overall setup are different. SMART statistics on the drives > show absolutely no sign of errors, or anything that indicates a hardware > failure. Many of the users are using AHCI as well (myself included, and > I have seen the DMA error issue myself), which is more reliable than > classic IDE. I have done some work on AHCI part of ATA driver and I am looking for testers... http://perforce.freebsd.org/changeList.cgi?CMD=changes&FSPC=//depot/user/butcher/src/... > It would be benefitial if there was some form of sysctl to increase the > verbosity from the ATA subsystem when an error happens. The existing > data we get back is terse, and barely useful. I know for a fact there's > more debug information that could be output in such scenarios. And > please do not reply with "good idea, send patches" unless you're wanting > to be chewed out. :-) Ok, I'll try to add some verbose 'printfs' in my branch in perforce :) >> I'm going to do some analysis and find out whether I can find any of our >> systems that may be experiencing ATA errors that don't correlate with >> what their SMART data is saying. To date I haven't caught any, but >> that's not to say they may not be happening... just that all of the ones >> I have caught to date do appear to have been hardware-related issues... IMHO. Today we have many hardware versions and revisions and some of them are buggy. But another OSes (windows, linux) work with buggy hardware without big problems. Yes, some developers have docs and can make workarounds.. I think our ata driver needs new error handling subsystem, which can correctly handle errors. -- WBR, Andrey V. Elsukov
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?489BCA4D.3050704>