Date: Thu, 7 Aug 2008 00:14:34 -0700 From: Jeremy Chadwick <koitsu@FreeBSD.org> To: "Andrey V. Elsukov" <bu7cher@yandex.ru> Cc: freebsd-fs@freebsd.org Subject: Re: zpool degraded - 'UNAVAIL cannot open' functioning drive Message-ID: <20080807071434.GA15465@eos.sc1.parodius.com> In-Reply-To: <489A9739.20707@yandex.ru> References: <6c3c36d00808062109y6ae176a0ha055129392b00542@mail.gmail.com> <20080807044759.GA7505@eos.sc1.parodius.com> <6c3c36d00808062212y4e9a1464i48e146e84725a36e@mail.gmail.com> <6c3c36d00808062235v5cbb4470v990b76d569f85614@mail.gmail.com> <20080807055841.GB9735@eos.sc1.parodius.com> <489A9739.20707@yandex.ru>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Aug 07, 2008 at 10:33:29AM +0400, Andrey V. Elsukov wrote: > Jeremy Chadwick wrote: >> Correct, it's a FreeBSD ATA subsystem/driver problem. > > I tried 8.0-CURRENT on marvell's, nvida's and intel's controllers. > Hot plug and attach/detach works on any of these controllers without > any problems.. What i should to do to get similar problems? :) I haven't tried CURRENT; I don't track HEAD. I will work on setting up another testbed environment at home and repeating my tests on HEAD. That will take me some time, however. My test method is very simple, at least in regards to disk removal. Here's the step-by-step I've used to hit the bugs in question: http://lists.freebsd.org/pipermail/freebsd-stable/2008-February/040534.html >> My advice at this point in time, because as of today I have officially >> lost faith in it: avoid ata(4) at all costs. > > I tried to contact you some time ago, but didn't receive any > answers.. Do you still want to resolve your problems with ATA? Yes, I did receive your mails, but you just wanted to know "if I was still having problems". I should have replied, but I did not. That is my fault, and for that I apologise. The issues aren't problems specific to me -- they are affecting a significant userbase, specifically folks who use servers in production environments. But maybe I've misunderstood what you meant by "your problems" -- my apologies if I have. But have you looked at my Wiki page, documenting most (but not all) of the issues? http://wiki.freebsd.org/JeremyChadwick/ATA_issues_and_troubleshooting We still don't have an answer to the famous "DMA timeout issue", which continues to haunt many. I provided a small analysis in my Wiki, but the technical justification is over my head -- it needs review from someone who is familiar with the ATA protocol. I inteprete the NID_NOT_FOUND error to mean FreeBSD is asking the disk to r/w to/from an invalid LBA. I received one mail from a user (I forget if a mailing list was CC'd or not -- I need to dig up the mail) who said that in some cases NID_NOT_FOUND is normal. The FreeNAS folks reported that increasing the internal ATA command timeout from 5 seconds to 10 or 15 has helped (FreeNAS users), but those on FreeBSD who suffer from said timeouts and have tried the patches said they have made no difference. That said, I have some questions: 1) Are you trying to tell me that individuals running commercial services in production environments should run CURRENT? I don't think many are willing to do this; I know I'm not, and I can probably speak for Randy Bush. ;-) 2) If the issues above were fixed in HEAD, why were none of the PRs listed in my Wiki updated to reflect that? 3) If the above issues were fixed in HEAD, can you point me to the CVS commits for them? Any time I see ATA commits happen in RELENG_7, I immediately use cvsweb to look at the changes and commit message -- that means I look at HEAD, RELENG_7, and any other branchpoint. I haven't seen anything committed for these issues. 4) If the above issues were actually fixed in HEAD, are there scheduled plans to MFC the fixes? I appreciate you taking the time to help track these down and investigate them, but I feel like you, myself, Scott Long, and the users are the only ones who care about these issues. The maintainer is alive and active, but hasn't said a word, and some of those PRs go untouched for 2+ years... -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080807071434.GA15465>