Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 13 Apr 2013 16:59:51 -0400
From:      Quartz <quartz@sneakertech.com>
To:        Jeremy Chadwick <jdc@koitsu.org>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: A failed drive causes system to hang
Message-ID:  <5169C747.8030806@sneakertech.com>
In-Reply-To: <20130413154130.GA877@icarus.home.lan>
References:  <mailman.11.1365681601.78138.freebsd-fs@freebsd.org> <51672164.1090908@o2.pl> <20130411212408.GA60159@icarus.home.lan> <5168821F.5020502@o2.pl> <20130412220350.GA82467@icarus.home.lan> <516917CA.5040607@sneakertech.com> <20130413154130.GA877@icarus.home.lan>

next in thread | previous in thread | raw e-mail | index | archive | help

> This is what happens when end-users start to try and "correlate" issues
> to one another's without actually taking the time to fully read the
> thread and follow along actively.

He was experiencing a system hang, which appeared to be related to zfs 
and/or cam. I'm experiencing a system hang, which appears to be related 
to zfs and/or cam. I am in fact following along with this thread.


> Your issue: "on my raidz2 pool, when I lose more than 2 disks, I/O to
> the pool stalls indefinitely,

Close, but not quite- Yes, io to the pool stalls, but io in general also 
stalls. It appears the problem possibly doesn't start until there's io 
traffic to the pool though.


>but I can still use the system barring
> ZFS-related things;

No. I've responded to this misconception on your part more than once- I 
*CANNOT* use the system in any reliable way, random commands fail. I've 
had it hang trying cd from one dir on the boot volume to another dir on 
the boot volume. The only thing I can *reliably* do is log in. Past that 
point all bets are off.


>I don't know how to get the system back into a
> usable state from this situation"

"...short of having to hard reset", yes.


> Else, all you've provided so far is a general explanation. You have
> still not provided concise step-by-step information like I've asked.

*WHAT* info? You have YET TO TELL ME WHAT THE CRAP YOU ACTUALLY NEED 
from me. I've said many times I'm perfectly willing to give you logs or 
run tests, but I'm not about to post a tarball of my entire drive and 
output of every possible command I could ever run.

For all the harping you do about "not enough info" you're just as bad 
yourself.


> I've gone so far as to give you an example of what to provide:
>
> http://lists.freebsd.org/pipermail/freebsd-fs/2013-March/016814.html

The only thing there you ask for is a dmesg, which I subsequently 
provided. Nowhere in that thread do you ask me to give you *anything* 
else, besides your generic mantra of "more info". And yes, I did read it 
again just now three times over to make sure. The closest you come is:

"This is why hard data/logs/etc. are necessary, and why
every single step of the way needs to be provided, including physical
tasks performed."

... but you still never told me WHICH logs or WHAT data you need. I've 
already given you the steps I took re: removing drives, steps which *you 
yourself* confirmed to express the problem.


> I will again point to the 2nd-to-last paragraph of my above referenced
> mail.

The "2nd-to-last paragraph" is:

"So in summary: there seem to be multiple issues shown above, but I can
confirm that failmode=continue **does** pass EIO to *running* processes
that are doing I/O.  Subsequent I/O, however, is questionable at this
time."

Unless you're typing in a language other than english, that isn't asking 
me jack shit.


> Once concise details are given and (highly preferable!) a step-by-step
> way to reproduce the issue 100% of the time

*YOU'VE ALREADY REPRODUCED THIS ON YOUR OWN MACHINE.*

Seriously, wtf?

______________________________________
it has a certain smooth-brained appeal



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5169C747.8030806>