Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 21 Mar 2013 01:58:42 -0400
From:      Quartz <quartz@sneakertech.com>
To:        Jeremy Chadwick <jdc@koitsu.org>
Cc:        freebsd-questions@freebsd.org
Subject:   Re: ZFS question
Message-ID:  <514AA192.2090006@sneakertech.com>
In-Reply-To: <20130321044557.GA15977@icarus.home.lan>
References:  <20130321044557.GA15977@icarus.home.lan>

next in thread | previous in thread | raw e-mail | index | archive | help

> 1. freebsd-fs is the proper list for filesystem-oriented questions of
> this sort, especially for ZFS.

Ok, I'm assuming I should subscribe to that list and post there then?


> 2. The issue you've described is experienced by some, and **not**
> experienced by even more/just as many, so please keep that in mind.

Well, that's a given. Presumably if zfs was flat out totally broken, 9.x 
wouldn't have been released or I would've already found a million pages 
about this via google. I'm assuming my problem is a corner case and 
there might've been a bug/regression, or I fundamentally don't 
understand how this works.


> 3. You haven't provided any useful details, even in your follow-up post
> here:

I got the impression that there wasn't a lot of overlap between the 
mailing lists and the forums, so I wanted to post in both simultaneously.


> - Contents of /boot/loader.conf
> - Contents of /etc/sysctl.conf
> - Output from "zpool get all"
> - Output from "zfs get all"
> - Output from "sysctl vfs.zfs kstat.zfs"

I'm running a *virgin* 9.1 with no installed software or modifications 
of any kind (past setting up a non-root user). All of these will be at 
their install defaults (with the possible exception of the "failmode" 
setting, but that didn't help when I tried it the first time, so I 
didn't bother during later re-installs).


> - Output from "zpool status"

There isn't a lot of detail to be had here.... after I pop the 3rd 
drive, zfs/zpool commands almost always cause the system to hang, so I'm 
not sure if I can get anything out of them. Prior to the hang it will 
just tell you I have a six-drive raidz2 with two of the drives 
"removed", so I'm not sure how that will be terribly useful.

I can tell you though that I'm creating the array with the following 
command:
zpool create -f array raidz2 ada{2,3,4,5,6,7}

There are eight drives in the machine at the moment, and I'm not messing 
with partitions yet because I don't want to complicate things. (I will 
eventually be going that route though as the controller tends to 
renumber drives in a first-come-first-serve order that makes some things 
difficult).


> - Output from "dmesg" (probably the most important)

When? ie; right after boot, or after I've hot plugged a few drives, or 
yanked them, or created a pool, or what?


> I particularly tend to assist with disk-level problems,

This machine is using a pile of spare seagate 250gb drives, if that 
makes any difference.


> By rolling back, if there is an issue, you're
> effectively ensuring it'll never get investigated or fixed,

That's why I asked for clarification, to see if it was a known 
regression in 9.1 or something similar.



>or don't have the
> time/cycles/interest to help track it down,

I have plenty of all that, for better or worse :)


>that's perfectly okay too:
> my recommendation is to go back to UFS (there's no shame in that).

At the risk of being flamed off the list, I'll switch to debian if it 
comes to that. I use freebsd exclusively for zfs.


> Else, as always, I strongly recommend running stable/9 (keep reading).

My problem with tracking -stable is the relative volatility. If I'm 
trying to debug a problem it's not always easy or possible to keep 
consistent/known versions of things. With -release I know exactly what 
I'm getting and it cuts out a lot of variables.


>just recently (~5 days ago)
> MFC'd an Illumos ZFS feature solely to help debug/troubleshoot this
> exact type of situation: introduction of the ZFS deadmean thread.

Yes, I already discovered this from various solaris threads I encountered.


> The purpose of this feature (enabled by default) is to induce a kernel
> panic when ZFS I/O stalls/hangs

This doesn't really help my situation though. If I wanted a panic I'd 
just set failmode=panic.


> All that's assuming that the issue truly is ZFS waiting for I/O and not
> something else

Well, everything I've read so far indicates that zfs has issues when 
dealing with un-writable pools, so I assume that's what's going on here.

______________________________________
it has a certain smooth-brained appeal



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?514AA192.2090006>