From owner-freebsd-fs@FreeBSD.ORG  Tue Apr  9 16:19:31 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id C6436E35
 for <freebsd-fs@freebsd.org>; Tue,  9 Apr 2013 16:19:31 +0000 (UTC)
 (envelope-from quartz@sneakertech.com)
Received: from relay02.pair.com (relay02.pair.com [209.68.5.16])
 by mx1.freebsd.org (Postfix) with SMTP id 68BFE694
 for <freebsd-fs@freebsd.org>; Tue,  9 Apr 2013 16:19:31 +0000 (UTC)
Received: (qmail 84655 invoked by uid 0); 9 Apr 2013 16:19:29 -0000
Received: from 173.48.104.62 (HELO ?10.2.2.1?) (173.48.104.62)
 by relay02.pair.com with SMTP; 9 Apr 2013 16:19:29 -0000
X-pair-Authenticated: 173.48.104.62
Message-ID: <51643F91.30704@sneakertech.com>
Date: Tue, 09 Apr 2013 12:19:29 -0400
From: Quartz <quartz@sneakertech.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6;
 rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2
MIME-Version: 1.0
To: Tom Evans <tevans.uk@googlemail.com>
Subject: Re: ZFS: Failed pool causes system to hang
References: <2092374421.4491514.1365459764269.JavaMail.root@k-state.edu>
 <5163F03B.9060700@sneakertech.com>
 <CAFHbX1LO9OvbqyYYaob-7nQSA_dwQkMK7+vn9c4QrXQuKvTCFA@mail.gmail.com>
 <51640BDB.1020403@sneakertech.com>
 <CAFHbX1+cdUtumDk3BB1jSE0sCuvVwsTNCAs8e=T0iML6WxHASw@mail.gmail.com>
In-Reply-To: <CAFHbX1+cdUtumDk3BB1jSE0sCuvVwsTNCAs8e=T0iML6WxHASw@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Cc: FreeBSD FS <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 09 Apr 2013 16:19:31 -0000


> Sorry, but you've not tested this. Your root is hanging off a
> different controller to the others, but it is still using the same
> ahci/cam stack. Is ahci/cam getting wedged, causing your root to get
> wedged - irrespective of running on a different controller - or is ZFS
> causing a deadlock.


If I simulate failures by yanking the sata cable to various drives in 
the pool, I can disconnect any two (raidz2) at random and everything 
hums along just fine. Status tells me the pool is degraded and if I 
reconnect them I can resilver and whatnot with no problems. However if I 
have three drives yanked simultaneously is when everything goes to shit.

I don't know the ahci/cam stack from a hole in the wall, but it seems to 
me that if it can gracefully handle two drives dropping out and coming 
back randomly, it ought to be able to handle three. I suppose it's 
possible that zfs itself is not the root cause of the problem, but one 
way or another there's some kind of interaction here, as I only 
experience an issue when the pool is no longer solvent.

______________________________________
it has a certain smooth-brained appeal