From owner-freebsd-hackers@FreeBSD.ORG  Fri Sep 12 19:45:55 2008
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A1CE01065675;
	Fri, 12 Sep 2008 19:45:55 +0000 (UTC)
	(envelope-from kpielorz_lst@tdx.co.uk)
Received: from lorca.tdx.co.uk (lorca.tdx.co.uk [62.13.128.6])
	by mx1.freebsd.org (Postfix) with ESMTP id 51E148FC1E;
	Fri, 12 Sep 2008 19:45:55 +0000 (UTC)
	(envelope-from kpielorz_lst@tdx.co.uk)
Received: from Quadro64.tdx.co.uk (rainbow.tdx.co.uk [62.13.130.232] (may be
	forged)) (authenticated bits=0)
	by lorca.tdx.co.uk (8.14.0/8.14.0/Kp) with ESMTP id m8CJXEBs010145;
	Fri, 12 Sep 2008 20:33:15 +0100 (BST)
Date: Fri, 12 Sep 2008 20:31:18 +0100
From: Karl Pielorz <kpielorz_lst@tdx.co.uk>
To: Jeremy Chadwick <koitsu@FreeBSD.org>
Message-ID: <FEFC1751EDD6B66957A04942@Quadro64.tdx.co.uk>
In-Reply-To: <20080912160422.GB60094@icarus.home.lan>
References: <C984A6E7B1C6657CD8C4F79E@Slim64.dmpriest.net.uk>
	<20080912132102.GB56923@icarus.home.lan>
	<3BE629D093001F6BA2C6791C@Slim64.dmpriest.net.uk>
	<20080912160422.GB60094@icarus.home.lan>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Cc: freebsd-hackers@freebsd.org
Subject: Re: ZFS w/failing drives - any equivalent of Solaris FMA?
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 12 Sep 2008 19:45:55 -0000


--On 12 September 2008 09:04 -0700 Jeremy Chadwick <koitsu@FreeBSD.org> 
wrote:

> I know ATA will notice a detached channel, because I myself have done
> it: administratively, that is -- atacontrol detach ataX.  But the only
> time that can happen "automatically" is if the actual controller does
> so itself, or if FreeBSD is told to do it administratively.

I think the problem at the moment is, ZFS "doesn't care" - it's 
deliberately remote from things like drivers, and drives - and at the 
moment, there's no 'middle layer' or way for at least the ATA drivers to 
communicate to ZFS that a drive 'has failed' (I mean, for starters, you've 
got the problem of "what's a failed drive" - presumably a drive that's 
operating outside a set of limits? - The first probably being 'is it still 
attached?' :)

That was a thread recently on the Open Solaris ZFS forum - and discussed at 
length...

> I am also very curious to know the exact brand/model of 8-port SATA
> controller from Supermicro you are using, *especially* if it uses ata(4)
> rather than CAM and da(4).

The controllers ID as:

  Marvell 88SX6081 SATA300 controller

They're SuperMicro 8 PORT PCI-X SATA controllers, 'AOC-SAT2-MV8' - and they 
definitely show as 'adX'

> Such Supermicro controllers were recently
> discussed on freebsd-stable (or was it -hardware?), and no one was able
> to come to a concise decision as to whether or not they were decent or
> even remotely trusted.  Supermicro provides a few different SATA HBAs.

Well, I've tested these cards for a number of months now, and they seem 
fine  here - at least with the WD drives I'm currently running (not saying 
they're 'perfect' - but for my setup, I've not seen any issues). I didn't 
notice any 'bad behaviour' when testing them under UFS, and when running 
under ZFS they've picked up no checksum errors (or console messages) for 
the duration the box has been running.

> I can see the usefulness in Solaris's FMA thing.  My big concern is
> whether or not FMA actually pulls the disk off the channel, or if it
> just leaves the disk/channel connected and simply informs kernel pieces
> not to use it.  If it pulls the disk off the channel, I have serious
> qualms with it.

I don't think it pulls it - I think it's looks at it's policies, and does 
what they say, which would seem to be the equivalent of 'zpool offline dev' 
by default (which, again doesn't pull it off any busses - it just notifies 
ZFS not to send I/O to that device).

I'll have to do a test using da / CAM driven disks (or ask someone who 
worked on the port ;) - but I'd guess, unless there's something been added 
to CAM to tell ZFS to offline the disk, it'll do the same - i.e. ZFS will 
continue to issue I/O requests to disks as it needs - as at least in Open 
Solaris, it's deemed *not* to be ZFS's job to detect failed disks, or do 
anything about them - other than what it's told.

ZFS under FreeBSD still works despite this (and works wonderfully well) - 
it just means if any of your drives 'go out to lunch' - unless they fail in 
such a way that the I/O requests are returned immediately as 'failed' (i.e. 
I guess if the device node has gone) - ZFS will keep issuing (and 
potentially pausing) waiting for I/O requests to failed drives, because it 
doesn't know, doesn't care - and hasn't been told to do otherwise.

-Kp