Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 4 Jun 2012 17:54:03 -0500
From:      Dustin Wenz <dustinwenz@ebureau.com>
To:        freebsd-fs@freebsd.org
Subject:   Can mps drop a failing device from bus?
Message-ID:  <5532CFFB-F943-4D9E-9722-7FB9C8A9F82A@ebureau.com>

next in thread | raw e-mail | index | archive | help
I asked this question back in April on the stable list with no response =
( =
http://lists.freebsd.org/pipermail/freebsd-stable/2012-April/067305.html =
). I've now been seeing the same behavior on 9.0-release, and I thought =
it would be good to ask again here.

There is a failure mode for SATA disks (Seagate Barracuda ST3000DM001 =
disks, in this case) that the mps driver doesn't handle very well. If a =
disk is slow to respond, or is unresponsive altogether, I'd like it to =
be removed from the bus and degrade the zpool that it's a part of.

The way things are now, mps will just report a lot of "SCSI command =
timeout on device" messages. Any I/O on the affected zpools will hang =
for an excessive amount of time (sometimes forever). We typically =
configure our storage volumes as a pool of mirrors, with the expectation =
that availability will be maintained if any redundant disk(s) should =
fail. Unfortunately, availability is actually made *worse* on =
highly-redundant mirrors when mps won't give up on an unresponsive =
device.

It's possible that I'm overlooking an obvious solution, or some relevant =
configuration options for the driver. Can anyone offer some insight on =
this?

Thanks,

	- .Dustin




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5532CFFB-F943-4D9E-9722-7FB9C8A9F82A>