From owner-freebsd-scsi@FreeBSD.ORG  Tue Nov  1 20:32:04 2011
Return-Path: <owner-freebsd-scsi@FreeBSD.ORG>
Delivered-To: freebsd-scsi@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 76867106566B
	for <freebsd-scsi@freebsd.org>; Tue,  1 Nov 2011 20:32:04 +0000 (UTC)
	(envelope-from nitroboost@gmail.com)
Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com
	[209.85.214.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 004EC8FC16
	for <freebsd-scsi@freebsd.org>; Tue,  1 Nov 2011 20:32:03 +0000 (UTC)
Received: by bkbzs2 with SMTP id zs2so5225899bkb.13
	for <freebsd-scsi@freebsd.org>; Tue, 01 Nov 2011 13:32:02 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type;
	bh=+Fn/8sZCdbJlV92LzMXlrggIA4HOQJixyp4tB0xttk8=;
	b=YJSajKwityjlkE+LWXxvjnx90odRVY0+VAOF5/nzH8MS2SIe6pqAzX4xcAhEm/xpOB
	hEXVsJJHVCgv/4CV+Vsrs/jH7EB9OHmHwjdcdur9tt0Y8HTH9Yzx34Bed6sAAHYSgI8v
	OhKwBlFyocOQIW6nDVF3D4PaatHOgMuEesiTc=
MIME-Version: 1.0
Received: by 10.182.74.41 with SMTP id q9mr257137obv.28.1320179522178; Tue, 01
	Nov 2011 13:32:02 -0700 (PDT)
Received: by 10.182.35.193 with HTTP; Tue, 1 Nov 2011 13:32:01 -0700 (PDT)
In-Reply-To: <4EAEF431.7090108@brockmann-consult.de>
References: <CAAAm0r2-pXLEZVoG7g_dkym6MzLJXggjOQh3a8t5QO90vPJvfw@mail.gmail.com>
	<4EAEF431.7090108@brockmann-consult.de>
Date: Tue, 1 Nov 2011 13:32:01 -0700
Message-ID: <CAAAm0r1T1ifTQt5A5O+jwUoKoGjzcbho606wCt4SpM3AQ-WM3Q@mail.gmail.com>
From: Jason Wolfe <nitroboost@gmail.com>
To: Peter Maloney <peter.maloney@brockmann-consult.de>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Cc: freebsd-scsi@freebsd.org
Subject: Re: mps/LSI SAS2008 controller crashes when smartctl is run with
 upped disk tags
X-BeenThere: freebsd-scsi@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: SCSI subsystem <freebsd-scsi.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-scsi>,
	<mailto:freebsd-scsi-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-scsi>
List-Post: <mailto:freebsd-scsi@freebsd.org>
List-Help: <mailto:freebsd-scsi-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-scsi>,
	<mailto:freebsd-scsi-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 01 Nov 2011 20:32:04 -0000

On Mon, Oct 31, 2011 at 12:17 PM, Peter Maloney <
peter.maloney@brockmann-consult.de> wrote:

> Dear Jason,
>
> I get a simlar problem on a system with an LSI 9211-8i with 20 SATA
> disks attached (2 SSDs and 18 spnning disks). My system doesn't hang,
> panic, or reset though. I just lose access to one disk, which is then
> considered FAULTED in my zpool status (with the ZFS file system). If I
> physically remove the FAULTED disk and run "gpart recover da0", I get a
> panic. Otherwise, the system keeps running in a degraded state.  When I
> reboot and resilver, some data is found damaged and repaired, not just
> refreshed with the latest state. The server has 1 HBA and 2 backplanes,
> and I have the 2 mirrored root disks on different backplanes. Maybe that
> is why mine runs degraded and yours hang.
>
> This happened twice so far (in around a month or two), and both times it
> was one of the mirrored root disks (SSDs) that faulted.
>
> My tags are set to 255. I will try reproducing it as you said, and then
> if it fails, rebooting and trying again setting tags to 2 as you suggested.
>
> And *thank you very much for this information*. This is the last
> outstanding issue with this server. I hope this workaround helps.
>
> # camcontrol tags /dev/da0
> (pass0:mps0:0:7:0): device openings: 255
>

Peter,

This happens 'randomly' for you, or do you have some automated process
running smartctl that trips the drives up occasionally? The way I'm getting
around it currently is to just move /usr/local/sbin/smartctl elsewhere, and
replacing it with a wrapper that simply drops the tags to 1, executes to
the new smartctl location with the options passed, then moves the tags back
to whatever you prefer. There will obviously be a small detriment here, but
it should be fairly quick and hopefully not even noticeable in your case.

If smartctl is not triggering these events for you, any idea what is?

Jason