Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 1 Nov 2011 13:32:01 -0700
From:      Jason Wolfe <nitroboost@gmail.com>
To:        Peter Maloney <peter.maloney@brockmann-consult.de>
Cc:        freebsd-scsi@freebsd.org
Subject:   Re: mps/LSI SAS2008 controller crashes when smartctl is run with upped disk tags
Message-ID:  <CAAAm0r1T1ifTQt5A5O%2BjwUoKoGjzcbho606wCt4SpM3AQ-WM3Q@mail.gmail.com>
In-Reply-To: <4EAEF431.7090108@brockmann-consult.de>
References:  <CAAAm0r2-pXLEZVoG7g_dkym6MzLJXggjOQh3a8t5QO90vPJvfw@mail.gmail.com> <4EAEF431.7090108@brockmann-consult.de>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Oct 31, 2011 at 12:17 PM, Peter Maloney <
peter.maloney@brockmann-consult.de> wrote:

> Dear Jason,
>
> I get a simlar problem on a system with an LSI 9211-8i with 20 SATA
> disks attached (2 SSDs and 18 spnning disks). My system doesn't hang,
> panic, or reset though. I just lose access to one disk, which is then
> considered FAULTED in my zpool status (with the ZFS file system). If I
> physically remove the FAULTED disk and run "gpart recover da0", I get a
> panic. Otherwise, the system keeps running in a degraded state.  When I
> reboot and resilver, some data is found damaged and repaired, not just
> refreshed with the latest state. The server has 1 HBA and 2 backplanes,
> and I have the 2 mirrored root disks on different backplanes. Maybe that
> is why mine runs degraded and yours hang.
>
> This happened twice so far (in around a month or two), and both times it
> was one of the mirrored root disks (SSDs) that faulted.
>
> My tags are set to 255. I will try reproducing it as you said, and then
> if it fails, rebooting and trying again setting tags to 2 as you suggested.
>
> And *thank you very much for this information*. This is the last
> outstanding issue with this server. I hope this workaround helps.
>
> # camcontrol tags /dev/da0
> (pass0:mps0:0:7:0): device openings: 255
>

Peter,

This happens 'randomly' for you, or do you have some automated process
running smartctl that trips the drives up occasionally? The way I'm getting
around it currently is to just move /usr/local/sbin/smartctl elsewhere, and
replacing it with a wrapper that simply drops the tags to 1, executes to
the new smartctl location with the options passed, then moves the tags back
to whatever you prefer. There will obviously be a small detriment here, but
it should be fairly quick and hopefully not even noticeable in your case.

If smartctl is not triggering these events for you, any idea what is?

Jason



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAAAm0r1T1ifTQt5A5O%2BjwUoKoGjzcbho606wCt4SpM3AQ-WM3Q>