Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 28 Sep 2015 11:44:06 -0400
From:      Rich <rincebrain@gmail.com>
To:        Graham Allan <allan@physics.umn.edu>
Cc:        =?UTF-8?Q?Karli_Sj=C3=B6berg?= <karli.sjoberg@slu.se>,  "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Subject:   Re: Cannot replace broken hard drive with LSI HBA
Message-ID:  <CAOeNLur0qu=Qku8O47MYcQkenxa9Xv385cTF5vgAoYfLXHUFog@mail.gmail.com>
In-Reply-To: <5609578E.1050606@physics.umn.edu>
References:  <1443447383.5271.66.camel@data-b104.adm.slu.se> <5609578E.1050606@physics.umn.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Sep 28, 2015 at 11:06 AM, Graham Allan <allan@physics.umn.edu> wrote:
> I have seen this and keep experiencing it. I posted a question about it a
> while back but I don't think there was much response.
>
> https://lists.freebsd.org/pipermail/freebsd-fs/2014-July/019715.html
>
> My original question was with 9.1, and at the time we discovered that if you
> ran the LSI utility "sas2ircu", for example simply "sas2ircu 0 DISPLAY", it
> was seem to ang for a while, then issue a bus reset, and the replaced drives
> are detected.
>
> Now that I also see the same issue on 9.3, running sas2ircu in this
> situation usually seems to cause a panic, so it's not exactly progress.
>
> https://lists.freebsd.org/pipermail/freebsd-scsi/2015-August/006794.html

Neat. In theory, the driver architecture is supposed to try resetting
increasingly large parts of the tree, I believe (e.g. a drive reset,
then the parent of that drive in the topology, then the parent of
that, and so on, until it ultimately should try resetting the
controller...)

Does using sas2ircu to induce a bus reset still work with P19 firmware
on 9.1, or older firmware on 9.3?

Does anything in the atacontrol/camcontrol family of commands do
anything useful for you?

> I am using Dell servers, generally R710 and R720, with LSI 9207-8e
> controllers, Supermicro JBZOD chassis, and mostly WD drives. I got the above
> problems using firmware 16 (probably) with both 9.1 and 9.3.
>
> Regarding your experience with firmware 20, I believe it is "known bad",
> though some seem to disagree. Certainly when building my recent-ish large
> 9.3 servers I specifically tested it and got consistent data corruption.
> There is now a newer release of firmware 20 , "20.00.04.00" which seems to
> be fixed - see this thread:
>
> https://lists.freebsd.org/pipermail/freebsd-scsi/2015-August/006793.html
>
> This is kind of painful as the new firmware was posted by LSI with no
> comment or no release notes, yet if you follow all the references there are
> hints that it was known internally to be problematic. It's bad if selecting
> the HBA firmware for FreeBSD is degenerated to a "black art" but that seems
> to be where it is right now.

It's kind of impressive; I have all of the P20 firmware files
downloaded for the old and new releases, and none of the "new"
firmware files include a changelog newer than the one in the
"original" P20 releases.

> I don't know that there are any other viable choices for SAS HBA besides LSI
> - I've never heard of any.

Only vendors I ever knew of were Areca and Adaptec - the former uses
LSI chips with some additional sauce in their SAS cards these days,
and the latter got bought by PMC and seems to be badly playing
catch-up with LSI (e.g. they have no 12 Gb/s HBAs at all,
currently...)

Marvell made SATA-only cards, but I haven't seen any large-scale
versions of those since PCI-X, and that doesn't help if you have
anything but a pure SATA topology.

- Rich



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOeNLur0qu=Qku8O47MYcQkenxa9Xv385cTF5vgAoYfLXHUFog>