Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 28 Sep 2015 10:27:08 -0400
From:      Rich <rincebrain@gmail.com>
To:        =?UTF-8?Q?Karli_Sj=C3=B6berg?= <karli.sjoberg@slu.se>
Cc:        "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Subject:   Re: Cannot replace broken hard drive with LSI HBA
Message-ID:  <CAOeNLuqejHVf5-m4J3zB5NpeKzQt6EdE=BV5hVFWzQKOcj_VZQ@mail.gmail.com>
In-Reply-To: <1443447383.5271.66.camel@data-b104.adm.slu.se>
References:  <1443447383.5271.66.camel@data-b104.adm.slu.se>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi Karli,
Which mps-supported HBA? Your firmware version indicates it's
something in the 92xx family, but there's a number of variants on that
flavor.

Have you played with any of the drive timeout settings in the HBA
firmware/OS/drives themselves (the dark vendor-specific magic known
variously as TLER, CCTL, ERC...)?

What models are the servers?

There are a number of possible complicating factors here - whether the
drives are SAS or SATA (and any "quirks" of the drives), whether the
backplanes are passive or have SAS expanders, what version of SAS/SATA
these backplanes are capable of handling, any firmware strangeness on
the passive or otherwise backplane...

How does the machine misbehave once you re-insert the drive?

Does the machine misbehave if you keep the drive removed?

One final quirk I'll mention is that a number of SAS expander
backplanes I've encountered sometimes will not notice a drive is
physically pulled until a new drive is inserted, and sometimes the
best way to convince it to see a drive after pulling one that was
misbehaving is:
- seat a "new" (not otherwise in the machine) drive
- unseat said drive after a few seconds
- seat whatever drive you intended to seat in the first place, be it
"new" or the original drive

Good luck,

- Rich

On Mon, Sep 28, 2015 at 9:36 AM, Karli Sj=C3=B6berg <karli.sjoberg@slu.se> =
wrote:
> Hey all!
>
> I=C2=B4m just giving a shout out here to see if anyone else have had simi=
lar
> experiences working with LSI/Avago HBA's in FreeBSD.
>
> For some time now, about a year or so, we=C2=B4ve had several times were =
hard
> drives have dropped out, you pull it out, pop a new back in, but it
> never shows up in the OS. When inserted, nothing prints in the logs, and
> physically, it just blinks for a half a second, then nothing. The entire
> server then needs to be rebooted to get the drive back.
>
> As for the hardware, we have several SuperMicro servers, an HP, and an
> old SUN server that all have this problem. It=C2=B4s happened with both o=
ld
> and new drives from different manufacturers and sizes. The only thing in
> common has been the LSI/Avago HBA.
>
> The software is FreeBSD-10.1-STABLE as per this[*] bug, very close to
> 10.2-RELEASE, mps driver version 20 and the firmware has been flashed to
> 19. Also tried firmware version 20 but ZFS went nuts, displaying
> checksum errors on just about every disk in the pool.
>
> I=C2=B4ts gotten to the point I=C2=B4m fed up and have to ask if someone =
else
> could think of a fix, since neither software nor firmware upgrade seems
> to make a difference. Or to suggest another HBA instead?
>
> Thanks in advance!
>
> /K
>
> [*]: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D191348
>
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOeNLuqejHVf5-m4J3zB5NpeKzQt6EdE=BV5hVFWzQKOcj_VZQ>