Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 27 Oct 2012 13:10:36 +0900
From:      Stephane LAPIE <stephane.lapie@darkbsd.org>
To:        freebsd-scsi@freebsd.org
Subject:   LSI mpt(4) driver problem : can't SMART poll, controller freezes
Message-ID:  <508B5EBC.8070509@darkbsd.org>

next in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--------------enig6D23D7F6B37ED6706408533D
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Hello list,

I have two controller cards of the following make (PCI-X controllers) :
Oct 24 09:26:00 eirei-no-za kernel: mpt0: <LSILogic SAS/SATA Adapter>
port 0x2000-0x20ff mem 0xdfa20000-0xdfa23fff,0xdfa00000-0xdfa0ffff irq
24 at device 1.0 on pci6
Oct 24 09:26:00 eirei-no-za kernel: mpt0: MPI Version=3D1.5.12.0
Oct 24 09:26:00 eirei-no-za kernel: mpt0: Capabilities: ( RAID-0 RAID-1E
RAID-1 )
Oct 24 09:26:00 eirei-no-za kernel: mpt0: 0 Active Volumes (2 Max)
Oct 24 09:26:00 eirei-no-za kernel: mpt0: 0 Hidden Drive Members (10 Max)=


Oct 24 09:26:00 eirei-no-za kernel: mpt1: <LSILogic SAS/SATA Adapter>
port 0x2400-0x24ff mem 0xdfa24000-0xdfa27fff,0xdfa10000-0xdfa1ffff irq
28 at device 7.0 on pci6
Oct 24 09:26:00 eirei-no-za kernel: mpt1: MPI Version=3D1.5.12.0
Oct 24 09:26:00 eirei-no-za kernel: mpt1: Capabilities: ( RAID-0 RAID-1E
RAID-1 )
Oct 24 09:26:00 eirei-no-za kernel: mpt1: 0 Active Volumes (2 Max)
Oct 24 09:26:00 eirei-no-za kernel: mpt1: 0 Hidden Drive Members (10 Max)=


Each of them having 8 ports used in the following fashion :
<ATA ST32000641AS CC13>            at scbus0 target 0 lun 0 (pass0,da0)
<ATA ST32000542AS CC37>            at scbus0 target 1 lun 0 (pass1,da1)
<ATA ST32000641AS CC13>            at scbus0 target 3 lun 0 (pass2,da2)
<ATA ST32000641AS CC13>            at scbus0 target 4 lun 0 (pass3,da3)
<ATA ST32000542AS CC34>            at scbus0 target 5 lun 0 (pass4,da4)
<ATA ST32000641AS CC13>            at scbus0 target 6 lun 0 (pass5,da5)
<ATA ST32000542AS CC37>            at scbus0 target 7 lun 0 (pass6,da6)

<ATA ST32000641AS CC13>            at scbus2 target 0 lun 0 (pass7,da7)
<ATA ST32000542AS CC34>            at scbus2 target 1 lun 0 (pass8,da8)
<ATA ST32000542AS CC37>            at scbus2 target 2 lun 0 (pass9,da9)
<ATA ST32000542AS CC34>            at scbus2 target 3 lun 0 (pass10,da10)=

<ATA ST32000542AS CC34>            at scbus2 target 4 lun 0 (pass11,da11)=

<ATA ST32000542AS CC37>            at scbus2 target 5 lun 0 (pass12,da12)=

<ATA ST32000542AS CC34>            at scbus2 target 6 lun 0 (pass13,da13)=

<ATA ST32000641AS CC13>            at scbus2 target 7 lun 0 (da14,pass14)=


It should also be noted that I have to override the default SCSI timeout
delay, in order to ensure proper detection of all devices at boot by
putting the following in /boot/loader.conf :
kern.cam.scsi_delay=3D15000

I wanted to know if anyone had experienced the following problems, and
found a way around them :



1) I can't run any detailed and meaningful SMART polls on disks
belonging to these controllers. (execution logs as separate files)

As can be seen I am running the latest available version of smartctl
from the ports :
http://www.yomi.darkbsd.org/~darksoul/eirei-no-za-broken-disk-smart-log.t=
xt

(Using the pass devices gives the same result)

Only the "-d scsi" polling returns somewhat meaningful info whatsoever
(disk serial number etc), but even that is error-inducing, as the disk
was actually nearing death.
Here is the full SMART log recovered from running the disk from a
USB->SATA device :
http://www.yomi.darkbsd.org/~darksoul/eirei-no-za-broken-disk-smart-log2.=
txt

I actually have scripts to monitor that, but it obviously relies on
smartctl being able to do its job, which it's not...
(Also, this worked perfectly fine under 8-STABLE with "-d sat"...)



2) Also, less annoying but still a show-stopper sort of for any serious
work requiring high availability :
Any disk I/O freeze ends up locking the whole controller (and the whole
ZFS pool...) until either the server crashes or the disk bails out,
whichever comes first, really. (kernel log as separate file)

http://www.yomi.darkbsd.org/~darksoul/eirei-no-za-mpt-timeout.txt


Thanks for your time.

--=20
Stephane LAPIE, EPITA SRS, Promo 2005
"Even when they have digital readouts, I can't understand them."
--MegaTokyo


--------------enig6D23D7F6B37ED6706408533D
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://www.enigmail.net/

iEYEARECAAYFAlCLXsIACgkQ24Ql8u6TF2PCGQCg4ohBfi7CAtQY1++GZt4PtvdV
ZngAn0wEWovubo+PRhLKdcMi45fJsB0S
=tWfh
-----END PGP SIGNATURE-----

--------------enig6D23D7F6B37ED6706408533D--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?508B5EBC.8070509>