Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 12 Dec 2015 08:41:00 -0700
From:      Alan Somers <asomers@freebsd.org>
To:        Mike Geiger <mykel@mware.ca>
Cc:        FreeBSD-scsi <freebsd-scsi@freebsd.org>
Subject:   Re: Informal(?) sesX messages
Message-ID:  <CAOtMX2ik4Owisg=Bmx6nGAbWj0_6GQJmGcejix7jW5zzx%2B-Xeg@mail.gmail.com>
In-Reply-To: <566BC34D.2020404@mware.ca>
References:  <566B4F68.2040807@mWare.ca> <CAOtMX2ibBUkS58EXfTc=Aznf_oc%2B_y4fC1xNAo=1F-yNSmTwSA@mail.gmail.com> <566B8E2A.8070404@mWare.ca> <CAOtMX2jQUQqDuW21grACVvYzdNcREdtMB55=2YR8TZ9V22FGqg@mail.gmail.com> <566BC34D.2020404@mware.ca>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Dec 11, 2015 at 11:48 PM,  <mykel@mware.ca> wrote:
> On 2015/12/11 22:55, Alan Somers wrote:
>>
>> On Fri, Dec 11, 2015 at 8:02 PM,  <Mykel@mware.ca> wrote:
>>>
>>> On 15-12-11 17:44, Alan Somers wrote:
>>>>
>>>> On Fri, Dec 11, 2015 at 3:34 PM,  <Mykel@mware.ca> wrote:
>>>>>
>>>>> Hi all, please CC me on reply as I'm not subscribed to this list.
>>>>>
>>>>> I've got one of those Supermicro 72-drive monster machines, all ZFS'd
>>>>> up.
>>>>>
>>>>> https://www.supermicro.com/products/system/4u/6048/SSG-6048R-E1CR72L.cfm
>>>>>
>>>>> And before & after replacing a faulty SAS Expander and a pair of cables
>>>>> (gobs of WRITE/ABORT errors), I'm still occasionally seeing these
>>>>> kernel
>>>>> messages (in groups), and I'm not sure if they're benign, or pointing
>>>>> to
>>>>> a
>>>>> SAS expander event... or what. I admit, this is my first time dealing
>>>>> with a
>>>>> machine with SAS expanders, so I'm a bit out of my depth in diagnosis
>>>>> thereof.
>>>>>
>>>>> Dec 11 16:06:54 ZFS-AF kernel: ses5: da7,pass7: Element descriptor:
>>>>> 'Slot00'
>>>>> Dec 11 16:06:54 ZFS-AF kernel: ses5: da7,pass7: SAS Device Slot
>>>>> Element:
>>>>> 1
>>>>> Phys at Slot 0
>>>>> Dec 11 16:06:54 ZFS-AF kernel: ses5:  phy 0: SAS device type 1 id 0
>>>>> Dec 11 16:06:54 ZFS-AF kernel: ses5:  phy 0: protocols: Initiator( None
>>>>> )
>>>>> Target( SSP )
>>>>> Dec 11 16:06:54 ZFS-AF kernel: ses5:  phy 0: parent 500304801ea2df3f
>>>>> addr
>>>>> 5000c500844bd449
>>>>>
>>>> These look like device arrival notifications.  If you scroll up, do
>>>> you see any departure notifications?  They should look like this:
>>>>
>>>> mps0: mpssas_prepare_remove: Sending reset for target ID 10
>>>> da0 at mps0 bus 0 scbus0 target 10 lun 0
>>>> da0: <ATA Hitachi HUA72201 A39C> s/n       JPW930HQ15H26H detached
>>>> mps0: Unfreezing devq for target ID 10
>>>> xpt_release_devq(): requested 1 > present 0
>>>> (da0:mps0:0:10:0): Periph destroyed
>>>>
>>>> Also, could you post your HBA and expander firmware versions?
>>>>
>>>> -Alan
>>>
>>>
>>> I can say, without doubt, that I do NOT have any preceding detachments...
>>> which is why I'm so baffled by the messages. If the devices aren't
>>> de/reattaching, what's the point of these informal/benign ones? I am
>>> familiar with them from other hot-swap and disk failure scenarios in
>>> other
>>> machines.
>>>
>>> Could this be a driver bug not logging the disconnection? But when I
>>> hot-unplugged them, I do see that in dmesg.
>>> Or does SAS do something where it might renegotiate or reconfigure the
>>> lanes, and I'm just seeing it do that?
>>>
>>> Thanks,
>>>
>>> Myke
>>>
>>>
>>> dev.mpr.0.driver_version: 09.255.01.00-fbsd
>>> dev.mpr.0.firmware_version: 06.00.00.00
>>> dev.mpr.1.driver_version: 09.255.01.00-fbsd
>>> dev.mpr.1.firmware_version: 08.00.00.00
>>> dev.mpr.2.driver_version: 09.255.01.00-fbsd
>>> dev.mpr.2.firmware_version: 08.00.00.00
>>>
>>> [root@ZFS-AF ~]# sg_inq --hex --len=64 ses0
>>>   00     0d 00 05 02 33 00 40 02  4c 53 49 20 20 20 20 20 ....3.@.LSI
>>>   10     53 41 53 33 78 34 38 20  20 20 20 20 20 20 20 20 SAS3x48
>>>   20     30 37 30 31 78 34 38 2d  36 36 2e 37 2e 31 2e 31
>>> 0701x48-66.7.1.1
>>>   30     37 00 20 20 20 20 20 20 7.
>>> [root@ZFS-AF ~]# sg_inq --hex --len=64 ses1
>>>   00     0d 00 05 02 33 00 40 02  4c 53 49 20 20 20 20 20 ....3.@.LSI
>>>   10     53 41 53 33 78 33 36 20  20 20 20 20 20 20 20 20 SAS3x36
>>>   20     30 37 30 31 78 33 36 2d  36 36 2e 37 2e 31 2e 31
>>> 0701x36-66.7.1.1
>>>   30     37 00 20 20 20 20 20 20 7.
>>> [root@ZFS-AF ~]# sg_inq --hex --len=64 ses2
>>> SCSI INQUIRY failed on ses2, res=-1
>>> [root@ZFS-AF ~]# sg_inq --hex --len=64 ses3
>>> SCSI INQUIRY failed on ses3, res=-1
>>> [root@ZFS-AF ~]# sg_inq --hex --len=64 ses4
>>>   00     0d 00 05 02 33 00 40 02  4c 53 49 20 20 20 20 20 ....3.@.LSI
>>>   10     53 41 53 33 78 32 38 20  20 20 20 20 20 20 20 20 SAS3x28
>>>   20     30 37 30 31 78 32 38 2d  36 36 2e 37 2e 31 2e 31
>>> 0701x28-66.7.1.1
>>>   30     37 00 20 20 20 20 20 20 7.
>>> [root@ZFS-AF ~]# sg_inq --hex --len=64 ses5
>>>   00     0d 00 05 02 33 00 40 02  4c 53 49 20 20 20 20 20 ....3.@.LSI
>>>   10     53 41 53 33 78 34 38 20  20 20 20 20 20 20 20 20 SAS3x48
>>>   20     30 37 30 31 78 34 38 2d  36 36 2e 37 2e 31 2e 31
>>> 0701x48-66.7.1.1
>>>   30     37 00 20 20 20 20 20 20 7.
>>> [root@ZFS-AF ~]#
>>>
>>>
>>> And here's dmesg after fresh reboot:
>>
>> Well, that's weird.  Your firmware versions look OK, though you might
>> want to upgrade mpr0 just to be consistent.  The next thing I would
>> check, if I were you, would be devctl messages.  Edit /etc/syslog.conf
>> and change devd's loglevel to INFO, then HUP syslogd.  Now every
>> devctl message should get logged in /var/log/devd.log.  That will tell
>> you more precisely than dmesg whether there are any arrival or
>> departure events.
>>
>> -Alan
>
> Huh, I never noticed the 6 vs. 8; curiously, mpr0 and mpr1 are the two
> connected to the front expander... and where I've never seen an issue. Tho
> perhaps I scrambled which cards are serving was which in my testing - I also
> moved mpr2 to sit on the other CPU's PCI bus.
>
> I've added the devd log, although I haven't been able to trigger the event
> yet anyway.
> Tried to assert hw.mpr.2.debug_level, however it seems like hw.mpr doesn't
> exist.

hw.mpr.debug_level is a tunable which, if set at boot time, will
affect all mpr cards.  What you want is dev.mpr.2.debug_level, a
runtime-controllabel sysctl.

>
> Finally, I haven't the slightest clue how to update the firmware; the Avago
> site only has a product brochure for the 3008 anyway :(

It's fairly annoying.  First, you must figure out which card you have.
3008 is the name of your chip.  Your card is probably a 9300-9i.  If
so, go to this URL and click on firmware.  If you download
"Installer_P10_for_UEFI" then you can install it through the EFI
shell.  But they also have an installer that runs in FreeBSD.  To use
that, download both "Installer_P10_for_FreeBSD" AND
"Installer_P10_for_MSDOS_and_Windows".  Unzip the latter and extract
the .bin file.  Then unzip the former and run the executable contained
within, providing the path to the .bin file obtained from the latter.
You'll need a reboot afterwards.

http://www.avagotech.com/products/server-storage/host-bus-adapters/sas-9300-8i#downloads

-Alan



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOtMX2ik4Owisg=Bmx6nGAbWj0_6GQJmGcejix7jW5zzx%2B-Xeg>