Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 19 Mar 2012 11:52:17 -0700
From:      Jason Wolfe <nitroboost@gmail.com>
To:        dgilbert@interlog.com, "Desai, Kashyap" <Kashyap.Desai@lsi.com>,  "McConnell, Stephen" <Stephen.McConnell@lsi.com>
Cc:        "freebsd-scsi@freebsd.org" <freebsd-scsi@freebsd.org>
Subject:   Re: LSI2008 controller clobbers first disk with new LSI mps driver
Message-ID:  <CAAAm0r1JMVvhjiD=u%2BqQvgvu7810DHEqan8hwJmSfmpqRqLcuA@mail.gmail.com>
In-Reply-To: <4F58F9DC.2040606@interlog.com>
References:  <CAAAm0r2NFhF=eh2bOPMnVN8E6e2o0KfaST0N-M_gWoJHpFOLmQ@mail.gmail.com> <CAAAm0r1pWN-F=madGdk7N%2BoRuZmSD5_MAYwLh6By126L0CTGuw@mail.gmail.com> <B2FD678A64EAAD45B089B123FDFC3ED72B96D34558@inbmail01.lsi.com> <CAAAm0r1x15_ho2MD0tX7Y7A6mnU2N6zihNOz_Qz=jpsyBkDCWQ@mail.gmail.com> <B2FD678A64EAAD45B089B123FDFC3ED72B96D3455B@inbmail01.lsi.com> <B2FD678A64EAAD45B089B123FDFC3ED72B96D34626@inbmail01.lsi.com> <CAAAm0r3_S2jTG=Te4UhLqHPqiXq7_aAOHNp=W3jb4KLJx9PTRg@mail.gmail.com> <B2FD678A64EAAD45B089B123FDFC3ED72B96D34748@inbmail01.lsi.com> <CAAAm0r3oRTcfipyVcp9nE1CL3dcK7cft8AUSf%2BfGYVK90b2A0w@mail.gmail.com> <B2FD678A64EAAD45B089B123FDFC3ED72B96D347BB@inbmail01.lsi.com> <4F450814.4020100@interlog.com> <B2FD678A64EAAD45B089B123FDFC3ED72B96D34840@inbmail01.lsi.com> <CAAAm0r2x4wAkPzpJkCJ8FYnFkxb3RrXnENRZi-Kgfh=y3ZuqEA@mail.gmail.com> <4F4C14A8.3050105@interlog.com> <B2FD678A64EAAD45B089B123FDFC3ED72B96E62A30@inbmail01.lsi.com> <4F58F9DC.2040606@interlog.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Kashyap/LSI,

Any movement here?  I'm also trying to test some bug fixes to the em
driver in 8.3-PRERELEASE, but am still locked into the old mps driver
from last year because of this issue.  If this is something we need to
take offline I can start up an internal thread if that works.

Thanks again,

Jason

On Thu, Mar 8, 2012 at 11:26 AM, Douglas Gilbert <dgilbert@interlog.com> wr=
ote:
> Kashyap,
> Backing up ... I thought this thread was about the mps
> driver failing to do the SAS discover process properly.
> Hence a disk did not appear because it was "hidden" by a
> SES device which had the same (device?) slot number.
>
> The SAS discover process is described in section 4.2 of
> spl2r04b.pdf (at t10.org). That is the latest draft. Please
> note that section never mentions the word "slot". I would
> hazard a guess that no SAS standard or draft has ever
> mentioned slots in the context of the discover process.
>
> The concept of device "slot" *** numbers comes from the
> SCSI Enclosure Services (SES) standards of which ses3r04.pdf
> is the latest draft. SAS provides the slot number _optionally_
> in the long form SMP DISCOVER response and does _not_ provide
> the device slot number in the SMP DISCOVER LIST short form
> response.
>
> So IMO the device slot number is just a bit of helpful
> information that SAS might provide and that slot
> number should not interfere with the SAS discover
> process.
>
>
> *** the term "slot" is used in the SAS port layer state
> =A0 =A0machine in a different context. It is also possible
> =A0 =A0that "slot" is a term used in LSI firmware.
>
>
> A few data points: I have an Intel RES2SV240 which contains
> a LSI SAS2X24 expander and a HP Expander card which contains
> a PMC Sierra PM8005 SAS-2 expander. Both report a device slot
> number of 255 (i.e. not provided) via their SMP DISCOVER
> responses. When the inbuilt SES device on each expander is
> probed, the LSI part reports device slot numbers 0 through 23
> while the PMC part reports a device slot number of 0 for all
> array devices. In both cases the SES device itself is not
> listed amongst the SES "array device slot" elements.
>
> Doug Gilbert
>
>
>
> On 12-03-07 12:44 PM, Desai, Kashyap wrote:
>>
>> Jason,
>>
>> We discuss this issue with our architect and he has strong recommendatio=
n
>> not to provide any work-around where Enclosure configuration is not corr=
ect.
>> Similar issue was reported by other customer sometimes back and they hav=
e
>> also configured their Enclosure to resolve this issue.
>> "The enclosure configuration needs to be fixed so it advertises enough
>> slots (phys disks + num of SES devices) and it places the SES devices
>> (assigned slot numbers) above the physical disks."
>>
>>
>> ` Kashyap
>>
>>
>>> -----Original Message-----
>>> From: Desai, Kashyap
>>> Sent: Wednesday, February 29, 2012 10:08 PM
>>> To: 'dgilbert@interlog.com'; Jason Wolfe
>>> Cc: freebsd-scsi@freebsd.org; McConnell, Stephen
>>> Subject: RE: LSI2008 controller clobbers first disk with new LSI mps
>>> driver
>>>
>>> Hi Jason,
>>>
>>> I have started discussion with LSI internal folks to get better clarity
>>> on this issue. Since our key person is on vacation, we may get clarity
>>> on this next week.
>>> I cannot provide some temporary workaround in upstream(because this is
>>> against our design), but if you want to use for your environment, I can
>>> provide you some temporary patch.
>>>
>>> Doug,
>>>
>>> Thanks for providing your view and I have convey this to our architect.
>>>
>>> ~ Kashyap
>>>
>>>> -----Original Message-----
>>>> From: Douglas Gilbert [mailto:dgilbert@interlog.com]
>>>> Sent: Tuesday, February 28, 2012 5:11 AM
>>>> To: Jason Wolfe
>>>> Cc: Desai, Kashyap; freebsd-scsi@freebsd.org; McConnell, Stephen
>>>> Subject: Re: LSI2008 controller clobbers first disk with new LSI mps
>>>> driver
>>>>
>>>> On 12-02-27 02:59 PM, Jason Wolfe wrote:
>>>>>
>>>>> On Wed, Feb 22, 2012 at 9:11 AM, Desai,
>>>
>>> Kashyap<Kashyap.Desai@lsi.com>
>>>>
>>>> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Douglas Gilbert [mailto:dgilbert@interlog.com]
>>>>>>> Sent: Wednesday, February 22, 2012 8:52 PM
>>>>>>> To: Desai, Kashyap
>>>>>>> Cc: Jason Wolfe; freebsd-scsi@freebsd.org; McConnell, Stephen
>>>>>>> Subject: Re: LSI2008 controller clobbers first disk with new LSI
>>>
>>> mps
>>>>>>>
>>>>>>> driver
>>>>>>>
>>>>>>> On 12-02-22 03:39 AM, Desai, Kashyap wrote:
>>>>>>>>
>>>>>>>> Here is a possible root cause of this issue.
>>>>>>>>
>>>>>>>> Enclosure which you are using in your setup (might be) not
>>>>
>>>> configured
>>>>>>>
>>>>>>> properly.
>>>>>>>>
>>>>>>>>
>>>>>>>> You have Enclosure with 12 Slots + 1 SES Device.
>>>>>>>> See below detail from the log.
>>>>>>>>
>>>>>>>> =A0 =A0 =A0EventDataLength: 5
>>>>>>>> =A0 =A0 =A0AckRequired: 0
>>>>>>>> =A0 =A0 =A0Event: SasEnclDeviceStatusChange (0x1d)
>>>>>>>> =A0 =A0 =A0EventContext: 0x0
>>>>>>>> =A0 =A0 =A0EnclosureHandle: 0x2
>>>>>>>> =A0 =A0 =A0ReasonCode: Added
>>>>>>>> =A0 =A0 =A0PhysicalPort: 0
>>>>>>>> =A0 =A0 =A0NumSlots: 13
>>>>>>>> =A0 =A0 =A0StartSlot: 0
>>>>>>>> =A0 =A0 =A0PhyBits: 0xff
>>>>>>>>
>>>>>>>> StartSlot is 0 in this case.
>>>>>>>> Correct behavior should be each device on your enclosure must
>>>
>>> have
>>>>>>>
>>>>>>> different slot number starting from 0 till 12.
>>>>>>>>
>>>>>>>> I have doubt that SES device has not configured well and it is
>>>>
>>>> using
>>>>>>>
>>>>>>> slot-0 as default. This can create issue for actual device which
>>>
>>> is
>>>>>>>
>>>>>>> connected to slot-0.
>>>>>>>>
>>>>>>>> So In your setup you will have slot-0 till slot-11 assigned for
>>>>
>>>> actual
>>>>>>>
>>>>>>> Phys of your enclosures and again slot-0 is assigned for SES
>>>
>>> device
>>>>>>>
>>>>>>> instead of Slot-12.
>>>>>>>
>>>>>>> No. SAS-2 expanders typically have an integral SES device on an
>>>>>>> expander _virtual_ phy (see SMP DISCOVER (LIST) response). Once
>>>>>>> you see that virtual phy flag the slot number is irrelevant.
>>>>>>
>>>>>>
>>>>>> Doug,
>>>>>>
>>>>>> I need some more info so that I can understand your point better.
>>>>>>
>>>>>> I have one Enclosure setup on FreeBSD. Here is smp_discover output.
>>>>
>>>> (smp_discover_list is failing for me)
>>>>>>
>>>>>>
>>>>>> phy =A0 0: inaccessible (phy vacant)
>>>>>> =A0 phy =A0 1: inaccessible (phy vacant)
>>>>>> =A0 phy =A0 2: inaccessible (phy vacant)
>>>>>> =A0 phy =A0 3: inaccessible (phy vacant)
>>>>>> =A0 phy =A0 4:S:attached:[500605b012345888:03 =A0i(SSP+STP+SMP)] =A0=
6 Gbps
>>>>>> =A0 phy =A0 5:S:attached:[500605b012345888:02 =A0i(SSP+STP+SMP)] =A0=
6 Gbps
>>>>>> =A0 phy =A0 6:S:attached:[500605b012345888:01 =A0i(SSP+STP+SMP)] =A0=
6 Gbps
>>>>>> =A0 phy =A0 7:S:attached:[500605b012345888:00 =A0i(SSP+STP+SMP)] =A0=
6 Gbps
>>>>>> =A0 phy =A012:D:attached:[5000c5003bc2c389:00 =A0t(SSP)] =A06 Gbps
>>>>>> =A0 phy =A013:D:attached:[500000e116ee91e2:00 =A0t(SSP)] =A06 Gbps
>>>>>> =A0 phy =A014:D:attached:[5000c5003bc308e5:00 =A0t(SSP)] =A06 Gbps
>>>>>> =A0 phy =A015:D:attached:[5000c5003bc2f0d1:00 =A0t(SSP)] =A06 Gbps
>>>>>> =A0 phy =A016:D:attached:[5000c5003bc2ff3d:00 =A0t(SSP)] =A06 Gbps
>>>>>> =A0 phy =A017:D:attached:[5000c5003bae5fdd:00 =A0t(SSP)] =A06 Gbps
>>>>>> =A0 phy =A018:D:attached:[5000c5003bae5eb1:00 =A0t(SSP)] =A06 Gbps
>>>>>> =A0 phy =A019:D:attached:[5000c5003bc2d135:00 =A0t(SSP)] =A06 Gbps
>>>>>> =A0 phy =A020:D:attached:[5000c5003baea36d:00 =A0t(SSP)] =A06 Gbps
>>>>>> =A0 phy =A021:D:attached:[5000c5003bc2a8c9:00 =A0t(SSP)] =A06 Gbps
>>>>>> =A0 phy =A022:D:attached:[5000c5003bc237a9:00 =A0t(SSP)] =A06 Gbps
>>>>>> =A0 phy =A023:D:attached:[5000c5003bc2cec1:00 =A0t(SSP)] =A06 Gbps
>>>>>> =A0 phy =A024:D:attached:[500000e01d92cb52:00 =A0t(SSP)] =A03 Gbps
>>>>>> =A0 phy =A025:D:attached:[500000e01d74cfb2:00 =A0t(SSP)] =A03 Gbps
>>>>>> =A0 phy =A026:D:attached:[500000e01d656052:00 =A0t(SSP)] =A03 Gbps
>>>>>> =A0 phy =A027:D:attached:[500000e01d7cad52:00 =A0t(SSP)] =A03 Gbps
>>>>>> =A0 phy =A028:D:attached:[500c04f2b64cdd1c:00 =A0t(SATA)] =A03 Gbps
>>>>>> =A0 phy =A029:D:attached:[500c04f2b64cdd1d:00 =A0t(SATA)] =A03 Gbps
>>>>>> =A0 phy =A030:D:attached:[500000e01d73c262:00 =A0t(SSP)] =A03 Gbps
>>>>>> =A0 phy =A031:D:attached:[500000e01d536b22:00 =A0t(SSP)] =A03 Gbps
>>>>>> =A0 phy =A032:D:attached:[500000e01d92cab2:00 =A0t(SSP)] =A03 Gbps
>>>>>> =A0 phy =A033:D:attached:[500000e01afd8792:00 =A0t(SSP)] =A03 Gbps
>>>>>> =A0 phy =A034:D:attached:[5000c5003bc30301:00 =A0t(SSP)] =A06 Gbps
>>>>>> =A0 phy =A035:D:attached:[5000c5003bb09a69:00 =A0t(SSP)] =A06 Gbps
>>>>>> =A0 phy =A036:D:attached:[500c04f2b64cdd3d:00 =A0V i(SSP) t(SSP)] =
=A06
>>>
>>> Gbps<-
>>>>
>>>> -- This has virtual phy set.
>>>>>>
>>>>>>
>>>>>> What I understood from your explanation is if we have virt_phy
>>>
>>> field
>>>>
>>>> set, we should not trust slot for that entry.
>>>>>>
>>>>>> You are suggesting to use phy index instead of slot. Just for info:
>>>>
>>>> But how to see Slot details mapping with phy ?
>>>>
>>>> Kashyap,
>>>> I haven't written a SAS discover algorithm but there
>>>> must be plenty of examples out there. One way to do it
>>>> is to find all the phy_ids attached to targets, in this
>>>> case there are SAS (SSP) and SATA targets. Each SATA
>>>> target phy_id will correspond to one SATA disk (or could be an
>>>> ATAPI device (e.g. DVD/BD player)). The SSP targets are a
>>>> bit trickier because two (or more) phys could be connected
>>>> to the same target (either a wide port or multiple (target)
>>>> ports). With a wide port each component phy has the same
>>>> attached SAS address (so above you have a wide initiator
>>>> port (phy ids 4,5,6,7) but no wide target ports). If a
>>>> SAS disk has multiple target ports connected, FreeBSD
>>>> probably has a device node for each. So for each SCSI (SSP)
>>>> target port you need a REPORT LUNS command issued on LUN 0
>>>> (or the REPORT LUNS well known logical unit) to find the
>>>> LUs it contains. A device node is created for each LU.
>>>>
>>>> Anyway I'm sure many folks in LSI know the SAS discover
>>>> process better than I do. Ask them :-) Surely most of
>>>> the above is already done in your HBA's firmware.
>>>>
>>>>
>>>> BTW I don't think slot numbers are reliable and don't apply
>>>> to things on virtual phys so they will just cause you
>>>> problems when used in the discover process, as this thread
>>>> attests. The BIOS on LSI's HBAs does a discover process
>>>> but is only interested in bootable devices so SES devices
>>>> don't appear.
>>>>
>>>>
>>>> Doug Gilbert
>>>>
>>>>> Kashyap,
>>>>>
>>>>> Let me know if there are any changes agreed upon, I'm happy to test
>>>>> out patches as this is affecting a large number of our machines. =A0I
>>>>> can only imagine the same for others as they start to upgrade, as
>>>
>>> this
>>>>>
>>>>> is standard SuperMicro hardware.
>>>>>
>>>>> Thanks,
>>>>> Jason
>>>>>
>>
>>
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAAAm0r1JMVvhjiD=u%2BqQvgvu7810DHEqan8hwJmSfmpqRqLcuA>