Date: Mon, 19 Mar 2012 11:52:17 -0700 From: Jason Wolfe <nitroboost@gmail.com> To: dgilbert@interlog.com, "Desai, Kashyap" <Kashyap.Desai@lsi.com>, "McConnell, Stephen" <Stephen.McConnell@lsi.com> Cc: "freebsd-scsi@freebsd.org" <freebsd-scsi@freebsd.org> Subject: Re: LSI2008 controller clobbers first disk with new LSI mps driver Message-ID: <CAAAm0r1JMVvhjiD=u%2BqQvgvu7810DHEqan8hwJmSfmpqRqLcuA@mail.gmail.com> In-Reply-To: <4F58F9DC.2040606@interlog.com> References: <CAAAm0r2NFhF=eh2bOPMnVN8E6e2o0KfaST0N-M_gWoJHpFOLmQ@mail.gmail.com> <CAAAm0r1pWN-F=madGdk7N%2BoRuZmSD5_MAYwLh6By126L0CTGuw@mail.gmail.com> <B2FD678A64EAAD45B089B123FDFC3ED72B96D34558@inbmail01.lsi.com> <CAAAm0r1x15_ho2MD0tX7Y7A6mnU2N6zihNOz_Qz=jpsyBkDCWQ@mail.gmail.com> <B2FD678A64EAAD45B089B123FDFC3ED72B96D3455B@inbmail01.lsi.com> <B2FD678A64EAAD45B089B123FDFC3ED72B96D34626@inbmail01.lsi.com> <CAAAm0r3_S2jTG=Te4UhLqHPqiXq7_aAOHNp=W3jb4KLJx9PTRg@mail.gmail.com> <B2FD678A64EAAD45B089B123FDFC3ED72B96D34748@inbmail01.lsi.com> <CAAAm0r3oRTcfipyVcp9nE1CL3dcK7cft8AUSf%2BfGYVK90b2A0w@mail.gmail.com> <B2FD678A64EAAD45B089B123FDFC3ED72B96D347BB@inbmail01.lsi.com> <4F450814.4020100@interlog.com> <B2FD678A64EAAD45B089B123FDFC3ED72B96D34840@inbmail01.lsi.com> <CAAAm0r2x4wAkPzpJkCJ8FYnFkxb3RrXnENRZi-Kgfh=y3ZuqEA@mail.gmail.com> <4F4C14A8.3050105@interlog.com> <B2FD678A64EAAD45B089B123FDFC3ED72B96E62A30@inbmail01.lsi.com> <4F58F9DC.2040606@interlog.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Kashyap/LSI, Any movement here? I'm also trying to test some bug fixes to the em driver in 8.3-PRERELEASE, but am still locked into the old mps driver from last year because of this issue. If this is something we need to take offline I can start up an internal thread if that works. Thanks again, Jason On Thu, Mar 8, 2012 at 11:26 AM, Douglas Gilbert <dgilbert@interlog.com> wr= ote: > Kashyap, > Backing up ... I thought this thread was about the mps > driver failing to do the SAS discover process properly. > Hence a disk did not appear because it was "hidden" by a > SES device which had the same (device?) slot number. > > The SAS discover process is described in section 4.2 of > spl2r04b.pdf (at t10.org). That is the latest draft. Please > note that section never mentions the word "slot". I would > hazard a guess that no SAS standard or draft has ever > mentioned slots in the context of the discover process. > > The concept of device "slot" *** numbers comes from the > SCSI Enclosure Services (SES) standards of which ses3r04.pdf > is the latest draft. SAS provides the slot number _optionally_ > in the long form SMP DISCOVER response and does _not_ provide > the device slot number in the SMP DISCOVER LIST short form > response. > > So IMO the device slot number is just a bit of helpful > information that SAS might provide and that slot > number should not interfere with the SAS discover > process. > > > *** the term "slot" is used in the SAS port layer state > =A0 =A0machine in a different context. It is also possible > =A0 =A0that "slot" is a term used in LSI firmware. > > > A few data points: I have an Intel RES2SV240 which contains > a LSI SAS2X24 expander and a HP Expander card which contains > a PMC Sierra PM8005 SAS-2 expander. Both report a device slot > number of 255 (i.e. not provided) via their SMP DISCOVER > responses. When the inbuilt SES device on each expander is > probed, the LSI part reports device slot numbers 0 through 23 > while the PMC part reports a device slot number of 0 for all > array devices. In both cases the SES device itself is not > listed amongst the SES "array device slot" elements. > > Doug Gilbert > > > > On 12-03-07 12:44 PM, Desai, Kashyap wrote: >> >> Jason, >> >> We discuss this issue with our architect and he has strong recommendatio= n >> not to provide any work-around where Enclosure configuration is not corr= ect. >> Similar issue was reported by other customer sometimes back and they hav= e >> also configured their Enclosure to resolve this issue. >> "The enclosure configuration needs to be fixed so it advertises enough >> slots (phys disks + num of SES devices) and it places the SES devices >> (assigned slot numbers) above the physical disks." >> >> >> ` Kashyap >> >> >>> -----Original Message----- >>> From: Desai, Kashyap >>> Sent: Wednesday, February 29, 2012 10:08 PM >>> To: 'dgilbert@interlog.com'; Jason Wolfe >>> Cc: freebsd-scsi@freebsd.org; McConnell, Stephen >>> Subject: RE: LSI2008 controller clobbers first disk with new LSI mps >>> driver >>> >>> Hi Jason, >>> >>> I have started discussion with LSI internal folks to get better clarity >>> on this issue. Since our key person is on vacation, we may get clarity >>> on this next week. >>> I cannot provide some temporary workaround in upstream(because this is >>> against our design), but if you want to use for your environment, I can >>> provide you some temporary patch. >>> >>> Doug, >>> >>> Thanks for providing your view and I have convey this to our architect. >>> >>> ~ Kashyap >>> >>>> -----Original Message----- >>>> From: Douglas Gilbert [mailto:dgilbert@interlog.com] >>>> Sent: Tuesday, February 28, 2012 5:11 AM >>>> To: Jason Wolfe >>>> Cc: Desai, Kashyap; freebsd-scsi@freebsd.org; McConnell, Stephen >>>> Subject: Re: LSI2008 controller clobbers first disk with new LSI mps >>>> driver >>>> >>>> On 12-02-27 02:59 PM, Jason Wolfe wrote: >>>>> >>>>> On Wed, Feb 22, 2012 at 9:11 AM, Desai, >>> >>> Kashyap<Kashyap.Desai@lsi.com> >>>> >>>> wrote: >>>>>> >>>>>> >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Douglas Gilbert [mailto:dgilbert@interlog.com] >>>>>>> Sent: Wednesday, February 22, 2012 8:52 PM >>>>>>> To: Desai, Kashyap >>>>>>> Cc: Jason Wolfe; freebsd-scsi@freebsd.org; McConnell, Stephen >>>>>>> Subject: Re: LSI2008 controller clobbers first disk with new LSI >>> >>> mps >>>>>>> >>>>>>> driver >>>>>>> >>>>>>> On 12-02-22 03:39 AM, Desai, Kashyap wrote: >>>>>>>> >>>>>>>> Here is a possible root cause of this issue. >>>>>>>> >>>>>>>> Enclosure which you are using in your setup (might be) not >>>> >>>> configured >>>>>>> >>>>>>> properly. >>>>>>>> >>>>>>>> >>>>>>>> You have Enclosure with 12 Slots + 1 SES Device. >>>>>>>> See below detail from the log. >>>>>>>> >>>>>>>> =A0 =A0 =A0EventDataLength: 5 >>>>>>>> =A0 =A0 =A0AckRequired: 0 >>>>>>>> =A0 =A0 =A0Event: SasEnclDeviceStatusChange (0x1d) >>>>>>>> =A0 =A0 =A0EventContext: 0x0 >>>>>>>> =A0 =A0 =A0EnclosureHandle: 0x2 >>>>>>>> =A0 =A0 =A0ReasonCode: Added >>>>>>>> =A0 =A0 =A0PhysicalPort: 0 >>>>>>>> =A0 =A0 =A0NumSlots: 13 >>>>>>>> =A0 =A0 =A0StartSlot: 0 >>>>>>>> =A0 =A0 =A0PhyBits: 0xff >>>>>>>> >>>>>>>> StartSlot is 0 in this case. >>>>>>>> Correct behavior should be each device on your enclosure must >>> >>> have >>>>>>> >>>>>>> different slot number starting from 0 till 12. >>>>>>>> >>>>>>>> I have doubt that SES device has not configured well and it is >>>> >>>> using >>>>>>> >>>>>>> slot-0 as default. This can create issue for actual device which >>> >>> is >>>>>>> >>>>>>> connected to slot-0. >>>>>>>> >>>>>>>> So In your setup you will have slot-0 till slot-11 assigned for >>>> >>>> actual >>>>>>> >>>>>>> Phys of your enclosures and again slot-0 is assigned for SES >>> >>> device >>>>>>> >>>>>>> instead of Slot-12. >>>>>>> >>>>>>> No. SAS-2 expanders typically have an integral SES device on an >>>>>>> expander _virtual_ phy (see SMP DISCOVER (LIST) response). Once >>>>>>> you see that virtual phy flag the slot number is irrelevant. >>>>>> >>>>>> >>>>>> Doug, >>>>>> >>>>>> I need some more info so that I can understand your point better. >>>>>> >>>>>> I have one Enclosure setup on FreeBSD. Here is smp_discover output. >>>> >>>> (smp_discover_list is failing for me) >>>>>> >>>>>> >>>>>> phy =A0 0: inaccessible (phy vacant) >>>>>> =A0 phy =A0 1: inaccessible (phy vacant) >>>>>> =A0 phy =A0 2: inaccessible (phy vacant) >>>>>> =A0 phy =A0 3: inaccessible (phy vacant) >>>>>> =A0 phy =A0 4:S:attached:[500605b012345888:03 =A0i(SSP+STP+SMP)] =A0= 6 Gbps >>>>>> =A0 phy =A0 5:S:attached:[500605b012345888:02 =A0i(SSP+STP+SMP)] =A0= 6 Gbps >>>>>> =A0 phy =A0 6:S:attached:[500605b012345888:01 =A0i(SSP+STP+SMP)] =A0= 6 Gbps >>>>>> =A0 phy =A0 7:S:attached:[500605b012345888:00 =A0i(SSP+STP+SMP)] =A0= 6 Gbps >>>>>> =A0 phy =A012:D:attached:[5000c5003bc2c389:00 =A0t(SSP)] =A06 Gbps >>>>>> =A0 phy =A013:D:attached:[500000e116ee91e2:00 =A0t(SSP)] =A06 Gbps >>>>>> =A0 phy =A014:D:attached:[5000c5003bc308e5:00 =A0t(SSP)] =A06 Gbps >>>>>> =A0 phy =A015:D:attached:[5000c5003bc2f0d1:00 =A0t(SSP)] =A06 Gbps >>>>>> =A0 phy =A016:D:attached:[5000c5003bc2ff3d:00 =A0t(SSP)] =A06 Gbps >>>>>> =A0 phy =A017:D:attached:[5000c5003bae5fdd:00 =A0t(SSP)] =A06 Gbps >>>>>> =A0 phy =A018:D:attached:[5000c5003bae5eb1:00 =A0t(SSP)] =A06 Gbps >>>>>> =A0 phy =A019:D:attached:[5000c5003bc2d135:00 =A0t(SSP)] =A06 Gbps >>>>>> =A0 phy =A020:D:attached:[5000c5003baea36d:00 =A0t(SSP)] =A06 Gbps >>>>>> =A0 phy =A021:D:attached:[5000c5003bc2a8c9:00 =A0t(SSP)] =A06 Gbps >>>>>> =A0 phy =A022:D:attached:[5000c5003bc237a9:00 =A0t(SSP)] =A06 Gbps >>>>>> =A0 phy =A023:D:attached:[5000c5003bc2cec1:00 =A0t(SSP)] =A06 Gbps >>>>>> =A0 phy =A024:D:attached:[500000e01d92cb52:00 =A0t(SSP)] =A03 Gbps >>>>>> =A0 phy =A025:D:attached:[500000e01d74cfb2:00 =A0t(SSP)] =A03 Gbps >>>>>> =A0 phy =A026:D:attached:[500000e01d656052:00 =A0t(SSP)] =A03 Gbps >>>>>> =A0 phy =A027:D:attached:[500000e01d7cad52:00 =A0t(SSP)] =A03 Gbps >>>>>> =A0 phy =A028:D:attached:[500c04f2b64cdd1c:00 =A0t(SATA)] =A03 Gbps >>>>>> =A0 phy =A029:D:attached:[500c04f2b64cdd1d:00 =A0t(SATA)] =A03 Gbps >>>>>> =A0 phy =A030:D:attached:[500000e01d73c262:00 =A0t(SSP)] =A03 Gbps >>>>>> =A0 phy =A031:D:attached:[500000e01d536b22:00 =A0t(SSP)] =A03 Gbps >>>>>> =A0 phy =A032:D:attached:[500000e01d92cab2:00 =A0t(SSP)] =A03 Gbps >>>>>> =A0 phy =A033:D:attached:[500000e01afd8792:00 =A0t(SSP)] =A03 Gbps >>>>>> =A0 phy =A034:D:attached:[5000c5003bc30301:00 =A0t(SSP)] =A06 Gbps >>>>>> =A0 phy =A035:D:attached:[5000c5003bb09a69:00 =A0t(SSP)] =A06 Gbps >>>>>> =A0 phy =A036:D:attached:[500c04f2b64cdd3d:00 =A0V i(SSP) t(SSP)] = =A06 >>> >>> Gbps<- >>>> >>>> -- This has virtual phy set. >>>>>> >>>>>> >>>>>> What I understood from your explanation is if we have virt_phy >>> >>> field >>>> >>>> set, we should not trust slot for that entry. >>>>>> >>>>>> You are suggesting to use phy index instead of slot. Just for info: >>>> >>>> But how to see Slot details mapping with phy ? >>>> >>>> Kashyap, >>>> I haven't written a SAS discover algorithm but there >>>> must be plenty of examples out there. One way to do it >>>> is to find all the phy_ids attached to targets, in this >>>> case there are SAS (SSP) and SATA targets. Each SATA >>>> target phy_id will correspond to one SATA disk (or could be an >>>> ATAPI device (e.g. DVD/BD player)). The SSP targets are a >>>> bit trickier because two (or more) phys could be connected >>>> to the same target (either a wide port or multiple (target) >>>> ports). With a wide port each component phy has the same >>>> attached SAS address (so above you have a wide initiator >>>> port (phy ids 4,5,6,7) but no wide target ports). If a >>>> SAS disk has multiple target ports connected, FreeBSD >>>> probably has a device node for each. So for each SCSI (SSP) >>>> target port you need a REPORT LUNS command issued on LUN 0 >>>> (or the REPORT LUNS well known logical unit) to find the >>>> LUs it contains. A device node is created for each LU. >>>> >>>> Anyway I'm sure many folks in LSI know the SAS discover >>>> process better than I do. Ask them :-) Surely most of >>>> the above is already done in your HBA's firmware. >>>> >>>> >>>> BTW I don't think slot numbers are reliable and don't apply >>>> to things on virtual phys so they will just cause you >>>> problems when used in the discover process, as this thread >>>> attests. The BIOS on LSI's HBAs does a discover process >>>> but is only interested in bootable devices so SES devices >>>> don't appear. >>>> >>>> >>>> Doug Gilbert >>>> >>>>> Kashyap, >>>>> >>>>> Let me know if there are any changes agreed upon, I'm happy to test >>>>> out patches as this is affecting a large number of our machines. =A0I >>>>> can only imagine the same for others as they start to upgrade, as >>> >>> this >>>>> >>>>> is standard SuperMicro hardware. >>>>> >>>>> Thanks, >>>>> Jason >>>>> >> >> >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAAAm0r1JMVvhjiD=u%2BqQvgvu7810DHEqan8hwJmSfmpqRqLcuA>