Date: Thu, 19 Oct 2017 22:23:26 +0000 From: Steven Hartland <killing@multiplay.co.uk> To: Ken Merry <ken@freebsd.org>, Shiva Bhanujan <Shiva.Bhanujan@quorum.com> Cc: "freebsd-scsi@freebsd.org" <freebsd-scsi@freebsd.org> Subject: Re: FreeBSD 10.3/11.0 SCSI errors with Symbios Logic SAS3008 PCI-Express Fusion-MPT SAS-3 Message-ID: <CAHEMsqY1ijV659MabfRPWecDVRbzwYGynd2hUfUOwmZe_Tg=Lw@mail.gmail.com> In-Reply-To: <3A5A10BE32AC9E45B4A22F89FC90EC0701C3666647@QLEXC01.Quorum.local> References: <3A5A10BE32AC9E45B4A22F89FC90EC0701C3665D5D@QLEXC01.Quorum.local> <3A5A10BE32AC9E45B4A22F89FC90EC0701C3665E8B@QLEXC01.Quorum.local> <20171016144231.GA94858@mithlond.kdm.org> <3A5A10BE32AC9E45B4A22F89FC90EC0701C366610E@QLEXC01.Quorum.local> <20171017023126.GA6559@mithlond.kdm.org> <3A5A10BE32AC9E45B4A22F89FC90EC0701C3666143@QLEXC01.Quorum.local> <3A5A10BE32AC9E45B4A22F89FC90EC0701C3666345@QLEXC01.Quorum.local> <3A5A10BE32AC9E45B4A22F89FC90EC0701C3666352@QLEXC01.Quorum.local> <3E746990-8C6D-4CA1-BD79-B5566CFB07F4@freebsd.org> <32C157CE-A122-435F-8430-9531BEEB5914@freebsd.org> <3A5A10BE32AC9E45B4A22F89FC90EC0701C3666647@QLEXC01.Quorum.local>
next in thread | previous in thread | raw e-mail | index | archive | help
With type 2 protection the ref tag has to match the LBA + N Some info about it is here https://www.usenix.org/legacy/event/lsf07/tech/petersen.pdf https://www.seagate.com/files/staticfiles/docs/pdf/whitepaper/safeguarding-= data-from-corruption-technology-paper-tp621us.pdf >From reading the seagate paper the only way to change the protection level is to format. On Thu, 19 Oct 2017 at 22:43, Shiva Bhanujan <Shiva.Bhanujan@quorum.com> wrote: > Here's the output of sg_readcap. > > [root@Filer-20-241 ~]# sg_readcap --16 da1 > Read Capacity results: > Protection: prot_en=3D1, p_type=3D1, p_i_exponent=3D0 [type 2 protecti= on] > Logical block provisioning: lbpme=3D0, lbprz=3D0 > Last logical block address=3D7814037167 (0x1d1c0beaf), Number of logic= al > blocks=3D7814037168 > Logical block length=3D512 bytes > Logical blocks per physical block exponent=3D0 > Lowest aligned logical block address=3D0 > Hence: > Device size: 4000787030016 bytes, 3815447.8 MiB, 4000.79 GB > [root@Filer-20-241 ~]# > > [root@Filer-20-241 ~]# camcontrol modepage da1 -v -m 10 | grep DPICZ > DPICZ: 1 > [root@Filer-20-241 ~]# > > I did toggle the DPICZ on the drive from 1 to 0 and back. The sg_readcap > still shows 'type 2 protection', and that gpart still shows the SCSI > errors. I've narrowed this down to the Seagate ST4000NM0005, w/ a DOM of > 03/2016. We also have Constellation ES.3 drives model ST4000NM0023 that > don't exhibit this issue. > > I did go through the URLs that you have mentioned and that's how I found > that using sg_format did address this issue. Again, that works only for > new drives and we have appliances that already have data, for which > sg_format isn't an option. > > Is this boiling down to the 'type 2 protection' tag that we see in > sg_readcap? if so, would there be a way to turn it off? > > > > > > From: Ken Merry [ken@freebsd.org] > > Sent: Thursday, October 19, 2017 2:17 PM > > To: Shiva Bhanujan > > Cc: freebsd-scsi@freebsd.org > > Subject: Re: FreeBSD 10.3/11.0 SCSI errors with Symbios Logic SAS3008 > PCI-Express Fusion-MPT SAS-3 > > > > > > > > By the way, the message you referenced is here: > > > > > https://lists.freebsd.org/pipermail/freebsd-scsi/2017-January/007237.html > > > > > And there is more here: > > > > > https://bugs.freenas.org/issues/14517 > > > > > Ken > > > > =E2=80=94 > > Ken Merry > > ken@FreeBSD.ORG > > > > > > > > > > On Oct 19, 2017, at 5:15 PM, Ken Merry <ken@freebsd.org> wrote: > > > > > What does sg_readcap =E2=80=9416 show for these drives? > > If it has type 2 protection turned on, check to see what the DCIPZ value > is in the control mode page: > > camcontrol modepage daX -v -m 10 > > If that is set to 0, add a -e to the above command line and see if you ca= n > set it to 1. That may or may not help anything. > > The mpr(4) and mps(4) drivers try to support protection information if it > is turned on on the drive. So, they set the protection information if > protection information is turned on in the drive. For that reason, setti= ng > the DCIPZ bit may not fix it. > > There could be a problem with how that is implemented that=E2=80=99s caus= ing the > drives to reject the command, but I=E2=80=99m not sure. > > If it is, Steve (CCed) can help us debug it. > > Ken > =E2=80=94 > Ken Merry > ken@FreeBSD.ORG > > > > On Oct 17, 2017, at 9:54 PM, Shiva Bhanujan <Shiva.Bhanujan@Quorum.com> > wrote: > > Sorry. perhaps I have failed to mention, the SCSI errors are only w/ the > seagate drives. These are of the model ST4000NM0023. > > > From: Shiva Bhanujan > Sent: Tuesday, October 17, 2017 6:53 PM > To: Kenneth D. Merry > Cc: freebsd-scsi@freebsd.org <mailto:freebsd-scsi@freebsd.org> > Subject: RE: FreeBSD 10.3/11.0 SCSI errors with Symbios Logic SAS3008 > PCI-Express Fusion-MPT SAS-3 > > Please note, that this isn't an issue w/ Toshiba drives. is this a > firmware issue by any chance? > > > > From: owner-freebsd-scsi@freebsd.org <mailto: > owner-freebsd-scsi@freebsd.org> [owner-freebsd-scsi@freebsd.org <mailto: > owner-freebsd-scsi@freebsd.org>] on behalf of Shiva Bhanujan [ > shiva.bhanujan@quorum.net <mailto:shiva.bhanujan@quorum.net>] > Sent: Tuesday, October 17, 2017 6:08 AM > To: Kenneth D. Merry > Cc: freebsd-scsi@freebsd.org <mailto:freebsd-scsi@freebsd.org> > Subject: RE: FreeBSD 10.3/11.0 SCSI errors with Symbios Logic SAS3008 > PCI-Express Fusion-MPT SAS-3 > > Since I started having the SCSI errors, I ended up running sg_format to > format the disks. I've found that once the disks are formatted using > sg_format, there are no SCSI errors. The errors that show up during the > format are towards the end of the dmesg output. > > (da0:mpr0:0:8:0): SCSI sense: NOT READY asc:4,4 (Logical unit not ready, > format in progress) > (da0:mpr0:0:8:0): Progress: 9% (6256/65536) complete > > once the format is done, I can successfully format and partition using > gpart. > > > The errors that show up when I try to run gpart for the first time are as > follows: > > (da9:mpr0:0:17:0): READ(10). CDB: 28 00 00 00 00 00 00 01 00 00 > (da9:mpr0:0:17:0): CAM status: SCSI Status Error > (da9:mpr0:0:17:0): SCSI status: Check Condition > (da9:mpr0:0:17:0): SCSI sense: ILLEGAL REQUEST asc:20,0 (Invalid command > operation code) > (da9:mpr0:0:17:0): Error 22, Unretryable error > > > It seems that it's the read that is failing, and is being tagged as an > illegal request. While sg_format will address the issue at hand, this isn= 't > an option for us, because there are appliances that were formatted using > FreeBSD 10.2, and an upgrade to 10.3 > or 11.x might be an issue? > > > > > ________________________________ > From: Kenneth D. Merry [ken@FreeBSD.ORG <mailto:ken@FreeBSD.ORG>] > Sent: Monday, October 16, 2017 7:31 PM > To: Shiva Bhanujan > Cc: freebsd-scsi@freebsd.org <mailto:freebsd-scsi@freebsd.org> > Subject: Re: FreeBSD 10.3/11.0 SCSI errors with Symbios Logic SAS3008 > PCI-Express Fusion-MPT SAS-3 > > On Tue, Oct 17, 2017 at 01:19:27 +0000, Shiva Bhanujan wrote: > Hi Ken, > > I've attached the output of dmesg. Here's the SCSI CDB for a sample drive= , > da3. > > (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 00 00 00 00 01 00 00 > (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 00 02 00 00 01 00 00 > (da3:mpr0:0:11:0): READ(16). CDB: 88 00 00 00 00 01 d1 c0 ba 00 00 00 01 > 00 00 00 > (da3:mpr0:0:11:0): READ(16). CDB: 88 00 00 00 00 01 d1 c0 bc 00 00 00 01 > 00 00 00 > (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 a0 00 22 00 01 00 00 > (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 a0 02 22 00 01 00 00 > (da3:mpr0:0:11:0): READ(16). CDB: 88 00 00 00 00 01 d1 c0 ba 22 00 00 01 > 00 00 00 > (da3:mpr0:0:11:0): READ(16). CDB: 88 00 00 00 00 01 d1 c0 bc 22 00 00 01 > 00 00 00 > (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 00 00 22 00 01 00 00 > (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 00 02 22 00 01 00 00 > (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 9f fc 22 00 01 00 00 > (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 9f fe 22 00 01 00 00 > (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 00 00 00 00 01 00 00 > (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 00 02 00 00 01 00 00 > (da3:mpr0:0:11:0): READ(16). CDB: 88 00 00 00 00 01 d1 c0 ba 00 00 00 01 > 00 00 00 > (da3:mpr0:0:11:0): READ(16). CDB: 88 00 00 00 00 01 d1 c0 bc 00 00 00 01 > 00 00 00 > (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 a0 00 22 00 01 00 00 > (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 a0 02 22 00 01 00 00 > (da3:mpr0:0:11:0): READ(16). CDB: 88 00 00 00 00 01 d1 c0 ba 22 00 00 01 > 00 00 00 > (da3:mpr0:0:11:0): READ(16). CDB: 88 00 00 00 00 01 d1 c0 bc 22 00 00 01 > 00 00 00 > (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 00 00 22 00 01 00 00 > (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 00 02 22 00 01 00 00 > (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 9f fc 22 00 01 00 00 > (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 9f fe 22 00 01 00 00 > > > My understanding is that FreeBSD 11.1 contains the mpr(4) driver? I've > tried this w/ 11.1, w/ the same results. > > > > Yes, the mpr(4) driver is in all recent FreeBSD releases. > > In looking at the dmesg, this is telling: > > (da0:mpr0:0:8:0): WRITE(6). CDB: 0a 00 00 00 01 00 > (da0:mpr0:0:8:0): CAM status: SCSI Status Error > (da0:mpr0:0:8:0): SCSI status: Check Condition > (da0:mpr0:0:8:0): SCSI sense: NOT READY asc:4,4 (Logical unit not ready, > format in progress) > (da0:mpr0:0:8:0): Progress: 9% (6256/65536) complete > (da0:mpr0:0:8:0): Error 16, Unretryable error > > If the drives are in the process of formatting, I guess it may make sense > for them to reject read commands. Otherwise, it makes no sense for a hard > drive to reject reads. > > Are you able to check the status of the format? You should be able to sen= d > a test unit ready and figure out how far along the format is: > > camcontrol tur da0 -v > > And so on for each of the drives. > > Ken > > ________________________________ > From: Kenneth D. Merry [ken@FreeBSD.ORG <mailto:ken@FreeBSD.ORG><mailto: > ken@FreeBSD.ORG> <mailto:ken@FreeBSD.ORG>>] > Sent: Monday, October 16, 2017 7:42 AM > To: Shiva Bhanujan > Cc: freebsd-scsi@freebsd.org <mailto:freebsd-scsi@freebsd.org><mailto: > freebsd-scsi@freebsd.org> <mailto:freebsd-scsi@freebsd.org>> > Subject: Re: FreeBSD 10.3/11.0 SCSI errors with Symbios Logic SAS3008 > PCI-Express Fusion-MPT SAS-3 > > On Fri, Oct 13, 2017 at 20:12:02 +0000, Shiva Bhanujan wrote: > Hello, > > I have a FreeBSD 10.3 install in a HVM on XenServer 6.5. The HBA330 SAS-3 > controller is in pcipassthrough mode to the FreeBSD VM. When I try to > access the disks (/dev/da0...) using gpart, I get SCSI errors, like the > following: > > > > (da0:mpr0:0:0:0): CAM status: SCSI Status Error > > (da0:mpr0:0:0:0): SCSI status: Check Condition > > (da0:mpr0:0:0:0): SCSI sense: ILLEGAL REQUEST asc:20,0 (Invalid command > operation code) > > > > The error message above is missing the SCSI CDB. We need that in order to > figure out what command the drive is complaining about. > > The error message means that FreeBSD is sending a SCSI command that the > drive doesn't support. That can be benign, or it can cause a problem. > > So, what error does gpart give you when you have this problem? > > I get the same errors w/ FreeBSD 11.0 also. Running 10.3 natively also ha= s > the same result. > > > > > Please note, that these errors don't show up on a Fusion-MPT SAS-2 > controller, or a MegaRAID SAS 2208 controller. Additionally, FreeBSD 10.2 > doesn't have any SCSI errors on the HBA330 SAS-3 controller either. > > > > > Is there a different version of the mpr driver I should be using? I > haven't checked the differences between the mpr driver in 10.2 vs 10.3 an= d > 11.0. I do see that there are others who have experienced these issues. C= an > somebody please provide me some pointers > as to why this is occurring? Or if there are some driver changes that I > might be able to incorporate? > > > > In general, the latest mpr(4) driver is the best one. The driver itself > generally doesn't send SCSI commands (there are a few exceptions), but > rather passes them through from the upper layers of CAM. > > Please note, that I have gone through the mail titled "scsi error at > SEAGATE ST1200MM0088 TT31" and have started sg_format on all the SEAGATE > disks. Having said that, I still need to figure out what would happen, if > the disks > were written to using FreeBSD 10.2, which doesn't seem to have SCSI > errors, and when I try to upgrade to 10.3. Any help is appreciated. > > > > Send the full error messages, and we may be able to figure out what's goi= ng > on. > > Ken > -- > Kenneth Merry > ken@FreeBSD.ORG <mailto:ken@FreeBSD.ORG><mailto:ken@FreeBSD.ORG> <mailto: > ken@FreeBSD.ORG>><mailto:ken@FreeBSD.ORG> <mailto:ken@FreeBSD.ORG>> > ________________________________ > > > > -- > Kenneth Merry > ken@FreeBSD.ORG <mailto:ken@FreeBSD.ORG><mailto:ken@FreeBSD.ORG> <mailto: > ken@FreeBSD.ORG>> > ________________________________ > _______________________________________________ > freebsd-scsi@freebsd.org <mailto:freebsd-scsi@freebsd.org> mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-scsi < > https://lists.freebsd.org/mailman/listinfo/freebsd-scsi> > To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org > <mailto:freebsd-scsi-unsubscribe@freebsd.org>" > > > _______________________________________________ > freebsd-scsi@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-scsi > To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org" > > > > > > > > > > > > > > _______________________________________________ > freebsd-scsi@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-scsi > To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAHEMsqY1ijV659MabfRPWecDVRbzwYGynd2hUfUOwmZe_Tg=Lw>