Date: Fri, 20 Oct 2017 10:34:54 -0400 From: Ken Merry <ken@freebsd.org> To: Shiva Bhanujan <Shiva.Bhanujan@Quorum.com> Cc: Steven Hartland <killing@multiplay.co.uk>, "freebsd-scsi@freebsd.org" <freebsd-scsi@freebsd.org> Subject: Re: FreeBSD 10.3/11.0 SCSI errors with Symbios Logic SAS3008 PCI-Express Fusion-MPT SAS-3 Message-ID: <95A66B26-548C-4BF5-9527-EE30F9C01D42@freebsd.org> In-Reply-To: <3A5A10BE32AC9E45B4A22F89FC90EC0701C366674D@QLEXC01.Quorum.local> References: <3A5A10BE32AC9E45B4A22F89FC90EC0701C3665D5D@QLEXC01.Quorum.local> <3A5A10BE32AC9E45B4A22F89FC90EC0701C3665E8B@QLEXC01.Quorum.local> <20171016144231.GA94858@mithlond.kdm.org> <3A5A10BE32AC9E45B4A22F89FC90EC0701C366610E@QLEXC01.Quorum.local> <20171017023126.GA6559@mithlond.kdm.org> <3A5A10BE32AC9E45B4A22F89FC90EC0701C3666143@QLEXC01.Quorum.local> <3A5A10BE32AC9E45B4A22F89FC90EC0701C3666345@QLEXC01.Quorum.local> <3A5A10BE32AC9E45B4A22F89FC90EC0701C3666352@QLEXC01.Quorum.local> <3E746990-8C6D-4CA1-BD79-B5566CFB07F4@freebsd.org> <32C157CE-A122-435F-8430-9531BEEB5914@freebsd.org> <3A5A10BE32AC9E45B4A22F89FC90EC0701C3666647@QLEXC01.Quorum.local> <CAHEMsqY1ijV659MabfRPWecDVRbzwYGynd2hUfUOwmZe_Tg=Lw@mail.gmail.com> <3A5A10BE32AC9E45B4A22F89FC90EC0701C36666F5@QLEXC01.Quorum.local> <96BAD947-4AB0-4EAC-9DA8-4B1F10253287@freebsd.org> <3A5A10BE32AC9E45B4A22F89FC90EC0701C366674D@QLEXC01.Quorum.local>
next in thread | previous in thread | raw e-mail | index | archive | help
--Apple-Mail=_C43579B3-638F-4C3F-B4A7-22EDAD9BDBE9 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Ok. Yes, for #2, in theory we can disable EEDP / protection information = in the mpr(4) driver and if DPICZ is set, the drive won=E2=80=99t = require setting protection information on read and write commands. That should let you access the disks normally. That said, I=E2=80=99ve never played with protection information before, = so I don=E2=80=99t know for sure. I have a drive that supports it and = I=E2=80=99m formatting it now to turn on type 2 protection I may be = able to debug things once I get that done. In the mean time, I=E2=80=99ve attached a patch against the stable/10 = version of the mpr(4) driver. Apply this patch, and rebuild and = reinstall your kernel. Then you=E2=80=99ll be able to disable EEDP in = the driver three different ways: 1. Set hw.mpr.disable_eedp=3D1 in /boot/loader.conf. That will disable = EEDP / Protection Information for all mpr instances. 2. Set dev.mpr.0.disable_eedp=3D1 in /boot/loader.conf. That will = disable EEDP for mpr0. 3. sysctl dev.mpr.0.disable_eedp=3D1 That will disable EEDP on the = fly for mpr0. Let me know what happens. Ken =E2=80=94=20 Ken Merry ken@FreeBSD.ORG --Apple-Mail=_C43579B3-638F-4C3F-B4A7-22EDAD9BDBE9 Content-Disposition: attachment; filename=mpr_eedp_diffs.stable10.20171020.txt Content-Type: text/plain; x-unix-mode=0600; name="mpr_eedp_diffs.stable10.20171020.txt" Content-Transfer-Encoding: quoted-printable =3D=3D=3D=3D //depot/users/kenm/FreeBSD-stable/10/sys/dev/mpr/mpr.c#5 - = /usr/home/kenm/perforce4/kenm/FreeBSD-stable/10/sys/dev/mpr/mpr.c =3D=3D=3D= =3D *** /tmp/tmp.44697.83 Fri Oct 20 08:17:31 2017 --- /usr/home/kenm/perforce4/kenm/FreeBSD-stable/10/sys/dev/mpr/mpr.c = Fri Oct 20 07:46:02 2017 *************** *** 1513,1518 **** --- 1513,1519 ---- TUNABLE_INT_FETCH("hw.mpr.enable_ssu", &sc->enable_ssu); TUNABLE_INT_FETCH("hw.mpr.spinup_wait_time", = &sc->spinup_wait_time); TUNABLE_INT_FETCH("hw.mpr.use_phy_num", &sc->use_phynum); + TUNABLE_INT_FETCH("hw.mpr.disable_eedp", &sc->disable_eedp); =20 /* Grab the unit-instance variables */ snprintf(tmpstr, sizeof(tmpstr), "dev.mpr.%d.debug_level", *************** *** 1551,1556 **** --- 1552,1561 ---- snprintf(tmpstr, sizeof(tmpstr), "dev.mpr.%d.use_phy_num", device_get_unit(sc->mpr_dev)); TUNABLE_INT_FETCH(tmpstr, &sc->use_phynum); +=20 + snprintf(tmpstr, sizeof(tmpstr), "dev.mpr.%d.disable_eedp", + device_get_unit(sc->mpr_dev)); + TUNABLE_INT_FETCH(tmpstr, &sc->disable_eedp); } =20 static void *************** *** 1646,1651 **** --- 1651,1660 ---- "Use the phy number for enumeration"); =20 SYSCTL_ADD_INT(sysctl_ctx, SYSCTL_CHILDREN(sysctl_tree), + OID_AUTO, "dsiable_eedp", CTLFLAG_RD, &sc->disable_eedp, 0, + "Disable Protection Info / EEDP"); +=20 + SYSCTL_ADD_INT(sysctl_ctx, SYSCTL_CHILDREN(sysctl_tree), OID_AUTO, "prp_pages_free", CTLFLAG_RD, &sc->prp_pages_free, 0, "number of free PRP pages"); =20 =3D=3D=3D=3D = //depot/users/kenm/FreeBSD-stable/10/sys/dev/mpr/mpr_sas.c#8 - = /usr/home/kenm/perforce4/kenm/FreeBSD-stable/10/sys/dev/mpr/mpr_sas.c = =3D=3D=3D=3D *** /tmp/tmp.44697.181 Fri Oct 20 08:17:31 2017 --- = /usr/home/kenm/perforce4/kenm/FreeBSD-stable/10/sys/dev/mpr/mpr_sas.c = Fri Oct 20 07:48:13 2017 *************** *** 2071,2077 **** * for EEDP transfer. */ eedp_flags =3D op_code_prot[req->CDB.CDB32[0]]; ! if (sc->eedp_enabled && eedp_flags) { SLIST_FOREACH(lun, &targ->luns, lun_link) { if (lun->lun_id =3D=3D csio->ccb_h.target_lun) { break; --- 2071,2077 ---- * for EEDP transfer. */ eedp_flags =3D op_code_prot[req->CDB.CDB32[0]]; ! if (sc->eedp_enabled && sc->disable_eedp =3D=3D 0 && eedp_flags) = { SLIST_FOREACH(lun, &targ->luns, lun_link) { if (lun->lun_id =3D=3D csio->ccb_h.target_lun) { break; =3D=3D=3D=3D //depot/users/kenm/FreeBSD-stable/10/sys/dev/mpr/mprvar.h#3 = - /usr/home/kenm/perforce4/kenm/FreeBSD-stable/10/sys/dev/mpr/mprvar.h = =3D=3D=3D=3D *** /tmp/tmp.44697.281 Fri Oct 20 08:17:31 2017 --- /usr/home/kenm/perforce4/kenm/FreeBSD-stable/10/sys/dev/mpr/mprvar.h = Fri Oct 20 07:44:26 2017 *************** *** 297,302 **** --- 297,303 ---- u_int enable_ssu; int spinup_wait_time; int use_phynum; + int disable_eedp; uint64_t chain_alloc_fail; uint64_t prp_page_alloc_fail; struct sysctl_ctx_list sysctl_ctx; --Apple-Mail=_C43579B3-638F-4C3F-B4A7-22EDAD9BDBE9 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Oct 20, 2017, at 9:26 AM, Shiva Bhanujan = <Shiva.Bhanujan@Quorum.com> wrote: >=20 > [3] isn't an option, because we're specifically moving to FreeBSD 10.3 = or above to get the 'restartable zfs send/receive' feature. I believe = for [2], this would require not setting the 'type 2 protection' when = configured formatting the disks? We have appliances that are already = formatted, so we'll need to figure out how to address that part. >=20 > I'm not familiar w/ the mpr driver source code. I can start looking @ = it, but I'm afraid, I'm not going to make much progress. If you could = help out w/ addressing this issue in 10.3 w/ the protection bit set for = SCSI disks, that'd be great. >=20 >=20 >=20 >=20 >=20 > From: Ken Merry [ken@freebsd.org] >=20 > Sent: Thursday, October 19, 2017 7:09 PM >=20 > To: Shiva Bhanujan >=20 > Cc: Steven Hartland; freebsd-scsi@freebsd.org >=20 > Subject: Re: FreeBSD 10.3/11.0 SCSI errors with Symbios Logic SAS3008 = PCI-Express Fusion-MPT SAS-3 >=20 >=20 >=20 >=20 >=20 >=20 >=20 > There are probably 3 solutions to get things fixed for you: >=20 >=20 >=20 > 1. Track down why the mpr driver is sending requests that the drive = doesn=E2=80=99t like and fix it. >=20 >=20 >=20 > 2. Turn on DPICZ in the drive and disable setting protection = information in the mpr driver. This will take commenting out the right = code and recompiling the kernel. >=20 >=20 >=20 > 3. Use an older version of FreeBSD to read the data and the reformat = and write it back. >=20 >=20 >=20 > We need to do #1 for the sake of the users who will run into this. #2 = hopefully will work and won=E2=80=99t require running an old OS. #3 is = essentially giving up. >=20 >=20 >=20 > For #2, If you feel comfortable modifying the mpr driver, just look = for the eedp code and turn it off. Otherwise I can try to come up with = something tomorrow. >=20 >=20 >=20 > Ken >=20 > --=20 >=20 > Ken Merry >=20 > ken@FreeBSD.ORG >=20 >=20 >=20 >> On Oct 19, 2017, at 20:51, Shiva Bhanujan <Shiva.Bhanujan@Quorum.com> = wrote: >=20 >>=20 >=20 >> would it possible to by default read SCSI disks w/ type 2 protection = in 10.3 and above? Please note, that the issue that I'm facing, is that = the partitions are created in FreeBSD 10.2, and for the Seagate drives = ST4000NM0005, they are created w/ type 2 partition. > read/write to these disks in FreeBSD 10.2 works just fine. However, if = I upgrade to FreeBSD 10.3 and above, I get SCSI errors for only these = disks. Would it be possible, that the SCSI reads default to reading = disks that have type 2 partition? In such a case, > removing type 2 partition would not be needed. >=20 >>=20 >=20 >> all other params in the output of sg_readcap look the same for the = ST4000NM0023, where this issue isn't seen. >=20 >>=20 >=20 >>=20 > root@Filer:~ # >=20 >>=20 > root@Filer:~ # camcontrol devlist | grep da0 >=20 >> <SEAGATE ST4000NM0023 GS14> at scbus2 target 0 lun 0 (pass1,da0) >=20 >>=20 > root@Filer:~ # >=20 >>=20 > root@Filer:~ # camcontrol devlist | grep da2 >=20 >> <SEAGATE ST4000NM0005 MS05> at scbus2 target 2 lun 0 (pass3,da2) >=20 >>=20 > root@Filer:~ # >=20 >>=20 > root@Filer:~ # sg_readcap --16 da0 >=20 >> Read Capacity results: >=20 >> Protection: prot_en=3D0, p_type=3D0, p_i_exponent=3D0 >=20 >> Logical block provisioning: lbpme=3D0, lbprz=3D0 >=20 >> Last logical block address=3D7814037167 (0x1d1c0beaf), Number of = logical blocks=3D7814037168 >=20 >> Logical block length=3D512 bytes >=20 >> Logical blocks per physical block exponent=3D0 >=20 >> Lowest aligned logical block address=3D0 >=20 >> Hence: >=20 >> Device size: 4000787030016 bytes, 3815447.8 MiB, 4000.79 GB >=20 >>=20 > root@Filer:~ # >=20 >>=20 > root@Filer:~ # sg_readcap --16 da2 >=20 >> Read Capacity results: >=20 >> Protection: prot_en=3D1, p_type=3D1, p_i_exponent=3D0 [type 2 = protection] >=20 >> Logical block provisioning: lbpme=3D0, lbprz=3D0 >=20 >> Last logical block address=3D7814037167 (0x1d1c0beaf), Number of = logical blocks=3D7814037168 >=20 >> Logical block length=3D512 bytes >=20 >> Logical blocks per physical block exponent=3D0 >=20 >> Lowest aligned logical block address=3D0 >=20 >> Hence: >=20 >> Device size: 4000787030016 bytes, 3815447.8 MiB, 4000.79 GB >=20 >>=20 > root@Filer:~ # >=20 >>=20 >=20 >>=20 >=20 >> is there some default SCSI read that is configurable in FreeBSD 10.3 = and above? >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> From: Steven Hartland [killing@multiplay.co.uk] >=20 >>=20 >=20 >> Sent: Thursday, October 19, 2017 3:23 PM >=20 >>=20 >=20 >> To: Ken Merry; Shiva Bhanujan >=20 >>=20 >=20 >> Cc:=20 > freebsd-scsi@freebsd.org >=20 >>=20 >=20 >> Subject: Re: FreeBSD 10.3/11.0 SCSI errors with Symbios Logic SAS3008 = PCI-Express Fusion-MPT SAS-3 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> With type 2 protection the ref tag has to match the LBA + N=20 >=20 >>=20 >=20 >> Some info about it is here >=20 >>=20 >=20 >>=20 > https://www.usenix.org/legacy/event/lsf07/tech/petersen.pdf >=20 >>=20 >=20 >>=20 >=20 >>=20 > = https://www.seagate.com/files/staticfiles/docs/pdf/whitepaper/safeguarding= -data-from-corruption-technology-paper-tp621us.pdf >=20 >>=20 >=20 >> =46rom reading the seagate paper the only way to change the = protection level is to format. >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> On Thu, 19 Oct 2017 at 22:43, Shiva Bhanujan = <Shiva.Bhanujan@quorum.com> wrote: >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> Here's the output of sg_readcap. >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> [root@Filer-20-241 ~]# sg_readcap --16 da1 >=20 >>=20 >=20 >> Read Capacity results: >=20 >>=20 >=20 >> Protection: prot_en=3D1, p_type=3D1, p_i_exponent=3D0 [type 2 = protection] >=20 >>=20 >=20 >> Logical block provisioning: lbpme=3D0, lbprz=3D0 >=20 >>=20 >=20 >> Last logical block address=3D7814037167 (0x1d1c0beaf), Number of = logical blocks=3D7814037168 >=20 >>=20 >=20 >> Logical block length=3D512 bytes >=20 >>=20 >=20 >> Logical blocks per physical block exponent=3D0 >=20 >>=20 >=20 >> Lowest aligned logical block address=3D0 >=20 >>=20 >=20 >> Hence: >=20 >>=20 >=20 >> Device size: 4000787030016 bytes, 3815447.8 MiB, 4000.79 GB >=20 >>=20 >=20 >> [root@Filer-20-241 ~]# >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> [root@Filer-20-241 ~]# camcontrol modepage da1 -v -m 10 | grep DPICZ >=20 >>=20 >=20 >> DPICZ: 1 >=20 >>=20 >=20 >> [root@Filer-20-241 ~]# >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> I did toggle the DPICZ on the drive from 1 to 0 and back. The = sg_readcap still shows 'type 2 protection', and that gpart still shows = the SCSI errors. I've narrowed this down to the Seagate ST4000NM0005, w/ = a DOM of 03/2016. We also have Constellation ES.3 >=20 >> drives model ST4000NM0023 that don't exhibit this issue. >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> I did go through the URLs that you have mentioned and that's how I = found that using sg_format did address this issue. Again, that works = only for new drives and we have appliances that already have data, for = which sg_format isn't an option. >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> Is this boiling down to the 'type 2 protection' tag that we see in = sg_readcap? if so, would there be a way to turn it off? >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> From: Ken Merry [ken@freebsd.org] >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> Sent: Thursday, October 19, 2017 2:17 PM >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> To: Shiva Bhanujan >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> Cc:=20 >=20 >>=20 > freebsd-scsi@freebsd.org >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> Subject: Re: FreeBSD 10.3/11.0 SCSI errors with Symbios Logic SAS3008 = PCI-Express Fusion-MPT SAS-3 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> By the way, the message you referenced is here: >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 > = https://lists.freebsd.org/pipermail/freebsd-scsi/2017-January/007237.html >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> And there is more here: >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 > https://bugs.freenas.org/issues/14517 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> Ken >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> =E2=80=94 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> Ken Merry >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 > ken@FreeBSD.ORG >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> On Oct 19, 2017, at 5:15 PM, Ken Merry <ken@freebsd.org> wrote: >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> What does sg_readcap =E2=80=9416 show for these drives? >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> If it has type 2 protection turned on, check to see what the DCIPZ = value is in the control mode page: >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> camcontrol modepage daX -v -m 10 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> If that is set to 0, add a -e to the above command line and see if = you can set it to 1. That may or may not help anything. >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> The mpr(4) and mps(4) drivers try to support protection information = if it is turned on on the drive. So, they set the protection information = if protection information is turned on in the drive. For that reason, = setting the DCIPZ bit may not fix it. >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> There could be a problem with how that is implemented that=E2=80=99s = causing the drives to reject the command, but I=E2=80=99m not sure. >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> If it is, Steve (CCed) can help us debug it. >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> Ken >=20 >>=20 >=20 >> =E2=80=94 >=20 >>=20 >=20 >> Ken Merry >=20 >>=20 >=20 >>=20 > ken@FreeBSD.ORG >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> On Oct 17, 2017, at 9:54 PM, Shiva Bhanujan = <Shiva.Bhanujan@Quorum.com> wrote: >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> Sorry. perhaps I have failed to mention, the SCSI errors are only w/ = the seagate drives. These are of the model ST4000NM0023. >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> From: Shiva Bhanujan >=20 >>=20 >=20 >> Sent: Tuesday, October 17, 2017 6:53 PM >=20 >>=20 >=20 >> To: Kenneth D. Merry >=20 >>=20 >=20 >> Cc:=20 >=20 >>=20 > freebsd-scsi@freebsd.org <mailto:freebsd-scsi@freebsd.org> >=20 >>=20 >=20 >> Subject: RE: FreeBSD 10.3/11.0 SCSI errors with Symbios Logic SAS3008 = PCI-Express Fusion-MPT SAS-3 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> Please note, that this isn't an issue w/ Toshiba drives. is this a = firmware issue by any chance? >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> From:=20 >=20 >>=20 > owner-freebsd-scsi@freebsd.org <mailto:owner-freebsd-scsi@freebsd.org> = [owner-freebsd-scsi@freebsd.org >=20 >> <mailto:owner-freebsd-scsi@freebsd.org>] on behalf of Shiva Bhanujan = [shiva.bhanujan@quorum.net >=20 >> <mailto:shiva.bhanujan@quorum.net>] >=20 >>=20 >=20 >> Sent: Tuesday, October 17, 2017 6:08 AM >=20 >>=20 >=20 >> To: Kenneth D. Merry >=20 >>=20 >=20 >> Cc:=20 >=20 >>=20 > freebsd-scsi@freebsd.org <mailto:freebsd-scsi@freebsd.org> >=20 >>=20 >=20 >> Subject: RE: FreeBSD 10.3/11.0 SCSI errors with Symbios Logic SAS3008 = PCI-Express Fusion-MPT SAS-3 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> Since I started having the SCSI errors, I ended up running sg_format = to format the disks. I've found that once the disks are formatted using = sg_format, there are no SCSI errors. The errors that show up during the = format are towards the end of the dmesg output. >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> (da0:mpr0:0:8:0): SCSI sense: NOT READY asc:4,4 (Logical unit not = ready, format in progress) >=20 >>=20 >=20 >> (da0:mpr0:0:8:0): Progress: 9% (6256/65536) complete >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> once the format is done, I can successfully format and partition = using gpart. >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> The errors that show up when I try to run gpart for the first time = are as follows: >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> (da9:mpr0:0:17:0): READ(10). CDB: 28 00 00 00 00 00 00 01 00 00 >=20 >>=20 >=20 >> (da9:mpr0:0:17:0): CAM status: SCSI Status Error >=20 >>=20 >=20 >> (da9:mpr0:0:17:0): SCSI status: Check Condition >=20 >>=20 >=20 >> (da9:mpr0:0:17:0): SCSI sense: ILLEGAL REQUEST asc:20,0 (Invalid = command operation code) >=20 >>=20 >=20 >> (da9:mpr0:0:17:0): Error 22, Unretryable error >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> It seems that it's the read that is failing, and is being tagged as = an illegal request. While sg_format will address the issue at hand, this = isn't an option for us, because there are appliances that were formatted = using FreeBSD 10.2, and an upgrade to 10.3 >=20 >>=20 >=20 >> or 11.x might be an issue? >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> ________________________________ >=20 >>=20 >=20 >> From: Kenneth D. Merry [ken@FreeBSD.ORG <mailto:ken@FreeBSD.ORG>] >=20 >>=20 >=20 >> Sent: Monday, October 16, 2017 7:31 PM >=20 >>=20 >=20 >> To: Shiva Bhanujan >=20 >>=20 >=20 >> Cc:=20 >=20 >>=20 > freebsd-scsi@freebsd.org <mailto:freebsd-scsi@freebsd.org> >=20 >>=20 >=20 >> Subject: Re: FreeBSD 10.3/11.0 SCSI errors with Symbios Logic SAS3008 = PCI-Express Fusion-MPT SAS-3 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> On Tue, Oct 17, 2017 at 01:19:27 +0000, Shiva Bhanujan wrote: >=20 >>=20 >=20 >> Hi Ken, >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> I've attached the output of dmesg. Here's the SCSI CDB for a sample = drive, da3. >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 00 00 00 00 01 00 00 >=20 >>=20 >=20 >> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 00 02 00 00 01 00 00 >=20 >>=20 >=20 >> (da3:mpr0:0:11:0): READ(16). CDB: 88 00 00 00 00 01 d1 c0 ba 00 00 00 = 01 00 00 00 >=20 >>=20 >=20 >> (da3:mpr0:0:11:0): READ(16). CDB: 88 00 00 00 00 01 d1 c0 bc 00 00 00 = 01 00 00 00 >=20 >>=20 >=20 >> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 a0 00 22 00 01 00 00 >=20 >>=20 >=20 >> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 a0 02 22 00 01 00 00 >=20 >>=20 >=20 >> (da3:mpr0:0:11:0): READ(16). CDB: 88 00 00 00 00 01 d1 c0 ba 22 00 00 = 01 00 00 00 >=20 >>=20 >=20 >> (da3:mpr0:0:11:0): READ(16). CDB: 88 00 00 00 00 01 d1 c0 bc 22 00 00 = 01 00 00 00 >=20 >>=20 >=20 >> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 00 00 22 00 01 00 00 >=20 >>=20 >=20 >> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 00 02 22 00 01 00 00 >=20 >>=20 >=20 >> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 9f fc 22 00 01 00 00 >=20 >>=20 >=20 >> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 9f fe 22 00 01 00 00 >=20 >>=20 >=20 >> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 00 00 00 00 01 00 00 >=20 >>=20 >=20 >> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 00 02 00 00 01 00 00 >=20 >>=20 >=20 >> (da3:mpr0:0:11:0): READ(16). CDB: 88 00 00 00 00 01 d1 c0 ba 00 00 00 = 01 00 00 00 >=20 >>=20 >=20 >> (da3:mpr0:0:11:0): READ(16). CDB: 88 00 00 00 00 01 d1 c0 bc 00 00 00 = 01 00 00 00 >=20 >>=20 >=20 >> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 a0 00 22 00 01 00 00 >=20 >>=20 >=20 >> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 a0 02 22 00 01 00 00 >=20 >>=20 >=20 >> (da3:mpr0:0:11:0): READ(16). CDB: 88 00 00 00 00 01 d1 c0 ba 22 00 00 = 01 00 00 00 >=20 >>=20 >=20 >> (da3:mpr0:0:11:0): READ(16). CDB: 88 00 00 00 00 01 d1 c0 bc 22 00 00 = 01 00 00 00 >=20 >>=20 >=20 >> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 00 00 22 00 01 00 00 >=20 >>=20 >=20 >> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 00 02 22 00 01 00 00 >=20 >>=20 >=20 >> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 9f fc 22 00 01 00 00 >=20 >>=20 >=20 >> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 9f fe 22 00 01 00 00 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> My understanding is that FreeBSD 11.1 contains the mpr(4) driver? = I've tried this w/ 11.1, w/ the same results. >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> Yes, the mpr(4) driver is in all recent FreeBSD releases. >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> In looking at the dmesg, this is telling: >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> (da0:mpr0:0:8:0): WRITE(6). CDB: 0a 00 00 00 01 00 >=20 >>=20 >=20 >> (da0:mpr0:0:8:0): CAM status: SCSI Status Error >=20 >>=20 >=20 >> (da0:mpr0:0:8:0): SCSI status: Check Condition >=20 >>=20 >=20 >> (da0:mpr0:0:8:0): SCSI sense: NOT READY asc:4,4 (Logical unit not = ready, format in progress) >=20 >>=20 >=20 >> (da0:mpr0:0:8:0): Progress: 9% (6256/65536) complete >=20 >>=20 >=20 >> (da0:mpr0:0:8:0): Error 16, Unretryable error >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> If the drives are in the process of formatting, I guess it may make = sense >=20 >>=20 >=20 >> for them to reject read commands. Otherwise, it makes no sense for a = hard >=20 >>=20 >=20 >> drive to reject reads. >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> Are you able to check the status of the format? You should be able to = send >=20 >>=20 >=20 >> a test unit ready and figure out how far along the format is: >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> camcontrol tur da0 -v >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> And so on for each of the drives. >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> Ken >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> ________________________________ >=20 >>=20 >=20 >> From: Kenneth D. Merry [ken@FreeBSD.ORG = <mailto:ken@FreeBSD.ORG><mailto:ken@FreeBSD.ORG> >=20 >> <mailto:ken@FreeBSD.ORG>>] >=20 >>=20 >=20 >> Sent: Monday, October 16, 2017 7:42 AM >=20 >>=20 >=20 >> To: Shiva Bhanujan >=20 >>=20 >=20 >> Cc:=20 >=20 >>=20 > freebsd-scsi@freebsd.org = <mailto:freebsd-scsi@freebsd.org><mailto:freebsd-scsi@freebsd.org> >=20 >> <mailto:freebsd-scsi@freebsd.org>> >=20 >>=20 >=20 >> Subject: Re: FreeBSD 10.3/11.0 SCSI errors with Symbios Logic SAS3008 = PCI-Express Fusion-MPT SAS-3 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> On Fri, Oct 13, 2017 at 20:12:02 +0000, Shiva Bhanujan wrote: >=20 >>=20 >=20 >> Hello, >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> I have a FreeBSD 10.3 install in a HVM on XenServer 6.5. The HBA330 = SAS-3 controller is in pcipassthrough mode to the FreeBSD VM. When I try = to access the disks (/dev/da0...) using gpart, I get SCSI errors, like = the following: >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> (da0:mpr0:0:0:0): CAM status: SCSI Status Error >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> (da0:mpr0:0:0:0): SCSI status: Check Condition >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> (da0:mpr0:0:0:0): SCSI sense: ILLEGAL REQUEST asc:20,0 (Invalid = command operation code) >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> The error message above is missing the SCSI CDB. We need that in = order to >=20 >>=20 >=20 >> figure out what command the drive is complaining about. >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> The error message means that FreeBSD is sending a SCSI command that = the >=20 >>=20 >=20 >> drive doesn't support. That can be benign, or it can cause a problem. >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> So, what error does gpart give you when you have this problem? >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> I get the same errors w/ FreeBSD 11.0 also. Running 10.3 natively = also has the same result. >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> Please note, that these errors don't show up on a Fusion-MPT SAS-2 = controller, or a MegaRAID SAS 2208 controller. Additionally, FreeBSD = 10.2 doesn't have any SCSI errors on the HBA330 SAS-3 controller either. >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> Is there a different version of the mpr driver I should be using? I = haven't checked the differences between the mpr driver in 10.2 vs 10.3 = and 11.0. I do see that there are others who have experienced these = issues. Can somebody please provide me some pointers >=20 >>=20 >=20 >> as to why this is occurring? Or if there are some driver changes that = I might be able to incorporate? >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> In general, the latest mpr(4) driver is the best one. The driver = itself >=20 >>=20 >=20 >> generally doesn't send SCSI commands (there are a few exceptions), = but >=20 >>=20 >=20 >> rather passes them through from the upper layers of CAM. >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> Please note, that I have gone through the mail titled "scsi error at = SEAGATE ST1200MM0088 TT31" and have started sg_format on all the SEAGATE = disks. Having said that, I still need to figure out what would happen, = if the disks >=20 >>=20 >=20 >> were written to using FreeBSD 10.2, which doesn't seem to have SCSI = errors, and when I try to upgrade to 10.3. Any help is appreciated. >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> Send the full error messages, and we may be able to figure out what's = going >=20 >>=20 >=20 >> on. >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> Ken >=20 >>=20 >=20 >> -- >=20 >>=20 >=20 >> Kenneth Merry >=20 >>=20 >=20 >>=20 > ken@FreeBSD.ORG <mailto:ken@FreeBSD.ORG><mailto:ken@FreeBSD.ORG> >=20 >> <mailto:ken@FreeBSD.ORG>><mailto:ken@FreeBSD.ORG> >=20 >> <mailto:ken@FreeBSD.ORG>> >=20 >>=20 >=20 >> ________________________________ >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> -- >=20 >>=20 >=20 >> Kenneth Merry >=20 >>=20 >=20 >>=20 > ken@FreeBSD.ORG <mailto:ken@FreeBSD.ORG><mailto:ken@FreeBSD.ORG> >=20 >> <mailto:ken@FreeBSD.ORG>> >=20 >>=20 >=20 >> ________________________________ >=20 >>=20 >=20 >> _______________________________________________ >=20 >>=20 >=20 >>=20 > freebsd-scsi@freebsd.org <mailto:freebsd-scsi@freebsd.org> >=20 >> mailing list >=20 >>=20 >=20 >>=20 > https://lists.freebsd.org/mailman/listinfo/freebsd-scsi = <https://lists.freebsd.org/mailman/listinfo/freebsd-scsi> >=20 >>=20 >=20 >> To unsubscribe, send any mail to = "freebsd-scsi-unsubscribe@freebsd.org = <mailto:freebsd-scsi-unsubscribe@freebsd.org>" >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> _______________________________________________ >=20 >>=20 >=20 >>=20 > freebsd-scsi@freebsd.org mailing list >=20 >>=20 >=20 >>=20 > https://lists.freebsd.org/mailman/listinfo/freebsd-scsi >=20 >>=20 >=20 >> To unsubscribe, send any mail to = "freebsd-scsi-unsubscribe@freebsd.org" >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >> _______________________________________________ >=20 >>=20 >=20 >>=20 > freebsd-scsi@freebsd.org mailing list >=20 >>=20 >=20 >>=20 > https://lists.freebsd.org/mailman/listinfo/freebsd-scsi >=20 >>=20 >=20 >> To unsubscribe, send any mail to = "freebsd-scsi-unsubscribe@freebsd.org" >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >>=20 >=20 >=20 >=20 >=20 >=20 >=20 >=20 >=20 >=20 >=20 >=20 --Apple-Mail=_C43579B3-638F-4C3F-B4A7-22EDAD9BDBE9--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?95A66B26-548C-4BF5-9527-EE30F9C01D42>