Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 20 Oct 2017 10:34:54 -0400
From:      Ken Merry <ken@freebsd.org>
To:        Shiva Bhanujan <Shiva.Bhanujan@Quorum.com>
Cc:        Steven Hartland <killing@multiplay.co.uk>, "freebsd-scsi@freebsd.org" <freebsd-scsi@freebsd.org>
Subject:   Re: FreeBSD 10.3/11.0 SCSI errors with Symbios Logic SAS3008 PCI-Express Fusion-MPT SAS-3
Message-ID:  <95A66B26-548C-4BF5-9527-EE30F9C01D42@freebsd.org>
In-Reply-To: <3A5A10BE32AC9E45B4A22F89FC90EC0701C366674D@QLEXC01.Quorum.local>
References:  <3A5A10BE32AC9E45B4A22F89FC90EC0701C3665D5D@QLEXC01.Quorum.local> <3A5A10BE32AC9E45B4A22F89FC90EC0701C3665E8B@QLEXC01.Quorum.local> <20171016144231.GA94858@mithlond.kdm.org> <3A5A10BE32AC9E45B4A22F89FC90EC0701C366610E@QLEXC01.Quorum.local> <20171017023126.GA6559@mithlond.kdm.org> <3A5A10BE32AC9E45B4A22F89FC90EC0701C3666143@QLEXC01.Quorum.local> <3A5A10BE32AC9E45B4A22F89FC90EC0701C3666345@QLEXC01.Quorum.local> <3A5A10BE32AC9E45B4A22F89FC90EC0701C3666352@QLEXC01.Quorum.local> <3E746990-8C6D-4CA1-BD79-B5566CFB07F4@freebsd.org> <32C157CE-A122-435F-8430-9531BEEB5914@freebsd.org> <3A5A10BE32AC9E45B4A22F89FC90EC0701C3666647@QLEXC01.Quorum.local> <CAHEMsqY1ijV659MabfRPWecDVRbzwYGynd2hUfUOwmZe_Tg=Lw@mail.gmail.com> <3A5A10BE32AC9E45B4A22F89FC90EC0701C36666F5@QLEXC01.Quorum.local> <96BAD947-4AB0-4EAC-9DA8-4B1F10253287@freebsd.org> <3A5A10BE32AC9E45B4A22F89FC90EC0701C366674D@QLEXC01.Quorum.local>

next in thread | previous in thread | raw e-mail | index | archive | help

--Apple-Mail=_C43579B3-638F-4C3F-B4A7-22EDAD9BDBE9
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=utf-8

Ok.  Yes, for #2, in theory we can disable EEDP / protection information =
in the mpr(4) driver and if DPICZ is set, the drive won=E2=80=99t =
require setting protection information on read and write commands.

That should let you access the disks normally.

That said, I=E2=80=99ve never played with protection information before, =
so I don=E2=80=99t know for sure.  I have a drive that supports it and =
I=E2=80=99m formatting it now to turn on type 2 protection  I may be =
able to debug things once I get that done.

In the mean time, I=E2=80=99ve attached a patch against the stable/10 =
version of the mpr(4) driver.  Apply this patch, and rebuild and =
reinstall your kernel.  Then you=E2=80=99ll be able to disable EEDP in =
the driver three different ways:

1.  Set hw.mpr.disable_eedp=3D1 in /boot/loader.conf.  That will disable =
EEDP / Protection Information for all mpr instances.
2.  Set dev.mpr.0.disable_eedp=3D1 in /boot/loader.conf.  That will =
disable EEDP for mpr0.
3.  sysctl dev.mpr.0.disable_eedp=3D1   That will disable EEDP on the =
fly for mpr0.

Let me know what happens.

Ken
=E2=80=94=20
Ken Merry
ken@FreeBSD.ORG


--Apple-Mail=_C43579B3-638F-4C3F-B4A7-22EDAD9BDBE9
Content-Disposition: attachment;
	filename=mpr_eedp_diffs.stable10.20171020.txt
Content-Type: text/plain;
	x-unix-mode=0600;
	name="mpr_eedp_diffs.stable10.20171020.txt"
Content-Transfer-Encoding: quoted-printable

=3D=3D=3D=3D //depot/users/kenm/FreeBSD-stable/10/sys/dev/mpr/mpr.c#5 - =
/usr/home/kenm/perforce4/kenm/FreeBSD-stable/10/sys/dev/mpr/mpr.c =3D=3D=3D=
=3D
*** /tmp/tmp.44697.83	Fri Oct 20 08:17:31 2017
--- /usr/home/kenm/perforce4/kenm/FreeBSD-stable/10/sys/dev/mpr/mpr.c	=
Fri Oct 20 07:46:02 2017
***************
*** 1513,1518 ****
--- 1513,1519 ----
  	TUNABLE_INT_FETCH("hw.mpr.enable_ssu", &sc->enable_ssu);
  	TUNABLE_INT_FETCH("hw.mpr.spinup_wait_time", =
&sc->spinup_wait_time);
  	TUNABLE_INT_FETCH("hw.mpr.use_phy_num", &sc->use_phynum);
+ 	TUNABLE_INT_FETCH("hw.mpr.disable_eedp", &sc->disable_eedp);
 =20
  	/* Grab the unit-instance variables */
  	snprintf(tmpstr, sizeof(tmpstr), "dev.mpr.%d.debug_level",
***************
*** 1551,1556 ****
--- 1552,1561 ----
  	snprintf(tmpstr, sizeof(tmpstr), "dev.mpr.%d.use_phy_num",
  	    device_get_unit(sc->mpr_dev));
  	TUNABLE_INT_FETCH(tmpstr, &sc->use_phynum);
+=20
+ 	snprintf(tmpstr, sizeof(tmpstr), "dev.mpr.%d.disable_eedp",
+ 	    device_get_unit(sc->mpr_dev));
+ 	TUNABLE_INT_FETCH(tmpstr, &sc->disable_eedp);
  }
 =20
  static void
***************
*** 1646,1651 ****
--- 1651,1660 ----
  	    "Use the phy number for enumeration");
 =20
  	SYSCTL_ADD_INT(sysctl_ctx, SYSCTL_CHILDREN(sysctl_tree),
+ 	    OID_AUTO, "dsiable_eedp", CTLFLAG_RD, &sc->disable_eedp, 0,
+ 	    "Disable Protection Info / EEDP");
+=20
+ 	SYSCTL_ADD_INT(sysctl_ctx, SYSCTL_CHILDREN(sysctl_tree),
  	    OID_AUTO, "prp_pages_free", CTLFLAG_RD,
  	    &sc->prp_pages_free, 0, "number of free PRP pages");
 =20
=3D=3D=3D=3D =
//depot/users/kenm/FreeBSD-stable/10/sys/dev/mpr/mpr_sas.c#8 - =
/usr/home/kenm/perforce4/kenm/FreeBSD-stable/10/sys/dev/mpr/mpr_sas.c =
=3D=3D=3D=3D
*** /tmp/tmp.44697.181	Fri Oct 20 08:17:31 2017
--- =
/usr/home/kenm/perforce4/kenm/FreeBSD-stable/10/sys/dev/mpr/mpr_sas.c	=
Fri Oct 20 07:48:13 2017
***************
*** 2071,2077 ****
  	 * for EEDP transfer.
  	 */
  	eedp_flags =3D op_code_prot[req->CDB.CDB32[0]];
! 	if (sc->eedp_enabled && eedp_flags) {
  		SLIST_FOREACH(lun, &targ->luns, lun_link) {
  			if (lun->lun_id =3D=3D csio->ccb_h.target_lun) {
  				break;
--- 2071,2077 ----
  	 * for EEDP transfer.
  	 */
  	eedp_flags =3D op_code_prot[req->CDB.CDB32[0]];
! 	if (sc->eedp_enabled && sc->disable_eedp =3D=3D 0 && eedp_flags) =
{
  		SLIST_FOREACH(lun, &targ->luns, lun_link) {
  			if (lun->lun_id =3D=3D csio->ccb_h.target_lun) {
  				break;
=3D=3D=3D=3D //depot/users/kenm/FreeBSD-stable/10/sys/dev/mpr/mprvar.h#3 =
- /usr/home/kenm/perforce4/kenm/FreeBSD-stable/10/sys/dev/mpr/mprvar.h =
=3D=3D=3D=3D
*** /tmp/tmp.44697.281	Fri Oct 20 08:17:31 2017
--- /usr/home/kenm/perforce4/kenm/FreeBSD-stable/10/sys/dev/mpr/mprvar.h	=
Fri Oct 20 07:44:26 2017
***************
*** 297,302 ****
--- 297,303 ----
  	u_int				enable_ssu;
  	int				spinup_wait_time;
  	int				use_phynum;
+ 	int				disable_eedp;
  	uint64_t			chain_alloc_fail;
  	uint64_t			prp_page_alloc_fail;
  	struct sysctl_ctx_list		sysctl_ctx;

--Apple-Mail=_C43579B3-638F-4C3F-B4A7-22EDAD9BDBE9
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=utf-8



> On Oct 20, 2017, at 9:26 AM, Shiva Bhanujan =
<Shiva.Bhanujan@Quorum.com> wrote:
>=20
> [3] isn't an option, because we're specifically moving to FreeBSD 10.3 =
or above to get the 'restartable zfs send/receive' feature.  I believe =
for [2], this would require not setting the 'type 2 protection' when =
configured formatting the disks?  We have appliances that are already =
formatted, so we'll need to figure out how to address that part.
>=20
> I'm not familiar w/ the mpr driver source code.  I can start looking @ =
it, but I'm afraid, I'm not going to make much progress.  If you could =
help out w/ addressing this issue in 10.3 w/ the protection bit set for =
SCSI disks, that'd be great.
>=20
>=20
>=20
>=20
>=20
> From: Ken Merry [ken@freebsd.org]
>=20
> Sent: Thursday, October 19, 2017 7:09 PM
>=20
> To: Shiva Bhanujan
>=20
> Cc: Steven Hartland; freebsd-scsi@freebsd.org
>=20
> Subject: Re: FreeBSD 10.3/11.0 SCSI errors with Symbios Logic SAS3008 =
PCI-Express Fusion-MPT SAS-3
>=20
>=20
>=20
>=20
>=20
>=20
>=20
> There are probably 3 solutions to get things fixed for you:
>=20
>=20
>=20
> 1. Track down why the mpr driver is sending requests that the drive =
doesn=E2=80=99t like and fix it.
>=20
>=20
>=20
> 2. Turn on DPICZ in the drive and disable setting protection =
information in the mpr driver. This will take commenting out the right =
code and recompiling the kernel.
>=20
>=20
>=20
> 3. Use an older version of FreeBSD to read the data and the reformat =
and write it back.
>=20
>=20
>=20
> We need to do #1 for the sake of the users who will run into this. #2 =
hopefully will work and won=E2=80=99t require running an old OS. #3 is =
essentially giving up.
>=20
>=20
>=20
> For #2, If you feel comfortable modifying the mpr driver, just look =
for the eedp code and turn it off. Otherwise I can try to come up with =
something tomorrow.
>=20
>=20
>=20
> Ken
>=20
> --=20
>=20
> Ken Merry
>=20
> ken@FreeBSD.ORG
>=20
>=20
>=20
>> On Oct 19, 2017, at 20:51, Shiva Bhanujan <Shiva.Bhanujan@Quorum.com> =
wrote:
>=20
>>=20
>=20
>> would it possible to by default read SCSI disks w/ type 2 protection =
in 10.3 and above? Please note, that the issue that I'm facing, is that =
the partitions are created in FreeBSD 10.2, and for the Seagate drives =
ST4000NM0005, they are created w/ type 2 partition.
> read/write to these disks in FreeBSD 10.2 works just fine. However, if =
I upgrade to FreeBSD 10.3 and above, I get SCSI errors for only these =
disks. Would it be possible, that the SCSI reads default to reading =
disks that have type 2 partition? In such a case,
> removing type 2 partition would not be needed.
>=20
>>=20
>=20
>> all other params in the output of sg_readcap look the same for the =
ST4000NM0023, where this issue isn't seen.
>=20
>>=20
>=20
>>=20
> root@Filer:~ #
>=20
>>=20
> root@Filer:~ # camcontrol devlist | grep da0
>=20
>> <SEAGATE ST4000NM0023 GS14> at scbus2 target 0 lun 0 (pass1,da0)
>=20
>>=20
> root@Filer:~ #
>=20
>>=20
> root@Filer:~ # camcontrol devlist | grep da2
>=20
>> <SEAGATE ST4000NM0005 MS05> at scbus2 target 2 lun 0 (pass3,da2)
>=20
>>=20
> root@Filer:~ #
>=20
>>=20
> root@Filer:~ # sg_readcap --16 da0
>=20
>> Read Capacity results:
>=20
>> Protection: prot_en=3D0, p_type=3D0, p_i_exponent=3D0
>=20
>> Logical block provisioning: lbpme=3D0, lbprz=3D0
>=20
>> Last logical block address=3D7814037167 (0x1d1c0beaf), Number of =
logical blocks=3D7814037168
>=20
>> Logical block length=3D512 bytes
>=20
>> Logical blocks per physical block exponent=3D0
>=20
>> Lowest aligned logical block address=3D0
>=20
>> Hence:
>=20
>> Device size: 4000787030016 bytes, 3815447.8 MiB, 4000.79 GB
>=20
>>=20
> root@Filer:~ #
>=20
>>=20
> root@Filer:~ # sg_readcap --16 da2
>=20
>> Read Capacity results:
>=20
>> Protection: prot_en=3D1, p_type=3D1, p_i_exponent=3D0 [type 2 =
protection]
>=20
>> Logical block provisioning: lbpme=3D0, lbprz=3D0
>=20
>> Last logical block address=3D7814037167 (0x1d1c0beaf), Number of =
logical blocks=3D7814037168
>=20
>> Logical block length=3D512 bytes
>=20
>> Logical blocks per physical block exponent=3D0
>=20
>> Lowest aligned logical block address=3D0
>=20
>> Hence:
>=20
>> Device size: 4000787030016 bytes, 3815447.8 MiB, 4000.79 GB
>=20
>>=20
> root@Filer:~ #
>=20
>>=20
>=20
>>=20
>=20
>> is there some default SCSI read that is configurable in FreeBSD 10.3 =
and above?
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> From: Steven Hartland [killing@multiplay.co.uk]
>=20
>>=20
>=20
>> Sent: Thursday, October 19, 2017 3:23 PM
>=20
>>=20
>=20
>> To: Ken Merry; Shiva Bhanujan
>=20
>>=20
>=20
>> Cc:=20
> freebsd-scsi@freebsd.org
>=20
>>=20
>=20
>> Subject: Re: FreeBSD 10.3/11.0 SCSI errors with Symbios Logic SAS3008 =
PCI-Express Fusion-MPT SAS-3
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> With type 2 protection the ref tag has to match the LBA + N=20
>=20
>>=20
>=20
>> Some info about it is here
>=20
>>=20
>=20
>>=20
> https://www.usenix.org/legacy/event/lsf07/tech/petersen.pdf
>=20
>>=20
>=20
>>=20
>=20
>>=20
> =
https://www.seagate.com/files/staticfiles/docs/pdf/whitepaper/safeguarding=
-data-from-corruption-technology-paper-tp621us.pdf
>=20
>>=20
>=20
>> =46rom reading the seagate paper the only way to change the =
protection level is to format.
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> On Thu, 19 Oct 2017 at 22:43, Shiva Bhanujan =
<Shiva.Bhanujan@quorum.com> wrote:
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> Here's the output of sg_readcap.
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> [root@Filer-20-241 ~]# sg_readcap --16 da1
>=20
>>=20
>=20
>> Read Capacity results:
>=20
>>=20
>=20
>> Protection: prot_en=3D1, p_type=3D1, p_i_exponent=3D0 [type 2 =
protection]
>=20
>>=20
>=20
>> Logical block provisioning: lbpme=3D0, lbprz=3D0
>=20
>>=20
>=20
>> Last logical block address=3D7814037167 (0x1d1c0beaf), Number of =
logical blocks=3D7814037168
>=20
>>=20
>=20
>> Logical block length=3D512 bytes
>=20
>>=20
>=20
>> Logical blocks per physical block exponent=3D0
>=20
>>=20
>=20
>> Lowest aligned logical block address=3D0
>=20
>>=20
>=20
>> Hence:
>=20
>>=20
>=20
>> Device size: 4000787030016 bytes, 3815447.8 MiB, 4000.79 GB
>=20
>>=20
>=20
>> [root@Filer-20-241 ~]#
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> [root@Filer-20-241 ~]# camcontrol modepage da1 -v -m 10 | grep DPICZ
>=20
>>=20
>=20
>> DPICZ: 1
>=20
>>=20
>=20
>> [root@Filer-20-241 ~]#
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> I did toggle the DPICZ on the drive from 1 to 0 and back. The =
sg_readcap still shows 'type 2 protection', and that gpart still shows =
the SCSI errors. I've narrowed this down to the Seagate ST4000NM0005, w/ =
a DOM of 03/2016. We also have Constellation ES.3
>=20
>> drives model ST4000NM0023 that don't exhibit this issue.
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> I did go through the URLs that you have mentioned and that's how I =
found that using sg_format did address this issue. Again, that works =
only for new drives and we have appliances that already have data, for =
which sg_format isn't an option.
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> Is this boiling down to the 'type 2 protection' tag that we see in =
sg_readcap? if so, would there be a way to turn it off?
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> From: Ken Merry [ken@freebsd.org]
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> Sent: Thursday, October 19, 2017 2:17 PM
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> To: Shiva Bhanujan
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> Cc:=20
>=20
>>=20
> freebsd-scsi@freebsd.org
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> Subject: Re: FreeBSD 10.3/11.0 SCSI errors with Symbios Logic SAS3008 =
PCI-Express Fusion-MPT SAS-3
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> By the way, the message you referenced is here:
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
> =
https://lists.freebsd.org/pipermail/freebsd-scsi/2017-January/007237.html
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> And there is more here:
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
> https://bugs.freenas.org/issues/14517
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> Ken
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> =E2=80=94
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> Ken Merry
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
> ken@FreeBSD.ORG
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> On Oct 19, 2017, at 5:15 PM, Ken Merry <ken@freebsd.org> wrote:
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> What does sg_readcap =E2=80=9416 show for these drives?
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> If it has type 2 protection turned on, check to see what the DCIPZ =
value is in the control mode page:
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> camcontrol modepage daX -v -m 10
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> If that is set to 0, add a -e to the above command line and see if =
you can set it to 1. That may or may not help anything.
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> The mpr(4) and mps(4) drivers try to support protection information =
if it is turned on on the drive. So, they set the protection information =
if protection information is turned on in the drive. For that reason, =
setting the DCIPZ bit may not fix it.
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> There could be a problem with how that is implemented that=E2=80=99s =
causing the drives to reject the command, but I=E2=80=99m not sure.
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> If it is, Steve (CCed) can help us debug it.
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> Ken
>=20
>>=20
>=20
>> =E2=80=94
>=20
>>=20
>=20
>> Ken Merry
>=20
>>=20
>=20
>>=20
> ken@FreeBSD.ORG
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> On Oct 17, 2017, at 9:54 PM, Shiva Bhanujan =
<Shiva.Bhanujan@Quorum.com> wrote:
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> Sorry. perhaps I have failed to mention, the SCSI errors are only w/ =
the seagate drives. These are of the model ST4000NM0023.
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> From: Shiva Bhanujan
>=20
>>=20
>=20
>> Sent: Tuesday, October 17, 2017 6:53 PM
>=20
>>=20
>=20
>> To: Kenneth D. Merry
>=20
>>=20
>=20
>> Cc:=20
>=20
>>=20
> freebsd-scsi@freebsd.org <mailto:freebsd-scsi@freebsd.org>
>=20
>>=20
>=20
>> Subject: RE: FreeBSD 10.3/11.0 SCSI errors with Symbios Logic SAS3008 =
PCI-Express Fusion-MPT SAS-3
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> Please note, that this isn't an issue w/ Toshiba drives. is this a =
firmware issue by any chance?
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> From:=20
>=20
>>=20
> owner-freebsd-scsi@freebsd.org <mailto:owner-freebsd-scsi@freebsd.org> =
[owner-freebsd-scsi@freebsd.org
>=20
>> <mailto:owner-freebsd-scsi@freebsd.org>] on behalf of Shiva Bhanujan =
[shiva.bhanujan@quorum.net
>=20
>> <mailto:shiva.bhanujan@quorum.net>]
>=20
>>=20
>=20
>> Sent: Tuesday, October 17, 2017 6:08 AM
>=20
>>=20
>=20
>> To: Kenneth D. Merry
>=20
>>=20
>=20
>> Cc:=20
>=20
>>=20
> freebsd-scsi@freebsd.org <mailto:freebsd-scsi@freebsd.org>
>=20
>>=20
>=20
>> Subject: RE: FreeBSD 10.3/11.0 SCSI errors with Symbios Logic SAS3008 =
PCI-Express Fusion-MPT SAS-3
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> Since I started having the SCSI errors, I ended up running sg_format =
to format the disks. I've found that once the disks are formatted using =
sg_format, there are no SCSI errors. The errors that show up during the =
format are towards the end of the dmesg output.
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> (da0:mpr0:0:8:0): SCSI sense: NOT READY asc:4,4 (Logical unit not =
ready, format in progress)
>=20
>>=20
>=20
>> (da0:mpr0:0:8:0): Progress: 9% (6256/65536) complete
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> once the format is done, I can successfully format and partition =
using gpart.
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> The errors that show up when I try to run gpart for the first time =
are as follows:
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> (da9:mpr0:0:17:0): READ(10). CDB: 28 00 00 00 00 00 00 01 00 00
>=20
>>=20
>=20
>> (da9:mpr0:0:17:0): CAM status: SCSI Status Error
>=20
>>=20
>=20
>> (da9:mpr0:0:17:0): SCSI status: Check Condition
>=20
>>=20
>=20
>> (da9:mpr0:0:17:0): SCSI sense: ILLEGAL REQUEST asc:20,0 (Invalid =
command operation code)
>=20
>>=20
>=20
>> (da9:mpr0:0:17:0): Error 22, Unretryable error
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> It seems that it's the read that is failing, and is being tagged as =
an illegal request. While sg_format will address the issue at hand, this =
isn't an option for us, because there are appliances that were formatted =
using FreeBSD 10.2, and an upgrade to 10.3
>=20
>>=20
>=20
>> or 11.x might be an issue?
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> ________________________________
>=20
>>=20
>=20
>> From: Kenneth D. Merry [ken@FreeBSD.ORG <mailto:ken@FreeBSD.ORG>]
>=20
>>=20
>=20
>> Sent: Monday, October 16, 2017 7:31 PM
>=20
>>=20
>=20
>> To: Shiva Bhanujan
>=20
>>=20
>=20
>> Cc:=20
>=20
>>=20
> freebsd-scsi@freebsd.org <mailto:freebsd-scsi@freebsd.org>
>=20
>>=20
>=20
>> Subject: Re: FreeBSD 10.3/11.0 SCSI errors with Symbios Logic SAS3008 =
PCI-Express Fusion-MPT SAS-3
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> On Tue, Oct 17, 2017 at 01:19:27 +0000, Shiva Bhanujan wrote:
>=20
>>=20
>=20
>> Hi Ken,
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> I've attached the output of dmesg. Here's the SCSI CDB for a sample =
drive, da3.
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 00 00 00 00 01 00 00
>=20
>>=20
>=20
>> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 00 02 00 00 01 00 00
>=20
>>=20
>=20
>> (da3:mpr0:0:11:0): READ(16). CDB: 88 00 00 00 00 01 d1 c0 ba 00 00 00 =
01 00 00 00
>=20
>>=20
>=20
>> (da3:mpr0:0:11:0): READ(16). CDB: 88 00 00 00 00 01 d1 c0 bc 00 00 00 =
01 00 00 00
>=20
>>=20
>=20
>> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 a0 00 22 00 01 00 00
>=20
>>=20
>=20
>> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 a0 02 22 00 01 00 00
>=20
>>=20
>=20
>> (da3:mpr0:0:11:0): READ(16). CDB: 88 00 00 00 00 01 d1 c0 ba 22 00 00 =
01 00 00 00
>=20
>>=20
>=20
>> (da3:mpr0:0:11:0): READ(16). CDB: 88 00 00 00 00 01 d1 c0 bc 22 00 00 =
01 00 00 00
>=20
>>=20
>=20
>> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 00 00 22 00 01 00 00
>=20
>>=20
>=20
>> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 00 02 22 00 01 00 00
>=20
>>=20
>=20
>> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 9f fc 22 00 01 00 00
>=20
>>=20
>=20
>> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 9f fe 22 00 01 00 00
>=20
>>=20
>=20
>> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 00 00 00 00 01 00 00
>=20
>>=20
>=20
>> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 00 02 00 00 01 00 00
>=20
>>=20
>=20
>> (da3:mpr0:0:11:0): READ(16). CDB: 88 00 00 00 00 01 d1 c0 ba 00 00 00 =
01 00 00 00
>=20
>>=20
>=20
>> (da3:mpr0:0:11:0): READ(16). CDB: 88 00 00 00 00 01 d1 c0 bc 00 00 00 =
01 00 00 00
>=20
>>=20
>=20
>> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 a0 00 22 00 01 00 00
>=20
>>=20
>=20
>> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 a0 02 22 00 01 00 00
>=20
>>=20
>=20
>> (da3:mpr0:0:11:0): READ(16). CDB: 88 00 00 00 00 01 d1 c0 ba 22 00 00 =
01 00 00 00
>=20
>>=20
>=20
>> (da3:mpr0:0:11:0): READ(16). CDB: 88 00 00 00 00 01 d1 c0 bc 22 00 00 =
01 00 00 00
>=20
>>=20
>=20
>> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 00 00 22 00 01 00 00
>=20
>>=20
>=20
>> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 00 02 22 00 01 00 00
>=20
>>=20
>=20
>> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 9f fc 22 00 01 00 00
>=20
>>=20
>=20
>> (da3:mpr0:0:11:0): READ(10). CDB: 28 00 00 9f fe 22 00 01 00 00
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> My understanding is that FreeBSD 11.1 contains the mpr(4) driver? =
I've tried this w/ 11.1, w/ the same results.
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> Yes, the mpr(4) driver is in all recent FreeBSD releases.
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> In looking at the dmesg, this is telling:
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> (da0:mpr0:0:8:0): WRITE(6). CDB: 0a 00 00 00 01 00
>=20
>>=20
>=20
>> (da0:mpr0:0:8:0): CAM status: SCSI Status Error
>=20
>>=20
>=20
>> (da0:mpr0:0:8:0): SCSI status: Check Condition
>=20
>>=20
>=20
>> (da0:mpr0:0:8:0): SCSI sense: NOT READY asc:4,4 (Logical unit not =
ready, format in progress)
>=20
>>=20
>=20
>> (da0:mpr0:0:8:0): Progress: 9% (6256/65536) complete
>=20
>>=20
>=20
>> (da0:mpr0:0:8:0): Error 16, Unretryable error
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> If the drives are in the process of formatting, I guess it may make =
sense
>=20
>>=20
>=20
>> for them to reject read commands. Otherwise, it makes no sense for a =
hard
>=20
>>=20
>=20
>> drive to reject reads.
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> Are you able to check the status of the format? You should be able to =
send
>=20
>>=20
>=20
>> a test unit ready and figure out how far along the format is:
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> camcontrol tur da0 -v
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> And so on for each of the drives.
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> Ken
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> ________________________________
>=20
>>=20
>=20
>> From: Kenneth D. Merry [ken@FreeBSD.ORG =
<mailto:ken@FreeBSD.ORG><mailto:ken@FreeBSD.ORG>
>=20
>> <mailto:ken@FreeBSD.ORG>>]
>=20
>>=20
>=20
>> Sent: Monday, October 16, 2017 7:42 AM
>=20
>>=20
>=20
>> To: Shiva Bhanujan
>=20
>>=20
>=20
>> Cc:=20
>=20
>>=20
> freebsd-scsi@freebsd.org =
<mailto:freebsd-scsi@freebsd.org><mailto:freebsd-scsi@freebsd.org>
>=20
>> <mailto:freebsd-scsi@freebsd.org>>
>=20
>>=20
>=20
>> Subject: Re: FreeBSD 10.3/11.0 SCSI errors with Symbios Logic SAS3008 =
PCI-Express Fusion-MPT SAS-3
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> On Fri, Oct 13, 2017 at 20:12:02 +0000, Shiva Bhanujan wrote:
>=20
>>=20
>=20
>> Hello,
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> I have a FreeBSD 10.3 install in a HVM on XenServer 6.5. The HBA330 =
SAS-3 controller is in pcipassthrough mode to the FreeBSD VM. When I try =
to access the disks (/dev/da0...) using gpart, I get SCSI errors, like =
the following:
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> (da0:mpr0:0:0:0): CAM status: SCSI Status Error
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> (da0:mpr0:0:0:0): SCSI status: Check Condition
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> (da0:mpr0:0:0:0): SCSI sense: ILLEGAL REQUEST asc:20,0 (Invalid =
command operation code)
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> The error message above is missing the SCSI CDB. We need that in =
order to
>=20
>>=20
>=20
>> figure out what command the drive is complaining about.
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> The error message means that FreeBSD is sending a SCSI command that =
the
>=20
>>=20
>=20
>> drive doesn't support. That can be benign, or it can cause a problem.
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> So, what error does gpart give you when you have this problem?
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> I get the same errors w/ FreeBSD 11.0 also. Running 10.3 natively =
also has the same result.
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> Please note, that these errors don't show up on a Fusion-MPT SAS-2 =
controller, or a MegaRAID SAS 2208 controller. Additionally, FreeBSD =
10.2 doesn't have any SCSI errors on the HBA330 SAS-3 controller either.
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> Is there a different version of the mpr driver I should be using? I =
haven't checked the differences between the mpr driver in 10.2 vs 10.3 =
and 11.0. I do see that there are others who have experienced these =
issues. Can somebody please provide me some pointers
>=20
>>=20
>=20
>> as to why this is occurring? Or if there are some driver changes that =
I might be able to incorporate?
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> In general, the latest mpr(4) driver is the best one. The driver =
itself
>=20
>>=20
>=20
>> generally doesn't send SCSI commands (there are a few exceptions), =
but
>=20
>>=20
>=20
>> rather passes them through from the upper layers of CAM.
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> Please note, that I have gone through the mail titled "scsi error at =
SEAGATE ST1200MM0088 TT31" and have started sg_format on all the SEAGATE =
disks. Having said that, I still need to figure out what would happen, =
if the disks
>=20
>>=20
>=20
>> were written to using FreeBSD 10.2, which doesn't seem to have SCSI =
errors, and when I try to upgrade to 10.3. Any help is appreciated.
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> Send the full error messages, and we may be able to figure out what's =
going
>=20
>>=20
>=20
>> on.
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> Ken
>=20
>>=20
>=20
>> --
>=20
>>=20
>=20
>> Kenneth Merry
>=20
>>=20
>=20
>>=20
> ken@FreeBSD.ORG <mailto:ken@FreeBSD.ORG><mailto:ken@FreeBSD.ORG>
>=20
>> <mailto:ken@FreeBSD.ORG>><mailto:ken@FreeBSD.ORG>
>=20
>> <mailto:ken@FreeBSD.ORG>>
>=20
>>=20
>=20
>> ________________________________
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> --
>=20
>>=20
>=20
>> Kenneth Merry
>=20
>>=20
>=20
>>=20
> ken@FreeBSD.ORG <mailto:ken@FreeBSD.ORG><mailto:ken@FreeBSD.ORG>
>=20
>> <mailto:ken@FreeBSD.ORG>>
>=20
>>=20
>=20
>> ________________________________
>=20
>>=20
>=20
>> _______________________________________________
>=20
>>=20
>=20
>>=20
> freebsd-scsi@freebsd.org <mailto:freebsd-scsi@freebsd.org>
>=20
>> mailing list
>=20
>>=20
>=20
>>=20
> https://lists.freebsd.org/mailman/listinfo/freebsd-scsi =
<https://lists.freebsd.org/mailman/listinfo/freebsd-scsi>;
>=20
>>=20
>=20
>> To unsubscribe, send any mail to =
"freebsd-scsi-unsubscribe@freebsd.org =
<mailto:freebsd-scsi-unsubscribe@freebsd.org>"
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> _______________________________________________
>=20
>>=20
>=20
>>=20
> freebsd-scsi@freebsd.org mailing list
>=20
>>=20
>=20
>>=20
> https://lists.freebsd.org/mailman/listinfo/freebsd-scsi
>=20
>>=20
>=20
>> To unsubscribe, send any mail to =
"freebsd-scsi-unsubscribe@freebsd.org"
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>> _______________________________________________
>=20
>>=20
>=20
>>=20
> freebsd-scsi@freebsd.org mailing list
>=20
>>=20
>=20
>>=20
> https://lists.freebsd.org/mailman/listinfo/freebsd-scsi
>=20
>>=20
>=20
>> To unsubscribe, send any mail to =
"freebsd-scsi-unsubscribe@freebsd.org"
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>>=20
>=20
>=20
>=20
>=20
>=20
>=20
>=20
>=20
>=20
>=20
>=20


--Apple-Mail=_C43579B3-638F-4C3F-B4A7-22EDAD9BDBE9--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?95A66B26-548C-4BF5-9527-EE30F9C01D42>