From owner-freebsd-scsi@FreeBSD.ORG Wed Mar 4 16:24:17 2015 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6061154F for ; Wed, 4 Mar 2015 16:24:17 +0000 (UTC) Received: from mithlond.kdm.org (mithlond.kdm.org [70.56.43.85]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "A1-33714", Issuer "A1-33714" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id E1E98BE1 for ; Wed, 4 Mar 2015 16:24:16 +0000 (UTC) Received: from [10.0.0.101] ([10.0.0.101]) (authenticated bits=0) by mithlond.kdm.org (8.14.9/8.14.9) with ESMTP id t24GOFs3027785 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 4 Mar 2015 09:24:16 -0700 (MST) (envelope-from ken@freebsd.org) Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) Subject: Re: What does the error code 82 mean? From: Ken Merry In-Reply-To: Date: Wed, 4 Mar 2015 09:24:15 -0700 Message-Id: <8689E5CD-4E89-48D1-B0EE-3821E7174A0D@freebsd.org> References: <20150303065052.GA98687@mithlond.kdm.org> To: fengyd X-Mailer: Apple Mail (2.2070.6) X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.4.3 (mithlond.kdm.org [70.56.43.85]); Wed, 04 Mar 2015 09:24:16 -0700 (MST) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: freebsd-scsi@freebsd.org X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Mar 2015 16:24:17 -0000 The challenge is that the data transfer rate is reset on the target for = both the initiator doing the reset, and the other initiator. So re-negotiating from the initiator that did the reset will do no good. = You need to re-negotiate from the other initiator. You can either detect the situation from a unit attention (that you will = get in response from a test unit ready) returned from the target, or you = can communicate between the nodes so that the other node knows that it = needs to re-negotiate. Ken =E2=80=94=20 Ken Merry ken@FreeBSD.ORG > On Mar 4, 2015, at 2:44 AM, fengyd wrote: >=20 > Hi, >=20 > The code to reset the target: > static void sym_reset_dev(hcb_p np, union ccb *ccb) > { > tcb_p tp; > struct ccb_hdr *ccb_h =3D &ccb->ccb_h; >=20 > if (ccb_h->target_id =3D=3D np->myaddr || > ccb_h->target_id >=3D SYM_CONF_MAX_TARGET || > ccb_h->target_lun >=3D SYM_CONF_MAX_LUN) { > sym_xpt_done2(np, ccb, CAM_DEV_NOT_THERE); > return; > } >=20 > tp =3D &np->target[ccb_h->target_id]; >=20 > tp->to_reset =3D 1; > sym_xpt_done2(np, ccb, CAM_REQ_CMP); >=20 > np->istat_sem =3D SEM; > OUTB (nc_istat, SIGP|SEM); > return; > } >=20 > Can target reset set data transfer with the size provided by driver? >=20 >=20 > Thanks for your help. >=20 > Br. > Yafeng >=20 > On Wed, Mar 4, 2015 at 5:40 PM, fengyd > wrote: > Hi, >=20 > It seems that during initialization, data transfer is set as 16-bit by = driver, it is set as 8-bit due to target reset. > So it means default data transfer for the drive is 8-bit? >=20 > -You might try seeing what the ahc(4) and ahd(4) drivers do in this = situation. > I didn't find the code related with ahc or ahd. > Do you know in which release ahc and ahd are implemented? >=20 > -If you have an idea that this may have happened, you can try doing a = bus or target rescan. > I just begin to study FREEBSD driver. > Could you give some instructions how to do bus or target rescan? >=20 > -Just out of curiosity, why are you doing multi-initiator with this = hardware? =20 > Two units needs to access the device at the same time. >=20 > Thanks for your help. >=20 > Br. > Yafeng >=20 > On Wed, Mar 4, 2015 at 12:28 AM, Ken Merry > wrote: > It sounds like the target reset is causing the drive to reset its = negotiation parameters, and go back to narrow SCSI. >=20 > UNIT1 still thinks it is talking wide SCSI, but the drive is actually = talking 8 bit. So the drive sends back the 64 bytes of inquiry data in = 64 bus clocks. The drive is only changing the bottom 8 bits, but the = controller thinks it is driving all 16, and records the top 8 bits as = zeros. >=20 > The result is that you get 64 bytes of =E2=80=9Cextra=E2=80=9D data, = and every other byte is zero. >=20 > So, you=E2=80=99ll need to figure out a way for the sym(4) driver to = figure out that the target has been reset, and re-negotiate with the = drive. >=20 > You might try seeing what the ahc(4) and ahd(4) drivers do in this = situation. I don=E2=80=99t know whether or not they actually handle it, = but it might be instructive to look. >=20 > If you have an idea that this may have happened, you can try doing a = bus or target rescan. That may go through the domain validation path = and trigger re-negotiation with the target. >=20 > Just out of curiosity, why are you doing multi-initiator with this = hardware? It would probably be easier to do all of this with more = modern SAS hardware and expanders. >=20 > Ken > =E2=80=94=20 > Ken Merry > ken@FreeBSD.ORG >=20 >=20 >=20 >> On Mar 3, 2015, at 12:50 AM, fengyd > wrote: >>=20 >> Hi, >>=20 >> Thanks very much for your reply. >>=20 >> -How are you sending the INQUIRY command?=20 >> Yes. >> -Are you sending it via the pass(4) driver? =20 >> Yes >> -How many bytes are you asking for in the CDB? =20 >> 64 >> -How many bytes are you setting in the dxfer_len field in the CCB? >> 64, but it seems the device wants to transfer 128 bytes. >>=20 >> -What kind of device are you talking to? =20 >> Some kernel log: >> da3 at sym1 bus 0 target 0 lun 0 >> da3: Fixed Direct Access SCSI-3 device=20 >> da3: 40.000MB/s transfers (20.000MHz, offset 31, 16bit), Tagged = Queueing Enabled >> da3: 70136MB (143638992 512 byte sectors: 255H 63S/T 8941C) >>=20 >> =20 >> >>=20 >> The brief connections as above: >> UNIT0 can access DISK0 and DISK1 by IOC0. >> UNIT1 can access DISK0 and DISK1 by IOC1. >>=20 >> The problem happens when UNIT0 sends XPT_RESET_DEV to reset one disk, = UNIT1 sends INQUIRY to get the basic information from the target, but = fails to get the correct information. >>=20 >> And I added some log. >> =20 >> The right information got from device: >>=20 >> 00 00 03 12 5B 00 01 3A 46 55 4A 49 54 53 55 20 >>=20 >> 4D 42 41 33 30 37 33 4E 50 20 20 20 20 20 20 20 >>=20 >> 34 37 30 32 42 42 53 32 50 41 41 30 31 31 46 34 >>=20 >> 00 00 00 00 00 00 00 00 0F 00 00 40 0B 54 01 3C >>=20 >> =20 >> The wrong information got from device: >>=20 >> 00 00 00 00 03 00 12 00 5B 00 00 00 01 00 3A 00 >>=20 >>=20 >> 46 00 55 00 4A 00 49 00 54 00 53 00 55 00 20 00 >>=20 >> 4D 00 42 00 41 00 33 00 30 00 37 00 33 00 4E 00 >>=20 >> 50 00 20 00 20 00 20 00 20 00 20 00 20 00 20 00 >>=20 >> =20 >> Compared to the right log, it seems one extra byte 00 is added after = every byte. >>=20 >>=20 >>=20 >>=20 >> Thanks for your help. >>=20 >> Br. >> Yafeng >>=20 >>=20 >> On Tue, Mar 3, 2015 at 2:50 PM, Kenneth D. Merry > wrote: >>=20 >> An overrun is exactly what the comment below indicates. It is when = the >> target sends back more data than you asked for. You will generally = see it >> on commands that receive data from a target. >>=20 >> How are you sending the INQUIRY command? Are you sending it via the >> pass(4) driver? How many bytes are you asking for in the CDB? How = many >> bytes are you setting in the dxfer_len field in the CCB? >>=20 >> What kind of device are you talking to? Obviously, you're using the = sym(4) >> driver, so I'm guessing this is a parallel SCSI device (unless there = is a >> virtualization stack that emulates the sym(4) hardware). >>=20 >> Ken >>=20 >> On Mon, Mar 02, 2015 at 15:49:57 +0800, fengyd wrote: >> > Hi, >> > >> > I found the related code in the function sym_int_sir: >> > /* >> > * The device wants us to tranfer more data than >> > * expected or in the wrong direction. >> > * The number of extra bytes is in scratcha. >> > * It is a data overrun condition. >> > */ >> > case *SIR_DATA_OVERRUN*: >> > if (cp) { >> > OUTONB (HF_PRT, HF_EXT_ERR); >> > * cp->xerr_status |=3D XE_EXTRA_DATA;* >> > cp->extra_bytes +=3D INL (nc_scratcha); >> > } >> > goto out; >> > >> > I'm not familiar with SCSI. >> > What does DATA_OVERRUN actually mean? >> > How can it be triggered? >> > Could you give more details about it? >> > >> > Thanks for your help. >> > >> > Br. >> > Yafeng >> > >> > >> > >> > On Sat, Feb 28, 2015 at 4:50 PM, fengyd > wrote: >> > >> > > Hi, >> > > >> > > It seems the error code 82 & 3F is 0x12. >> > > And the definition of the error code in the file cam.h: >> > > CAM_AUTOSENSE_FAIL =3D 0x10,/* Autosense: request sense = cmd fail */ >> > > CAM_NO_HBA, /* No HBA Detected error */ >> > > CAM_DATA_RUN_ERR, /* Data Overrun error */ >> > > >> > > So, it means data overrun error? >> > > >> > > Thanks. >> > > >> > > Br. >> > > Yafeng >> > > >> > > On Sat, Feb 28, 2015 at 4:32 PM, fengyd > wrote: >> > > >> > >> Hi, >> > >> >> > >> INQUIRY command is sent to the target, but error code 82 is = returned. >> > >> I added some log in the driver: >> > >> SIR_COMPLETE_ERROR >> > >> (pass0:sym0:0:0:0): sym_complete_error status =3D 18 >> > >> (pass0:sym0:0:0:0): status =3D 82 >> > >> >> > >> Do you know what does the error code 82 mean? >> > >> >> > >> Thanks in advance. >> > >> >> > >> Br. >> > >> Yafeng >> > >> >> > > >> > > >> > _______________________________________________ >> > freebsd-scsi@freebsd.org mailing = list >> > http://lists.freebsd.org/mailman/listinfo/freebsd-scsi = >> > To unsubscribe, send any mail to = "freebsd-scsi-unsubscribe@freebsd.org = " >>=20 >> -- >> Kenneth Merry >> ken@FreeBSD.ORG >>=20 >=20 >=20 >=20