Date: Mon, 17 Dec 2012 15:10:52 -0700 From: Jim Harris <jim.harris@gmail.com> To: Willem Jan Withagen <wjw@digiware.nl> Cc: FreeBSD Stable Users <freebsd-stable@freebsd.org> Subject: Re: Strange CAM errors Message-ID: <CAJP=Hc-4mjWON5=Qi=WVzZ_wzGzz06MjX6w9S5t=xFfyAQ7jbA@mail.gmail.com> In-Reply-To: <50CF925C.5040106@digiware.nl> References: <50CEFAC5.8000002@digiware.nl> <572946ED30FA47C69D6DCDD511CF6EB2@multiplay.co.uk> <50CF47A5.4090008@digiware.nl> <CAJP=Hc9q50qe4tXxmek_ZD6j=1CNQFwiO9XxtniOLdHZz6gWxw@mail.gmail.com> <50CF925C.5040106@digiware.nl>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Dec 17, 2012 at 2:45 PM, Willem Jan Withagen <wjw@digiware.nl>wrote: > On 17-12-2012 20:16, Jim Harris wrote:> > > The timeouts are occurring on inquiry commands to non-zero LUNs. > > arcmsr(4) is returning CAM_SEL_TIMEOUT instead of CAM_DEV_NOT_THERE for > > inquiry commands to this device and LUN > 0. CAM_DEV_NOT_THERE is > > preferred to remove these types of warnings, and similar patches have > > gone into for other SCSI drivers recently. > > > > Can you try this patch? > > > > Index: sys/dev/arcmsr/arcmsr.c > > =================================================================== > > --- sys/dev/arcmsr/arcmsr.c (revision 244190) > > +++ sys/dev/arcmsr/arcmsr.c (working copy) > > @@ -2439,7 +2439,7 @@ > > char *buffer=pccb->csio.data_ptr; > > > > if (pccb->ccb_h.target_lun) { > > - pccb->ccb_h.status |= CAM_SEL_TIMEOUT; > > + pccb->ccb_h.status |= CAM_DEV_NOT_THERE; > > xpt_done(pccb); > > return; > > } > > > > Hi Jim, > > The noise has gone down by a factor of 5, now I get: > > (probe6:arcmsr0:0:16:1): INQUIRY. CDB: 12 20 0 0 24 0 > (probe6:arcmsr0:0:16:1): CAM status: Unable to terminate I/O CCB request > (probe6:arcmsr0:0:16:1): Error 5, Unretryable error > (probe6:arcmsr0:0:16:2): INQUIRY. CDB: 12 40 0 0 24 0 > > Which is defined in sys/cam/cam.c .... > as CAM_UA_TERMIO, but that error is nowhere set in the arcmsr code.... > > There is something out of sync on your system. I just noticed this, but your original error messages were showing "Command timeout" (CAM_CMD_TIMEOUT) even though the driver was returning CAM_SEL_TIMEOUT. Now in this case, driver is returning CAM_DEV_NOT_THERE, but CAM is printing error message for CAM_UA_TERMIO. In both cases, driver is returning value X, but cam is interpreting it as X+1. So CAM and arcmsr(4) seem to have a different idea of the values of the cam_status enumeration. Can you provide details on your build environment? Are you building arcmsr as a loadable module or do you specify "device arcmsr" in your kernel config to link it statically? I'm suspecting loadable module, although I have no idea how these values would get out of sync since this enumeration hasn't changed in probably 10+ years. -Jim
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJP=Hc-4mjWON5=Qi=WVzZ_wzGzz06MjX6w9S5t=xFfyAQ7jbA>