Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 17 Dec 2012 15:10:52 -0700
From:      Jim Harris <jim.harris@gmail.com>
To:        Willem Jan Withagen <wjw@digiware.nl>
Cc:        FreeBSD Stable Users <freebsd-stable@freebsd.org>
Subject:   Re: Strange CAM errors
Message-ID:  <CAJP=Hc-4mjWON5=Qi=WVzZ_wzGzz06MjX6w9S5t=xFfyAQ7jbA@mail.gmail.com>
In-Reply-To: <50CF925C.5040106@digiware.nl>
References:  <50CEFAC5.8000002@digiware.nl> <572946ED30FA47C69D6DCDD511CF6EB2@multiplay.co.uk> <50CF47A5.4090008@digiware.nl> <CAJP=Hc9q50qe4tXxmek_ZD6j=1CNQFwiO9XxtniOLdHZz6gWxw@mail.gmail.com> <50CF925C.5040106@digiware.nl>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Dec 17, 2012 at 2:45 PM, Willem Jan Withagen <wjw@digiware.nl>wrote:

> On 17-12-2012 20:16, Jim Harris wrote:>
> > The timeouts are occurring on inquiry commands to non-zero LUNs.
> > arcmsr(4) is returning CAM_SEL_TIMEOUT instead of CAM_DEV_NOT_THERE for
> > inquiry commands to this device and LUN > 0.  CAM_DEV_NOT_THERE is
> > preferred to remove these types of warnings, and similar patches have
> > gone into for other SCSI drivers recently.
> >
> > Can you try this patch?
> >
> > Index: sys/dev/arcmsr/arcmsr.c
> > ===================================================================
> > --- sys/dev/arcmsr/arcmsr.c     (revision 244190)
> > +++ sys/dev/arcmsr/arcmsr.c     (working copy)
> > @@ -2439,7 +2439,7 @@
> >                 char *buffer=pccb->csio.data_ptr;
> >
> >                 if (pccb->ccb_h.target_lun) {
> > -                       pccb->ccb_h.status |= CAM_SEL_TIMEOUT;
> > +                       pccb->ccb_h.status |= CAM_DEV_NOT_THERE;
> >                         xpt_done(pccb);
> >                         return;
> >                 }
> >
>
> Hi Jim,
>
> The noise has gone down by a factor of 5, now I get:
>
> (probe6:arcmsr0:0:16:1): INQUIRY. CDB: 12 20 0 0 24 0
> (probe6:arcmsr0:0:16:1): CAM status: Unable to terminate I/O CCB request
> (probe6:arcmsr0:0:16:1): Error 5, Unretryable error
> (probe6:arcmsr0:0:16:2): INQUIRY. CDB: 12 40 0 0 24 0
>
> Which is defined in sys/cam/cam.c ....
> as CAM_UA_TERMIO, but that error is nowhere set in the arcmsr code....
>
>
There is something out of sync on your system.  I just noticed this, but
your original error messages were showing "Command timeout"
(CAM_CMD_TIMEOUT) even though the driver was returning CAM_SEL_TIMEOUT.
Now in this case, driver is returning CAM_DEV_NOT_THERE, but CAM is
printing error message for CAM_UA_TERMIO.  In both cases, driver is
returning value X, but cam is interpreting it as X+1.  So CAM and arcmsr(4)
seem to have a different idea of the values of the cam_status enumeration.

Can you provide details on your build environment?  Are you building arcmsr
as a loadable module or do you specify "device arcmsr" in your kernel
config to link it statically?  I'm suspecting loadable module, although I
have no idea how these values would get out of sync since this enumeration
hasn't changed in probably 10+ years.

-Jim



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJP=Hc-4mjWON5=Qi=WVzZ_wzGzz06MjX6w9S5t=xFfyAQ7jbA>