Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 28 Apr 2001 23:33:06 -0600
From:      "Kenneth D. Merry" <ken@kdm.org>
To:        Joerg Wunsch <joerg_wunsch@uriah.heep.sax.de>
Cc:        freebsd-scsi@FreeBSD.ORG
Subject:   Re: sa(4) jamming
Message-ID:  <20010428233306.A37621@panzer.kdm.org>
In-Reply-To: <20010428210359.Q50185@uriah.heep.sax.de>; from j@uriah.heep.sax.de on Sat, Apr 28, 2001 at 09:03:59PM %2B0200
References:  <200104271649.f3RGmts35017@aslan.scsiguy.com> <200104271700.f3RH01s35435@aslan.scsiguy.com> <20010428210359.Q50185@uriah.heep.sax.de>

next in thread | previous in thread | raw e-mail | index | archive | help

--k1lZvvs/B4yU6o8G
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

On Sat, Apr 28, 2001 at 21:03:59 +0200, J Wunsch wrote:
> As Justin T. Gibbs wrote:
> 
> > >That depends on whether the sa driver relies on any settings (mode
> > >page or otherwise) that are invalidated by a reset, but aren't
> > >restored when the bus reset async event occurs.
> > 
> > Just to be more clear here, it is the peripheral driver's responsibility
> > to restore any of this state if it needs to.
> 
> I don't doubt this.  It only surprises me how many driver bugs have
> been unobscured by the CAM error handling reorg.  All those things (sa
> driver ILI handling, pt driver invalidating the device after a
> power-on UA, sa driver running into obscure problems after a bus reset
> without reloading the tape) didn't happen with earlier versions of the
> CAM subsystem.

I think some problems are inevitable with a rewrite as large as the error
recovery rewrite.
 
What surprised me is that we didn't get any bug reports after the changes
first went into the tree on March 27th.  Your bug reports are (as far as I
can remember) the first.

> Please get me right, i don't want to grumble about CAM here, but i'd
> like to see bugs fixed as they become obvious in -current, and the
> first step is always to make sure where the bug actually is.

Yep, we definitely want to fix them.

> Also Justin, you still didn't respond whether the patch for the `NO
> SENSE' handling i posted i what you were referring to.  Right now,
> cam_periph_error() effectively returns ERESTART in that case.

I think your patch was on target, but the action string needs to be set as
well.  Even with an action of SS_NOP, there are cases where sense will be
printed.  (e.g. when a block is reallocated and a drive informs us of it)

IIRC, in Justin's original patches SS_NOP wasn't even in the case
statement.  So you'd get a panic whenever SS_NOP came back from
scsi_error_action().  I added it to the retry case, which wasn't really the
correct thing to do in hindsight.

Anyway, I've attached a patch.  It needs to be tested.  I won't be able to
test it until I get one of my machines upgraded to -current.  (My
buildworld blew up in xlint.)

Ken
-- 
Kenneth Merry
ken@kdm.org

--k1lZvvs/B4yU6o8G
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="cam_periph.c.20010428"

==== //depot/FreeBSD-ken/src/sys/cam/cam_periph.c#12 - /usr/home/ken/perforce/FreeBSD-ken/src/sys/cam/cam_periph.c ====
*** /tmp/tmp.44005.0	Sat Apr 28 23:24:03 2001
--- /usr/home/ken/perforce/FreeBSD-ken/src/sys/cam/cam_periph.c	Sat Apr 28 23:23:52 2001
***************
*** 1369,1374 ****
--- 1369,1377 ----
  
  		switch (err_action & SS_MASK) {
  		case SS_NOP:
+ 			action_string = "No Recovery Action Needed";
+ 			error = 0;
+ 			break;
  		case SS_RETRY:
  			action_string = "Retrying Command";
  			error = ERESTART;

--k1lZvvs/B4yU6o8G--

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010428233306.A37621>