From owner-freebsd-scsi Sat Apr 14 22: 4:36 2001 Delivered-To: freebsd-scsi@freebsd.org Received: from aslan.scsiguy.com (aslan.scsiguy.com [63.229.232.106]) by hub.freebsd.org (Postfix) with ESMTP id 44DF137B423 for ; Sat, 14 Apr 2001 22:04:25 -0700 (PDT) (envelope-from gibbs@scsiguy.com) Received: from scsiguy.com (localhost [127.0.0.1]) by aslan.scsiguy.com (8.11.2/8.9.3) with ESMTP id f3F544s00932; Sat, 14 Apr 2001 23:04:05 -0600 (MDT) (envelope-from gibbs@scsiguy.com) Message-Id: <200104150504.f3F544s00932@aslan.scsiguy.com> To: Joerg Wunsch Cc: freebsd-scsi@FreeBSD.ORG Subject: Re: Problem with current sa(4) driver In-Reply-To: Your message of "Sat, 14 Apr 2001 20:39:25 +0200." <20010414203925.A63281@uriah.heep.sax.de> Date: Sat, 14 Apr 2001 23:04:04 -0600 From: "Justin T. Gibbs" Sender: owner-freebsd-scsi@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org >So now, i think it's rather in the domain of the sa(4) driver to >handle an illegal length indication properly by itself, since it's >rather special to handling tapes that a `short read' from the tape >(supplied blocksize to read(2) is larger than logical block size on >the tape) is basically a normal operating condition which is never to >be returned as an EIO, but always to be reported (in the b_resid) >filed to the bio layer. While it is true that the sa driver should be filtering out this particular case because there is no error, returning ERESTART for NO_SENSE is also wrong. You should be able to fix that by changing the table entry for that sense code in cam_periph.c. >Somehow i thought it's necessary to actually call cam_periph_error(), >in order to fetch the check condition from the device. This would only be the case if a controller either did not support "auto sense" or the CAM status was "auto sense failed". >OK, i did so, >and rearranged my modification to first call cam_periph_error(), but >to then ignore its returned ERESTART, and have saerror() return 0. >This reproducibly causes a segmentation violation panic with the >following stack trace: ERESTART means the error recovery code has already re-queued the CCB to retry the operation. By ignoring this code, you are telling the caller of saerror() to complete the command normally resulting in an eventual release of this particular ccb back to the free pool. A ccb can only be doing one thing at a time. 8-) To understand the symptoms of the panic, look at these definitions in cam.h: #define CAM_UNQUEUED_INDEX -1 #define CAM_ACTIVE_INDEX -2 #define CAM_DONEQ_INDEX -3 So, depending on exactly when the CCB was reused, the ccbq was manipulated, and perhaps the error recovery code's retry completed, you might see any of these three indexes or a valid index. >[I've finally subscribed to this list again, so no need to send >me a personal Cc.] I can feel a David O'Brien moment coming one. 8-) -- Justin To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-scsi" in the body of the message