From owner-freebsd-bugs@freebsd.org Tue Apr 11 21:27:59 2017 Return-Path: Delivered-To: freebsd-bugs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8D40AD3A096 for ; Tue, 11 Apr 2017 21:27:59 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 723BEC4E for ; Tue, 11 Apr 2017 21:27:59 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v3BLRx7r086421 for ; Tue, 11 Apr 2017 21:27:59 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 218572] pass(4) driver sometimes does error recovery when CAM_PASS_ERR_RECOVER is not set Date: Tue, 11 Apr 2017 21:27:59 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.3-STABLE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: terry-freebsd@glaver.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Apr 2017 21:27:59 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D218572 Bug ID: 218572 Summary: pass(4) driver sometimes does error recovery when CAM_PASS_ERR_RECOVER is not set Product: Base System Version: 10.3-STABLE Hardware: Any OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: freebsd-bugs@FreeBSD.org Reporter: terry-freebsd@glaver.org [This is a summation of a long discussion between me, ken@ and mav@] After SVN rev 236814 in FreeBSD/head, the pass(4) driver does some error=20 recovery, but not all cases, when the retry_count is set in the CCB and CAM_PASS_ERR_RECOVER is not set. Previously, the pass(4) driver would only do error recovery if CAM_PASS_ERR_RECOVER is set. This can be seen with 'camcontrol tur -v'. camcontrol sets the retry_count to 1 by default, so that the user will have at least one retry if he turns on retries with -E. If you reset a hard drive: # camcontrol reset 1:172:0 Reset of 1:172:0 was successful There should be a Unit Attention pending: # camcontrol tur 1:172:0 -v Unit is ready But that doesn't happen, because the kernel is doing error recovery when we have not turned it on with -E (which sets the CAM_PASS_ERR_RECOVER flag on the CCB). Retrying the experiment: # camcontrol reset 1:172:0 Reset of 1:172:0 was successful Now set the retry count to 0: # camcontrol tur 1:172:0 -v -C 0 Unit is not ready (pass42:mps1:0:172:0): TEST UNIT READY. CDB: 00 00 00 00 00 00=20 (pass42:mps1:0:172:0): CAM status: SCSI Status Error (pass42:mps1:0:172:0): SCSI status: Check Condition (pass42:mps1:0:172:0): SCSI sense: UNIT ATTENTION asc:29,2 (SCSI bus reset occurred) (pass42:mps1:0:172:0): Field Replaceable Unit: 2 We get the unit attention. Also, the "Filemark detected" asc/ascq entry (0x00,0x01) and other, similar tape error recovery entries should probably have an error recovery action of SS_NOP instead of SS_RDEF. The application should be notified of=20 Filemarks, setmarks, end of medium, etc. [This affects everything after r237326 in 9-STABLE, so the affected releases are 9.1/2/3, 10.0/1/2/3, 11.0, HEAD. As everything before 10.3 is EoL, the = fix only needs to be MFC'd back to 10.3.] --=20 You are receiving this mail because: You are the assignee for the bug.=