From owner-freebsd-scsi Fri Dec 8 08:58:30 1995 Return-Path: owner-freebsd-scsi Received: (from root@localhost) by freefall.freebsd.org (8.7.3/8.7.3) id IAA17547 for freebsd-scsi-outgoing; Fri, 8 Dec 1995 08:58:30 -0800 (PST) Received: from Sysiphos (Sysiphos.MI.Uni-Koeln.DE [134.95.212.10]) by freefall.freebsd.org (8.7.3/8.7.3) with SMTP id IAA17231 for ; Fri, 8 Dec 1995 08:55:01 -0800 (PST) Received: by Sysiphos id AA26892 (5.67b/IDA-1.5 for scsi@freebsd.org); Fri, 8 Dec 1995 16:23:21 +0100 Message-Id: <199512081523.AA26892@Sysiphos> From: se@zpr.uni-koeln.de (Stefan Esser) Date: Fri, 8 Dec 1995 16:23:21 +0100 In-Reply-To: Andrew Russell "Check back Re: Problem with IBM 2 gig" (Dec 3, 23:33) X-Mailer: Mail User's Shell (7.2.6 alpha(2) 7/9/95) To: rmallory@wiley.csusb.edu, scsi@freebsd.org Subject: Re: Check back Re: Problem with IBM 2 gig Sender: owner-freebsd-scsi@freebsd.org Precedence: bulk } Subject: Check back Re: Problem with IBM 2 gig } looks like a power glitch.. are those devices in an external cabinet? } OR } possibly the ncd drive detected a hang and reset the scsi bus.. } } what do you think stefan? } julian Thanks for forwarding the message ... } > [asus MB, ncr825 bios3.04, plextor 6x] } > off a new kernel from sunday night, I got the following when doing } > a `df` with a mounted cdrom... any clues? } > ps: the new clustering code on cd9660's works excellent! } > I can (almost) watch two ~100MB qt movies off a cd at the same time! } > } > root@kickme$ df } > Dec 3 22:56:03 kickme /kernel: ncr0:6: ERROR (80:140) (8-2a-0) (88/13) @ (bd4:900b0000). } > Dec 3 22:56:03 kickme /kernel: ncr0:6: ERROR (80:140) (8-2a-0) (88/13) @ (bd4:900b0000). It is quite funny to see this (and the other messages) appear twice ... Never observed that before ... dstat and istat registers = (80:140): 80: dma fifo empty (Ok) 140: arbitration complete + handshake timeout SCSI bus state: out: ATN (NCR issues ATN) bus: BSY + ATN + C_D (SCSI lines are: Command phase + Attention) data: 0 DISPATCH: ... ... bd4: 900b0000 00000000 return when (data_out) 910a0000 00000000 return if (data_in) Hmmm, this is where it fails ... The DISPATCH code waits for the SCSI phase to stabilize (that's the WHEN clause). It will just return (to the data transfer code), if either a data input or output phase is detected. Obviously the NCR chip blocked at the WHEN, because it considered the bus to be in an inconsistent state (e.g. not connected, arbitrating, ...) } > root@kickme$ Dec 3 22:56:04 kickme /kernel: script cmd = 910a0000 } > Dec 3 22:56:04 kickme /kernel: script cmd = 910a0000 } > Dec 3 22:56:04 kickme /kernel: reg: da 10 80 13 47 88 06 0f 01 08 02 2a 80 00 0a 00. } > Dec 3 22:56:04 kickme /kernel: reg: da 10 80 13 47 88 06 0f 01 08 02 2a 80 00 0a 00. } > Dec 3 22:56:04 kickme /kernel: ncr0: handshake timeout } > Dec 3 22:56:04 kickme /kernel: ncr0: handshake timeout } > Dec 3 22:56:04 kickme /kernel: cd0(ncr0:6:0): COMMAND FAILED (6 ff) @f0aa8a00. } > Dec 3 22:56:04 kickme /kernel: cd0(ncr0:6:0): COMMAND FAILED (6 ff) @f0aa8a00. The command failed because of the timeout. The NCR did not go on, because it considered the SCSI bus to not be in a valid state. } > Dec 3 22:56:04 kickme /kernel: sd0(ncr0:0:0): COMMAND FAILED (6 ff) @f0aa7c00. } > Dec 3 22:56:04 kickme /kernel: sd0(ncr0:0:0): COMMAND FAILED (6 ff) @f0aa7c00. This is a secondary effect. The hard disk had an active command, which also was terminated because of the timeout. It had not got a chance to connect, so this is a little unfair ... } > Dec 3 22:56:04 kickme /kernel: sd0(ncr0:0:0): UNIT ATTENTION asc:29,0 } > Dec 3 22:56:04 kickme /kernel: sd0(ncr0:0:0): UNIT ATTENTION asc:29,0 The hard disk complains about the bus reset. Why was there no message from the NCR driver, that it was about to send a SCSI bus reset ??? Hmmm. } > Dec 3 22:56:05 kickme /kernel: sd0(ncr0:0:0): Power on, reset, or bus device reset occurred } > Dec 3 22:56:05 kickme /kernel: sd0(ncr0:0:0): Power on, reset, or bus device reset occurred The same in text form ... I'm a little confused about the two identical error messages. The commands in question are one and the same (as the @f0aa8a00 command control block address proves). If this remains a single case, thaen I'd say it most likely was a glitch. The driver was in an consistent state, and the NCR seems to have missed the fact, that the bus was ready for the requested command transfer, according to the SCSI control lines printed in the error message. The timeout lead to a SCSI bus reset, but the generic SCSI code should have resent the command to the hard disk, and if the CDROM did not lock up internally, then it should have been able to continue normal operation as well. Or did the system crash as a result ??? Regards, STefan -- Stefan Esser, Zentrum fuer Paralleles Rechnen Tel: +49 221 4706021 Universitaet zu Koeln, Weyertal 80, 50931 Koeln FAX: +49 221 4705160 ============================================================================== http://www.zpr.uni-koeln.de/~se