Date: Tue, 21 Nov 95 20:48 WET From: uhclem%nemesis@fw.ast.com To: FreeBSD-gnats-submit@freebsd.org Subject: i386/833: SCSI hard disks time out during tape rewind - FDIV039 Message-ID: <m0tI5EH-000IvKC@nemesis.lonestar.org> Resent-Message-ID: <199511220400.UAA25206@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 833 >Category: i386 >Synopsis: SCSI hard disks time out during tape rewind - FDIV039 >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Tue Nov 21 20:00:03 PST 1995 >Last-Modified: >Originator: Frank Durda IV >Organization: >Release: FreeBSD 2.1.0-RELEASE (also FreeBSD 2.0.5-RELEASE) >Environment: Three different systems all 486 (25MHz or faster, 8Meg or more RAM, Adaptec 1540B or 1542CF SCSI adapters, all with latest firmware/BIOS. At least one SCSI hard disk at aha0:0:0 SCSI Tape drive always at aha0:2:0 QIC150 using 600ft or longer tape (including 250Meg 1020ft tapes). Tape drive is QIC-150 Archive Viper 150, or Archive Viper 2150eS, or WangDAT Model 2600 DAT tape Normal combinations are: 33MHz 16Meg 1542CF WangDAT One hard disk 25MHz 8Meg 1542CF Archive Viper 150 (Internal) 33MHz 12Meg 1540B Archive 2150eS (external) >Description: If the tape is more than 20 seconds or so from BOT and a "mt rewind" command is issued, after 10 seconds or so the message: sd0(aha0:0:0): timed-out is reported, and continues to be reported at roughly five second intervals until the rewind is completed and BOT acquired. On DAT this operation can take up to a minute. Note that the tape is st0(aha0:2:0). On 250Meg (1020ft) QIC tapes, over two minutes can lapse. During this time, all system SCSI I/O seems to come to a halt. This problem was not noticed on the SCO UNIX software that used to be run on these systems, and we are fairly certain the system didn't "hang" for a minute when a DAT tape was re-wound as this would have been noticed. The higher priority on this report is more out of concern that I/O destined for the hard disk is aborted or otherwise lost because of the timeouts. That hopefully isn't the case. >How-To-Repeat: On the QIC 150 tape, run it (nrst0) until you hear the drive pause to reverse direction, then abort the operation. Now issue a "mt rewind". On a different screen, type "sync" or do something that will access the SCSI hard disks. Within 15 seconds you should see an error on the console. On the DAT, I found that writing/reading 50Meg (nrst0) into the tape got you far enough down the tape to see the error. Then abort the function and do a "mt rewind". >Fix: Issue rewind with bus disconnect commands when allowed. If these drives can't be disconnected from the bus while performing rewinds, set time-out timers higher when removable media is present. It is not sufficient to set longer timers on commands sent just to the removable media; requests to devices blocked by the slower devices must also get more time as in the above errors. >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?m0tI5EH-000IvKC>