From owner-freebsd-scsi Sat Aug 16 23:01:55 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id XAA08979 for freebsd-scsi-outgoing; Sat, 16 Aug 1997 23:01:55 -0700 (PDT) Received: from nico.telstra.net (nico.telstra.net [139.130.204.16]) by hub.freebsd.org (8.8.5/8.8.5) with SMTP id XAA08971 for ; Sat, 16 Aug 1997 23:01:49 -0700 (PDT) Received: from freebie.lemis.com (gregl1.lnk.telstra.net [139.130.136.133]) by nico.telstra.net (8.6.10/8.6.10) with ESMTP id QAA06501; Sun, 17 Aug 1997 16:01:16 +1000 Received: (grog@localhost) by freebie.lemis.com (8.8.7/8.6.12) id PAA06503; Sun, 17 Aug 1997 15:31:15 +0930 (CST) Message-ID: <19970817153114.20533@lemis.com> Date: Sun, 17 Aug 1997 15:31:14 +0930 From: Greg Lehey To: Joerg Wunsch Cc: FreeBSD SCSI Mailing List Subject: Re: Bus resets. Grrrr. References: <199708170129.KAA03776@freebie.lemis.com> <19970817075001.XE28042@uriah.heep.sax.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.81e In-Reply-To: <19970817075001.XE28042@uriah.heep.sax.de>; from J Wunsch on Sun, Aug 17, 1997 at 07:50:01AM +0200 Organisation: LEMIS, PO Box 460, Echunga SA 5153, Australia Phone: +61-8-8388-8250 Fax: +61-8-8388-8250 Mobile: +61-41-739-7062 WWW-Home-Page: http://www.lemis.com/~grog Sender: owner-freebsd-scsi@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk On Sun, Aug 17, 1997 at 07:50:01AM +0200, J Wunsch wrote: > As Greg Lehey wrote: > >> This is the third time in a row that I haven't been able to complete >> a backup because of "recoverable" SCSI errors. > > What makes you think these are `recoverable'? The disks recover. > Be reminded that this is the typical failure picture one can see from > a bad SCSI chain. Yup. But that's not the only thing. > I'm also seeing it occasionally on our new Seagate/Conner DAT drive at > work, where even the older ahc driver used to work with the previous > HP DAT (that is dead now). I'm not fully sure yet, but i tend to > blame the Conner drive there. Interesting. The tape in question is a Conner^H^H^H^H^H^HArchive^H^H^H^H^H^H^HSeagate changer--see the dmesg output below for more info. But that doesn't seem to be the problem: it's always the Micropolis disk which has the timeout. >> If I understand this correctly, this means that the abort SCB wasn't >> received either, so the driver does a bus reset: > > Which is typical for a SCSI chain where ``Nichts geht mehr''. But which can happen as well at other times. >> Aug 17 10:27:32 freebie /kernel: sd1: UNIT ATTENTION asc:29,0 >> Aug 17 10:27:32 freebie /kernel: sd1: Power on, reset, or bus device reset occurred > > That's the consequence from the bus reset. As you wrote, no harm done > for the disks. The unit attention is typically caught by the first > (out of 4) retries. My question (which you omitted): does this have to be fatal for the tape? Is there indeterminate data loss (i.e. can we not be sure whether a block has been written or not?) >> Is anybody doing anything about this? > > You, checking your termination and term power first? No, been there, done that. Do you think I'd ask a question like that without doing my homework first? Also, this config has been running smoothly for weeks. In that connection, however, I suspect problems with the IBM DORS-32160 drives I have connected to that host adapter. They just plain Would Not Work on any host adapter together with my Conner CFP4207S. The BIOS wouldn't even get through the scan. Here are some relevant parts of the config: ahc0: rev 0x03 int a irq 12 on pci0.18.0 ahc0: aic7870 Single Channel, SCSI Id=7, 16 SCBs ahc0: waiting for scsi devices to settle scbus0 at ahc0 bus 0 scbus0 target 0 lun 0: type 0 fixed SCSI 2 sd0 at scbus0 target 0 lun 0 sd0: Direct-Access 1001MB (2051615 512 byte sectors) sd0: with 1760 cyls, 15 heads, and an average 77 sectors/track scbus0 target 3 lun 0: type 0 fixed SCSI 2 sd1 at scbus0 target 3 lun 0 sd1: Direct-Access 2063MB (4226725 512 byte sectors) sd1: with 6703 cyls, 5 heads, and an average 126 sectors/track scbus0 target 4 lun 0: type 1 removable SCSI 2 st0 at scbus0 target 4 lun 0 st0: Sequential-Access density code 0x24, 512-byte blocks, write-enabled scbus0 target 4 lun 1: type 8 removable SCSI 2 uk0 at scbus0 target 4 lun 1 uk0: Unknown scbus0 target 5 lun 0: type 1 removable SCSI 1 st1 at scbus0 target 5 lun 0 st1: Sequential-Access density code 0x0, drive empty scbus1 at aha0 bus 0 scbus1 target 2 lun 0: type 0 fixed SCSI 2 sd2 at scbus1 target 2 lun 0 Any ideas? I was thinking of moving the Micropolis drive to the aha, but that suffers from other problems, over and above the performance loss. Greg