Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 6 Jan 1997 20:58:50 +0100
From:      se@freebsd.org (Stefan Esser)
To:        kelly@fsl.noaa.gov (Kelly)
Cc:        scsi@freebsd.org
Subject:   Re: Problem appears from  migration from bt0 to ncr0
Message-ID:  <Mutt.19970106205850.se@x14.mi.uni-koeln.de>
In-Reply-To: <32D125EB.4E3F@fsl.noaa.gov>; from Kelly on Jan 6, 1997 09:18:51 -0700
References:  <32D125EB.4E3F@fsl.noaa.gov>

next in thread | previous in thread | raw e-mail | index | archive | help
On Jan 6, kelly@fsl.noaa.gov (Kelly) wrote:
> Hello SCSI experts.
> 
> I've recently migrated a host using a Buslogic 946C to an NCR 53C810. 

Which FreeBSD version is that ?
Please test with a 2.2-BETA, if possible ...

> This machine contains a DAT drive and I use it as an Amanda dumphost. 
> It also contains a DFRS hard drive.  That's where the problem is.
> 
> The DFRS is a good capacity drive that's nice 'n fast and doesn't even
> run too hot, but it goes asleep ever so often for some kind of thermal
> calibration period or something.  During this time, the Buslogic driver
> would report "bt0: try to abort" but everything would be fine afterward.
> 
> Now, with the NCR driver, things get confused.  The thermal calibration
> period will cause syslogd to have an input/output error trying to write
> messages to disk.  The kernel spits out

Well, the DFRS is known to go asleep for more than a minute :(
You can extend the time-outs (in the GENERIC SCSI code) to be 
longer than that, for all commands. This should avoid the code
that thinks the NCR might be dead, if no progress is made for 
a long time, and no NCR chip and SCSI bus reset would occur ...

> ncr0: restart (ncr dead ?).
> Jan  6 03:47:07 sage /kernel: sd0(ncr0:0:0): FAST SCSI-2 100ns (10
> Mb/sec) offset 8.
> Jan  6 03:47:07 sage /kernel: sd0(ncr0:0:0): UNIT ATTENTION asc:29,0
> Jan  6 03:47:07 sage /kernel: sd0(ncr0:0:0):  Power on, reset, or bus
> device reset occurred
> 
> Now, although alarming, the filesystems seem OK.  The disk works
> afterwards as does the CD-ROM.
> 
> But the worse part is this: access to any sequential devices on the SCSI
> bus stops working.  Any tape activity results in input/output error and
> messages like the following:
> 
> Jan  6 06:57:22 sage /kernel: st1(ncr0:3:0): 275ns (4 Mb/sec) offset 8.
> Jan  6 06:57:22 sage /kernel: st1(ncr0:3:0): NOT READY asc:4,1
> Jan  6 06:57:22 sage /kernel: st1(ncr0:3:0):  Logical unit is in process
> of becoming ready
> Jan  6 06:57:37 sage /kernel: ncr0: restart (ncr dead ?).

What kind of tape device is that ?
Please disable synchronous transfers for the tape:

# ncrcontrol -t 3 -s sync=0

I guess, that the tape drive and the NCR driver disagree about 
the transfer mode, after the SCSI bus reset. This must be fixed
(if it is the case), but I need more input ...

> It looks like any tape activity causes a bus reset, and the tape drives
> can't handle it.  But I'm probably wrong.

I've been thinking whether the tape is reset to use asynchronous 
transfers after the bus reset, but looking on the messages above,
it is obvious that the 4MB transfer rate has successfully been 
negotiated again. Hmmm ... No idea what's going on here ...

> Now, I wouldn't mind chucking the DFRS drive into the nearest active
> volcano complete with massive cinder dome and flying sparks and other
> nifty special effects, but my wife would frown on more computer
> equipment appearing on the credit card bill.  So, is there anything you
> could recommend I try in software?

Please try the ncrcontrol command given above.
Report, how long the drive "sleeps".
If possible, switch off and on the tape drive and see, 
whether that makes it operational again ...

Regards, STefan



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Mutt.19970106205850.se>