Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 19 Aug 1997 21:19:59 -0600
From:      "Justin T. Gibbs" <gibbs@plutotech.com>
To:        Greg Lehey <grog@lemis.com>
Cc:        FreeBSD SCSI Mailing List <freebsd-scsi@freebsd.org>
Subject:   Re: Bus resets. Grrrr. 
Message-ID:  <199708200320.VAA08483@pluto.plutotech.com>
In-Reply-To: Your message of "Wed, 20 Aug 1997 09:08:10 %2B0930." <19970820090810.54774@lemis.com> 

next in thread | previous in thread | raw e-mail | index | archive | help
>On Tue, Aug 19, 1997 at 10:53:54AM -0600, Justin T. Gibbs wrote:
>>>> What version of the kernel are you using
>>>
>>> Recent versions of -current.  The ones I reported it against were some
>>> time last week.  I've just rebuilt with a version supped this morning.
>>
>> And it is still reproducible?
>
>I changed the configuration file and added (inter alia)
>AHC_SCBPAGING_ENABLE.  The resultant kernel hung solid three times in
>the course of a couple of hours, once with a disk activity light on
>solid, and the other two without.  I removed AHC_SCBPAGING_ENABLE, and
>last night the backup went through for the first time in a week.  It
>ran fine until last Wednesday, however, so this could be a
>coincidence.

The system hung solid with no kernel messages or were you in X so
you couldn't see them?  There is no guarantee that driver messages
will make it into the log file if the SCSI bus is wedged.  I wasn't
aware of any problems with SCB paging, so I'd be very interrested
in any information you can provide on this problem.  In most cases,
BTW, SCB paging isn't a win unless you are also using tagged queuing
(option AHC_TAGENABLE).

>No, at the moment the chain only has four devices connected, but I
>notice it finds two LUNs for the tape changer:

Ahh.  I thought you said you had a 2940A, not a 2940.  This pretty
much rules out the QOUTFIFO overflow problem (assuming you are not
using tagged queueing, which your dmesg output seems to confirm)
since the aic7870 has 16 slots meaning 9 active devices would be
necessary.

>ahc0: <Adaptec 2940 SCSI host adapter> rev 0x03 int a irq 12 on pci0.18.0
>ahc0: aic7870 Single Channel, SCSI Id=7, 16 SCBs
>ahc0: waiting for scsi devices to settle
>scbus0 at ahc0 bus 0
>scbus0 target 0 lun 0: <MICROP 2112-15MQ1094802 HQ48> type 0 fixed SCSI 2
>sd0 at scbus0 target 0 lun 0
>sd0: Direct-Access 1001MB (2051615 512 byte sectors)
>sd0: with 1760 cyls, 15 heads, and an average 77 sectors/track

It looks like there is newer firmware available for this drive from
Micropolis:

ftp://techsupport.micropolis.com/pub/files/firmware/Aquaris/2105-2108-2112/4930010f.bin
ftp://techsupport.micropolis.com/pub/files/Utils/ASPIUTIL.EXE

>> Could it be that you don't have disconnections enabled for your tape drive?
>> You should check both SCSI-Select for the 2940 and any relevant jumpers
>> on the tape drive itself.  If disconnections are disabled, a tape write that
>> required multiple retries could easily tie up the SCSI bus for the 10s
>> needed to make a disk command time out.
>
>You'd see that on the activity light, right?  In any case, the host
>adapter is set correctly, and the tape doesn't seem to have any such
>config switch.  Would there be another way to test that?

Not really.  Since the timeout was "while idle", chances are that
disconnection is enabled and working.

>> The first one probably fails because the device isn't ready.  
>
>That's what I thought, too, so I put a sleep 30 into the script.  It
>still works the second time.

Then it probably fails because there is a unit attention that needs to
be cleared.  The console error message would be enough to determine
what is really happening.

--
Justin T. Gibbs
===========================================
  FreeBSD: Turning PCs into workstations
===========================================



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199708200320.VAA08483>