From owner-freebsd-scsi  Tue Jul 27 12:38:28 1999
Delivered-To: freebsd-scsi@freebsd.org
Received: from dialup124.zpr.uni-koeln.de (dialup124.zpr.Uni-Koeln.DE [134.95.219.124])
	by hub.freebsd.org (Postfix) with ESMTP
	id 6295C14CC8; Tue, 27 Jul 1999 12:37:55 -0700 (PDT)
	(envelope-from se@zpr.uni-koeln.de)
Received: by dialup124.zpr.uni-koeln.de (Postfix, from userid 200)
	id 38EB5D36; Tue, 27 Jul 1999 20:17:08 +0200 (CEST)
Date: Tue, 27 Jul 1999 20:17:07 +0200
From: Stefan Esser <se@zpr.uni-koeln.de>
To: Thierry.Besancon@lps.ens.fr
Cc: scsi@FreeBSD.ORG, Stefan Esser <se@freebsd.org>
Subject: Re: tagged openings
Message-ID: <19990727201707.A371@dialup124.zpr.uni-koeln.de>
Reply-To: se@freebsd.org
References: <wnnn1woobqg.fsf@excalibur.lps.ens.fr>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 0.95.6i
In-Reply-To: <wnnn1woobqg.fsf@excalibur.lps.ens.fr>; from Thierry.Besancon@lps.ens.fr on Thu, Jul 22, 1999 at 07:06:31PM +0200
Sender: owner-freebsd-scsi@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On 1999-07-22 19:06 +0200, Thierry.Besancon@lps.ens.fr wrote:
> I'm running FreeBSD 3.1 and whenever my workstation reboots I get the
> message :
> 
>         (da0:ncr0:0:0:0): tagged openings now 15
> 
> What does it mean ?

There were more tagged commands issued to the drive than it is able
to queue. This is not really a problem, the driver will resend the
command that has not been accepted and will reduce the number of 
commands in progress on that drive to one less than the number that
caused the failure ...

There are quite a number of Quantum drives that had to be entered into
the "quirks" table in /sys/cam/cam_xpt.c. You may want to add an entry
for your drive, that limits "mintags" to 8 and "maxtags" to 15. See the
other Quantum entries for reference.

> I must say that I encounter scsi problems with this host but I can't
> find where they're coming from.
> Generally the machine freezes with messages saying ncr1 is on timeout.
> 
> For example :
> 
>         ncr1:5: ERROR (0:91) (9-ae-800) (8/13) @ (script 6dc:190001cb).

This message indicates a SCSI bus problem. SIST code 0x91 has the 
following error bits set:

0x80 = phase mismatch
0x10 = reselected by another device
0x01 = parity error

This happened during a read (DATA IN phase) after quite some data had
already been transfered. While it is bit OK that the driver does not
recover from this situation, you may want to check your SCSI cables
and terminators to prevent the parity error, which most often is the
result of too long a SCSI bus or a bad cable.

> Another one :
> 
>         ncr1:4:ERROR (81:0) (f-aa-0) (0/3) @ (script 3f0: 48000000)

This one is different, but may well also be caused by spurious SCSI
bus pulses. The NCR chip reports an illegal instruction error, which
most often is caused by too optimistic PCI performance options choosen
in the BIOS setup. Some chip-sets could not really support as many 
active bus-masters as claimed (often only a single bus-master was
allowed, and the ISA legacy DMA counted as one). Intel chip-sets
should be OK, but I'm not sure whether Ali or VIA Super-7 chip-sets
are as reliable.

> The precise configuration is 2 Tekram 390F cards + 2 towers of disks
> (4 disks each, IBM 9.1 Go), one DLT and one QUANTUM for the system :
> 
> ncr0: <ncr 53c875 fast20 wide scsi> rev 0x26 int a irq 9 on pci0.9.0
> ncr1: <ncr 53c875 fast20 wide scsi> rev 0x26 int a irq 12 on pci0.10.0

Do you know about save cable length limits for ULTRA-SCSI ?
Most of your devices are operating at 20MHz synch. SCSI rate, which
means the maximum specified SCSI bus length is *at most* 3m.
This value does of course include the internal ribbon cable in your
drive boxes, which often is already 90cm in a 2 drive enclosing.

If you are not sure that your SCSI bus cable is specified for 20MHz
transfer rates, you better consider 1.5m to be the maximum total bus 
length, or you will see sporadic transfer failures (with a certain 
probability of undetected data corruption, since parity only detects 
single bit errors (or rather odd numbers of flipped bits)).

> The DLT is daisy chained with one UW tower and I don't use the narrow
> connector on the tekram 390F. If I do so, the workstation just freezes
> during the boot with an error like :
> 
>         ncr1:5: ERROR (0:91) (9-ae-800) (8/13) @ (script 6dc:190001cb).

This is again the same parity error as in the first message and it
points to the real source of your trouble: SCSI bus data corruption.

> I must say too that I had the same problems with the same PC in
> another configuration : the DLT was the same, the system disk was the
> same, all other disks were different and not UW, the scsi cards were
> NCR 810.

Since the 810 only supported 10MHz rates, you could have a 6m SCSI bus, 
but only if termination at both sides was OK. There have been a few 
cheap 810 based SCSI cards with only passive terminators (single in-line 
resistor packs), though the original NCR and all Symbios cards (as well 
as Tekram and other high quality cards) always used active terminators, 
AFAIK.

Again: If your cable quality is not up to the spec, you better stay below 
half the maximum specified for perfect cables and terminators.

> The scsi bus goinf timeout is always the one with the DLT.
> Might it be faulty ?

No, I just think that you violate the Fast-20 specs by daisy chaining
the DLT with the UW tower, which may already be at its limits because
of the sum of external and internal cables between the SCSI card and
the last disk drive in the chain. (External cables are often in the
order of 1.5m and I guess that the internal cable will be at least 0.9m 
long ...)

Isn't the DLT4000 a non-wide SCSI device ?

If you connect an 8bit SCSI cable to the end of the 16bit SCSI bus,
you need quite some extra safety margin (i.e. restrict the total cable 
length even further).


If you really need to connect that number of drives, you may be 
better off with an Ultra-2 (Fast-40) card with the bus operating in
LVD mode. Your IBM DDRS drives may already be U2W, I can't tell from
the probe message, at least over here they are the same price in either
UW or U2W versions. LVD supports a SCSI bus length of 12.5m, which should 
satisfy your requirements ;-)

But if you have another free PCI slot, you may instead just install
another 8bit SCSI card (Sym8100 or 8600) for the DLT (and the boot disk).

Regards, STefan


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message