From owner-freebsd-hardware  Fri Jan 24 12:27:58 1997
Return-Path: <owner-hardware>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.5/8.8.5) id MAA17513
          for hardware-outgoing; Fri, 24 Jan 1997 12:27:58 -0800 (PST)
Received: from Sisyphos.MI.Uni-Koeln.DE (Sisyphos.MI.Uni-Koeln.DE [134.95.212.10])
          by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id MAA17506
          for <hardware@freebsd.org>; Fri, 24 Jan 1997 12:27:53 -0800 (PST)
Received: from x14.mi.uni-koeln.de (annexr2-49.slip.Uni-Koeln.DE) by Sisyphos.MI.Uni-Koeln.DE with SMTP id AA01204
  (5.67b/IDA-1.5 for <hardware@freebsd.org>); Fri, 24 Jan 1997 21:27:42 +0100
Received: (from se@localhost) by x14.mi.uni-koeln.de (8.8.4/8.6.9) id VAA00878; Fri, 24 Jan 1997 21:27:50 +0100 (CET)
Message-Id: <Mutt.19970124212750.se@x14.mi.uni-koeln.de>
Date: Fri, 24 Jan 1997 21:27:50 +0100
From: se@freebsd.org (Stefan Esser)
To: joe@via.net (Joe McGuckin)
Cc: hardware@freebsd.org
Subject: Re: NCR SCSI problem
References: <199701241911.LAA13945@monk.via.net>
X-Mailer: Mutt 0.55-PL15
Mime-Version: 1.0
In-Reply-To: <199701241911.LAA13945@monk.via.net>; from Joe McGuckin on Jan 24, 1997 11:11:38 -0800
Sender: owner-hardware@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

On Jan 24, joe@via.net (Joe McGuckin) wrote:
> FreeBSD 2.1.6-RELEASE.
> 
> Jan 24 11:02:29 news /kernel: ncr0 <ncr 53c875 wide scsi> rev 3 int a irq 10 on pci0:10
> Jan 24 11:02:29 news /kernel: ncr0 waiting for scsi devices to settle
> Jan 24 11:02:29 news /kernel: (ncr0:0:0): "IBM XP31070W      !x 81K6" type 0 fixed SCSI 2
> Jan 24 11:02:29 news /kernel: sd0(ncr0:0:0): Direct-Access 
> Jan 24 11:02:29 news /kernel: sd0(ncr0:0:0): WIDE SCSI (16 bit) enabled.
> Jan 24 11:02:30 news /kernel: sd0(ncr0:0:0): FAST SCSI-2 100ns (10 Mb/sec) offset 8.

> I'm using a Symbios '875 PCI card with 1 Fast-Wide drive and 3 narrow drives.
> 
> About twice a week we have a crash. Luckily, last night it didn't reboot, so 
> we could see the messages.
> 
> sd0(ncr:0:0:0) COMMAND FAILED (4 28) @f11b0a00

The 28 above indicates a "Queue Full" condition, which means 
that the command failed because of resource exhaustion. (The
default of 4 tagged commands should not cause this, but it is
possible, that some device can't deal with even that number
of active commands in certain situations).

You may try with less tags allowed for target 0:

# ncrcontrol -t 0 -s tags=2

(Or completely disable tags with "tags=0".)

> assertion "cp" failed: file "../../pci/ncr.c line 5560
> 
> swap_pager: I/O error - pagein failed error 5
> vm_fault: pager input (probably hardware error)

The SCSI code in 2.2 is much more robust. It retries commands in
such situations, and most probably would have recovered from the 
Queue Full condition.

> Are there known problem with this (or any other NCR) card? This 
> machine is used as a news server, so it gets lots of activity.

The most likely cause of your problem is the lack of SCSI command 
retries in 2.1.x.

Regards, STefan