From owner-freebsd-scsi  Sun Sep 28 14:04:51 1997
Return-Path: <owner-freebsd-scsi>
Received: (from root@localhost)
          by hub.freebsd.org (8.8.7/8.8.7) id OAA22721
          for freebsd-scsi-outgoing; Sun, 28 Sep 1997 14:04:51 -0700 (PDT)
Received: from Octopussy.MI.Uni-Koeln.DE (Octopussy.MI.Uni-Koeln.DE [134.95.166.20])
          by hub.freebsd.org (8.8.7/8.8.7) with SMTP id OAA22712
          for <scsi@freebsd.org>; Sun, 28 Sep 1997 14:04:42 -0700 (PDT)
Received: from x14.mi.uni-koeln.de ([134.95.219.124]) by Octopussy.MI.Uni-Koeln.DE with SMTP id AA19973
  (5.67b/IDA-1.5 for <scsi@FreeBSD.ORG>); Sun, 28 Sep 1997 23:04:27 +0200
Received: (from se@localhost) by x14.mi.uni-koeln.de (8.8.7/8.6.9) id LAA00968; Sun, 28 Sep 1997 11:01:43 +0200 (CEST)
X-Face: "<d]#=8pzx);RzeqSKI86OVa7=!0/(uRa.+B.9Z9\eNUn@UG?!`y7yt2dFNn%k4'.}](uE%
 yCO)$e&Y1%3EO~ifu6Q-#YUM&JZ't,}JkPnAz,8Dj33u%@GBi%[Y#LHz$]h7a<p4)-jKI7~sKjlP-^
 EvA[G;]v&0]W!EL%shs,{7x0|oqN4YVIs5,NI#,V{9"WF):5&RkOhyj*#-IAG}Tnu;YCF,d
Message-Id: <19970928110142.45106@mi.uni-koeln.de>
Date: Sun, 28 Sep 1997 11:01:42 +0200
From: Stefan Esser <se@FreeBSD.ORG>
To: David Langford <langfod@dihelix.com>
Cc: scsi@FreeBSD.ORG
Subject: Re: Possible ncr problem?
References: <199709280439.SAA00307@caliban.dihelix.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 0.74
In-Reply-To: <199709280439.SAA00307@caliban.dihelix.com>; from David Langford on Sat, Sep 27, 1997 at 06:39:31PM -1000
Sender: owner-freebsd-scsi@FreeBSD.ORG
X-Loop: FreeBSD.org
Precedence: bulk

On Sep 27, David Langford <langfod@dihelix.com> wrote:
> Any thought on what this means?

Yes, again the QUEUE_FULL problem of the DORS.
This drive can't deal with a reasonable number 
of tagged commands at times, it seems. Since 
the generic SCSI layer will immediately re-issue
the failed command, there will be no adverse 
effect, if it succeeds within 5 retries. 

The new generic SCSI code currently being prepared
by Justin Gibbs will take care of this problem.

> After reconfiguring my SCSI chain several times I think I can
> rule bad termination.
> 
> Bad drive could be. I only have seen these errors if I "dump"
> the main partition on the drive or place my "obj" directory and do a
> "make world".

I have been wondering, whether the problem is
caused by sending an unnecessary START_STOP_UNIT
when the raw partition is opened. IBM drives did
never like that ...

You may want to remove the call of scsi_start_unit()
from sd.c (there is only one occurance), and see 
whether the error messages are still printed ...

> Fsck doesnt show problems and bad144 at least seemed to scan the drive without 
> any console messages showing up.

Fsck does also open the raw partition, but at that
time, there should not be any other activity. I'm
guessing, that even a START_STOP_UNIT command that
is a NOP because the drive motor had been started
long ago, does collide with any other command on
IBM drives.

> assertion "cp" failed: file "../../pci/ncr.c", line 6228
> sd1: COMMAND FAILED (4 28) @f0497000.
> assertion "cp" failed: file "../../pci/ncr.c", line 6228
> sd1: COMMAND FAILED (4 28) @f0497000.   

The 28 identifies QUEUE_FULL status, the 4 just is
an indication, that processing of the command run
to completion, as far as the NCR controller and the
driver are concerned.

> sd0: <IBM DORS-32160 S82C> type 0 fixed SCSI 2
> sd1: <IBM DORS-32160 S82C> type 0 fixed SCSI 2

There was roumor, that the non-wide DORS does not
support as many tags as the wide version, though I
never had a chance to confirm this myself.

If you want to help debug the problem, then please
try a kernel that does not start the SCSI drives
in /sys/scsi/sd.c:sdopen().


Regards, STefan