From owner-freebsd-hackers  Sun Jun  4 09:59:44 1995
Return-Path: hackers-owner
Received: (from majordom@localhost)
          by freefall.cdrom.com (8.6.10/8.6.6) id JAA17183
          for hackers-outgoing; Sun, 4 Jun 1995 09:59:44 -0700
Received: from luke.pmr.com (luke.pmr.com [199.98.84.132])
          by freefall.cdrom.com (8.6.10/8.6.6) with ESMTP id JAA17172
          for <freebsd-hackers@freebsd.org>; Sun, 4 Jun 1995 09:59:40 -0700
Received: (from bob@localhost) by luke.pmr.com (8.6.11/8.6.9) id MAA05079; Sun, 4 Jun 1995 12:00:17 -0500
From: Bob Willcox <bob@luke.pmr.com>
Message-Id: <199506041700.MAA05079@luke.pmr.com>
Subject: Re: NCR810 problem?
To: asami@cs.berkeley.edu (Satoshi Asami)
Date: Sun, 4 Jun 1995 12:00:17 -0500 (CDT)
Cc: rgrimes@gndrsh.aac.dev.com, mdomsch@dellgate.us.dell.com,
        freebsd-hackers@freebsd.org
In-Reply-To: <199506040720.AAA01779@silvia.HIP.Berkeley.EDU> from "Satoshi Asami" at Jun 4, 95 00:20:03 am
X-Mailer: ELM [version 2.4 PL24]
Content-Type: text
Content-Length: 3181      
Sender: hackers-owner@freebsd.org
Precedence: bulk

Satoshi Asami wrote:
> 
>  * > > > assertion "cp == np->header.cp" failed: file "../../pci/ncr.c", line 5235
>  * > > > assertion "cp" failed: file "../../pci/ncr.c", line 5236
>  * > > > ncr0 targ0?: ERROR (80:100) (e-ab-2) (8/13) @ (10d4:e000000).
>  * > > >              reg: da 10 0 13 47 8 0 1f 0 e 80 ab 80 0 3 0.
>  * > > > ncr0: restart (fatal error).
>  * > > > ncr0: reset by timeout.
>  * > > > sd0: error reading primary partition table from fsbn 0 (sd0 bn 0; cn 0
>  * > > > tn 0 sn 0)
> 
> Just another datapoint, I am seeing this from time to time on my
> -current system too.  NCR 53C825, with Quantam Atlas 2.1GB as the lone 
> SCSI device, SiS chipset, Pentium-90.  The only other device on PCI
> bus is video card (#9 GXE64 Pro).
> 
> It's not predictable though, my guess is that it happens about 1/3 of
> the time.  If the boot gets through the fsck stage, it will run fine
> for days.  Otherwise, it will spew a river of errors and I need to
> open the lid and press the reset button.  It always succeeds after the
> reset (or so it seems, I don't remember having to reset twice in a
> row).

I have seen many of these messages here on a system that I just
upgraded to 2.0.5 and a Pentium-100 (I placed a copy of one
representative example at the end of this note).  Previously, under
1.1.5.1 using 3 BusLogic BT-747S controllers and 486DX4/100, all
15 of disk drives were happy...system would run for weeks w/o
incident and I had *no* disk problems.  Now, with the current
configuration:

  100MHz Pentium, Triton chipset, 256KB cache, 32MB DRAM
  4 NCR 53C810 PCI SCSI controllers
  1 disk, 2 tape drives, 2 CDROM drives on first controller internal
  Remaining disks in an external box, divided among each of the
    remaining controllers
  
5 of my disks that previously worked will not work with this configuration.
They are:

  3 QUANTUM PD700S 3110 (these drives work when internal w/short cables)
  1 DEC DSP3105S X385
  1 TOSHIBA MK438FB         5133

The remaining working disks are:

  1 MICROP 2217-15MZ1001901 HZ30 (internally mounted root volume)
  4 DEC DSP3105S 386C
  1 C2247-300 0BE4
  2 DEC DSP3210S 435E
  1 SEAGATE ST12550N 0013
  1 SEAGATE ST12550N 0014

I tried various combinations of different drives on different busses
to no avail.  Note that termination and ID assignments are correct.
I considered connecting my SCSI bus analyzer up to see if I could
get a clue to what was going wrong from the bus activity but ran
out of time.  (This is my primary server and it had been down a
long time already.)

A sample error message:

assertion "cp == np->header.cp" failed: file "../../pci/ncr.c", line 5235
assertion "cp" failed: file "../../pci/ncr.c", line 5236
ncr1 targ 4?: ERROR (80:100) (e-ae-0) (8/13) @ (10d4:e000000).
              reg: da 10 0 13 47 8 4 1f 0 e 84 ae 80 0 6 0.
ncr1: restart (fatal error).
sd3(ncr1:4:0): COMMAND FAILED (9 ff) @f0b6d600.
sd3: error reading primary partition table reading fsbn 0 (sd3 bn 0; cn 0 tn 0 sn 0)
ncr1: timeout ccb=f0b6d600 (skip)
ncr1: reset by timeout.
sd2(ncr1:3:0): FAST SCSI-2 100ns (10 Mb/sec) offset 8.


-- 
Bob Willcox
bob@luke.pmr.com (or obiwan%bob@uunet.uu.net)
Austin, TX