From owner-freebsd-hackers Sun Jun 4 09:59:44 1995 Return-Path: hackers-owner Received: (from majordom@localhost) by freefall.cdrom.com (8.6.10/8.6.6) id JAA17183 for hackers-outgoing; Sun, 4 Jun 1995 09:59:44 -0700 Received: from luke.pmr.com (luke.pmr.com [199.98.84.132]) by freefall.cdrom.com (8.6.10/8.6.6) with ESMTP id JAA17172 for ; Sun, 4 Jun 1995 09:59:40 -0700 Received: (from bob@localhost) by luke.pmr.com (8.6.11/8.6.9) id MAA05079; Sun, 4 Jun 1995 12:00:17 -0500 From: Bob Willcox Message-Id: <199506041700.MAA05079@luke.pmr.com> Subject: Re: NCR810 problem? To: asami@cs.berkeley.edu (Satoshi Asami) Date: Sun, 4 Jun 1995 12:00:17 -0500 (CDT) Cc: rgrimes@gndrsh.aac.dev.com, mdomsch@dellgate.us.dell.com, freebsd-hackers@freebsd.org In-Reply-To: <199506040720.AAA01779@silvia.HIP.Berkeley.EDU> from "Satoshi Asami" at Jun 4, 95 00:20:03 am X-Mailer: ELM [version 2.4 PL24] Content-Type: text Content-Length: 3181 Sender: hackers-owner@freebsd.org Precedence: bulk Satoshi Asami wrote: > > * > > > assertion "cp == np->header.cp" failed: file "../../pci/ncr.c", line 5235 > * > > > assertion "cp" failed: file "../../pci/ncr.c", line 5236 > * > > > ncr0 targ0?: ERROR (80:100) (e-ab-2) (8/13) @ (10d4:e000000). > * > > > reg: da 10 0 13 47 8 0 1f 0 e 80 ab 80 0 3 0. > * > > > ncr0: restart (fatal error). > * > > > ncr0: reset by timeout. > * > > > sd0: error reading primary partition table from fsbn 0 (sd0 bn 0; cn 0 > * > > > tn 0 sn 0) > > Just another datapoint, I am seeing this from time to time on my > -current system too. NCR 53C825, with Quantam Atlas 2.1GB as the lone > SCSI device, SiS chipset, Pentium-90. The only other device on PCI > bus is video card (#9 GXE64 Pro). > > It's not predictable though, my guess is that it happens about 1/3 of > the time. If the boot gets through the fsck stage, it will run fine > for days. Otherwise, it will spew a river of errors and I need to > open the lid and press the reset button. It always succeeds after the > reset (or so it seems, I don't remember having to reset twice in a > row). I have seen many of these messages here on a system that I just upgraded to 2.0.5 and a Pentium-100 (I placed a copy of one representative example at the end of this note). Previously, under 1.1.5.1 using 3 BusLogic BT-747S controllers and 486DX4/100, all 15 of disk drives were happy...system would run for weeks w/o incident and I had *no* disk problems. Now, with the current configuration: 100MHz Pentium, Triton chipset, 256KB cache, 32MB DRAM 4 NCR 53C810 PCI SCSI controllers 1 disk, 2 tape drives, 2 CDROM drives on first controller internal Remaining disks in an external box, divided among each of the remaining controllers 5 of my disks that previously worked will not work with this configuration. They are: 3 QUANTUM PD700S 3110 (these drives work when internal w/short cables) 1 DEC DSP3105S X385 1 TOSHIBA MK438FB 5133 The remaining working disks are: 1 MICROP 2217-15MZ1001901 HZ30 (internally mounted root volume) 4 DEC DSP3105S 386C 1 C2247-300 0BE4 2 DEC DSP3210S 435E 1 SEAGATE ST12550N 0013 1 SEAGATE ST12550N 0014 I tried various combinations of different drives on different busses to no avail. Note that termination and ID assignments are correct. I considered connecting my SCSI bus analyzer up to see if I could get a clue to what was going wrong from the bus activity but ran out of time. (This is my primary server and it had been down a long time already.) A sample error message: assertion "cp == np->header.cp" failed: file "../../pci/ncr.c", line 5235 assertion "cp" failed: file "../../pci/ncr.c", line 5236 ncr1 targ 4?: ERROR (80:100) (e-ae-0) (8/13) @ (10d4:e000000). reg: da 10 0 13 47 8 4 1f 0 e 84 ae 80 0 6 0. ncr1: restart (fatal error). sd3(ncr1:4:0): COMMAND FAILED (9 ff) @f0b6d600. sd3: error reading primary partition table reading fsbn 0 (sd3 bn 0; cn 0 tn 0 sn 0) ncr1: timeout ccb=f0b6d600 (skip) ncr1: reset by timeout. sd2(ncr1:3:0): FAST SCSI-2 100ns (10 Mb/sec) offset 8. -- Bob Willcox bob@luke.pmr.com (or obiwan%bob@uunet.uu.net) Austin, TX