From owner-freebsd-hardware Tue Sep 23 15:32:58 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id PAA09638 for hardware-outgoing; Tue, 23 Sep 1997 15:32:58 -0700 (PDT) Received: from Octopussy.MI.Uni-Koeln.DE (Octopussy.MI.Uni-Koeln.DE [134.95.166.20]) by hub.freebsd.org (8.8.7/8.8.7) with SMTP id PAA09544; Tue, 23 Sep 1997 15:32:37 -0700 (PDT) Received: from x14.mi.uni-koeln.de ([134.95.219.124]) by Octopussy.MI.Uni-Koeln.DE with SMTP id AA18856 (5.67b/IDA-1.5); Wed, 24 Sep 1997 00:32:34 +0200 Received: (from se@localhost) by x14.mi.uni-koeln.de (8.8.7/8.6.9) id AAA04367; Wed, 24 Sep 1997 00:20:10 +0200 (CEST) X-Face: " Date: Wed, 24 Sep 1997 00:20:10 +0200 From: Stefan Esser To: Walter Hafner Cc: freebsd-scsi@FreeBSD.ORG, freebsd-hardware@FreeBSD.ORG, Stefan Esser Subject: Re: Is my NCR controller broken? References: <199709180857.IAA03695@pccog4.forwiss.tu-muenchen.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.74 In-Reply-To: <199709180857.IAA03695@pccog4.forwiss.tu-muenchen.de>; from Walter Hafner on Thu, Sep 18, 1997 at 08:57:34AM +0000 Sender: owner-freebsd-hardware@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk On Sep 18, Walter Hafner wrote: > Hello! Hallo! Sorry for the late reply ... > I just want to make sure I don't miss something before changing my > mainboard. Please enlighten me. > > I run a 486/DX2-66 (ASUS SP-3 with onboard NCR-810 SCSI > controller). This computer runs for about 3 years now (2.0.5, 2.1.0, > 2.1.5) Is this the original ASUS SP3 with the Saturn I (revision 2) chip set ? That chip set is known buggy, and you'll have to disable one of the PCI bus performance options. I don't remember if it was "PCI bursts" or some buffer option ("write buffers" ??) > Since about four weeks I keep getting SCSI resets and then the bus is > dead. No recovery! And it's really strange because the NCR controller > reports totally different errors before hanging. Here are the error > reports from the last three crashes (typed in by hand, so the actual > format may differ): Did you by chance do any of the following: - modify PCI BIOS setup options (bursts, ...) - add another PCI card (even a bus-master) - add some ISA card - change the amount of memory in the system > ------------------------------------------------------------------------------- > > sd1(ncr0:1:0): internal error: cmd00 != 91=(vdsp[0] >> 24) > ncr0: timeout ccb=f19fbc00 (skip) This is a "can't happen" case, and the first time I see it reported. Some value in a register is different from the data at the address from where this register was loaded. > ------------------------------------------------------------------------------- > > ncr0:1: ERROR (a0:0) (f-28-0) (8/13) @ (260:00000000). > script cmd=fc00001c. > reg: da 10 80 13 47 08 01 1f 00 0f 81 28 80 00 00 00. > ncr0: restart (fatal error). > sd1(ncr0:1:0): command failed (9ff)@f19fbc00. > nrc0: timeout ccb=f19fbc00 (skip) Another indication of a hardware problem: The NCR status has the bus fault bit set in DSTAT, which indicates a problem accessing the PCI bus. > ------------------------------------------------------------------------------- > > ncr0: SCSI phase error fixup: CCB already dequeued (0xf19fbc00) > nrc0: timeout ccb=f19fbc00 (skip) Hmmm, another "first" ... There definitely is something wrong with your hardware. > I changed everything: > > * disconnected everything except the system drive -> still errors > * changed cables (three different ones) -> still errors > * changed termination (two different external ones, internal, different > termpower sttings etc.) -> still errors > * turned all devices to 5MB synchr. and finally to acync via > 'ncrcontrol' -> still errors > * finally replaced the system drive (old DEC 5200 against new IBM DAHC > 34330) and put 2.2.1 on it -> still errors. Actually, the errors above > are from that setup. > > The only thing I didn't change was the mainboard. Well, and I think that's the problem :) But please try with conservative PCI options. This helped other people with an ASUS SP3, too. I just don't remember the exact option that did cause the problem. Just disable all that the BIOS setup offers :) > I'd be glad if anyone can confirm my suspicion that the NCR controller > has gone nuts. I just can't imagine why ... No, I don't think this is a controller going bad. Though such a thing has happened before ... > I'd also appreciate it very much if someone with more insight than > myself could explain the error reports to me. I'd especially like to > know what this 'f19fbc00' means: it shows up in all three errors (what's > a 'ccb' anyway?) The CCB is a Command Control Block, a structure that contains all the information the NCR needs to issue and execute a SCSI command. It is in fact surprising, that the same address is printed in each case, but depending on the number of drives and whether tags are enabled, it is possible that only one CCB is in use. Regards, STefan