From owner-freebsd-scsi Wed Oct 8 14:13:43 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id OAA14836 for freebsd-scsi-outgoing; Wed, 8 Oct 1997 14:13:43 -0700 (PDT) (envelope-from owner-freebsd-scsi) Received: from Octopussy.MI.Uni-Koeln.DE (Octopussy.MI.Uni-Koeln.DE [134.95.166.20]) by hub.freebsd.org (8.8.7/8.8.7) with SMTP id OAA14829 for ; Wed, 8 Oct 1997 14:13:33 -0700 (PDT) (envelope-from se@zpr.uni-koeln.de) Received: from x14.mi.uni-koeln.de ([134.95.219.124]) by Octopussy.MI.Uni-Koeln.DE with SMTP id AA04305 (5.67b/IDA-1.5 for ); Wed, 8 Oct 1997 23:13:12 +0200 Received: (from se@localhost) by x14.mi.uni-koeln.de (8.8.7/8.6.9) id WAA01232; Wed, 8 Oct 1997 22:55:53 +0200 (CEST) X-Face: " Date: Wed, 8 Oct 1997 22:55:52 +0200 From: Stefan Esser To: Philippe Regnauld Cc: freebsd-scsi@FreeBSD.ORG Subject: Re: 2.2.2 anc NCR875 failures References: <19971008113725.46245@deepo.prosa.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.84 In-Reply-To: <19971008113725.46245@deepo.prosa.dk>; from Philippe Regnauld on Wed, Oct 08, 1997 at 11:37:25AM +0200 Sender: owner-freebsd-scsi@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk On 1997-10-08 11:37 +0200, Philippe Regnauld wrote: > I just got (a week ago) a new machine to run a keyserver on... > The configuration is > > TX97/K6-180, NCR-875, 64MB RAM, 2 x 2.2 Atlas II UW disks. > > I've had the following failure three times so far, I would > guess during some fair amount of disk i/o: (written on paper, > I'm trying to reread myself): > > > ncr0: ERROR (81:0) 8af80 (10/1b) @24:00000000 The NCR is failing on one the first instructions, and the error code indicates that an illegal instruction has been fetched. This was most probably caused by a jump to the immediate operand of an instruction: /*--------------------------< START >-----------------------*/ { /* ** Claim to be still alive ... */ SCR_COPY (sizeof (((struct ncb *)0)->heartbeat)), KVAR (KVAR_TIME_TV_SEC), NADDR (heartbeat), /* ** Make data structure address invalid. ** clear SIGP. */ SCR_LOAD_REG (dsa, 0xff), 0, SCR_FROM_REG (ctest2), ===>>> 0, The NCR processor tried to execute that constant 0, and it was not recognized as a valid instruction ... Hmmm, the (10/1b) in the error message indicate, that synchronous transfers have been negotiated (the offset is set to 0x10 == 16 bytes), but the clock pre-scaler (0x1b) is not set correctly for the 53c875, it appears! But I don't understand, how you can possibly complete a single SCSI transfer, at twice the correct clock rate. You did not tell, which version of the NCR driver (and FreeBSD) that is. The pre-scaler may be correct, if you are running the NCR driver as of FreeBSD-2.2.2 and if the 53c875 is revision 2 or newer. > In the two other cases, I had some other message, every > 30 sec. or so, like "retrying block = xxxyyy". No crash, > no reboot... Hmmm, there is no such message anywhere in the NCR driver. > I had to go and manually reset the machine (off-site!) every > time. Sorry to hear that ... > I tried reducing TAG number in ncrcontrol -- nada. No, your problem is different from the QUEUE FULL situation others are suffering from. But that may still hurt you, if you got revision LXY4 firmware in your Atlas II drives ... > Help ? Please let me know, what version of FreeBSD and the NCR driver you are using. Booting with "-v -v" will enable extra verbose boot message, and there will be more information on the NCR initalization. I'd like to know those messages. I'm very sorry for the inconvenience. I'll try to help you get this problem solved as quickly as possible, but it does look like a hardware problem to me, currently. But it may also be because of the timing loop used to measure the NCR 875 clock frequency, which may fail on your particular hardware, for as of now unknown reasons. Regards, STefan