Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 27 Jun 1998 15:18:34 +0200
From:      Stefan Esser <se@FreeBSD.ORG>
To:        Doug Russell <drussell@saturn-tech.com>, stable@FreeBSD.ORG
Cc:        Stefan Esser <se@FreeBSD.ORG>
Subject:   Re: Make world failures (ncr wierdness?)
Message-ID:  <19980627151834.05015@mi.uni-koeln.de>
In-Reply-To: <Pine.BSF.3.95.980626105150.9412A-100000@hobbes.saturn-tech.com>; from Doug Russell on Fri, Jun 26, 1998 at 11:22:08AM -0600
References:  <Pine.BSF.3.95.980626105150.9412A-100000@hobbes.saturn-tech.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 1998-06-26 11:22 -0600, Doug Russell <drussell@saturn-tech.com> wrote:
> 
> Does anyone know what would be causing this during make world?
> 
> Jun 26 10:19:46 doug /kernel: assertion "cp" failed: file "../../pci/ncr.c", line 6191
> Jun 26 10:19:46 doug /kernel: sd0(ncr0:6:0): COMMAND FAILED (4 28) @f05a9800.

This is your drive complaining about too many simultanous commands.
Some drives support varying numbers of tags depending on parameters
we have no control over, some have firmware bugs and can't even deal
with 2 tagged commands at times. IIRC, then the LXY4 revision of the
Quantum Atlas II firmware is known to cause such problems. You may
want to upgrade to a later revision (available from the Quantum FTP
server at ftp://ftp.qntm.com/pub/support/Firmware). The upgrade can
be performed from a DOS floppy with ASPI drivers for SDMS (came with
your NCR card). The upgrade utility tries to be as safe to use as
possible and the actual write of the firmware takes less of a second.
(I'm just telling this because I was a little worried before I sent
new firmware to my Atlas for the first time ;-)

The <assertion "cp" failed> message is caused by a benign mis-match 
between the code executed by the NCR and the driver (I'm running a 
driver with some modifications locally, but can't test them under 
-stable. I'm not sure whether they could fix your problem, since I'm 
using the CAM version of the driver under -current.)

One workaround is to reduce the number of simultabous commands (tags)
to a value that is always acceptable to your drive.

> It appears to be something to do with 'ncr chip exception handler for
> programmed interrupts'.  I can do a make world with no problem, (it
> produced this error about 30 times during my build last night, but
> completed anyway.)  However, if I try a make -j4 world (or -j8) it dies
> after roughly 40 minutes.  (If I do another make -j4 world after it dies,
> it dies again in the same place, but if I do a make world, it goes through
> just fine.  No reboot in between.)

> Unless parallel makes on 2.2.6-STABLE have broken in the last couple
> weeks, I think it must be due to this ncr strangeness, but I don't know
> what is causing the errors.  (I last did a -j8 with just the IDE drive)

Well, the NCR can't easily postpone execution of a command that has
been started (i.e. the command phase can only be initiated once, and 
when the drive returns a QUEUE FULL status, the driver has to return
a "soft failure" code to the generic layer. Pre-CAM, the generic layer
will re-issue the failed command immediately, and the drive will often
still lack ressources to accept it. The CAM code throttles back (cuts
down the number of tags used), but the old SCSI code predates the time
when SCSI drives started to commonly support tags.

> The system is:
> 
> Jun 25 16:04:06 doug /kernel: FreeBSD 2.2.6-STABLE #0: Thu Jun 25 14:43:11 MDT 1998
> Jun 25 16:04:06 doug /kernel:     drussell@doug.saturn-tech.com:/usr/src/sys/compile/DOUG
> Jun 25 16:04:06 doug /kernel: CPU: AMD-K6tm w/ multimedia extensions (233.86-MHz 586-class CPU)
> Jun 25 16:04:06 doug /kernel:   Origin = "AuthenticAMD"  Id = 0x562  Stepping=2
> Jun 25 16:04:06 doug /kernel:   Features=0x8001bf<FPU,VME,DE,PSE,TSC,MSR,MCE,CX8,MMX>
> Jun 25 16:04:06 doug /kernel: real memory  = 33554432 (32768K bytes)
> Jun 25 16:04:06 doug /kernel: avail memory = 30969856 (30244K bytes)
> ...
> Jun 25 16:04:06 doug /kernel: ncr0 <ncr 53c810a fast10 scsi> rev 18 int a irq 5 on pci0:9:0
> Jun 25 16:04:06 doug /kernel: ncr0 waiting for scsi devices to settle
> Jun 25 16:04:06 doug /kernel: (ncr0:6:0): "QUANTUM XP32275S LXY4" type 0 fixed SCSI 2
> Jun 25 16:04:06 doug /kernel: sd0(ncr
> Jun 25 16:04:06 doug /kernel: 0:6:0): Direct-Access 
> Jun 25 16:04:06 doug /kernel: sd0(ncr0:6:0): 10.0 MB/s (100 ns, offset 8)
> Jun 25 16:04:06 doug /kernel: 2170MB (4445380 512 byte sectors)

> I note that ncr0 is on irq5 as it is in the bottom PCI slot... I was going
> to move it up a slot, but there is nothing on irq5 anyway, and I can't see
> how this would make any difference.  (It just feels strange to have the
> card on irq5.  :) )

You should be able to reserve 5 for ISA devices in the BIOS, and the
driver will use a different IRQ next time. (The IRQ is choosen by the
PCI BIOS.)

> Everything except /usr/obj is on wd0.  /usr/obj is on the Quantum.  Both
> are mounted normally (ie. NOT async, etc.)  Sources are the same as the
> binaries that are currently running.  (About 10 AM yesterday MDT.  The
> only changes that were applied were Peter's if_ppp makefile change,
> and about 10 changes to various files by jkh, if my memory is correct. :) )
> 
> At first I thought parallel makes were broken until I had the brainwave to
> look at /var/log/messages.
> Any ideas?

Yes, told about them already ;-)

Let me know if you have further questions.

Regards, STefan

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19980627151834.05015>