Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 29 May 1995 01:13:44 -0700 (PDT)
From:      "Rodney W. Grimes" <rgrimes@gndrsh.aac.dev.com>
To:        agl@mac.glas.apc.org (Anthony Graphics)
Cc:        freebsd-hackers@FreeBSD.org
Subject:   Re: 950412 hangs on ncr0 probing:
Message-ID:  <199505290813.BAA00655@gndrsh.aac.dev.com>
In-Reply-To: <Pine.LNX.3.91.950529110037.322A-100000@mail.redline.ru> from "Anthony Graphics" at May 29, 95 11:36:57 am

next in thread | previous in thread | raw e-mail | index | archive | help
> 
> On Sun, 28 May 1995, Rodney W. Grimes wrote:
> 
> > I think I matched your problem to another person.  But do realize
> > scsi bus termination errors are the #1 cause for failure so I will
> > always remind people right from the start to check this.  Even after
> > doing that I have found my self on site to repair a ``broken'' system,
> > only to find what I told them on the phone to check the did not and
> > infact had an incorrectly terminated bus.
> > 
> > There are still 2 places for you to have an error in your scsi termination
> > even with this simple set up.  Are the terminators enabled and powered up
> > on the NCR controller, and on the Quantum Lightning drive.
> > 
> Ok, ASUS SP3G always comes with NCR termination enabled.

There is a jumper on the motherboard to turn it off though.

> As for Quantum lightning, there are two physical units representing
> terminators and they are in place.

I need to update my Quantum drive manual, don't have the Lightning or
the Fireball series data in it.  There may also be a enable/disable
jumper besides the Resistor Packs.  (It is true, some Quantum drives
have both removable r-packs, and jumpers to enable/disable termination.)

> Anyway, kernel -c
> disable wt0
> disable mcd0
> disable mcd1
> helped (well mcd0 and mcd1 was on irq 10 and 11 and these ones
> are configured for pci Slots 1 & 2 on asus, I have no idea how
> pci bus works, but when I was installing Mach32 into the slot 1
> on the very same board in the machine running linux I had
> a conflict with AGUS which was using the very same irq.
> I wonder whether the problem was IRQ conflict or mcd autoprobing)

Something is going funky here with these drivers and the ASUS board,
it is not the interrupts since the probes fail and we never do the
IRQ attach.  I'll have to wait until my back ordered SP3G boards
come in so I can debug this :-(.

> > Just because a scsi bus works on one controller does not make it right,
> > the termination could have been enabled on the 1542 and disabeld on
> > the NCR for example.
> > 
> Sure thing, but it's enabled in my case.

Okay, just making sure.

> > > I booted 2.0-RELEASE ok, but then I was unable to recompile 0412
> > > under it (gcc-2.6.3 with -O0 was giving various signals trapped:
> >             ^^^^^^^^^^^^^^^^^^
> > 
> > Please don't try to fuss with compiler options when building the kernel,
> > It can cause problems, we compile the way we set things up because that
> > is what we know to work.  You put more variables into the equations when
> > you do this type of stuff!!
> > 
> Ok, I run 'config MYSYSTEMNAME' again and tried to compile:
> cc1 still boiling out with error 'exited on signal 11(sometimes other random
> signal, often 10))

:-(, well, at least I got you booting SNAP-950412.

> 
> Well, somebody pointed out optimization code is broken in 2.6.3
> so I tried to simplify freebsd-hacker's life (and can't compile 2.6.2
> because I have only one FreeBSD box at the site :-( )
> Anyway, 2.6.3 seemed to come with 940412 if my memory serves
> me and I don't keep 2.6.3 sources here, and 2.6.3 used to compile
> MYSYSTEMNAME before I switched to the machine with the ASUS SP3G board...

Well, one problem at a time, since we got the boot going, we need to
move on to this signal 10/11 problem.  Have you set the external cache
to write through mode.  The Saturn chip set used on the ASUS PCI/I-486SP3G
has a bus master DMA cache invalidation bug (it fails to invalidate
cache entries written by the NCR controller to main memory), this should
fix your signal 10/11 problem.

> One thing to mention: I have compiled and installed gnu make into
> /usr/local/bin/make and /usr/local/bin precedes /usr/bin in my path,
> so I cd /sys/compile/MYSYSTEMNAME
> and /usr/bin/make
> it have nothing to do with problems with unexpected signal trapped by cc1
> I suppose?

Nope, cache bug in the board I am sure is what you are seeing.  Set the
*External* cache to *Write-through* and your problem will vanish.

> > > most often it was 11 sometimes 10 rarely 5)
> > > Well, rerunning /usr/bin/make was helpful with times but I've
> > > gotr stuck with vfs_func.c or something like that: no matter
> > > how much I've re running make: signal 11 trapped.
> > > Well I'm running the 0412-SNAP distribution, still under 0322
> > > I have _the same problem_ as in the 0412 :-(
> > > And I can't roll back to 2.0-RELEASE because kvm_make or something like
> > > this coredumps (what is probably what should be expected but I was surprised
> > > ;-)
> > > So, the kernel hangs on the ncr detection :-(
> > 
> Now after successfully booting 0412SNAP kvm_mkdb still causes this printing
> appearing on the console:
> 
> May 29 11:24:11 relay /kernel: pid 209: kvm_mkdb: uid 0: exited on signal 10
> Bus error (core dumped) 

See above...

> > That was the exact sympton Boyd had, see my other email, on his system.
> > I wish you had said up front you where using the on board NCR 810 of
> > an ASUS PCI/I-486SP3G, it would have tripped my memory sooner that I
> > had been here before.
> > 
> I'm sorry: never knew which part would be essential for the
> trouble report.

And I forget to tell you about the cache bug on this board, gee, how
stupid of me!!!

> Thanx!
> Partially succeeded: 0412 boots at least. Strange: the system
> I used before was some VLB of unknown origin with SIMMS of unknown
> origin and it worked, now when I assembled the machine from parts 
> supplied by the well known producers it ain't working ;-)

But you didn't buy it from a ``well known source'' who works all this
out before he ships them :-) :-) :-)

> Ok here what it is:
> super tower case with 2 fans and 400 Watt PSU (nearly empty)

EEkksss... a 400W power supply with less than 100W of load on it
can easily go out of regulation.  Most power supplies have a minium
load requirement to do proper regulation.  This is often 10 to 25% of
full load.  Your little SP3G, DX4-100(3.3V chip, not much draw there!),
16MB, and probably 1 disk drive are drawing <100W.  Please put a
high quality DVM on the +5 supply at the motherboard connector and
check it.  Also check the ripple with a scope, if you can.

> ASUS SP3G with 256k cache (that came with the board) and Intel-100Mhz DX4
> 	(to tell you truth we 've purchased Intel 100Mhz just to minimize
> 	dip switching on the ASUS ;-)
> two 16MB NEC SIMMS (with parity sure thing)
> Then came SMC Ultra 8013TPC
> and Cirrus Logic (some ISA model) SVGA board.
> AST4 board & Cronyx.
> Well all four ISA slots are occupied and no PCI slots are occupied

:-)


-- 
Rod Grimes                                      rgrimes@gndrsh.aac.dev.com
Accurate Automation Company                   Custom computers for FreeBSD



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199505290813.BAA00655>