Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 9 May 1996 21:55:38 -0700 (PDT)
From:      "Rodney W. Grimes" <rgrimes@GndRsh.aac.dev.com>
To:        matt@lkg.dec.com (Matt Thomas)
Cc:        jgreco@brasil.moneng.mei.com, hackers@freebsd.org
Subject:   Re: Problems with a SMC EtherPower 10/100 on Triton-II?
Message-ID:  <199605100455.VAA06630@GndRsh.aac.dev.com>
In-Reply-To: <199605092110.VAA02322@whydos.lkg.dec.com> from Matt Thomas at "May 9, 96 09:10:10 pm"

next in thread | previous in thread | raw e-mail | index | archive | help
> 
> > We've recently upgraded a news server to an ASUS Triton-II board (P133,
> > 256MB RAM, AHA-3940, NCR-810, SMC 10/100).  There's an intermittent memory
> > problem that we are tracking, but this did not seem to be related.
> > 
> > It has crashed twice, spewing the following errors:
> > 
> > May  9 18:44:06 daily-planet xntpd[85]: time reset (step) -0.296390 s
> > May  9 18:51:04 daily-planet /kernel: de0: abnormal interrupt: 0xffffffff [0x1b14b]
> 
> Yikes!  It's not possible for the DC21140 status register to be all 1s.
> There is something seriously broken here.
> 
> > May  9 18:51:04 daily-planet /kernel: de0: abnormal interrupt: 0xfcefa044 [0x1a040]  
> > May  9 18:51:04 daily-planet /kernel: de0: abnormal interrupt: 0xfceea004 [0x0a000]  
> 
> This is bad.  Very bad.  Very very very very bad.
> 
> Abnormal interrupt + Fatal Bus Error, type Master Abort.

Could this also explain the all 1's??

> >From the DC21140 bible:
> 
> Fatal Bus Error -- Indicates a fatal bus error ocurrred.  If a system
> error occurs, the 21140 disables all bus access.

Which to me would result in reading all 1's from a csr,  right?

> 5.5.2.1.3  Master Abort   If the target does not asert _devsel_l_ within
> five cycles from the assertion of _frame_l_, the 21140 performs a normal
> completion.  It then releases the bus and asserts both master abort
> (CFCS<29>) and fatal bus error (CSR5<13>).
> 
> I have never ever seen this happen.
> 
> This is basically either the TritonII host-bridge is screwing
> up big time or some other device is screwing up the PCI bus.

How about a parity error in main memory during a PCI bus master write,
the machine is known to have bad memory in it (it has panic'ed two or
three times now with a NMI parity error).   Joe, until that system
stops get parity errors suspect all problems to be related to the
memory system.  The bios is set to assert SERR on the PCI bus if a
parity error occurs, this probably caused the DC21140 to go offline.

> Didn't intel recently (as of week weeks ago) introduce a new
> rev of the TritonII chipset(s)?
Not that I am aware of, but I'll do some digging.

> What does FreeBSD think your machine has in it?
> 
> I would worry about your motherboard and not the SMC 10/100.

That motherboard was AAC's FCS board, it has been through some pretty
grewling test sequences.  Unfortanetly the memory that it has been 
populated with is defanitly suspect as being bad.


-- 
Rod Grimes                                      rgrimes@gndrsh.aac.dev.com
Accurate Automation Company                 Reliable computers for FreeBSD



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199605100455.VAA06630>