From owner-freebsd-hackers Sat Sep 16 13:11:55 1995 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.6.12/8.6.6) id NAA09669 for hackers-outgoing; Sat, 16 Sep 1995 13:11:55 -0700 Received: from GndRsh.aac.dev.com (GndRsh.aac.dev.com [198.145.92.241]) by freefall.freebsd.org (8.6.12/8.6.6) with ESMTP id NAA09629 ; Sat, 16 Sep 1995 13:11:45 -0700 Received: (from rgrimes@localhost) by GndRsh.aac.dev.com (8.6.12/8.6.12) id NAA01803; Sat, 16 Sep 1995 13:11:17 -0700 From: "Rodney W. Grimes" Message-Id: <199509162011.NAA01803@GndRsh.aac.dev.com> Subject: Re: looking for REALLY good hardware diagnostics To: pst@shockwave.com (Paul Traina) Date: Sat, 16 Sep 1995 13:11:16 -0700 (PDT) Cc: hackers@freebsd.org, current@freebsd.org In-Reply-To: <199509161825.LAA07664@precipice.shockwave.com> from "Paul Traina" at Sep 16, 95 11:25:30 am X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Length: 2816 Sender: owner-hackers@freebsd.org Precedence: bulk > > I've got two systems, one an early-model pentium, the other a Cx486DLC both > experiencing the occasional odd failure when running under FreeBSD and I want > to double-check the hardware on these machines. Given that I do this day in day out 7 days a week I can lend some hand in diagnosing hardware related failures, I don't go out of my way to do this for others any longer as I have enought of it to deal with here, but I see you making some assumptions that from my data are incorrect. > (Yes, I know about the cache weirdness on the 486DLC, I've even disabled the > internal cache completely as part of my testing). Good, that eliminates that screwwy mess :-). > I think the Pentium either has a bad CPU (likely) ^^^^^^^^ very unlikely, in the last 2 years I have seen 0 (yes, 0) defects in a Pentium CPU chip, if the thing will power up and load a kernel it is good from my data. > or a bad cache chip (unlikely) Very likely, in the last 120 days I have had to replace 3 SRAM cache chips that would causes system crashes, sig 11's or other strangeness. Infact bad ``cache'' is my #1 failure mode of incoming motherboard products. And my #1 failure during burn in as far as electronic components go (overall #1 is disk drives that die during my 720 minute hour glass seek/random seek pre-burn in test :-(). > and the DLC either has a bad cache chip (likely) or bad dram (unlikely). Dram is my #2 electronic failure, most often occurs during initial incoming inspection during a 3 hour make world pass using a 100Mhz PB Pentium that can really knock snot on the memory subsystem. > Does anyone have ANY pointers whatsoever to a really really really good and > thorough set of diagnostics that could be used to check for hardware faults? There are non other than a VLSI tester for memory chips and simms, and ICE for MB problems, so unless you have access to a multi million dollar test equipment lab it is real tough. Is what I use here is ``make world'', if it fails that I use my years of correlating FreeBSD panics or signals to mostlikely hardware component, and a deep gut feeling for what I have seen over the years. > Specificly, anything that can be used to diagnose external caches, memory, > (and in the case of the pentium, perform cpu diagnostics) would be cool. If it runs ``make world'' it is good, I have found no better test of fucntionality than this as far as MB/CPU/Cache/Memory goes. If it fails I use the big shot gun approach and start replacing each one of those with ``known good'' units (another benifit I have that you may not is I have ``known good'' units around.) -- Rod Grimes rgrimes@gndrsh.aac.dev.com Accurate Automation Company Reliable computers for FreeBSD