Date: Tue, 28 Jul 1998 10:31:13 -0400 (EDT) From: "Robert G. Brown" <rgb@phy.duke.edu> To: Jess Johnson <jester@feeding.frenzy.com> Cc: aic7xxx Mailing List <AIC7xxx@FreeBSD.ORG> Subject: Re: Puzzle for Doug... Message-ID: <Pine.LNX.3.96.980728102203.31650A-100000@ganesh.phy.duke.edu> In-Reply-To: <199807280020.TAA14182@feeding.frenzy.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 27 Jul 1998, Jess Johnson wrote: > The BIOS memory test is a pretty pathetic memory test on 90% of pc's. Only > on critical errors will it find anything wrong. I would suggest swapping it > around to see if it makes a difference > > Jess Well, I saw the NMI error pop up on ANOTHER of the five systems overnight, although this one recovered. I have to say that I seriously doubt that 3/5 of Dell's delivered systems have bad memory, especially given that I've run these systems diskless for around 3 weeks now "flawlessly" under heavy load of big-memory applications. A memory problem with any significant probability of occurring (which clearly must be the case, given that it happens at boot time in low memory) would almost certainly have created havoc -- repeated kernel crashes, bad answers, segment violation errors as loop/jump addresses were corrupted -- none of which have been observed. The phenomena thus far seems confined to the aic7xxx driver only and moreso to the 7890 device -- I ran the old aic7xxx driver in diskless kernels for a week or so (the one that found the 7860 but not the 7890) and observed none of this. It COULD be memory, and of course I'll (sigh) take down a box and see if I can improve things (or at least change things) by swapping memory out two banks at a time -- if I don't get a more promising response, since I really don't think that it IS memory. You can like Dell or not as an "Intel/Microsoft lackey" (as a wag on the linux-smp list is fond of calling them) but I really think that they do sell excellent, if expensive, hardware. I'd never expect a memory failure rate in the 10-20% range, which is what it would have to be to explain the phenomena, and I'd further not expect to see only MARGINAL failures instead of out and out won't boot the system period failures. rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe aic7xxx" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.LNX.3.96.980728102203.31650A-100000>