Date: Wed, 24 Apr 1996 10:51:15 +0930 (CST) From: Michael Smith <msmith@atrad.adelaide.edu.au> To: scrappy@ki.net (Marc G. Fournier) Cc: current@FreeBSD.org, hackers@FreeBSD.org Subject: Re: Intelligent Debugging Tools... Message-ID: <199604240121.KAA13482@genesis.atrad.adelaide.edu.au> In-Reply-To: <Pine.NEB.3.93.960423154553.23204B-100000@freebsd.ki.net> from "Marc G. Fournier" at Apr 23, 96 04:02:04 pm
next in thread | previous in thread | raw e-mail | index | archive | help
Marc G. Fournier stands accused of saying: > > What would it take to either create software for debugging > hardware, and/or add appropriate debugging to the kernel that would > improve debugging of hardware problems? Ah. As someone with a foot in both the hardware and software camps, all I can say is "forget it". Any software has to make a few assumptions about the hardware it runs on. If the hardware fails to meet those assumptions, (eg. random parts of memory change) there's no hope for the software. To answer your question absolutely, this sort of software does exist. You find it in board-level test equipment with price tags starting in the mid six figures. Configuring such software usually requires access to the manufacturer's specification for the DUT. (If such information actually exists in the first place - often it's easier for a board vendor to just throw a prototype together, and if it runs Windows, commit to manufacturing it.) > Erk...as far as software is concerned, maybe something that > you could run in single user mode that would completely thrash the > RAM, doing read/writes to *all* the memory looking for any corruption? "make world". The issue here is that it's not _just_ memory, but the interaction between processor memory accesses, busmastering activity, refresh, chipset timing and random system noise. Simulating such an environment is _impossible_. If the memory was legitimately altered in an incorrect fasion (eg. a bus latch was late and caught data from the master as it transited out of a valid state, and subsequently wrote it into memory), even ECC memory won't help you. > As far as the kernel is concerned, I'm getting panics in VM > and keep getting told its hardware problems...fine, but there *has* > to be a better way of isolating the problem then replacing bits and > pieces until the problem seems to go away. For instance, when I get > a VM fault...what exactly *is* the problem? Is it a problem with > the swap space (ie. hard drives) or RAM? Find a spare $10K or so and buy a _good_ DRAM tester. Discover, much to your surprise, that most of the DRAMs on the market fail to operate to spec. Become Enlightened. Purchase a pile of Triton-II motherboards, fork out _lots_ of money for fast ECC memory, and _maybe_ your problems will go away. What is worth bearing in mind is that other people are doing essentially the same things that you are doing, but aren't having the problems you are. They don't have access to any magical software fixes, it's just that their (our) hardware appears to work OK. > Does this make any sense? Yes. The problem is that PCs are built like toasters, and making a souffle' in a toaster is very difficult. > Marc G. Fournier scrappy@ki.net -- ]] Mike Smith, Software Engineer msmith@atrad.adelaide.edu.au [[ ]] Genesis Software genesis@atrad.adelaide.edu.au [[ ]] High-speed data acquisition and (GSM mobile) 0411-222-496 [[ ]] realtime instrument control (ph/fax) +61-8-267-3039 [[ ]] Collector of old Unix hardware. "Where are your PEZ?" The Tick [[
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199604240121.KAA13482>