Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 06 Jul 2003 14:10:21 -0400
From:      Bill Moran <wmoran@potentialtech.com>
To:        Adam <blueeskimo@gmx.net>
Cc:        FreeBSD-Questions <freebsd-questions@freebsd.org>
Subject:   Re: More hardware problems (advice needed)
Message-ID:  <3F08660D.7010105@potentialtech.com>
In-Reply-To: <1057511651.581.27.camel@elwood>
References:  <1057511651.581.27.camel@elwood>

next in thread | previous in thread | raw e-mail | index | archive | help
Adam wrote:
> My main FreeBSD (4.8) box has died on me again, and I'm 99% certain it's
> due to hardware failure. However, I'm having a very hard time
> determining what hardware is going bad, due to the nature of the crash.
> 
> Let me describe the scenario.
> 
> I was working on the machine, not doing anything out of the ordinary.
> All of a sudden, my mouse stopped responding. I thought maybe moused had
> crashed, so I did 'ps -aux |fgrep moused'. This caused ps to segfault,
> which caused me to nearly soil myself. So, I decided to quickly kill all
> my apps and exit X so I could reboot. When I closed X, I noticed a lot
> of errors on my console about dc0 (my Linksys NIC interface, external)
> having underruns, and that ad2 was timed out. I also noticed that my LAN
> connection to my other box was dead. I tried to reboot, and all went
> well until it got to the 'Rebooting...', at which point it hung. I
> waited for 10+ minutes, thinking it might eventually reboot, but it was
> stuck, so I turned it off. 
> 
> When I powered back up, I got tons of errors that the kernel couldn't be
> loaded, and I couldn't even get into single-user mode. So, I made a
> fixit floppy and fired up the fixit shell, and start poking around to
> see what happened. I was able to mount ad3 and ad2 just fine, but
> mounting ad0 caused fixit to panic and the machine reboot. 
> 
> So, this is where I am now. For those of you that remember, I had
> another crash & burn experience on that machine a couple months ago,
> where the machine just suddenly froze completely and my ad0 was trashed
> when I boot back up. That time, I didn't have backups. This time, I do.
> But, before I work on that computer again, I think I need to replace
> some hardware.
> 
> I've heard pretty good arguments for both the ad0 drive (Western Digital
> 120gb, 2mb cache), and for the motherboard/cpu (Asus A7V266-E, Athlon
> 1600+). I used memtest86 to test the RAM, which came up clean. 
> 
> I doubt if its a power problem, since I've got a very nice case (Antec
> 1080, 400+ watts). Also, I've got another machine in my apartment that
> hasn't experienced any weird problems like this. 
> 
> The CPU might be overheating, but its hard to tell. Roughly 5 minutes
> after the crash, I checked the CPU temperature from the BIOS, which
> registered 63C for the CPU. I have no idea how hot the CPU was at the
> time of the crash, but it definitely had to have cooled off a bit in
> those 5 minutes.

Sounds like a HDD going ... I had a similar sceneria a few months ago
and it was the HDD.
You could get a FreeSBIE CD, boot it and run cpuburn to test the CPU.

> I don't have enough $$ to replace all the hardware, so I'd like some
> expert advice as to what is the most likely culprit. I don't know if
> I'll be able to convince any of Asus, AMD, or Western Digital to give me
> an RMA number, but I can try (also would like some advice on this to
> maximize my chances).


-- 
Bill Moran
Potential Technologies
http://www.potentialtech.com



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3F08660D.7010105>