Date: Mon, 13 Apr 2009 20:31:47 +0200 From: Roland Smith <rsmith@xs4all.nl> To: John Almberg <jalmberg@identry.com> Cc: freebsd-questions@freebsd.org Subject: Re: How to diagnose hardware problem? Message-ID: <20090413183147.GA82769@slackbox.xs4all.nl> In-Reply-To: <648C2025-CD72-4BA1-8D5D-48D4CC781250@identry.com> References: <648C2025-CD72-4BA1-8D5D-48D4CC781250@identry.com>
next in thread | previous in thread | raw e-mail | index | archive | help
--EeQfGwPcQSOJBaQU Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Apr 13, 2009 at 12:07:25PM -0400, John Almberg wrote: > I have what looks like a hardware problem with an Intel 1U server, =20 > which I am using mainly as a mysql database server for some of my =20 > bigger website clients. >=20 > The server went down last week with a badly corrupted file system. >=20 > After spending a day trying to fix the file system, we gave up and =20 > did a fresh install of FreeBSD, PF, and mysql, using our daily =20 > backups to restore the database. It all seemed to work fine until I =20 > switched the websites from the temporary database server that I had =20 > been using, onto the restored server. >=20 > The database ran well for about 2 minutes, then the server crashed =20 > again. The filesystem was again corrupted so badly that we could not =20 > even log in to look at the logs. >=20 > We've reinstalled FreeBSD again, just to be able to SSH into the box. =20 > It looks like there is probably a hardware problem, like a bad power =20 > supply or overheating CPU that fails when the load of the database is =20 > applied. >=20 > Problem is, I have no idea how to determine which bits are failing. =20 > Can anyone suggest a favorite book or website that focuses on how to =20 > troubleshoot hardware issues? First things first; if the machine is still in warranty, don't mess with it but send it back to the manufacturer and demand a replacement. If the machine is out of warranty, you might consider replacing it altogether. My employer's IT department ditches PC's and servers at the fir= st failure after the warranty runs out. Accordinf to them it's cheaper than repairing them. But if you want to have a go, this might help: http://www.daileyint.com/hmdpc/manual.htm=20 Basically, it's just a problem of elimination. First check if your machine is the only one having problems at the hosting site. Maybe they have unstable electrical power. Then make sure that all expansion cards and RAM are well-seated, and that all connectors are OK. Also check that there is no dust build-up on e.g. fans and heatsinks. If necessary, clean carefully with (dry, oil free) compressed air. Dust can lead to short circuits or reduced cooling. Next, look for capacitors that have leaked fluid, or have bulging metal end plates on the motherboard; those are dead or dying. It's a leading cause of motherboard failure. It is possible to replace them, but you'll need the right equipment: http://www.tomshardware.com/reviews/fixing-motherboard,1606.html Install a monitoring program like mbmon or healthd, and have it log to another machine or a USB stick mounted syncronously. Monitor CPU temperature, fan speeds and the different voltages. Not all power supplies are created equally. See the articles at tom's hardware: http://www.tomshardware.com/reviews/Components,1/Power-Supplies,6/=20 If you've found nothing so far, it's time to start swapping out components, starting with the power supply. Roland --=20 R.F.Smith http://www.xs4all.nl/~rsmith/ [plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated] pgp: 1A2B 477F 9970 BA3C 2914 B7CE 1277 EFB0 C321 A725 (KeyID: C321A725) --EeQfGwPcQSOJBaQU Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.11 (FreeBSD) iEYEARECAAYFAknjhRMACgkQEnfvsMMhpyW6PQCfb1DPPIOtdfH5SRYNQ8Sl4bVa ucQAnRkx3GIAjNfIiZBb4HBKhCddT18R =4f26 -----END PGP SIGNATURE----- --EeQfGwPcQSOJBaQU--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090413183147.GA82769>