Date: Sun, 21 Dec 2003 00:27:12 +0000 From: Matthew Seaman <m.seaman@infracaninophile.co.uk> To: freebsd@ryansandridge.com Cc: freebsd-questions@freebsd.org Subject: Re: Troubleshooting a Freeze Message-ID: <20031221002712.GA22276@happy-idiot-talk.infracaninophile.co.uk> In-Reply-To: <30A62BA7-3312-11D8-8048-000393DEE4B4@ryansandridge.com> References: <30A62BA7-3312-11D8-8048-000393DEE4B4@ryansandridge.com>
next in thread | previous in thread | raw e-mail | index | archive | help
--IJpNTDwzlM2Ie8A6 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, Dec 20, 2003 at 12:30:22PM -0500, Ryan Sandridge wrote: > Are there any general or specific tips on how to figure out what is =20 > causing FreeBSD to just freeze up? The machine is not at my physical =20 > location, and I log in using ssh. I've checked all the logs in =20 > /var/log for some kind of clue, but I haven't found anything helpful. = =20 > The machine has worked fine for over 6 months, until the last few days = =20 > it freezes up at apparently random times, normally within the first 12 = =20 > hours of being up. While many ports have been added over the past 6 =20 > months, nothing was added on or around the time I started experiencing = =20 > the problem. Sounds hardware-ish to me. Thermal cut out? Fans going awol? Power supply going flakey? Try using the sysutils/xmbmon port to monitor fan speed, CPU and motherboard temperature, PSU voltages. (Works well if you hook it into rrdtool or mrtg to produce graphs against time). Get the NOC people to check that the machine is getting an unobstructed flow of cooling air and that filters are not clogged by dust. While they're there make sure there isn't anything printed out to the system console when the system goes down. If that's all OK, try running a few cycles of memtest86 (http://www.memtest86.com) to see if it can turn up any anomalies. =20 Failing finding any memory problems, then try enabling crash dumps by setting eg: dumpdir=3D"/dev/da0s1b" in /etc/rc.conf (Nb. requires that your swap partition contains slightly more space than the amount of RAM in you machine). Compile a kernel with debugging symbols. When the machine conks out again, see if you can get a core image. Use gdb to try and get a traceback showing what went wrong. If that doesn't give you any clues as to how you can fix things, submit a PR with the traceback information and anything else pertinent, and see if any of the developers have any suggestions. http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/ker= neldebug.html explains all about kernel crash dumps. Cheers, Matthew --=20 Dr Matthew J Seaman MA, D.Phil. 26 The Paddocks Savill Way PGP: http://www.infracaninophile.co.uk/pgpkey Marlow Tel: +44 1628 476614 Bucks., SL7 1TH UK --IJpNTDwzlM2Ie8A6 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (FreeBSD) iD8DBQE/5OjgdtESqEQa7a0RArueAKCOqsQoSwyfDMQspqYQFYPHmKIsaACgkM+k Bg4pIwHZ5yYUpnbtmmQUPsc= =kSKS -----END PGP SIGNATURE----- --IJpNTDwzlM2Ie8A6--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20031221002712.GA22276>