From owner-freebsd-stable@FreeBSD.ORG Mon Oct 13 09:49:13 2003 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5F1F416A4B3 for ; Mon, 13 Oct 2003 09:49:13 -0700 (PDT) Received: from smtp.infracaninophile.co.uk (ns0.infracaninophile.co.uk [81.2.69.218]) by mx1.FreeBSD.org (Postfix) with ESMTP id ECAEB43F85 for ; Mon, 13 Oct 2003 09:49:10 -0700 (PDT) (envelope-from m.seaman@infracaninophile.co.uk) Received: from happy-idiot-talk.infracaninophile.co.uk (localhost [127.0.0.1]) h9DGmQQK021137 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 13 Oct 2003 17:48:56 +0100 (BST) (envelope-from matthew@happy-idiot-talk.infracaninophile.co.uk) Received: (from matthew@localhost)id h9DGmQhA021136; Mon, 13 Oct 2003 17:48:26 +0100 (BST) (envelope-from matthew) Date: Mon, 13 Oct 2003 17:48:26 +0100 From: Matthew Seaman To: Hani Mouneimne Message-ID: <20031013164826.GB20434@happy-idiot-talk.infracaninophile.co.uk> Mail-Followup-To: Hani Mouneimne , "." References: <40e792e94c57e8fc779e568f066edcfb@194.83.224.1> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="i9LlY+UWpKt15+FH" Content-Disposition: inline In-Reply-To: <40e792e94c57e8fc779e568f066edcfb@194.83.224.1> User-Agent: Mutt/1.5.4i X-Spam-Status: No, hits=-4.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=2.60 X-Spam-Checker-Version: SpamAssassin 2.60 (1.212-2003-09-23-exp) on happy-idiot-talk.infracaninophile.co.uk cc: "." Subject: Re: Crashing box X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Oct 2003 16:49:13 -0000 --i9LlY+UWpKt15+FH Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Oct 13, 2003 at 04:19:58PM +0200, Hani Mouneimne wrote: > Hey all, >=20 > I was wondering if you could help with this issue. >=20 > Eeverytime I run a make/compile on my freebsd 4.8 p10 systrem it has a > complete spaz and reboots. Usually cores and someimes gives no messages at > all in the logfiles.=20 > Here is the latest output of a makeworld I am doing > =3D"sh /usr/src/tools/install.sh"=20 > PATH=3D/usr/obj/usr/src/i386/usr/sbin:/usr/obj/usr/src/i386/usr/bin:/usr/= obj/usr/src/i386/usr/games:/sbin:/bin:/usr/sbin:/usr/bin > make -f Makefile.inc1 par-depend > *** Signal 11 > *** Signal 11 > Killed I assume you've read the FAQ entry on Sig11: http://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/troubleshoot.html#= SIGNAL11 Signal 11, especially if it occurs in an unpredictable place during compiles or other heavy weight operations, is a clear sign of hardware problems, but I think you know that from what you say next. =20 > This is just one of many crashes of similar scale, Sefaulting is also > common.=20 > I have changed the entire server hardware including the hard drive and it= is > still doing this. It was fine with FreeBSD p0 so I am wondering it it cou= ld > be some code issue. Tricky. Are you sure you've swapped out *all* of the hardware? SEGVs are typically due to memory or CPUs going bad, but there are several other considerations. - memory can be marginal: tests like running memtest86 won't necessarily pick up all failure cases, although when they do find a problem they are generally right. If the memory timing isn't quite in spec, or if there's a problem that only occurs when the memory stick heats up due to high activity then you may not pick it up except under load. - SEGVs can also occur due to bad memory in such devices as RAID or graphics controllers, or even in the CPU cache. - Overheating will generally cause stressed components to fail in this sort of way. Such failures will definitely be correlated with high system activity. CPUs generally do have thermal cutouts that just halt the machine, but thermal problems in other components can crash the system as you've seen. Northbridge and Southbridge chipsets on the motherboard can be an Achilles' heel in this respect. Check that all of the fans are working correctly, and that all of the ventilation holes/dust filters are clear and that there is sufficient room around the machine to permit free flow of air. If you've added extra components inside the system is the cooling airflow still adequate? - PSUs are also capable of causing such symptoms, especially if they aren't actually quite powerful enough to drive all your hardware. If the system voltages aren't properly stable then all sorts of undefined behaviour can occur. Modern 1GHz+ boxes generally need a 300W PSU, and the PSU tends to be both one of the least reliable parts of the system and one of the items where box manufacturers will be most agressive on price when sourcing components. - Even the machine *case* can cause this sort of problem. I've seen a machine where all of the electronics, PSU, fans etc. were swapped out, but the machine still keeled over when the case was screwed back together. Turned out that the case itself was a bit distorted, and screwing the case on resulted in bending the motherboard in a way that was clearly not good for it, especially when it warmed up a bit as well. Changing out the case produced a working system... Cheers, Matthew --=20 Dr Matthew J Seaman MA, D.Phil. 26 The Paddocks Savill Way PGP: http://www.infracaninophile.co.uk/pgpkey Marlow Tel: +44 1628 476614 Bucks., SL7 1TH UK --i9LlY+UWpKt15+FH Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (FreeBSD) iD8DBQE/itdadtESqEQa7a0RAkZEAJ4kTbYLiNPCO4xDwMDApXMz2FI2SQCfZXuz klMmBVj5iCj2ET3WhBnQGe0= =tFXY -----END PGP SIGNATURE----- --i9LlY+UWpKt15+FH--