From owner-freebsd-current@FreeBSD.ORG Tue Jul 1 16:56:17 2014 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0015954C; Tue, 1 Jul 2014 16:56:16 +0000 (UTC) Received: from outpost1.zedat.fu-berlin.de (outpost1.zedat.fu-berlin.de [130.133.4.66]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id AB09626C2; Tue, 1 Jul 2014 16:56:16 +0000 (UTC) Received: from inpost2.zedat.fu-berlin.de ([130.133.4.69]) by outpost.zedat.fu-berlin.de (Exim 4.82) with esmtp (envelope-from ) id <1X21Lk-000GPz-B6>; Tue, 01 Jul 2014 18:56:08 +0200 Received: from g225051198.adsl.alicedsl.de ([92.225.51.198] helo=thor.walstatt.dynvpn.de) by inpost2.zedat.fu-berlin.de (Exim 4.82) with esmtpsa (envelope-from ) id <1X21Lk-000A1i-7v>; Tue, 01 Jul 2014 18:56:08 +0200 Date: Tue, 1 Jul 2014 18:56:03 +0200 From: "O. Hartmann" To: Willem Jan Withagen Subject: Re: [CURRENT]: weird memory/linker problem? Message-ID: <20140701185603.00be87ef.ohartman@zedat.fu-berlin.de> In-Reply-To: <53B2DA66.9010506@digiware.nl> References: <20140622165639.17a1ba1e.ohartman@zedat.fu-berlin.de> <20140623163115.03bdd675.ohartman@zedat.fu-berlin.de> <20140701150755.548ed6b9.ohartman@zedat.fu-berlin.de> <53B2D262.2040502@digiware.nl> <20140701173335.394414c3.ohartman@zedat.fu-berlin.de> <53B2DA66.9010506@digiware.nl> Organization: FU Berlin X-Mailer: Claws Mail 3.10.1 (GTK+ 2.24.22; amd64-portbld-freebsd11.0) MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; boundary="Sig_/ZmBnMA4RRv0oGJlzTHlHs8K"; protocol="application/pgp-signature" X-Originating-IP: 92.225.51.198 X-ZEDAT-Hint: A Cc: "Rang, Anton" , Adrian Chadd , FreeBSD CURRENT , Dimitry Andric X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 Jul 2014 16:56:17 -0000 --Sig_/ZmBnMA4RRv0oGJlzTHlHs8K Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Am Tue, 01 Jul 2014 17:57:26 +0200 Willem Jan Withagen schrieb: > On 2014-07-01 17:33, O. Hartmann wrote: > > Am Tue, 01 Jul 2014 17:23:14 +0200 > > Willem Jan Withagen schrieb: > > > >> On 2014-07-01 16:48, Rang, Anton wrote: > >>> DOT =3D> DOD > >>> > >>> 444F54 =3D> 444F44 > >>> > >>> That's a single-bit flip. Bad memory, perhaps? > >> > >> Very likely, especially if the system does not have ECC.... > >> It just happens on rare occasions that a alpha particle, power cycle, = or > >> any things else disruptive damages a memory cell. And it could be that > >> it requires a special pattern of accesses to actually exhibit the erro= r. > >> > >> In the past (199x's) 'make buildworld' used to be a rather good memory > >> tester. But nowadays look at > >> http://www.memtest.org/ > >> > >> This tool has found all of the bad memory in all the systems I used and > >> or build for others... > >> Note that it might take a few runs and some more heat to actually > >> trigger the faulty cell, but memtest86 will usually find it. > >> > >> Note that on big systems with lots of memory it can take a loooooong > >> time to run just one full testset to completion. > >> > >> --WjW > > > > I already testet via memtest86+ (had to download the linux image, the p= ort on FreeBSD > > is broken on CURRENT). It didn't find anything strange so far. > > > > I will do another test. > > > > I realised, that on that that specific box, the chipset temperature is = 81 Grad Celius. > > The chipset is a Eaglelake P45 - in which the memory controller resides= on that old > > platform. dmidecode gives: > > > > Manufacturer: ASUSTeK Computer INC. > > Product Name: P5Q-WS > > Version: Rev 1.xx > Hello Willem, =20 > Hi Oliver, >=20 > I've build several (5+) systems with these boards (from memory they date= =20 > around 2009??). And if I recall right, one of them is still functional.=20 > The first one broke down in a couple of weeks, and the other did not=20 > survive time either. >=20 > The auxiliary chips on that board do run hot, but I never realized this=20 > hot. Is 81C is the CPU temp from sysctl, or did you measure the cooling=20 > body on the motherboard. In the later case it is just too hot, probably. > But even if it is the temp on the chip itself, I've rrarely seen temps=20 > go up this high. The temperature is seen in BIOS and by the usage of one of those health dae= mon, found in ports (forgot about the name).=20 There is no sysctl MIB showing the chipset temperature on that board, as fa= r as I know. >=20 > You can need to run the memtest86 for more than 6-10 complete runs with=20 > all the tests. Last time I ran memtest86+ it took ~ 1 1/2 days to finish. >=20 > If the memtests do not reveal anything broken, then you get into even=20 > more wizardry stuff, like bad power etc... Especially since it only=20 > occurs on occasion, it is going to be a nightmare to find the root cause= =20 > of this. Other than replacing hardware piece by piece, which won't be=20 > easy given the age of the board and parts. >=20 > You could go into the bios, and try to config ram access at a slower=20 > speed and see if the problem goes away. Then it could be that you are=20 > running an the edge of the spec with regards to ram timing. >=20 > But like I said, it is all lots of funky details that can interact in=20 > strange and unexpected ways. >=20 > --WjW I will check memory these days again. Regards, Oliver --Sig_/ZmBnMA4RRv0oGJlzTHlHs8K Content-Type: application/pgp-signature; name=signature.asc Content-Disposition: attachment; filename=signature.asc -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAEBAgAGBQJTsugnAAoJEOgBcD7A/5N8WXMH+QGPihwFglKqVaFZ0XiH5un/ 9FkGh0vfhkbpJK1xtCUz3qPOseumUSIzfs8tGOaTpfqf4VNvpAdJ4k64wqd3m95E bgXKgiXoyubWHO9KIJ9pME9LB1UEVyzWKBkT3r4doFRiwEKiZlpRK+mVW3Hbx46y a6ffXL+o2PKyMw8HGvuUMF0C1YPixYu7nwBN/jYRvFaui4g0kfk6PFNt/XoiU6f2 1U77pPGXXyiNsEXFknMIqrjjX+vXjza7GTFeEJw/j8teUg0akitEMOVtBQWMEAvO FHo+iQMcGGx7Qa17qpz6wE+36ikMZopRHJNe8ZXzoBzyXMmFF9/+YTO46vVkUQ4= =0mnH -----END PGP SIGNATURE----- --Sig_/ZmBnMA4RRv0oGJlzTHlHs8K--