From owner-freebsd-stable@FreeBSD.ORG Fri Jan 18 16:48:23 2013 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 27C4CA8E for ; Fri, 18 Jan 2013 16:48:23 +0000 (UTC) (envelope-from ian@FreeBSD.org) Received: from duck.symmetricom.us (duck.symmetricom.us [206.168.13.214]) by mx1.freebsd.org (Postfix) with ESMTP id 2478673B for ; Fri, 18 Jan 2013 16:48:15 +0000 (UTC) Received: from damnhippie.dyndns.org (daffy.symmetricom.us [206.168.13.218]) by duck.symmetricom.us (8.14.5/8.14.5) with ESMTP id r0IGm8pe010326 for ; Fri, 18 Jan 2013 09:48:09 -0700 (MST) (envelope-from ian@FreeBSD.org) Received: from [172.22.42.240] (revolution.hippie.lan [172.22.42.240]) by damnhippie.dyndns.org (8.14.3/8.14.3) with ESMTP id r0IGm5pt007879; Fri, 18 Jan 2013 09:48:05 -0700 (MST) (envelope-from ian@FreeBSD.org) Subject: Re: Spontaneous reboots on Intel i5 and FreeBSD 9.0 From: Ian Lepore To: Warren Block In-Reply-To: References: Content-Type: text/plain; charset="us-ascii" Date: Fri, 18 Jan 2013 09:48:05 -0700 Message-ID: <1358527685.32417.237.camel@revolution.hippie.lan> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit Cc: freebsd-stable@FreeBSD.org, Ronald Klop X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 18 Jan 2013 16:48:23 -0000 On Fri, 2013-01-18 at 08:04 -0700, Warren Block wrote: > On Fri, 18 Jan 2013, Ronald Klop wrote: > > > Memory chips gone bad? Power (or other) cables gone loose? > > Memory failures will cause intermittent and mysterious things. Easy to > test, too, just run memtest86 on it for a while. Do that before > rebuilding. If memory is failing, corrupted data could be written to > disk. > > I had a Crucial DIMM fail spontaneously a couple of weeks ago. Working > one minute, totally failed the next. The machine rebooted, for no > visible reason. After it came back up, compiles failed, always with > different errors and in different places. > > Power supplies also fail, as do motherboards. These are both harder to > swap out than memory, so test the memory first. I tend to agree, a machine that starts rebooting spontaneously when nothing significant changed and it used to be stable is usually a sign of a failing power supply or memory. But I disagree about memtest86. It's probably not completely without value, but to me its value is only negative: if it tells you memory is bad, it is. If it tells you it's good, you know nothing. Over the years I've had 5 dimms fail. memtest86 found the error in one of them, but said all the others were fine in continuous 48-hour tests. I even tried running the tests on multiple systems. The thing that always reliably finds bad memory for me is /usr/ports/math/mprime run in test/benchmark mode. It often takes 24 or more hours of runtime, but it will find your bad memory. -- Ian