From owner-freebsd-questions@FreeBSD.ORG Fri Nov 26 04:49:29 2004 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CFFA616A4CE for ; Fri, 26 Nov 2004 04:49:29 +0000 (GMT) Received: from skippyii.compar.com (mail.compar.com [216.208.38.130]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1D0A343D54 for ; Fri, 26 Nov 2004 04:49:29 +0000 (GMT) (envelope-from matt@gsicomp.on.ca) Received: from hermes (CPE00062566c7bb-CM000039c69a66.cpe.net.cable.rogers.com [69.193.82.185])iAQ4soU2062187; Thu, 25 Nov 2004 23:54:52 -0500 (EST) (envelope-from matt@gsicomp.on.ca) Message-ID: <001801c4d372$b66b3bf0$1200a8c0@gsicomp.on.ca> From: "Matt Emmerton" To: "Chuck Robey" , "Haulmark, Chris" References: <6FC9F9894A9F8C49A722CF9F2132FC220276581D@ms05.mailstreet2003.net> <20041125234948.R27818@april.chuckr.org> Date: Thu, 25 Nov 2004 23:45:06 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1437 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1441 cc: Jonathon McKitrick cc: freebsd-questions@freebsd.org Subject: Re: Is this a sign of memory going bad? X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Nov 2004 04:49:29 -0000 > On Thu, 25 Nov 2004, Haulmark, Chris wrote: > > > Someone broke the silence: > > > > > On Thu, Nov 25, 2004 at 04:05:53PM -0500, Lowell Gilbert wrote: > > >> Jonathon McKitrick writes: > > >> > > >>> This is what I get from make buildworld. I've gotten signal 10, > > >>> 11, and now 5. > > >>> > > >>> Is this bad memory? > > >> > > >> That's a reasonable guess, but the only way to tell for sure is to > > >> test it. > > > > > > Is there a port to do this, or do I have to take it out and take it > > > somewhere else to get it tested? > > > > > > jm > > > > sysutils/memtest in the ports. > > I don't want to embarrass anyone here, but something needs to be said. > Note this next sentence carefully: THERE IS NO SUCH THING AS A WORKING > MEMORY TEST PROGRAM!!! > > Anyone who tells you otherwise is no friend of yours, because they are > making your life hard. It's very alluring to assume that programs written > to do a job actually do that job, and most especially in the case of > memory test, one would *really* **REALLY** wish that Chuck here was lying, > cause you honestly need a memory test program, but the truth is otherwise: > memory test programs don't work. At the very best, if they spend 30 > minutes carefully exercising memory, you get a factor that is maybe 10% > reliable, and 90% wishful guessing. > > With that in mind, sometimes, the very best memory test programs can give > you better ideas that memory you thought was failing IS failing. The > opposite, proving that memory is good, is just totally, totally useless, > you cannot take any data home at all about your memory being good. And it's for this very reason that I often keep a few extra sticks of memory lying around my office. When a system starts acting wonky (intermittent crashes, especially under load like during a buildworld or heavy spamassassin/razor activity), I take it offline, swap memory, and see if the bad behaviour continues. If it does, I'm no worse off than before. If it doesn't, I have a pretty good confidence level in saying that the memory was bad. While this method may seem somewhat brute-force-ish, it's often much quicker and easier than futzing around with memtest and guessing. Given the cost of memory these days, swapping it out is generally cheaper than the cost of random downtime and recovering from crashes in a production environment. -- Matt Emmerton