Date: Sun, 26 Apr 2015 17:02:32 -0700 From: Mehmet Erol Sanliturk <m.e.sanliturk@gmail.com> To: galtsev@kicp.uchicago.edu Cc: =?UTF-8?Q?Fernando_Apestegu=C3=83=C2=ADa?= <fernando.apesteguia@gmail.com>, User Questions <freebsd-questions@freebsd.org> Subject: Re: Debugging bad memory problems Message-ID: <CAOgwaMs8ePhmD9%2BX6C87atHu-RxO5Q0%2Bce%2BRLMfhMDPfcmpxGQ@mail.gmail.com> In-Reply-To: <5793.69.209.235.143.1430086547.squirrel@cosmo.uchicago.edu> References: <CAGwOe2Y%2BRuT7MuCTBq_swn-Ny-BS-WH1J=bZTbE9L4tuv8LmCA@mail.gmail.com> <5480.69.209.235.143.1430078703.squirrel@cosmo.uchicago.edu> <CAGwOe2a7UZxSsaV4T2pcU0K1MA-OH1=123pb%2BsM=pTgSFEDLFg@mail.gmail.com> <5793.69.209.235.143.1430086547.squirrel@cosmo.uchicago.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Apr 26, 2015 at 3:15 PM, Valeri Galtsev <galtsev@kicp.uchicago.edu> wrote: > > On Sun, April 26, 2015 4:05 pm, Fernando Apestegu=C3=83=C2=ADa wrote: > > On Sun, Apr 26, 2015 at 10:05 PM, Valeri Galtsev > > <galtsev@kicp.uchicago.edu> wrote: > >> > >> On Sun, April 26, 2015 12:11 pm, Fernando Apestegu=C3=83=C2=ADa wrote: > >>> Hi, > >>> > >>> I suspect my old and beloved AMD64 laptop is suffering from bad memor= y > >>> problems: I get random crashes of well tested programs like sh, which= , > >>> etc even when I executed some of them from /rescue. > >> > >> If RAM is a suspect the first thing I would do is re-seat memory > >> modules. > >> Open the box. (Observe static precautions!) Remove memory modules. > >> Install > >> them again. > >> > >> Do memtest86 (by booting into memtest86, you can have that in your boo= t > >> options, or you can boot off external media as others suggested). > >> > >> If you still have problems: try to run with one memory module instead = of > >> two. At some point when they went to higher RAM speeds memory bus > >> amplifier became more fragile (some chips, some manufacturers, as not = it > >> is part of CPU, this may be true only about some of the CPU models). Y= ou > >> sometimes can slightly fry it if you merely leave laptop running on > >> battery, letting battery run down and laptop powering off due to that. > >> With some of chips this may lead to slightly frying it - memory > >> controller > >> portion of it, address bus amplifier in particular. Bus amplifier > >> becomes > >> slightly lower frequency, which results in poorer handling capacitive > >> load > >> (which is larger if you have more RAM), and it is marginally OK, > >> occasionally having address errors. Going to one module may resolve > >> this. > >> You will know if this is likely the case if memtest86 is successful wi= th > >> each of single RAM modules, but fails (in random places, often not > >> reproducible) with both. > >> > >> Good luck! > > > > I booted from a memtest CD-ROM. It passed a couple of tests fine and > > then it rebooted while doing a "bit fade" test at around 93%. Removing > > the modules is tricky since this laptop has screws all around in dark > > corners (even removing the battery needs a screw driver). I will try > > to limit physical memory with hw.physmem and see if it makes any > > difference. > > The last will not help against what I mentioned, as capacitive load on > memory address bus is defined by what is physically attached to it. > > One usually runs memtest86 for 24 hours at lest. One loop will catch > "solid defects" like adjacent line on the board connected (while they > shouldn't). Memory related failures to the contrary are often > intermittent. In worst case I've seen, they only manifested under intense > load of the box (whereas memtest86 is equivalent to almost zero load). > > Good luck! > > Valeri > > ++++++++++++++++++++++++++++++++++++++++ > Valeri Galtsev > Sr System Administrator > Department of Astronomy and Astrophysics > Kavli Institute for Cosmological Physics > University of Chicago > Phone: 773-702-4247 > ++++++++++++++++++++++++++++++++++++++++ > Failure may be in memory management circuits instead of memory chips . To test this situation , the existing memories may be replaced by memory chips that they known to work ( if it can be done ) . Thank you very much . Mehmet Ero Sanliturk
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOgwaMs8ePhmD9%2BX6C87atHu-RxO5Q0%2Bce%2BRLMfhMDPfcmpxGQ>