Date: Tue, 31 Jul 2012 20:50:23 -0400 From: Mark Saad <nonesuch@longcount.org> To: Julian Elischer <julian@freebsd.org> Cc: "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org> Subject: Re: How to diagnose system freezes? Message-ID: <B675E827-07F5-4F41-9983-1E1B1B095326@longcount.org> In-Reply-To: <50187853.7080206@freebsd.org> References: <501871FD.601@rawbw.com> <50187853.7080206@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Jul 31, 2012, at 8:29 PM, Julian Elischer <julian@freebsd.org> wrote: > On 7/31/12 5:02 PM, Yuri wrote: >> One of my 9.1-BETA1 systems periodically freezes. If sound was playing, i= t would usually cycle with a very short period. And system stops being sensi= tive to keyboard/mouse. Also ping of this system doesn't get a response. >> I would normally think that this is the faulty memory. But memory was rec= ently replaced and tested with memtest+ for hours both before and after free= zes and it passes all tests. >> One out of the ordinary thing that is running on this system is nvidia dr= iver. But the freezes happen even when there is no graphics activity. >> Another out of the ordinary thing is that the kernel is built for DTrace.= But DTrace was never used in the sessions that had a freeze. >>=20 >> What is the way to diagnose this problem? > The answer depends on a number of things but an NMI can be useful if you h= ave some way of > generating them. (some IPMI implementations can allw you to generate them a= nd some motherboards have > jumpers to allow you to attach a 'nmi-button'. >=20 > The fact that ping is not responsive is important, as that is done at a ve= ry low level but > it may still be alive down there somewhere. >=20 > Make sure you have debugging enabled in your kernel. That will catch quite= a few 'hangs'. >=20 > as also mentioned by others... a serial console and DDB may also be useful= in some hangs. >=20 >=20 > Julian >> CPU: i7 CPU 920 @ 2.67GHz >> Memory: 24GB >> MB: P2T >>=20 >> Yuri >>=20 Yuri Install sysutils/mcelog and try running the example included . While not a= complete definitative hardware test it can report other hardware issues tha= t memtest86+ misses and it can be run on line in multiuser mode and via cron= .=20 --- Mark saad | mark.saad@longcount.org
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?B675E827-07F5-4F41-9983-1E1B1B095326>