Date: Thu, 29 Jan 2004 09:12:04 +0100 (CET) From: Per von Zweigbergk <pvz@e.kth.se> To: Tony Holmes <tony@crosswinds.net> Cc: freebsd-hardware@freebsd.org Subject: Re: Signal 10? Message-ID: <Pine.LNX.4.58.0401290857350.21584@quetzalcoatlite.e.kth.se> In-Reply-To: <20040128121913.A54789@crosswinds.net> References: <20040128121913.A54789@crosswinds.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 28 Jan 2004, Tony Holmes wrote: > Quick question. > > I am getting occasional processes dying from Sig 10 and 11. > It has been a long time since I saw these and to narrow down > where I start my debugging, wanted to ask what the usual source > of these signals (problems) are from? > > IIRC sig11 is bad memory, but sig 10? Signal 11 is Segmentation Fault. This happens when programs try to write to or read from memory they're not allowed to read. (This is quite common if the program attempts to dereference unitialized pointers, or in case of buffer overflow. The most common source of this problem is quite simply a bug in the software in question. But if this is happening on many programs in general, it could be a sign of hardware error, quite probably memory error. Signal 10 is Bus Error. This is much more rare, but still plausibly could be caused by incorrectly written software. (I think I've seen Netscape 4 crash with this message once or twice -- but it's rare.) Below is the definition from FOLDOC: bus error <processor> A fatal failure in the execution of a machine language instruction resulting from the processor detecting an anomalous condition on its bus. Such conditions include invalid address alignment (accessing a multi-byte number at an odd address), accessing a physical address that does not correspond to any device, or some other device-specific hardware error. A bus error triggers a processor-level exception which Unix translates into a "SIGBUS" signal which, if not caught, will terminate the current process. This can quite plausibly be caused by hardware error, or memory problems in particular. But note that random Signal 11's are a symptom of a problem, and their appearance alone isn't enough to make a diagnosis. I suggest you download the excellent utility MemTest86 (www.memtest86.com) for more information on possible memory problems. If you're too lazy or too poor to have the memory replaced (either under warranty or out of pocket), and the system is not particularilly mission critical, there is a kernel patch for Linux called BadRAM or something along those lines, which allows you to simply not use the parts of the memory which are bad. There are no comparable patches for FreeBSD as far as I am aware. If I had to put my money on what the problem with your setup was, I'd most likely bet it was the memory -- but don't take my word for it -- use memtest86! Hope this helps. -- Per von Zweigbergk <pvz@e.kth.se>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.LNX.4.58.0401290857350.21584>