Date: Mon, 17 Dec 2007 13:43:47 -0800 (PST) From: Nate Eldredge <nge@cs.hmc.edu> To: Jordi Espasa Clofent <jordi.espasa@opengea.org> Cc: freebsd-amd64@freebsd.org Subject: Re: Random reboots Message-ID: <Pine.LNX.4.64.0712171336250.32093@knuth.cs.hmc.edu> In-Reply-To: <4766CF56.7030308@opengea.org> References: <47656FB7.4070807@opengea.org> <Pine.LNX.4.64.0712171127330.32093@knuth.cs.hmc.edu> <4766CF56.7030308@opengea.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 17 Dec 2007, Jordi Espasa Clofent wrote: >> That would be especially helpful, since from this information we don't >> know whether the cause is a kernel panic or a hardware problem. Is your >> kernel configured to reboot automatically on panic? Also, are you by any >> chance using the watchdog? > > Yes Nate, I'm working on this way. The idea is attach another HD and expand > the /swap value and get a coredump file. Great. I got your other message where you mention this just after I sent mine. Not trying to hound you :) > Besides of that, I was looking at watchdog but I don't understand their > operation yet. It's a time question. The reason I ask is that I've run into a couple of issues where the machine hangs. If you were using a watchdog, that would cause the system to reboot. So as far as debugging goes, it's just as well that you aren't using it. I have run into some issues with snapshots, are you using them? You might also check the SMART data on your disks since FreeBSD has some bugs where failing drives are not handled gracefully. See the smartmontools port. One other idea: you might configure a serial console so you can see any messages the machine generates as it's dying. (These wouldn't necessarily appear in the log files, since the system is too dead to write to them.) You could connect the serial port to another machine which logs it. -- Nate Eldredge nge@cs.hmc.edu
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.LNX.4.64.0712171336250.32093>