From owner-freebsd-amd64@FreeBSD.ORG Mon Dec 17 21:43:47 2007 Return-Path: Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7FAD516A417 for ; Mon, 17 Dec 2007 21:43:47 +0000 (UTC) (envelope-from nge@cs.hmc.edu) Received: from knuth.cs.hmc.edu (knuth.cs.hmc.edu [134.173.42.100]) by mx1.freebsd.org (Postfix) with ESMTP id 6563213C46E for ; Mon, 17 Dec 2007 21:43:47 +0000 (UTC) (envelope-from nge@cs.hmc.edu) Received: by knuth.cs.hmc.edu (Postfix, from userid 26983) id 2A70D85140; Mon, 17 Dec 2007 13:43:47 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by knuth.cs.hmc.edu (Postfix) with ESMTP id 2A09442F7E0; Mon, 17 Dec 2007 13:43:47 -0800 (PST) Date: Mon, 17 Dec 2007 13:43:47 -0800 (PST) From: Nate Eldredge X-X-Sender: nate@knuth.cs.hmc.edu To: Jordi Espasa Clofent In-Reply-To: <4766CF56.7030308@opengea.org> Message-ID: References: <47656FB7.4070807@opengea.org> <4766CF56.7030308@opengea.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-amd64@freebsd.org Subject: Re: Random reboots X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Dec 2007 21:43:47 -0000 On Mon, 17 Dec 2007, Jordi Espasa Clofent wrote: >> That would be especially helpful, since from this information we don't >> know whether the cause is a kernel panic or a hardware problem. Is your >> kernel configured to reboot automatically on panic? Also, are you by any >> chance using the watchdog? > > Yes Nate, I'm working on this way. The idea is attach another HD and expand > the /swap value and get a coredump file. Great. I got your other message where you mention this just after I sent mine. Not trying to hound you :) > Besides of that, I was looking at watchdog but I don't understand their > operation yet. It's a time question. The reason I ask is that I've run into a couple of issues where the machine hangs. If you were using a watchdog, that would cause the system to reboot. So as far as debugging goes, it's just as well that you aren't using it. I have run into some issues with snapshots, are you using them? You might also check the SMART data on your disks since FreeBSD has some bugs where failing drives are not handled gracefully. See the smartmontools port. One other idea: you might configure a serial console so you can see any messages the machine generates as it's dying. (These wouldn't necessarily appear in the log files, since the system is too dead to write to them.) You could connect the serial port to another machine which logs it. -- Nate Eldredge nge@cs.hmc.edu