From owner-freebsd-isp@FreeBSD.ORG Fri Dec 29 22:02:05 2006 Return-Path: X-Original-To: FreeBSD-ISP@FreeBSD.org Delivered-To: FreeBSD-ISP@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1F07416A407 for ; Fri, 29 Dec 2006 22:02:05 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id DC77513C428 for ; Fri, 29 Dec 2006 22:02:04 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 403EB46DE0; Fri, 29 Dec 2006 16:33:25 -0500 (EST) Date: Fri, 29 Dec 2006 21:33:25 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Jan Knepper In-Reply-To: <4593552E.80400@digitaldaemon.com> Message-ID: <20061229213013.E86685@fledge.watson.org> References: <45918F6E.90006@digitaldaemon.com> <004c01c7293b$d5e03b40$6500a8c0@laptopt> <4591CB3C.1060902@digitaldaemon.com> <20061227033742.GA9706@xor.obsecurity.org> <4593552E.80400@digitaldaemon.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Tim McCullagh , FreeBSD ISP , FreeBSD Hackers , Kris Kennaway Subject: Re: 6.1-RELEASE / 6.2 Kernel Crash... X-BeenThere: freebsd-isp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Internet Services Providers List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Dec 2006 22:02:05 -0000 On Thu, 28 Dec 2006, Jan Knepper wrote: >> Sounds like a bug in the support for your ATA hardware, or your hardware is >> broken. The very least you'll need to do is to obtain a crashdump and >> debugging backtrace (see the developers handbook) and CC it to sos@ >> > This is getting funnier... > I added: > dumpdev="AUTO" > to: rc.conf > Rebooted the system and tried to get it to crash again... > And indeed it does in process 9: taskq > > Then it starts dumping which takes a couple of seconds as the machine has 2 > GB Ram... > > Than it reboots... and the next thing you know... savecore does NOT > recognize a dump on the swap file system. If does not save anything to > /var/crash... Tried this about 10 times... No luck... > > Any other idea's? Yeah, unfortunately if some combination of storage driver and hardware aren't working, it's hard to get a dump... The usual fallback here is to use a serial console to capture debugging information from DDB and to skip the dump side of things. In fact, I prefer debugging that way most of the time. The reason for using a serial console (or firewire) is to avoid having to hand-copy trap and debugging information, which gets very painful very quickly. Compile DDB and KDB into your kernel, and configure a serial console, and a panic should lead to the system entering the debugger. The usual first command to type is "trace" to generate a backtrace; it's often useful also to do "show pcpu", "show allpcpu", "alltrace", and "ps", although for the problem you're seeing the last two may be less useful. The 0x50 trap address in your post suggests this is a NULL pointer dereference. What we now need to do is work out what piece of code is dereferencing the pointer improperly, which is where the backtrace comes in. If you could copy and paste all that DDB/KDB output into an e-mail (or, perhaps more ideally, a PR), that would be great. Robert N M Watson Computer Laboratory University of Cambridge