From owner-freebsd-amd64@FreeBSD.ORG Sun Dec 23 13:16:31 2007 Return-Path: Delivered-To: amd64@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 76ACC16A417 for ; Sun, 23 Dec 2007 13:16:31 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id 5F55E13C459 for ; Sun, 23 Dec 2007 13:16:31 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id B799F47936; Sun, 23 Dec 2007 07:58:14 -0500 (EST) Date: Sun, 23 Dec 2007 12:58:14 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Erwin Lansing In-Reply-To: <20071223125236.GM1616@droso.net> Message-ID: <20071223125714.K79882@fledge.watson.org> References: <20071223125236.GM1616@droso.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Mailman-Approved-At: Sun, 23 Dec 2007 13:18:25 +0000 Cc: amd64@FreeBSD.org Subject: Re: Can't panic from debugger X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 23 Dec 2007 13:16:31 -0000 On Sun, 23 Dec 2007, Erwin Lansing wrote: > The amd64 nodes in the pointyhat cluster are starting to behave quite > interestingly. They stop to respond to ssh, but are still answering ping. > More worrying is that I cannot get a useful dump out of it, as a panic from > the debugger just hangs there, and all I am left with is to pull the plug. > This even happens on a normal working system after entering the debugger, of > which there is a typescript below. > > FreeBSD hammer1.isc.gumbysoft.com 8.0-CURRENT FreeBSD 8.0-CURRENT #11: Sat > Dec 8 17:18:09 UTC 2007 > root@danner.isc.gumbysoft.com:/usr/obj/usr/src/sys/HAMMER amd64 > > http://people.freebsd.org/~erwin/hammer1 I discovered yesterday that I was seeing the same problem on a dual-cpu, dual-core box in the netperf cluster: cheetah-rwatson# sysctl debug.kdb.enter=1 debug.kdb.enter:K DB0: enter: sysctl debug.kdb.enter [thread pid 1136 tid 100091 ] Stopped at kdb_enter+0x3d: movq $0,0x603f38(%rip) db> show pcpu cpuid = 0 curthread = 0xffffff00032e56a0: pid 1136 "sysctl" curpcb = 0xffffffffabae7d40 fpcurthread = none idlethread = 0xffffff00010d56a0: pid 11 "idle: cpu0" spin locks held: db> panic panic: from debugger cpuid = 0 I *can* get a coredump if I directly "call doadump" and then "reset", but I can't get one if I just do "panic". Robert N M Watson Computer Laboratory University of Cambridge