From owner-freebsd-amd64@FreeBSD.ORG  Sun Dec 23 13:16:31 2007
Return-Path: <owner-freebsd-amd64@FreeBSD.ORG>
Delivered-To: amd64@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 76ACC16A417
	for <amd64@FreeBSD.org>; Sun, 23 Dec 2007 13:16:31 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 5F55E13C459
	for <amd64@FreeBSD.org>; Sun, 23 Dec 2007 13:16:31 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from fledge.watson.org (fledge.watson.org [209.31.154.41])
	by cyrus.watson.org (Postfix) with ESMTP id B799F47936;
	Sun, 23 Dec 2007 07:58:14 -0500 (EST)
Date: Sun, 23 Dec 2007 12:58:14 +0000 (GMT)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: Erwin Lansing <erwin@FreeBSD.org>
In-Reply-To: <20071223125236.GM1616@droso.net>
Message-ID: <20071223125714.K79882@fledge.watson.org>
References: <20071223125236.GM1616@droso.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Mailman-Approved-At: Sun, 23 Dec 2007 13:18:25 +0000
Cc: amd64@FreeBSD.org
Subject: Re: Can't panic from debugger
X-BeenThere: freebsd-amd64@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Porting FreeBSD to the AMD64 platform <freebsd-amd64.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-amd64>
List-Post: <mailto:freebsd-amd64@freebsd.org>
List-Help: <mailto:freebsd-amd64-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 23 Dec 2007 13:16:31 -0000

On Sun, 23 Dec 2007, Erwin Lansing wrote:

> The amd64 nodes in the pointyhat cluster are starting to behave quite 
> interestingly.  They stop to respond to ssh, but are still answering ping. 
> More worrying is that I cannot get a useful dump out of it, as a panic from 
> the debugger just hangs there, and all I am left with is to pull the plug. 
> This even happens on a normal working system after entering the debugger, of 
> which there is a typescript below.
>
> FreeBSD hammer1.isc.gumbysoft.com 8.0-CURRENT FreeBSD 8.0-CURRENT #11: Sat 
> Dec 8 17:18:09 UTC 2007 
> root@danner.isc.gumbysoft.com:/usr/obj/usr/src/sys/HAMMER amd64
>
> http://people.freebsd.org/~erwin/hammer1

I discovered yesterday that I was seeing the same problem on a dual-cpu, 
dual-core box in the netperf cluster:

cheetah-rwatson# sysctl debug.kdb.enter=1
debug.kdb.enter:K DB0: enter: sysctl debug.kdb.enter
[thread pid 1136 tid 100091 ]
Stopped at      kdb_enter+0x3d: movq    $0,0x603f38(%rip)
db> show pcpu
cpuid        = 0
curthread    = 0xffffff00032e56a0: pid 1136 "sysctl"
curpcb       = 0xffffffffabae7d40
fpcurthread  = none
idlethread   = 0xffffff00010d56a0: pid 11 "idle: cpu0"
spin locks held:
db> panic
panic: from debugger
cpuid = 0
<wedge>

I *can* get a coredump if I directly "call doadump" and then "reset", but I 
can't get one if I just do "panic".

Robert N M Watson
Computer Laboratory
University of Cambridge