From owner-freebsd-current@FreeBSD.ORG Wed Feb 7 20:03:17 2007 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 54F0C16A401; Wed, 7 Feb 2007 20:03:14 +0000 (UTC) (envelope-from rrs@cisco.com) Received: from sj-iport-4.cisco.com (sj-iport-4.cisco.com [171.68.10.86]) by mx1.freebsd.org (Postfix) with ESMTP id 2D76813C48D; Wed, 7 Feb 2007 20:03:14 +0000 (UTC) (envelope-from rrs@cisco.com) Received: from sj-dkim-6.cisco.com ([171.68.10.81]) by sj-iport-4.cisco.com with ESMTP; 07 Feb 2007 12:03:14 -0800 X-IronPort-AV: i="4.13,296,1167638400"; d="scan'208"; a="37919737:sNHT49543533" Received: from sj-core-4.cisco.com (sj-core-4.cisco.com [171.68.223.138]) by sj-dkim-6.cisco.com (8.12.11/8.12.11) with ESMTP id l17K397V018588; Wed, 7 Feb 2007 12:03:09 -0800 Received: from xbh-sjc-231.amer.cisco.com (xbh-sjc-231.cisco.com [128.107.191.100]) by sj-core-4.cisco.com (8.12.10/8.12.6) with ESMTP id l17K3FnH017369; Wed, 7 Feb 2007 12:03:15 -0800 (PST) Received: from xfe-sjc-211.amer.cisco.com ([171.70.151.174]) by xbh-sjc-231.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.1830); Wed, 7 Feb 2007 12:03:15 -0800 Received: from [127.0.0.1] ([171.68.225.134]) by xfe-sjc-211.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.1830); Wed, 7 Feb 2007 12:03:13 -0800 Message-ID: <45CA3063.3000300@cisco.com> Date: Wed, 07 Feb 2007 15:02:43 -0500 From: Randall Stewart User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.8) Gecko/20061029 FreeBSD/i386 SeaMonkey/1.0.6 MIME-Version: 1.0 To: John Baldwin References: <45A533F6.7030405@cisco.com> <45BC9943.9010300@cisco.com> <200702071154.38250.jhb@freebsd.org> In-Reply-To: <200702071154.38250.jhb@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 07 Feb 2007 20:03:13.0763 (UTC) FILETIME=[FFCEC330:01C74AF2] DKIM-Signature: v=0.5; a=rsa-sha256; q=dns/txt; l=2524; t=1170878589; x=1171742589; c=relaxed/simple; s=sjdkim6002; h=Content-Type:From:Subject:Content-Transfer-Encoding:MIME-Version; d=cisco.com; i=rrs@cisco.com; z=From:=20Randall=20Stewart=20 |Subject:=20Re=3A=20how=20to=20find=20out=20what=20the=20other=20CPU=20is =20doing |Sender:=20; bh=DNU7cOvvFoGDaVBKFrbRo45SaWDRCBTeem10oyEL9dM=; b=SQ/c1qD7wU0YkZqSzttlGi2NvVJqCNjWRtmCxpPnjJ6eFg5TTvONNnKiArsQF6bP84QzSOGl /Qo5JMASN/xj360CwstLaF0rhfHHqdKbG7AorPiXZhHYjGBvDlH+kmNY; Authentication-Results: sj-dkim-6; header.From=rrs@cisco.com; dkim=pass (sig from cisco.com/sjdkim6002 verified; ); Cc: freebsd-current@freebsd.org Subject: Re: how to find out what the other CPU is doing X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Feb 2007 20:03:17 -0000 John Baldwin wrote: > On Sunday 28 January 2007 07:38, Randall Stewart wrote: >> All: >> >> Ok, I did not get an answer to this.. and of course >> I hit the bug again (which I now figured out how to >> fix :-D) >> >> So let me explain what I did.. so that way I >> can come back and find this email later when it >> someday happens again ;-) (and for anyone else >> curious). >> >> 1) I had to do this from DDB ... I could not find a >> way in kgdb. >> >> 2) When you stop the machine in ddb (at least in i386) it >> dumps BOTH CPU's info in something called >> stoppcbs[num-cpus] >> 3) Its an array of struct pcb .. which has all the info >> you need to get started. >> 4) With a trusty x/ stoppcbs you can work your way through >> and gather the info you need.. For x86 the second CPU >> started at stoppcbs+0x270 .. if you don't want to look >> at all those 0's (of course the offset could change and >> will vary from CPU type to CPU type :-D) >> 5) Dig out the ebp from here. You can look at the IP >> but it will be in some NMI stop CPU routine. >> 6) You can use the bp to trace backward through the stack >> and figure out the running stack trace... I went back >> to kgdb after getting the ebp (with CPU still spinning away). >> 7) You have to go several frames back to get by all the NMI >> stuff before you find your guilty party :-) >> >> There might be a better way to do this.. and I am thinking >> about adding a machine dependent trace that can take >> a ebp argument (if one does not already exist in kgdb.. I >> suppose I need to poke around in the macro's a bit).. anyway >> its primitive .. but it allows you to find that spinning >> kernel routine :-) > > When you use 'thread/tid/proc' in kgdb it uses stoppcbs[] automatically, so > you can do 'proc 437' and do 'bt' to get a trace as I explained earlier. ddb > can also do this for you as 'tr' in ddb can take a pid or tid as an argument, > so in ddb you can do 'tr 437' to trace proc 437. Note if you want to use > the 'tid' in kgdb you use 'tid '. 'proc' takes PIDs not TIDs in kgdb. > Hmm.. I tried that the first time I had a crash in kgdb (I did not do anything in DDB) and it did not work for me... I flustered around with it for a very very long time too. Maybe I have an old kgdb or something but I could not get it to work :-( R -- Randall Stewart NSSTG - Cisco Systems Inc. 803-345-0369 803-317-4952 (cell)