From owner-freebsd-questions@FreeBSD.ORG  Wed May 30 15:58:14 2012
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: freebsd-questions@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id F05AB106567D;
	Wed, 30 May 2012 15:58:14 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net
	[IPv6:2001:470:1f10:75::2])
	by mx1.freebsd.org (Postfix) with ESMTP id C46808FC1A;
	Wed, 30 May 2012 15:58:14 +0000 (UTC)
Received: from jhbbsd.localnet (unknown [209.249.190.124])
	by bigwig.baldwin.cx (Postfix) with ESMTPSA id 35BB7B99A;
	Wed, 30 May 2012 11:58:14 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: freebsd-hackers@freebsd.org
Date: Wed, 30 May 2012 11:06:13 -0400
User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p13; KDE/4.5.5; amd64; ; )
References: <op.wbwe9s0k34t2sn@tech304>
	<CAJ-Vmoneopo8xNpThbewfE2tg6HrdH74DXurO38P_aVs=YS9+A@mail.gmail.com>
	<op.wete9wbq34t2sn@tech304>
In-Reply-To: <op.wete9wbq34t2sn@tech304>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="utf-8"
Content-Transfer-Encoding: 7bit
Message-Id: <201205301106.13885.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
	(bigwig.baldwin.cx); Wed, 30 May 2012 11:58:14 -0400 (EDT)
Cc: Mark Felder <feld@feld.me>, dene@ilovedene.com,
	freebsd-questions@freebsd.org, Adrian Chadd <adrian@freebsd.org>
Subject: Re: Please help me diagnose this crazy VMWare/FreeBSD 8.x crash
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 30 May 2012 15:58:15 -0000

On Thursday, May 24, 2012 9:47:46 am Mark Felder wrote:
> On Wed, 23 May 2012 17:30:40 -0500, Adrian Chadd <adrian@freebsd.org>  
> wrote:
> 
> > Hi,
> >
> > can you please, -please- file a PR? And place all of the above
> > information in it so we don't lose it?
> >
> 
> I'd be glad to post a PR and assist in helping to get it permanently  
> fixed. I certainly don't want this data to get lost and honestly our  
> business uses FreeBSD on VMWare so much that we really need a permanent  
> fix as much as anyone else :-)
> 
> The reason I've hesitated to post a PR so far is that I didn't have any  
> truly useful or concrete evidence of where the problem lies. After Dane  
> Foster contacted me and told me he could recreate the crash on demand with  
> his workload it was easier to narrow things down. The suggestion that it  
> was an interrupts issue (by possibly Bjoern Zeeb?) and Dane's discovery  
> that his crashes ceased when em0 and mpt0 share an IRQ, but em0 is  
> completely unused was starting to prove there is some strong evidence here  
> in favor of the interrupts issue.
> 
> Dane, what's the status on your end? Has your fix still been successful?  
> Is it also stable if you simply set hint.mpt.0.msi_enable="1" ?

Hmm, so the set of ps output you have from DDB shows a lot of runnable 
processes and swi6 (Giant taskq) as the only running thread (all consistent
with your hang).  (And that is from your Ctrl-Alt-Esc)

Do you only have one CPU in this VM?  If not, do you know which threads
the other CPUs were running (e.g. do you have ps7.png, etc.)?

-- 
John Baldwin