From owner-freebsd-questions@FreeBSD.ORG Wed May 30 15:58:14 2012 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F05AB106567D; Wed, 30 May 2012 15:58:14 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id C46808FC1A; Wed, 30 May 2012 15:58:14 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 35BB7B99A; Wed, 30 May 2012 11:58:14 -0400 (EDT) From: John Baldwin To: freebsd-hackers@freebsd.org Date: Wed, 30 May 2012 11:06:13 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p13; KDE/4.5.5; amd64; ; ) References: In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <201205301106.13885.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Wed, 30 May 2012 11:58:14 -0400 (EDT) Cc: Mark Felder , dene@ilovedene.com, freebsd-questions@freebsd.org, Adrian Chadd Subject: Re: Please help me diagnose this crazy VMWare/FreeBSD 8.x crash X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 May 2012 15:58:15 -0000 On Thursday, May 24, 2012 9:47:46 am Mark Felder wrote: > On Wed, 23 May 2012 17:30:40 -0500, Adrian Chadd > wrote: > > > Hi, > > > > can you please, -please- file a PR? And place all of the above > > information in it so we don't lose it? > > > > I'd be glad to post a PR and assist in helping to get it permanently > fixed. I certainly don't want this data to get lost and honestly our > business uses FreeBSD on VMWare so much that we really need a permanent > fix as much as anyone else :-) > > The reason I've hesitated to post a PR so far is that I didn't have any > truly useful or concrete evidence of where the problem lies. After Dane > Foster contacted me and told me he could recreate the crash on demand with > his workload it was easier to narrow things down. The suggestion that it > was an interrupts issue (by possibly Bjoern Zeeb?) and Dane's discovery > that his crashes ceased when em0 and mpt0 share an IRQ, but em0 is > completely unused was starting to prove there is some strong evidence here > in favor of the interrupts issue. > > Dane, what's the status on your end? Has your fix still been successful? > Is it also stable if you simply set hint.mpt.0.msi_enable="1" ? Hmm, so the set of ps output you have from DDB shows a lot of runnable processes and swi6 (Giant taskq) as the only running thread (all consistent with your hang). (And that is from your Ctrl-Alt-Esc) Do you only have one CPU in this VM? If not, do you know which threads the other CPUs were running (e.g. do you have ps7.png, etc.)? -- John Baldwin