From owner-freebsd-questions@FreeBSD.ORG  Mon May 21 17:06:57 2012
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: freebsd-questions@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 093F41065679;
	Mon, 21 May 2012 17:06:57 +0000 (UTC)
	(envelope-from aboyer@averesystems.com)
Received: from mail.averesystems.com
	(50-73-27-109-cpennsylvania.hfc.comcastbusiness.net [50.73.27.109])
	by mx1.freebsd.org (Postfix) with ESMTP id C91A48FC1E;
	Mon, 21 May 2012 17:06:56 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
	by mail.averesystems.com (Postfix) with ESMTP id 04B994801C4;
	Mon, 21 May 2012 13:01:23 -0400 (EDT)
X-Virus-Scanned: amavisd-new at mail.averesystems.com
Received: from mail.averesystems.com ([127.0.0.1])
	by localhost (mail.averesystems.com [127.0.0.1]) (amavisd-new,
	port 10024)
	with ESMTP id TBtSQY8W+wMT; Mon, 21 May 2012 13:01:22 -0400 (EDT)
Received: from riven.arriad.com (206.193.225.214.nauticom.net
	[206.193.225.214])
	by mail.averesystems.com (Postfix) with ESMTPSA id A2B444801BE;
	Mon, 21 May 2012 13:01:21 -0400 (EDT)
Mime-Version: 1.0 (Apple Message framework v1084)
Content-Type: text/plain; charset=us-ascii
From: Andrew Boyer <aboyer@averesystems.com>
In-Reply-To: <op.wen3bwws34t2sn@tech304>
Date: Mon, 21 May 2012 13:01:19 -0400
Content-Transfer-Encoding: quoted-printable
Message-Id: <490F2075-3E4D-4F85-9935-937CED8FB10B@averesystems.com>
References: <op.wbwe9s0k34t2sn@tech304> <op.wen3bwws34t2sn@tech304>
To: Mark Felder <feld@feld.me>
X-Mailer: Apple Mail (2.1084)
Cc: freebsd-hackers@freebsd.org, freebsd-questions@freebsd.org
Subject: Re: Please help me diagnose this crazy VMWare/FreeBSD 8.x crash
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 21 May 2012 17:06:57 -0000


On May 21, 2012, at 12:41 PM, Mark Felder wrote:

> OK guys I've been talking with another user who can recreate this =
crash and the last bit of information we've learned seems to be leaning =
towards interrupts/IRQ issues like someone (bz@ perhaps?) suggested.
>=20
> I'm still trying to test this myself, but the other user was able to =
recreate my crash pretty much on demand. The fix was to not use the =
first NIC in the VM because it will always share an IRQ with mpt0. Once =
mpt0 is on its own the crash does not seem to be reproducible anymore.
>=20
> Before:
>=20
> $ vmstat -i
> interrupt                          total       rate
> irq1: atkbd0                         378          0
> irq6: fdc0                             9          0
> irq15: ata1                           34          0
> irq16: em1                        687237          1
> irq18: em0 mpt0                319094024        539
> cpu0: timer                    236770821        400
> Total                          556552503        940
>=20
> After:
>=20
> $ vmstat -i
> interrupt                          total       rate
> irq1: atkbd0                          38          0
> irq6: fdc0                             9          0
> irq15: ata1                           34          0
> irq16: em1                          2811         15
> irq17: em2                             5          0
> cpu0: timer                        71013        398
> irq256: mpt0                       12163         68
> Total                              86073        483
>=20
>=20
> Is there any other way we can make mpt0 get its own dedicated IRQ =
without having to do this? The problem is that it causes us to have to =
make rc.conf changes, pf.conf changes, and who knows what other software =
could be on these machines that is trying to bind to a specific NIC...
>=20
>=20
> Thanks!
>=20

You could try switching mpt to MSI.  MSI interrupts are never shared.  =
Add this to /boot/device.hints:

> hint.mpt.0.msi_enable=3D"1"


-Andrew

--------------------------------------------------
Andrew Boyer	aboyer@averesystems.com