From owner-freebsd-questions@FreeBSD.ORG Mon May 21 17:06:57 2012 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 093F41065679; Mon, 21 May 2012 17:06:57 +0000 (UTC) (envelope-from aboyer@averesystems.com) Received: from mail.averesystems.com (50-73-27-109-cpennsylvania.hfc.comcastbusiness.net [50.73.27.109]) by mx1.freebsd.org (Postfix) with ESMTP id C91A48FC1E; Mon, 21 May 2012 17:06:56 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mail.averesystems.com (Postfix) with ESMTP id 04B994801C4; Mon, 21 May 2012 13:01:23 -0400 (EDT) X-Virus-Scanned: amavisd-new at mail.averesystems.com Received: from mail.averesystems.com ([127.0.0.1]) by localhost (mail.averesystems.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id TBtSQY8W+wMT; Mon, 21 May 2012 13:01:22 -0400 (EDT) Received: from riven.arriad.com (206.193.225.214.nauticom.net [206.193.225.214]) by mail.averesystems.com (Postfix) with ESMTPSA id A2B444801BE; Mon, 21 May 2012 13:01:21 -0400 (EDT) Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=us-ascii From: Andrew Boyer In-Reply-To: Date: Mon, 21 May 2012 13:01:19 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: <490F2075-3E4D-4F85-9935-937CED8FB10B@averesystems.com> References: To: Mark Felder X-Mailer: Apple Mail (2.1084) Cc: freebsd-hackers@freebsd.org, freebsd-questions@freebsd.org Subject: Re: Please help me diagnose this crazy VMWare/FreeBSD 8.x crash X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 May 2012 17:06:57 -0000 On May 21, 2012, at 12:41 PM, Mark Felder wrote: > OK guys I've been talking with another user who can recreate this = crash and the last bit of information we've learned seems to be leaning = towards interrupts/IRQ issues like someone (bz@ perhaps?) suggested. >=20 > I'm still trying to test this myself, but the other user was able to = recreate my crash pretty much on demand. The fix was to not use the = first NIC in the VM because it will always share an IRQ with mpt0. Once = mpt0 is on its own the crash does not seem to be reproducible anymore. >=20 > Before: >=20 > $ vmstat -i > interrupt total rate > irq1: atkbd0 378 0 > irq6: fdc0 9 0 > irq15: ata1 34 0 > irq16: em1 687237 1 > irq18: em0 mpt0 319094024 539 > cpu0: timer 236770821 400 > Total 556552503 940 >=20 > After: >=20 > $ vmstat -i > interrupt total rate > irq1: atkbd0 38 0 > irq6: fdc0 9 0 > irq15: ata1 34 0 > irq16: em1 2811 15 > irq17: em2 5 0 > cpu0: timer 71013 398 > irq256: mpt0 12163 68 > Total 86073 483 >=20 >=20 > Is there any other way we can make mpt0 get its own dedicated IRQ = without having to do this? The problem is that it causes us to have to = make rc.conf changes, pf.conf changes, and who knows what other software = could be on these machines that is trying to bind to a specific NIC... >=20 >=20 > Thanks! >=20 You could try switching mpt to MSI. MSI interrupts are never shared. = Add this to /boot/device.hints: > hint.mpt.0.msi_enable=3D"1" -Andrew -------------------------------------------------- Andrew Boyer aboyer@averesystems.com