From owner-freebsd-questions@FreeBSD.ORG Thu May 24 20:54:08 2012 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id E4926106574E for ; Thu, 24 May 2012 20:54:08 +0000 (UTC) (envelope-from dene@ilovedene.com) Received: from mail-pz0-f54.google.com (mail-pz0-f54.google.com [209.85.210.54]) by mx1.freebsd.org (Postfix) with ESMTP id AB7E78FC15 for ; Thu, 24 May 2012 20:54:08 +0000 (UTC) Received: by dadv36 with SMTP id v36so294223dad.13 for ; Thu, 24 May 2012 13:54:08 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer :x-gm-message-state; bh=aCTK+HsF8F3LiCKwDqDpivn/GJYURHffv4Ei3tE1kfc=; b=aN0LFwCbHPEOJypbyLA8dmBnExIkhX6SPOYxOglEsoAxXhOOuIoW/f0Y0yGnXNYPkY 6y1opyh1YkTIbsrmWtfRmqOj4H9mg+x+M3tCuQuJCl7TRlxpjS6cbTIeXJk/4aGAiRch HPX4C/t/lhpNd9O21zo0Qkrq+bnCZYzaI1vRDi0NS7oB0lzLQ4vEa9VdmZyE2kAimTX/ 5TxfULJ/ONAleVEWf0AHVX+q/b3RwZu/tO8428QACxuc/Z/H5Fk9Ei5w4Q6WLhF+Gi1n NvUEnLI0hK1+EUTr9ruCSt+Jku1Cx1YiUhkkOhYjXZl/pSZ4Oueldn2BipmHqkb9UbEl DRNw== Received: by 10.68.213.71 with SMTP id nq7mr4154247pbc.25.1337892848238; Thu, 24 May 2012 13:54:08 -0700 (PDT) Received: from pdene.citylink.co.nz (banks.citylink.co.nz. [202.8.44.8]) by mx.google.com with ESMTPS id ub8sm6557349pbc.44.2012.05.24.13.54.04 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 24 May 2012 13:54:06 -0700 (PDT) Mime-Version: 1.0 (Apple Message framework v1278) Content-Type: text/plain; charset=us-ascii From: dane foster In-Reply-To: Date: Fri, 25 May 2012 08:54:04 +1200 Content-Transfer-Encoding: quoted-printable Message-Id: <62F1D149-FC1C-4E00-98FD-DF6C46A5DC55@ilovedene.com> References: <490F2075-3E4D-4F85-9935-937CED8FB10B@averesystems.com> To: Mark Felder X-Mailer: Apple Mail (2.1278) X-Gm-Message-State: ALoCoQnBy2SH7mNP5UcpcKVCDOih2Jf8jegyugOTnr81BSbHgk4Jzm6QSiZeOf8LI0DSsCBkqH0y Cc: freebsd-hackers@freebsd.org, Adrian Chadd , freebsd-questions@freebsd.org Subject: Re: Please help me diagnose this crazy VMWare/FreeBSD 8.x crash X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 May 2012 20:54:09 -0000 Hey all, On 25/05/2012, at 1:47 AM, Mark Felder wrote: > On Wed, 23 May 2012 17:30:40 -0500, Adrian Chadd = wrote: >=20 >> Hi, >>=20 >> can you please, -please- file a PR? And place all of the above >> information in it so we don't lose it? >>=20 >=20 > I'd be glad to post a PR and assist in helping to get it permanently = fixed. I certainly don't want this data to get lost and honestly our = business uses FreeBSD on VMWare so much that we really need a permanent = fix as much as anyone else :-) >=20 > The reason I've hesitated to post a PR so far is that I didn't have = any truly useful or concrete evidence of where the problem lies. After = Dane Foster contacted me and told me he could recreate the crash on = demand with his workload it was easier to narrow things down. The = suggestion that it was an interrupts issue (by possibly Bjoern Zeeb?) = and Dane's discovery that his crashes ceased when em0 and mpt0 share an = IRQ, but em0 is completely unused was starting to prove there is some = strong evidence here in favor of the interrupts issue. >=20 > Dane, what's the status on your end? Has your fix still been = successful? Is it also stable if you simply set = hint.mpt.0.msi_enable=3D"1" ? >=20 The situation I've got that's stable now is: hw.pci.enable_msi=3D"0" hw.pci.enable_msix=3D"0" in /boot/loader.conf and: samael:~:% vmstat -i [ = 6:31PM] interrupt total rate irq1: atkbd0 6 0 irq18: em0 mpt0 3061100 15 irq19: em1 6891706 35 cpu0: timer 166383735 868 cpu1: timer 166382123 868 cpu3: timer 166382123 868 cpu2: timer 166382121 868 Total 675482914 3525 Not using em0. This works for 8 (FreeBSD samael.slush.ca 8.3-STABLE = FreeBSD 8.3-STABLE #1: Mon May 7 11:51:03 NZST 2012 = root@samael.slush.ca:/usr/obj/usr/src/sys/DENE amd64). Neither of those settings on their own seem to stop it from happening. The 9 box I've tried this on still hangs almost every time i run = handbrake, no matter whether MSI/MSIX is enabled, or I have separate = IRQs for mpt0 and em0/1 I can cause the hang mostly on demand, but not quite sure what = information to provide from the hung system. If somebody can let me know = what they need, including root access, I can make that happen. Cheers, Dane >=20 > Thanks!