From owner-freebsd-hackers@FreeBSD.ORG Thu Mar 29 16:53:03 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id A6AC31065670; Thu, 29 Mar 2012 16:53:03 +0000 (UTC) (envelope-from alan.l.cox@gmail.com) Received: from mail-pb0-f54.google.com (mail-pb0-f54.google.com [209.85.160.54]) by mx1.freebsd.org (Postfix) with ESMTP id 711788FC0A; Thu, 29 Mar 2012 16:53:03 +0000 (UTC) Received: by pbcwz17 with SMTP id wz17so423760pbc.13 for ; Thu, 29 Mar 2012 09:53:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=ygMI8sgrwYQMlKyI/jI7SjAob7XiQCRGlhvltif9IFo=; b=uRHiDNboJE3KaveXIoNs4XTD+lbDl4Ftn1sIW0CyyvLXlzsILfafwUFqaAjZ24R3gF d89M7VlqtQ5pdDCxJQ6wZPODcWF56yBAntS6JZ3Tvlh1rn+J9kMEP2j2BhHDI9Mhc+Mb dSSryGBfb864iArKTrfldlMJTqSNDWJv6dJ/l1TcbcFZz5OCX8jYzBB00hR6cOWDPo3k gGUZVUVFlyv2G3R/i+tIOneu+s3nQO9lBIinAG0TfnzN98vtyZJnEWzmFm3ZlXnXux4o KloPFEfjzkn/TjrH+mmRO6cUx4mbirhqAlPjp7V4fxdB5rvGPfPcMvTVFozltImu6XX2 yXNg== MIME-Version: 1.0 Received: by 10.68.134.101 with SMTP id pj5mr1444886pbb.48.1333039982800; Thu, 29 Mar 2012 09:53:02 -0700 (PDT) Received: by 10.68.228.168 with HTTP; Thu, 29 Mar 2012 09:53:02 -0700 (PDT) In-Reply-To: References: <201203291549.q2TFnUc7080406@aurora.sol.net> <201203291755.36651.hselasky@c2i.net> Date: Thu, 29 Mar 2012 11:53:02 -0500 Message-ID: From: Alan Cox To: Mark Felder Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-hackers@freebsd.org, freebsd-questions@freebsd.org, Hans Petter Selasky Subject: Re: Please help me diagnose this crazy VMWare/FreeBSD 8.x crash X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: alc@freebsd.org List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Mar 2012 16:53:03 -0000 On Thu, Mar 29, 2012 at 11:27 AM, Mark Felder wrote: > On Thu, 29 Mar 2012 10:55:36 -0500, Hans Petter Selasky > wrote: > >> >> It almost sounds like the lost interrupt issue I've seen with USB EHCI >> devices, though disk I/O should have a retry timeout? >> >> What does "wmstat -i" output? >> >> --HPS >> > > > Here's a server that has a week uptime and is due for a crash any hour now: > > root@server:/# vmstat -i > interrupt total rate > irq1: atkbd0 34 0 > irq6: fdc0 9 0 > irq15: ata1 34 0 > irq16: em1 778061 1 > irq17: mpt0 19217711 31 > irq18: em0 283674769 460 > cpu0: timer 246571507 400 > Total 550242125 892 > > Not so long ago, VMware implemented a clever scheme for reducing the overhead of virtualized interrupts that must be delivered by at least some (if not all) of their emulated storage controllers: http://static.usenix.org/events/atc11/tech/techAbstracts.html#Ahmad Perhaps, there is a bad interaction between this scheme and FreeBSD's mpt driver. Alan