From owner-freebsd-questions@FreeBSD.ORG  Fri Mar 30 04:48:31 2012
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: freebsd-questions@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 2F7361065678;
	Fri, 30 Mar 2012 04:48:31 +0000 (UTC)
	(envelope-from adrian.chadd@gmail.com)
Received: from mail-pb0-f54.google.com (mail-pb0-f54.google.com
	[209.85.160.54])
	by mx1.freebsd.org (Postfix) with ESMTP id ED9FB8FC0C;
	Fri, 30 Mar 2012 04:48:30 +0000 (UTC)
Received: by pbcwz17 with SMTP id wz17so1312200pbc.13
	for <multiple recipients>; Thu, 29 Mar 2012 21:48:30 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
	h=mime-version:sender:in-reply-to:references:date
	:x-google-sender-auth:message-id:subject:from:to:cc:content-type;
	bh=3dacrXfTE4Dg8wfXnJwWVKoBu1Iz5O3D+oyN7SrZMuE=;
	b=gDeVv7pNK4FV9DG6X+TBTV4rgQPeisnwZB+vL+CR8uvGFi9yW5hI2WsyBb/fgldy0r
	gIFpKt1MPaQSMGgvVRT3yjqyN5rVbhfx4WCwPPTOw05tv6+svST6dMGL16b17dk9Odst
	CrPp1h/cW6+P6aAVOIeKO55PckVvcUZiHsAEaLo9Me1GHPLzotdU5QQ8OV4+u2Z273YR
	08/oFk16/tRb/an7nIw1JtG7domppP6NpxkBJyOK4p82yIs0Y4DfERVtiTeuhm1W/bAF
	drjNpAAVHJ21nT1kiqmsSmMU9W+AmmUzc7ZtnTFDfuMTwsxwIdrbV0ZsU5x1Q2idvs66
	Qj1g==
MIME-Version: 1.0
Received: by 10.68.213.202 with SMTP id nu10mr5700413pbc.37.1333082910394;
	Thu, 29 Mar 2012 21:48:30 -0700 (PDT)
Sender: adrian.chadd@gmail.com
Received: by 10.143.33.5 with HTTP; Thu, 29 Mar 2012 21:48:30 -0700 (PDT)
In-Reply-To: <op.wbykhgrl34t2sn@cr48.lan>
References: <201203300027.q2U0RVZS085304@aurora.sol.net>
	<op.wbykhgrl34t2sn@cr48.lan>
Date: Thu, 29 Mar 2012 21:48:30 -0700
X-Google-Sender-Auth: -UFPX1kx74a2vgImmp6dWXfr7Uc
Message-ID: <CAJ-VmokU23NNnFd6FY4nr-_FRBfSftYgebJNroxOwrfxtCzepQ@mail.gmail.com>
From: Adrian Chadd <adrian@freebsd.org>
To: Mark Felder <feld@feld.me>
Content-Type: text/plain; charset=ISO-8859-1
Cc: freebsd-hackers@freebsd.org, freebsd-questions@freebsd.org,
	Joe Greco <jgreco@ns.sol.net>
Subject: Re: Please help me diagnose this crazy VMWare/FreeBSD 8.x crash
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 30 Mar 2012 04:48:31 -0000

Again, it's starting to sound like an interrupt handling issue which
may or may not be limited to the storage device.

You'll have to engage someone who knows those device drivers and
likely have them add some debugging to the driver which can be easily
flipped on (via binaries in a ramdisk - very important if you can't
run sysctl because your disk IO has locked up!) to see what the
current state of things.

It's likely that the BSD mpt(4) and other storage drivers, and/or our
interrupt handling code, is just slightly different enough to confuse
the snot out of VMWare. I'd first look at the obvious - (eg, if you've
just stopped receiving interrupts, even if new IO is scheduled). I'd
also ask VMware if they have any tools that they can run on a VM to
get the state of the internal emulated driver. For example, register
dumps of the device to see if it's in a hung state, register dumps of
the PIC/APIC to see what state they're in, etc.

Maybe pull in someone like ixsystems and see if they can help debug
this kind of stuff? If you're paying vmware for support, you could
pull them into things with ixsystems and see if the two of them can
help you sort this out?

Thanks,



Adrian