From owner-freebsd-hackers@FreeBSD.ORG Fri Mar 30 21:18:22 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 65C8D106566B for ; Fri, 30 Mar 2012 21:18:22 +0000 (UTC) (envelope-from dieterbsd@engineer.com) Received: from mailout-us.gmx.com (mailout-us.gmx.com [74.208.5.67]) by mx1.freebsd.org (Postfix) with SMTP id 0B3188FC1F for ; Fri, 30 Mar 2012 21:18:21 +0000 (UTC) Received: (qmail 12276 invoked by uid 0); 30 Mar 2012 21:18:20 -0000 Received: from 67.206.186.20 by rms-us002.v300.gmx.net with HTTP Content-Type: text/plain; charset="utf-8" Date: Fri, 30 Mar 2012 17:18:18 -0400 From: "Dieter BSD" Message-ID: <20120330211819.155070@gmx.com> MIME-Version: 1.0 To: freebsd-hackers@freebsd.org X-Authenticated: #74169980 X-Flags: 0001 X-Mailer: GMX.com Web Mailer x-registered: 0 Content-Transfer-Encoding: 8bit X-GMX-UID: TEYEb/dd3zOlNR3dAHAh+ft+IGRvb0AS Subject: Re: Please help me diagnose this crazy VMWare/FreeBSD 8.x crash X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Mar 2012 21:18:22 -0000 > Subsequent inspection suggested that it was happening during the > periodic daily, though we never managed to get it to happen by manually > forcing periodic daily, so that's only a theory. Perhaps due to a bunch of VMs all running periodic daily at the same time? > We had a perfectly functional, nearly zero-traffic VM, since Jabber > traffic averages no more than a few messages per hour.  It was working > for quite some time. > > We moved it from a local datastore to an iSCSI datastore that ended up > getting periodically crushed by the load (in particular during the > periodic daily load imposed by a bunch of VM's all running at once). > At this point, this one VM started hanging on I/O.  We expected that > this would clear up upon return to a host with a local datastore.  It > did not. > > This ended up as a broken VM, one that would hang up overnite, maybe > not every night, but several times a week at least. ... > For the problem to "follow" the VM in this manner, and afflict *only* > the one VM, strongly suggests that it is something that is contained > within the VM files that constitute this VM.  That is consistent with > the observation that the problem arose at a point where the VM is > known to have had all those files moved from one location to a dodgy > location. > > That's why I believe the evidence points to corruption of some sort. Compare a backup of the VM before it broke to a backup of the same VM after it broke.  Hopefully the haystack of insignificant differences isn't too large, or the significant difference needle might be a lot of "fun" to find.