From owner-freebsd-hackers@FreeBSD.ORG Fri Mar 30 00:47:50 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 69C3F106566B; Fri, 30 Mar 2012 00:47:50 +0000 (UTC) (envelope-from feld@feld.me) Received: from feld.me (unknown [IPv6:2607:f4e0:100:300::2]) by mx1.freebsd.org (Postfix) with ESMTP id 3097E8FC16; Fri, 30 Mar 2012 00:47:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=feld.me; s=blargle; h=In-Reply-To:Message-Id:From:Mime-Version:Date:References:Subject:Cc:To:Content-Type; bh=9p4s/1itnCxpchoFGNrjfFOq0EzEyG3izZCZBVoQUEE=; b=UeV1TvdcRx2Dc3ZG7iUphhdp0Fnvks8xxl2rH+6uiuKkRT1wLc3lzjU9m6NySWvfJES3qMKTqGsNexM/deAqJk8ybR97t0tsm5cn5dSYMlDwC6NeW4bXj7PJ4lcR6pYB; Received: from localhost ([127.0.0.1] helo=mwi1.coffeenet.org) by feld.me with esmtp (Exim 4.77 (FreeBSD)) (envelope-from ) id 1SDQ0K-0007oP-43; Thu, 29 Mar 2012 19:47:49 -0500 Received: from feld@feld.me by mwi1.coffeenet.org (Archiveopteryx 3.1.4) with esmtpa id 1333068462-20726-20725/5/33; Fri, 30 Mar 2012 00:47:42 +0000 Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes To: freebsd-hackers@freebsd.org, freebsd-questions@FreeBSD.org References: <201203300027.q2U0RVZS085304@aurora.sol.net> Date: Thu, 29 Mar 2012 19:47:29 -0500 Mime-Version: 1.0 From: Mark Felder Message-Id: In-Reply-To: <201203300027.q2U0RVZS085304@aurora.sol.net> User-Agent: Opera Mail/11.62 (FreeBSD) X-SA-Score: -1.5 Cc: Joe Greco Subject: Re: Please help me diagnose this crazy VMWare/FreeBSD 8.x crash X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Mar 2012 00:47:50 -0000 On Thu, 29 Mar 2012 19:27:31 -0500, Joe Greco wrote: > It also doesn't explain the experience here, where one VM basically > crapped out but only after a migration - and then stayed crapped out. > It would be interesting to hear about your datastore, how busy it is, > what technology, whether you're using thin, etc. I just have this real > strong feeling that it's some sort of corruption with the vmfs3 and thin > provisioned disk format, but it'd be interesting to know if that's > totally off-track. We've ruled out SAN, but we haven't ruled out VMFS. Even FreeBSD Guests on standalone ESXi servers with no SAN exhibit this crash. For the record, we only use thick provisioning and if it was corruption I'm not sure what layer the corruption could be at. The crashy servers show no abnormalities when I run either `freebsd-update IPS` or `pkg_libchk` to confirm checksums of all installed programs. Now the other data on there... it's not exactly verified, but our backups via rsnapshot seem to prove there is no issue there or we'd have lots of new files each run.