From owner-freebsd-questions@FreeBSD.ORG Fri Mar 30 14:45:00 2012 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C8809106566C; Fri, 30 Mar 2012 14:45:00 +0000 (UTC) (envelope-from jgreco@aurora.sol.net) Received: from mail2.sol.net (mail2.sol.net [206.55.64.73]) by mx1.freebsd.org (Postfix) with ESMTP id 8CDA08FC0A; Fri, 30 Mar 2012 14:45:00 +0000 (UTC) Received: from aurora.sol.net (IDENT:jgreco@aurora.sol.net [206.55.70.98]) by mail2.sol.net (8.14.4/8.14.4/SNNS-1.04) with ESMTP id q2UEilR2081441; Fri, 30 Mar 2012 09:44:47 -0500 (CDT) Received: (from jgreco@localhost) by aurora.sol.net (8.14.3/8.14.3/Submit) id q2UEilmj097567; Fri, 30 Mar 2012 09:44:47 -0500 (CDT) From: Joe Greco Message-Id: <201203301444.q2UEilmj097567@aurora.sol.net> To: feld@feld.me (Mark Felder) Date: Fri, 30 Mar 2012 09:44:47 -0500 (CDT) In-Reply-To: X-Mailer: ELM [version 2.5 PL8] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: freebsd-hackers@freebsd.org, freebsd-questions@freebsd.org Subject: Re: Please help me diagnose this crazy VMWare/FreeBSD 8.x crash X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Mar 2012 14:45:00 -0000 > On Thu, 29 Mar 2012 19:27:31 -0500, Joe Greco wrote: > > > It also doesn't explain the experience here, where one VM basically > > crapped out but only after a migration - and then stayed crapped out. > > It would be interesting to hear about your datastore, how busy it is, > > what technology, whether you're using thin, etc. I just have this real > > strong feeling that it's some sort of corruption with the vmfs3 and thin > > provisioned disk format, but it'd be interesting to know if that's > > totally off-track. > > We've ruled out SAN, but we haven't ruled out VMFS. Even FreeBSD Guests on > standalone ESXi servers with no SAN exhibit this crash. > > For the record, we only use thick provisioning and if it was corruption > I'm not sure what layer the corruption could be at. The crashy servers > show no abnormalities when I run either `freebsd-update IPS` or > `pkg_libchk` to confirm checksums of all installed programs. Now the other > data on there... it's not exactly verified, but our backups via rsnapshot > seem to prove there is no issue there or we'd have lots of new files each > run. Crud, there goes part of my theory :-) Have you migrated these hosts, or were they installed in-place and never moved? fwiw the apparent integrity of things on the VM is consistent with our experience too. ... JG -- Joe Greco - sol.net Network Services - Milwaukee, WI - http://www.sol.net "We call it the 'one bite at the apple' rule. Give me one chance [and] then I won't contact you again." - Direct Marketing Ass'n position on e-mail spam(CNN) With 24 million small businesses in the US alone, that's way too many apples.