From owner-freebsd-bugs@FreeBSD.ORG Thu Jun 7 18:50:16 2012 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id EDFD7106564A for ; Thu, 7 Jun 2012 18:50:16 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id BF55E8FC14 for ; Thu, 7 Jun 2012 18:50:16 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q57IoGka088572 for ; Thu, 7 Jun 2012 18:50:16 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q57IoGaT088571; Thu, 7 Jun 2012 18:50:16 GMT (envelope-from gnats) Date: Thu, 7 Jun 2012 18:50:16 GMT Message-Id: <201206071850.q57IoGaT088571@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org From: Mark Felder Cc: Subject: Re: kern/168416: [hang] OS hangs when guest on VMWare ESX X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Mark Felder List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jun 2012 18:50:17 -0000 The following reply was made to PR kern/168416; it has been noted by GNATS. From: Mark Felder To: bug-followup@freebsd.org Cc: Subject: Re: kern/168416: [hang] OS hangs when guest on VMWare ESX Date: Thu, 7 Jun 2012 13:42:33 -0500 I have wonderful news: we can now reproduce the crash on demand. We discovered that if we stress em and mpt at the same time by doing I/O on a HAST device, we can easily reproduce this issue. I also have a coredump I took from breaking into DDB and running "dump" and also a picture of the backtrace: http://feld.me/pub/freebsd/esx_crash/bt.png http://feld.me/pub/freebsd/esx_crash/vmcore.0.gz Requirements: VMWare ESXi 5: RAM 1GB CPUs 1 FreeBSD 9 (-RELEASE, or STABLE... produced this coredump on -STABLE from Jun 3rd) HAST iozone or bonnie++ (seems that iozone crashes it faster and more consistently) / 40GB UFS+SUJ /dev/hast/hast0 (mounted on /mnt) 8GB UFS+SUJ SWAP 2GB (I put my swap on a separate disk as well to help aid getting a successful dump.) So in this environment I have 2 servers (node1 and node2) for proper HAST, so it actually does try to transfer changes to the secondary. It's merely there to receive the data; it's not otherwise involved in this test. hast.conf: # global section timeout 5 compression hole resource hast0 { on node1 { local /dev/da1 remote 192.168.44.2 } on node2 { local /dev/da1 remote 192.168.44.1 } } Kernel config "DEBUG" I used for getting this coredump: include GENERIC makeoptions DEBUG=-g options INVARIANTS options INVARIANT_SUPPORT options WITNESS options DEBUG_LOCKS options DEBUG_VFS_LOCKS options DIAGNOSTIC options KDB options DDB options DDB options BREAK_TO_DEBUGGER options ALT_BREAK_TO_DEBUGGER options KTR options KTR_ENTRIES=1024 options KTR_COMPILE=(KTR|KTR_PROC) options KTR_MASK=KTR_SCHED options KTR_CPUMASK=("0x3") options KTR_VERBOSE And the iozone command that works quite consistently (I ran it in a loop just in case it wouldn't crash the first time): iozone -M -e -+u -T -t 8 -r 128k -s 40960 -i 0 -i 1 -i 2 -i 8 -+p 70 -C -F /mnt/io.1 /mnt/io.2 /mnt/io.3 /mnt/io.4 /mnt/io.5 /mnt/io.6 /mnt/io.7 /mnt/io.8 Bonnie++ command we can get to cause the crash sometimes: bonnie++ -u root -d /mnt/ -s 3552M -n 10:102400:1024:1024 The only other tip I have for you if you want to rebuild this entire environment is to change your keybind. You can't break in to the debugger on VMWare with CTRL+ALT+ESC because CTRL+ALT drops your focus on the VM. You have to override this. I tend to use CTRL+ALT+SHIFT. The following is how you change that: XP: C:\Documents and Settings\USERNAME\Application Data\VMware\preferences.ini Vista/7: C:\users\\appdata\roaming\vmware\preferences.ini pref.hotkey.shift = "true" pref.hotkey.control = "true" pref.hotkey.alt = "true" Please let me know if there is anything I can do to help aid in resolving this issue.