From owner-freebsd-hackers@FreeBSD.ORG  Wed Jun  9 18:11:59 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B4582106567B
	for <freebsd-hackers@freebsd.org>; Wed,  9 Jun 2010 18:11:59 +0000 (UTC)
	(envelope-from rhfb@akira.stdio.com)
Received: from akira.stdio.com (akira.stdio.com [204.152.114.29])
	by mx1.freebsd.org (Postfix) with SMTP id 861168FC19
	for <freebsd-hackers@freebsd.org>; Wed,  9 Jun 2010 18:11:46 +0000 (UTC)
Received: from akira (localhost [127.0.0.1])
	by akira.stdio.com (Postfix) with SMTP id 9769650815
	for <freebsd-hackers@freebsd.org>; Wed,  9 Jun 2010 13:52:40 -0400 (EDT)
From: rhfb@akira.stdio.com
To: freebsd-hackers@freebsd.org
Message-Id: <20100609175244.9769650815@akira.stdio.com>
Date: Wed,  9 Jun 2010 13:52:40 -0400 (EDT)
Subject: NFSD lockup running ESXi 4
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Jun 2010 18:11:59 -0000

I have an AMD64 FreeBSD 8.0 running 8-Stable from around 2010/04/25 19:13:08.

ZFS disk, Nfsd flags "-t -n 16", private network exclusive for nfs network,
not using jumbo frames, HZ=1000, Device_Polling, Zero_Copy_Sockets, and the
following sysctl options:
net.inet.tcp.recvspace=232140
net.inet.tcp.sendspace=232140
net.inet.tcp.slowstart_flightsize=159
net.inet.tcp.mssdflt=1460

FreeBSD 6 TB zpool, nfs from Three ESXi 4 (newest patch level 193498)
working reliably for months.

Added a new ESXi, patched to the newest (Post Update 1) patch level 256968.
Added a bunch of VM's, booted them all into the 2008 R2 Server install DVD.
Then when attempting to do the installs (in parallel/simultaneously) I started
getting the NFS server locking up.  NFSD would wedge at 100% CPU in "rc_lo"
which I presume is rc_lock?  Once wedged, /etc/rc.d/nfsd restart can't kill
nfsd.  So a reboot is required.  A Reboot causes all my active VM's with
pending disk writes to have disk errors in the VM (10 second default timeout
for disk writes in the VM.)  This was very reproducable.

Has anyone noticed this problem?  Is this an ESXi problem with the newest
updates?  Is this a problem with NFS on FreeBSD 8?