Date: Sun, 8 Aug 2010 19:58:09 -0400 (EDT) From: Rick Macklem <rmacklem@uoguelph.ca> To: rhfb@akira.stdio.com Cc: freebsd-hackers@freebsd.org, dfr@freebsd.org Subject: Re: NFS server hangs (was no subject) Message-ID: <282423324.419135.1281311889612.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <20100729094046.AD3F3C2@akira.stdio.com>
index | next in thread | previous in thread | raw e-mail
[-- Attachment #1 --]
> I have a similar problem.
>
> I have a NFS server (8.0 upgraded a couple times since Feb 2010) that
> locks up
> and requires a reboot.
>
> The clients are busy vm's from VMWare ESXi using the NFS server for
> vmdk virtual
> disk storage.
>
> The ESXi reports nfs server inactive and all the vm's post disk write
> errors when
> trying to write to their disk.
>
> /etc/rc.d/nfsd restart fails to work (it can not kill the nfsd
> process)
>
> The nfsd process runs at 100% cpu at rc_lo state in top.
>
> reboot is the only fix.
>
> It has only happened under two circumstances.
> 1) Installation of a VM using Windows 2008.
> 2) Migrating 16 million mail messages from a physical server to a VM
> running FreeBSD with ZFS file system as a VM on the ESXi box that uses
> NFS to store the VM's ZFS disk.
>
> The NFS server uses ZFS also.
I don't think what you are seeing is the same as what others have reported.
(I have a hunch that your problem might be a replay cache problem.)
Please try the attached patch and make sure that your sys/rpc/svc.c
is at r205562 (upgrade if it isn't).
If this patch doesn't help, you could try using the experimental nfs
server (which doesn't use the generic replay cache), by adding "-e" to
mountd and nfsd.
Please let me know if the patch or switching to the experimental nfs
server helps, rick
[-- Attachment #2 --]
--- rpc/replay.c.sav 2010-08-08 18:05:50.000000000 -0400
+++ rpc/replay.c 2010-08-08 18:16:43.000000000 -0400
@@ -90,8 +90,10 @@
replay_setsize(struct replay_cache *rc, size_t newmaxsize)
{
+ mtx_lock(&rc->rc_lock);
rc->rc_maxsize = newmaxsize;
replay_prune(rc);
+ mtx_unlock(&rc->rc_lock);
}
void
@@ -144,8 +146,8 @@
bool_t freed_one;
if (rc->rc_count >= REPLAY_MAX || rc->rc_size > rc->rc_maxsize) {
- freed_one = FALSE;
do {
+ freed_one = FALSE;
/*
* Try to free an entry. Don't free in-progress entries
*/
help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?282423324.419135.1281311889612.JavaMail.root>
