Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 21 Oct 2016 21:47:28 +0000
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Marek Salwerowicz <marek.salwerowicz@misal.pl>, "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Subject:   Re: ZFS - NFS server for VMware ESXi issues
Message-ID:  <YTXPR01MB018980B341213CA43277464ADDD40@YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM>
In-Reply-To: <930df17b-8db8-121a-a24b-b4909b8162dc@misal.pl>
References:  <930df17b-8db8-121a-a24b-b4909b8162dc@misal.pl>

next in thread | previous in thread | raw e-mail | index | archive | help
Marek Salwerowicz wrote:

Stuff snipped for brevity...

> Today, after two weeks of working, we experienced the same situation.
> The nfsd service was in following state:
>
>  PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME WCPU COMMAND
>   984 root        128  20    0 12344K  4020K vq->vq  8 346:27 0.00% nfsd
>
> nfsd service didn't respond to service nfsd restart, but this time
> machine was able to reboot using "# reboot" command.
I am not sure how "top" got a STATE of "vq->vq", but I suspect that refers =
to the
vdev section of the ZFS code. (The only other place in the kernel where "vq=
->vq"
shows up is in virtio and I doubt you are using that?)

I'm not a ZFS guy so I can't help, but I'd guess that it's looping around i=
n the vdev
code, possibly competing for the vq->vq_lock?

Hopefully someone with ZFS expertise can help out?

Btw, about the only area of the NFS server that might need tuning is the DR=
C and
this doesn't suggest that. If you "nfsstat -e -s" on the server and see lar=
ge #s for
the last line under "Server Cache Stats:" there are tunables that can be us=
ed.
I'd also suggest you capture the output of "ps axHl" on the server when it =
happens
again, which tells you what all the nfsd threads are up to.

Good luck with it, rick





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YTXPR01MB018980B341213CA43277464ADDD40>