Date: Fri, 21 Oct 2016 21:47:28 +0000 From: Rick Macklem <rmacklem@uoguelph.ca> To: Marek Salwerowicz <marek.salwerowicz@misal.pl>, "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org> Subject: Re: ZFS - NFS server for VMware ESXi issues Message-ID: <YTXPR01MB018980B341213CA43277464ADDD40@YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM> In-Reply-To: <930df17b-8db8-121a-a24b-b4909b8162dc@misal.pl> References: <930df17b-8db8-121a-a24b-b4909b8162dc@misal.pl>
next in thread | previous in thread | raw e-mail | index | archive | help
Marek Salwerowicz wrote: Stuff snipped for brevity... > Today, after two weeks of working, we experienced the same situation. > The nfsd service was in following state: > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 984 root 128 20 0 12344K 4020K vq->vq 8 346:27 0.00% nfsd > > nfsd service didn't respond to service nfsd restart, but this time > machine was able to reboot using "# reboot" command. I am not sure how "top" got a STATE of "vq->vq", but I suspect that refers = to the vdev section of the ZFS code. (The only other place in the kernel where "vq= ->vq" shows up is in virtio and I doubt you are using that?) I'm not a ZFS guy so I can't help, but I'd guess that it's looping around i= n the vdev code, possibly competing for the vq->vq_lock? Hopefully someone with ZFS expertise can help out? Btw, about the only area of the NFS server that might need tuning is the DR= C and this doesn't suggest that. If you "nfsstat -e -s" on the server and see lar= ge #s for the last line under "Server Cache Stats:" there are tunables that can be us= ed. I'd also suggest you capture the output of "ps axHl" on the server when it = happens again, which tells you what all the nfsd threads are up to. Good luck with it, rick
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YTXPR01MB018980B341213CA43277464ADDD40>