Date: Thu, 24 Nov 2016 12:53:43 +0000 From: Rick Macklem <rmacklem@uoguelph.ca> To: Konstantin Belousov <kostikbel@gmail.com>, Alan Somers <asomers@freebsd.org> Cc: FreeBSD CURRENT <freebsd-current@freebsd.org> Subject: Re: NFSv4 performance degradation with 12.0-CURRENT client Message-ID: <YTXPR01MB0189E0B1DB5B16EE6B388B7DDDB60@YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM> In-Reply-To: <20161124090811.GO54029@kib.kiev.ua> References: <CAOtMX2jJ2XoQyVG1c04QL7NTJn1pg38s=XEgecE38ea0QoFAOw@mail.gmail.com>, <20161124090811.GO54029@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Nov 23, 2016 at 10:17:25PM -0700, Alan Somers wrote: > I have a FreeBSD 10.3-RELEASE-p12 server exporting its home > directories over both NFSv3 and NFSv4. I have a TrueOS client (based > on 12.0-CURRENT on the drm-next-4.7 branch, built on 28-October) > mounting the home directories over NFSv4. At first, everything is > fine and performance is good. But if the client does a buildworld > using sources on NFS and locally stored objects, performance slowly > degrades. The degradation is most noticeable with metadata-heavy > operations. For example, "ls -l" in a directory with 153 files takes > less than 0.1 seconds right after booting. But the longer the > buildworld goes on, the slower it gets. Eventually that same "ls -l" > takes 19 seconds. When the home directories are mounted over NFSv3 > instead, I see no degradation. > > top shows negligible CPU consumption on the server, and very high > consumption on the client when using NFSv4 (nearly 100%). The > NFS-using process is spending almost all of its time in system mode, > and dtrace shows that almost all of its time is spent in > ncl_getpages(). > A couple of things you could do when it slow (as well as what Kostik sugges= ted): - nfsstat -c -e on client and nfsstat -e -s on server, to see what RPCs are= being done and how quickly. (nfsstat -s -e will also show you how big the DRC is, al= though a large DRC should show up as increased CPU consumption on the server) - capture packets with tcpdump -s 0 -w test.pcap host <other-one> - then you can email me test.pcap as an attachment. I can look at it in w= ireshark and see if there seem to protocol and/or TCP issues. (You can look at i= n wireshark yourself, the look for NFS4ERR_xxx, TCP segment retransmits...) If you are using either "intr" or "soft" on the mounts, try without those m= ount options. (The Bugs section of mount_nfs recommends against using them. If an RPC fai= ls due to these options, something called a seqid# can be "out of sync" between clie= nt/server and that causes serious problems.) --> These seqid#s are not used by NFSv4.1, so you could try that by adding "minorversion=3D1" to your mount options. Good luck with it, rick
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YTXPR01MB0189E0B1DB5B16EE6B388B7DDDB60>