Date: Thu, 24 Nov 2016 11:08:11 +0200 From: Konstantin Belousov <kostikbel@gmail.com> To: Alan Somers <asomers@freebsd.org> Cc: FreeBSD CURRENT <freebsd-current@freebsd.org> Subject: Re: NFSv4 performance degradation with 12.0-CURRENT client Message-ID: <20161124090811.GO54029@kib.kiev.ua> In-Reply-To: <CAOtMX2jJ2XoQyVG1c04QL7NTJn1pg38s=XEgecE38ea0QoFAOw@mail.gmail.com> References: <CAOtMX2jJ2XoQyVG1c04QL7NTJn1pg38s=XEgecE38ea0QoFAOw@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Nov 23, 2016 at 10:17:25PM -0700, Alan Somers wrote: > I have a FreeBSD 10.3-RELEASE-p12 server exporting its home > directories over both NFSv3 and NFSv4. I have a TrueOS client (based > on 12.0-CURRENT on the drm-next-4.7 branch, built on 28-October) > mounting the home directories over NFSv4. At first, everything is > fine and performance is good. But if the client does a buildworld > using sources on NFS and locally stored objects, performance slowly > degrades. The degradation is most noticeable with metadata-heavy > operations. For example, "ls -l" in a directory with 153 files takes > less than 0.1 seconds right after booting. But the longer the > buildworld goes on, the slower it gets. Eventually that same "ls -l" > takes 19 seconds. When the home directories are mounted over NFSv3 > instead, I see no degradation. > > top shows negligible CPU consumption on the server, and very high > consumption on the client when using NFSv4 (nearly 100%). The > NFS-using process is spending almost all of its time in system mode, > and dtrace shows that almost all of its time is spent in > ncl_getpages(). > > I have delegations disabled on the server. On the client, the home > directories are nullfs mounted to two additional locations, and the > buildworld was actually using one of those nullfs mounts, not the NFS > mount directly. > > Any ideas? Try stock FreeBSD first. If reproducable on the stock HEAD, can you point to the lines of ncl_getpages() where the time is spent ? Does reading of the problematic files, as opposed to mmaping it, also cause the behaviour ? E.g. try dd. There is really no time-unbounded loops in the ncl_getpages() itself. I could understand the situation if e.g. time is spent in getpbuf() or ncl_readrpc(), but not in ncl_getpages() directly. Also, as an experiment, you could see if HEAD after r308980 demonstrates any difference.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20161124090811.GO54029>