Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 9 Mar 2022 14:39:39 +0000
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Yoshihiro Ota <ota@j.email.ne.jp>, freebsd-stable <freebsd-stable@freebsd.org>
Subject:   Re: nfsd becomes slow when machine CPU usage is at or over 100% on STABLE/13
Message-ID:  <YT2PR01MB9730D7B51D325258AAA29828DD0A9@YT2PR01MB9730.CANPRD01.PROD.OUTLOOK.COM>
In-Reply-To: <20220309034601.ea3135e31aec3ffb2623f145@j.email.ne.jp>

index | next in thread | previous in thread | raw e-mail

Yoshihiro Ota <ota@j.email.ne.jp> wrote:
> Hi,
>
> I'm on stable/13 with latest code base.
> I started testing pre-13.1 branch.
>
> I noticed major performance degrades with NFS when all CPUs are fully 
> utilized.
>
> This happends with stable/13 but not releng/13.0 nor releng/12.3.
NFS performance is sensitive to RPC response time.
Since this only happens when the COUs are busy, I'd suspect:
- Kernel thread scheduling changes
or
- Timing of receive socket upcalls (which wake up the nfsd kernel threads).

I suspect bisecting to the actual commit that causes this is the only way
to find it.
If you know of a working stable/13 that is more recent than 13.0, it would
help. If not, you start at this commit (which did make socket upcall changes):
commit 55cc0a478506ee1c2db7b2f9aadb9855e5490af3
which was done on May 21, 2021.

Maybe others can suggest commits related to thread scheduling (which I
know nothing about).

If you don't have the time/resources to bisect, I doubt this will get resolved.

Good luck with it, rick

I had NFS server with above versions and rsynced nfs mount to ufs mount on NFS clients.
My NFS server has 4 cores.
When I had load average of 3 with make buildworld -j3, NFS server was fine.
After adding another 1 load, NFS server throughput came down to about 10% of before.
After taking back to 3 load avg, performance recovered and down again after getting over 4.
Disk was fully avaiable for rsync; buildworld was done on another disk.


Someone told me his smbfs was also slow and he suspected TCP/IP regression instead of NFS, by the way.

Hiro




home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YT2PR01MB9730D7B51D325258AAA29828DD0A9>