Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 28 May 2022 16:00:07 +0000
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Andreas Kempe <kempe@lysator.liu.se>
Cc:        "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Subject:   Re: FreeBSD 12.3/13.1 NFS client hang
Message-ID:  <YQBPR0101MB9742087275E46E400FDE31E0DDDB9@YQBPR0101MB9742.CANPRD01.PROD.OUTLOOK.COM>
In-Reply-To: <YpFM2bSMscG4ekc9@shipon.lysator.liu.se>
References:  <YpEwxdGCouUUFHiE@shipon.lysator.liu.se> <YQBPR0101MB9742280313FC17543132A61CDDD89@YQBPR0101MB9742.CANPRD01.PROD.OUTLOOK.COM> <YpFM2bSMscG4ekc9@shipon.lysator.liu.se>

next in thread | previous in thread | raw e-mail | index | archive | help
Andreas Kempe <kempe@lysator.liu.se> wrote:=0A=
> On Fri, May 27, 2022 at 08:59:57PM +0000, Rick Macklem wrote:=0A=
> > Andreas Kempe <kempe@lysator.liu.se> wrote:=0A=
> > > Hello everyone!=0A=
> > >=0A=
> > > I'm having issues with the NFS clients on FreeBSD 12.3 and 13.1=0A=
> > > systems hanging when using a CentOS 7 server.=0A=
Here are a few other things to consider:=0A=
Delegations - They are complex and seldom improve performance.=0A=
       I think I finally have them implemented reliably, but???=0A=
       They are disabled by default in the FreeBSD server and can be=0A=
       avoided by not running the nfscbd(8) daemon when mounting=0A=
       non-FreeBSD NFS servers.=0A=
       # nfsstat -E -c=0A=
       - If it shows non-zero "Delegs", consider disabling them.=0A=
=0A=
TSO- Some net chips/drivers don't get these quite right. NFS is very=0A=
      good at finding the flaws, since it generates all kinds of small and=
=0A=
      weird sized TSO/TCP segments.=0A=
      - Consider trying disabling TSO if intermittent hangs persist.=0A=
=0A=
Jumbo mbuf clusters - Some net interfaces use jumbo mbuf clusters=0A=
      when jumbo frames are in use.  These can fragment the memory=0A=
      pool that mbuf clusters are being allocated from.=0A=
      # vmstat -z | fgrep mbuf_jumbo=0A=
      - and look to see if the third numbers are non-zero.=0A=
      Reducing the mtu may be a performance hit, but if the memory=0A=
      pool that clusters are allocated from becomes too fragmented,=0A=
      NFS will come to a grinding halt.=0A=
=0A=
An NFSv4 server that does not reply to an RPC. This is a badly broken=0A=
server. NFSv4 servers are supposed to reply NFSERR_DELAY if they cannot=0A=
do an RPC at the time requested. They are not supposed to throw away=0A=
the request without replying.=0A=
Hopefully, such servers do not exist. If they do, the mount will hang.=0A=
About the only way to detect this would be a packet capture when it=0A=
happens.=0A=
About the only fix is a different NFS server or using NFSv3 mounts, which=
=0A=
are stateless and might work better in this case.=0A=
=0A=
rick=0A=
=0A=
=0A=
> First, make sure you are using hard mounts. "soft" or "intr" mounts won't=
=0A=
> work and will mess up the session sooner or later. (A messed up session c=
ould=0A=
> result in no free slots on the session and that will wedge threads in=0A=
> nfsv4_sequencelookup() as you describe.=0A=
> (This is briefly described in the BUGS section of "man mount_nfs".)=0A=
>=0A=
=0A=
I had totally missed that soft and interruptible mounts have these=0A=
issues. I switched the FreeBSD-machines to soft and intr on purpose=0A=
to be able to fix hung mounts without having to restart the machine on=0A=
NFS hangs. Since they are shared machines, it is an inconvinience for=0A=
other users if one user causes a hang.=0A=
=0A=
Switching our test machine back to hard mounts did prevent recursive=0A=
grep from immediately causing the slot type hang again.=0A=
=0A=
> Do a:=0A=
> # nfsstat -m=0A=
> on the clients and look for "hard".=0A=
>=0A=
> Next, is there anything logged on the console for the 13.1 client(s)?=0A=
> (13.1 has some diagnostics for things like a server replying with the=0A=
>  wrong session slot#.)=0A=
>=0A=
=0A=
The one thing we have seen logged are messages along the lines of:=0A=
kernel: newnfs: server 'mail' error: fileid changed. fsid 4240eca6003a052a:=
0: expected fileid 0x22, got 0x2. (BROKEN NFS SERVER OR MIDDLEWARE)=0A=
=0A=
> Also, maybe I'm old fashioned, but I find "ps axHl" useful, since it show=
s=0A=
> where all the processes are sleeping.=0A=
> And "procstat -kk" covers all of the locks.=0A=
>=0A=
=0A=
I don't know if it is a matter of being old fashioned as much as one=0A=
of taste. :) In future dumps, I can provide both ps axHl and procstat -kk.=
=0A=
=0A=
> > Below are procstat kstack $PID invocations showing where the processes=
=0A=
> > have hung. In the nfsv4_sequencelookup it seems hung waiting for=0A=
> > nfsess_slots to have an available slot. In the second nfs_lock case,=0A=
> > it seems the processes are stuck waiting on vnode locks.=0A=
> >=0A=
> > These issues seem to appear seemingly at random, but also if=0A=
> > operations that open a lot of files or create a lot of file locks are=
=0A=
> > used. An example that can often provoke a hang is performing a=0A=
> > recursive grep through a large file hierarchy like the FreeBSD=0A=
> > codebase.=0A=
> >=0A=
> > The NFS code is large and complicated so any advice is appriciated!=0A=
> Yea. I'm the author and I don't know exactly what it all does;-)\=0A=
>=0A=
> > Cordially,=0A=
> > Andreas Kempe=0A=
> >=0A=
>=0A=
> [...]=0A=
>=0A=
> Not very useful unless you have all the processes and their locks to try =
and figure out what is holding=0A=
> the vnode locks.=0A=
>=0A=
=0A=
Yes, I sent this mostly in the hope that it might be something that=0A=
someone has seen before. I understand that more verbose information is=0A=
needed to track down the lock contention.=0A=
=0A=
I'll switch our machines back to using hard mounts and try to get as=0A=
much diagnostic information as possible when the next lockup happens.=0A=
=0A=
Do you have any good suggestions for tracking down the issue? I've=0A=
been contemplating enabling WITNESS or building with debug information=0A=
to be able to hook in the kernel debugger.=0A=
=0A=
Thank you very much for your reply!=0A=
Cordially,=0A=
Andreas Kempe=0A=
=0A=
> rick=0A=
>=0A=
>=0A=
=0A=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YQBPR0101MB9742087275E46E400FDE31E0DDDB9>