Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 16 Jul 2022 13:43:11 +0000
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Peter <pmc@citylink.dinoex.sub.org>, "freebsd-stable@freebsd.org" <freebsd-stable@freebsd.org>
Subject:   Re: nfs stalls client: nfsrv_cache_session: no session
Message-ID:  <YQBPR0101MB97420DEC6E7CCAB309BDC1C7DD8A9@YQBPR0101MB9742.CANPRD01.PROD.OUTLOOK.COM>
In-Reply-To: <YtKpqjBITR/ocjiF@gate.intra.daemon.contact>
References:  <YtKpqjBITR/ocjiF@gate.intra.daemon.contact>

next in thread | previous in thread | raw e-mail | index | archive | help
Peter <pmc@citylink.dinoex.sub.org> wrote:=0A=
> Hija,=0A=
>  I have a problem with NFSv4:=0A=
>=0A=
> The configuration:=0A=
>   Server Rel. 13.1-RC2=0A=
>     nfs_server_enable=3D"YES"=0A=
>     nfs_server_flags=3D"-u -t --minthreads 2 --maxthreads 20 -h ..."=0A=
Allowing it to go down to 2 threads is very low. I've never even=0A=
tried to run a server with less than 4 threads. Since kernel threads=0A=
don't generate much overhead, I'd suggest replacing the=0A=
minthreads/maxthreads with "-n 32" for a very small server.=0A=
(I didn't write the code that allows number of threads to vary and=0A=
 never use that either.)=0A=
=0A=
>     mountd_enable=3D"YES"=0A=
>     mountd_flags=3D"-S -p 803 -h ..."=0A=
>     rpc_lockd_enable=3D"YES"=0A=
>     rpc_lockd_flags=3D"-h ..."=0A=
>     rpc_statd_enable=3D"YES"=0A=
>     rpc_statd_flags=3D"-h ..."=0A=
>     rpcbind_enable=3D"YES"=0A=
>     rpcbind_flags=3D"-h ..."=0A=
>     nfsv4_server_enable=3D"YES"=0A=
>     sysctl vfs.nfs.enable_uidtostring=3D1=0A=
>     sysctl vfs.nfsd.enable_stringtouid=3D1=0A=
> =0A=
>   Client bhyve Rel. 13.1-RELEASE on the same system=0A=
>     nfs_client_enable=3D"YES"=0A=
>     nfs_access_cache=3D"600"=0A=
>     nfs_bufpackets=3D"32"=0A=
>     nfscbd_enable=3D"YES"=0A=
> =0A=
>   Mount-options: nfsv4,readahead=3D1,rw,async=0A=
I would expect the behaviour you are seeing for "intr" and/or "soft"=0A=
mounts, but since you are not using those, I don't know how you=0A=
broke the session? (10052 is NFSERR_BADSESSION)=0A=
You might want to do "nfsstat -m" on the client to see what options=0A=
were actually negotiated for the mount and then check that neither=0A=
"soft" nor "intr" are there.=0A=
=0A=
I suspect that the recovery thread in the client (called "nfscl") is=0A=
somehow wedged and cannot do the recovery from the bad session,=0A=
as well.=0A=
A "ps axHl" on the client would be useful to see what the=0A=
processes/threads are up to on the client when it is hung.=0A=
=0A=
If increasing the number of nfsd threads in the server doesn't resolve=0A=
the problem, I'd guess it is some network weirdness caused by how=0A=
the bhyve instance is networked to its host. (I always use bridging=0A=
for bhyve instances and do NFS mounts, but I don't work those=0A=
mounts hard.)=0A=
=0A=
Btw, "umount -N <mnt_path>" on the client will normally get rid=0A=
of a hung mount, although it can take a couple of minutes to complete.=0A=
=0A=
rick=0A=
=0A=
=0A=
Access to the share suddenly stalled. Server reports this in messages,=0A=
every second:=0A=
   nfsrv_cache_session: no session IPaddr=3D192.168...=0A=
=0A=
Restarting nfsd and mountd didn't help, only now the client started to=0A=
also report in messages, every second:=0A=
   nfs server 192.168...:/var/sysup/mnt/tmp.6.56160: is alive again=0A=
=0A=
Mounting the same share anew to a different place works fine.=0A=
=0A=
The network babble is this, every second:=0A=
   NFS request xid 1678997001 212 getattr fh 0,6/2=0A=
   NFS reply xid 1678997001 reply ok 52 getattr ERROR: unk 10052=0A=
=0A=
Forensics: I tried to build openoffice on that share, a couple of=0A=
   times. So there was a bit of traffic, and some things may have=0A=
   overflown.=0A=
=0A=
There seems to be no way to recover, only crashing the client.=0A=
=0A=
=0A=
=0A=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YQBPR0101MB97420DEC6E7CCAB309BDC1C7DD8A9>