Date: Sat, 16 Jul 2022 13:43:11 +0000 From: Rick Macklem <rmacklem@uoguelph.ca> To: Peter <pmc@citylink.dinoex.sub.org>, "freebsd-stable@freebsd.org" <freebsd-stable@freebsd.org> Subject: Re: nfs stalls client: nfsrv_cache_session: no session Message-ID: <YQBPR0101MB97420DEC6E7CCAB309BDC1C7DD8A9@YQBPR0101MB9742.CANPRD01.PROD.OUTLOOK.COM> In-Reply-To: <YtKpqjBITR/ocjiF@gate.intra.daemon.contact> References: <YtKpqjBITR/ocjiF@gate.intra.daemon.contact>
next in thread | previous in thread | raw e-mail | index | archive | help
Peter <pmc@citylink.dinoex.sub.org> wrote:=0A= > Hija,=0A= > I have a problem with NFSv4:=0A= >=0A= > The configuration:=0A= > Server Rel. 13.1-RC2=0A= > nfs_server_enable=3D"YES"=0A= > nfs_server_flags=3D"-u -t --minthreads 2 --maxthreads 20 -h ..."=0A= Allowing it to go down to 2 threads is very low. I've never even=0A= tried to run a server with less than 4 threads. Since kernel threads=0A= don't generate much overhead, I'd suggest replacing the=0A= minthreads/maxthreads with "-n 32" for a very small server.=0A= (I didn't write the code that allows number of threads to vary and=0A= never use that either.)=0A= =0A= > mountd_enable=3D"YES"=0A= > mountd_flags=3D"-S -p 803 -h ..."=0A= > rpc_lockd_enable=3D"YES"=0A= > rpc_lockd_flags=3D"-h ..."=0A= > rpc_statd_enable=3D"YES"=0A= > rpc_statd_flags=3D"-h ..."=0A= > rpcbind_enable=3D"YES"=0A= > rpcbind_flags=3D"-h ..."=0A= > nfsv4_server_enable=3D"YES"=0A= > sysctl vfs.nfs.enable_uidtostring=3D1=0A= > sysctl vfs.nfsd.enable_stringtouid=3D1=0A= > =0A= > Client bhyve Rel. 13.1-RELEASE on the same system=0A= > nfs_client_enable=3D"YES"=0A= > nfs_access_cache=3D"600"=0A= > nfs_bufpackets=3D"32"=0A= > nfscbd_enable=3D"YES"=0A= > =0A= > Mount-options: nfsv4,readahead=3D1,rw,async=0A= I would expect the behaviour you are seeing for "intr" and/or "soft"=0A= mounts, but since you are not using those, I don't know how you=0A= broke the session? (10052 is NFSERR_BADSESSION)=0A= You might want to do "nfsstat -m" on the client to see what options=0A= were actually negotiated for the mount and then check that neither=0A= "soft" nor "intr" are there.=0A= =0A= I suspect that the recovery thread in the client (called "nfscl") is=0A= somehow wedged and cannot do the recovery from the bad session,=0A= as well.=0A= A "ps axHl" on the client would be useful to see what the=0A= processes/threads are up to on the client when it is hung.=0A= =0A= If increasing the number of nfsd threads in the server doesn't resolve=0A= the problem, I'd guess it is some network weirdness caused by how=0A= the bhyve instance is networked to its host. (I always use bridging=0A= for bhyve instances and do NFS mounts, but I don't work those=0A= mounts hard.)=0A= =0A= Btw, "umount -N <mnt_path>" on the client will normally get rid=0A= of a hung mount, although it can take a couple of minutes to complete.=0A= =0A= rick=0A= =0A= =0A= Access to the share suddenly stalled. Server reports this in messages,=0A= every second:=0A= nfsrv_cache_session: no session IPaddr=3D192.168...=0A= =0A= Restarting nfsd and mountd didn't help, only now the client started to=0A= also report in messages, every second:=0A= nfs server 192.168...:/var/sysup/mnt/tmp.6.56160: is alive again=0A= =0A= Mounting the same share anew to a different place works fine.=0A= =0A= The network babble is this, every second:=0A= NFS request xid 1678997001 212 getattr fh 0,6/2=0A= NFS reply xid 1678997001 reply ok 52 getattr ERROR: unk 10052=0A= =0A= Forensics: I tried to build openoffice on that share, a couple of=0A= times. So there was a bit of traffic, and some things may have=0A= overflown.=0A= =0A= There seems to be no way to recover, only crashing the client.=0A= =0A= =0A= =0A=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YQBPR0101MB97420DEC6E7CCAB309BDC1C7DD8A9>