Date: Thu, 18 Mar 2021 13:58:30 +0100 From: tuexen@freebsd.org To: "Rodney W. Grimes" <freebsd-rwg@gndrsh.dnsmgr.net> Cc: Rick Macklem <rmacklem@uoguelph.ca>, Alan Somers <asomers@freebsd.org>, "freebsd-net@freebsd.org" <freebsd-net@freebsd.org> Subject: Re: NFS Mount Hangs Message-ID: <B11F5CD9-A026-4549-89A3-57D1E180C628@freebsd.org> In-Reply-To: <202103181253.12ICrF35016815@gndrsh.dnsmgr.net> References: <202103181253.12ICrF35016815@gndrsh.dnsmgr.net>
next in thread | previous in thread | raw e-mail | index | archive | help
> On 18. Mar 2021, at 13:53, Rodney W. Grimes = <freebsd-rwg@gndrsh.dnsmgr.net> wrote: >=20 > Note I am NOT a TCP expert, but know enough about it to add a = comment... >=20 >> Alan Somers wrote: >> [stuff snipped] >>> Is the 128K limit related to MAXPHYS? If so, it should be greater = in 13.0. >> For the client, yes. For the server, no. >> For the server, it is just a compile time constant NFS_SRVMAXIO. >>=20 >> It's mainly related to the fact that I haven't gotten around to = testing larger >> sizes yet. >> - kern.ipc.maxsockbuf needs to be several times the limit, which = means it would >> have to increase for 1Mbyte. >> - The session code must negotiate a maximum RPC size > 1 Mbyte. >> (I think the server code does do this, but it needs to be tested.) >> And, yes, the client is limited to MAXPHYS. >>=20 >> Doing this is on my todo list, rick >>=20 >> The client should acquire the attributes that indicate that and set = rsize/wsize >> to that. "# nfsstat -m" on the client should show you what the client >> is actually using. If it is larger than 128K, set both rsize and = wsize to 128K. >>=20 >>> Output from the NFS Client when the issue occurs >>> # netstat -an | grep NFS.Server.IP.X >>> tcp 0 0 NFS.Client.IP.X:46896 NFS.Server.IP.X:2049 = FIN_WAIT2 >> I'm no TCP guy. Hopefully others might know why the client would be >> stuck in FIN_WAIT2 (I vaguely recall this means it is waiting for a = fin/ack, >> but could be wrong?) >=20 > The most common way to get stuck in FIN_WAIT2 is to call > shutdown(2) on a socket, but never following up with a > close(2) after some timeout period. The "client" is still > connected to the socket and can stay in this shutdown state > for ever, the kernel well not reap the socket as it is > associated with a processes, aka not orphaned. I suspect > that the Linux client has a corner condition that is leading > to this socket leak. >=20 > If on the Linux client you can look at the sockets to see > if these are still associated with a process, ala fstat or > what ever Linux tool does this that would be helpfull. > If they are infact connected to a process it is that > process that must call close(2) to clean these up. >=20 > IIRC the server side socket would be gone at this point > and there is nothing the server can do that would allow > a FIN_WAIT2 to close down. Jason reported that the server is in CLOSE-WAIT. This would mean the the server received the FIN, ACKed it, but has not initiated the teardown of the Server->Client direction. So the server side socket is still there and close has not be called yet. >=20 > The real TCP experts can now correct my 30 year old TCP > stack understanding... I wouldn't count myself as a real TCP expert, but the behaviour hasn't changed in the last 30 years, I think... Best regards Michael >=20 >>=20 >>> # cat /sys/kernel/debug/sunrpc/rpc_xprt/*/info >>> netid: tcp >>> addr: NFS.Server.IP.X >>> port: 2049 >>> state: 0x51 >>>=20 >>> syslog >>> Mar 4 10:29:27 hostname kernel: [437414.131978] -pid- flgs status = -client- --rqstp- ->timeout ---ops-- >>> Mar 4 10:29:27 hostname kernel: [437414.133158] 57419 40a1 0 = 9b723c73 >143cfadf 30000 4ca953b5 nfsv4 OPEN_NOATTR = a:call_connect_status [sunrpc] >q:xprt_pending >> I don't know what OPEN_NOATTR means, but I assume it is some variant >> of NFSv4 Open operation. >> [stuff snipped] >>> Mar 4 10:29:30 hostname kernel: [437417.110517] RPC: 57419 = xprt_connect_status: >connect attempt timed out >>> Mar 4 10:29:30 hostname kernel: [437417.112172] RPC: 57419 = call_connect_status >>> (status -110) >> I have no idea what status -110 means? >>> Mar 4 10:29:30 hostname kernel: [437417.113337] RPC: 57419 = call_timeout (major) >>> Mar 4 10:29:30 hostname kernel: [437417.114385] RPC: 57419 = call_bind (status 0) >>> Mar 4 10:29:30 hostname kernel: [437417.115402] RPC: 57419 = call_connect xprt >00000000e061831b is not connected >>> Mar 4 10:29:30 hostname kernel: [437417.116547] RPC: 57419 = xprt_connect xprt >00000000e061831b is not connected >>> Mar 4 10:30:31 hostname kernel: [437478.551090] RPC: 57419 = xprt_connect_status: >connect attempt timed out >>> Mar 4 10:30:31 hostname kernel: [437478.552396] RPC: 57419 = call_connect_status >(status -110) >>> Mar 4 10:30:31 hostname kernel: [437478.553417] RPC: 57419 = call_timeout (minor) >>> Mar 4 10:30:31 hostname kernel: [437478.554327] RPC: 57419 = call_bind (status 0) >>> Mar 4 10:30:31 hostname kernel: [437478.555220] RPC: 57419 = call_connect xprt >00000000e061831b is not connected >>> Mar 4 10:30:31 hostname kernel: [437478.556254] RPC: 57419 = xprt_connect xprt >00000000e061831b is not connected >> Is it possible that the client is trying to (re)connect using the = same client port#? >> I would normally expect the client to create a new TCP connection = using a >> different client port# and then retry the outstanding RPCs. >> --> Capturing packets when this happens would show us what is going = on. >>=20 >> If there is a problem on the FreeBSD end, it is most likely a broken >> network device driver. >> --> Try disabling TSO , LRO. >> --> Try a different driver for the net hardware on the server. >> --> Try a different net chip on the server. >> If you can capture packets when (not after) the hang >> occurs, then you can look at them in wireshark and see >> what is actually happening. (Ideally on both client and >> server, to check that your network hasn't dropped anything.) >> --> I know, if the hangs aren't easily reproducible, this isn't >> easily done. >> --> Try a newer Linux kernel and see if the problem persists. >> The Linux folk will get more interested if you can reproduce >> the problem on 5.12. (Recent bakeathon testing of the 5.12 >> kernel against the FreeBSD server did not find any issues.) >>=20 >> Hopefully the network folk have some insight w.r.t. why >> the TCP connection is sitting in FIN_WAIT2. >>=20 >> rick >>=20 >>=20 >>=20 >> Jason Breitman >>=20 >>=20 >>=20 >>=20 >>=20 >>=20 >> _______________________________________________ >> freebsd-net@freebsd.org<mailto:freebsd-net@freebsd.org> mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org<mailto:freebsd-net-unsubscribe@freebs= d.org>" >>=20 >> _______________________________________________ >> freebsd-net@freebsd.org<mailto:freebsd-net@freebsd.org> mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org<mailto:freebsd-net-unsubscribe@freebs= d.org>" >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>=20 >=20 > --=20 > Rod Grimes = rgrimes@freebsd.org > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?B11F5CD9-A026-4549-89A3-57D1E180C628>