From owner-freebsd-net@freebsd.org Wed Mar 17 22:48:52 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 14BC957BB3D for ; Wed, 17 Mar 2021 22:48:52 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (mail.lysator.liu.se [IPv6:2001:6b0:17:f0a0::3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4F151H64gkz3kB3 for ; Wed, 17 Mar 2021 22:48:51 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id AF5C540021 for ; Wed, 17 Mar 2021 23:48:42 +0100 (CET) Received: by mail.lysator.liu.se (Postfix, from userid 1004) id 9DFBE40020; Wed, 17 Mar 2021 23:48:42 +0100 (CET) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on bernadotte.lysator.liu.se X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED, AWL, HTML_MESSAGE autolearn=disabled version=3.4.2 X-Spam-Score: -1.0 Received: from [192.168.1.132] (h-201-113.A785.priv.bahnhof.se [98.128.201.113]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id DD3644000E; Wed, 17 Mar 2021 23:48:40 +0100 (CET) From: Peter Eriksson Message-Id: <3CF50285-AD1F-4D0C-B298-0B6263B4AB45@lysator.liu.se> Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.60.0.2.21\)) Subject: Re: NFS Mount Hangs Date: Wed, 17 Mar 2021 23:48:40 +0100 In-Reply-To: <789BCFA9-D6BC-4C5A-AEA2-E6F7C6E26CB5@tildenparkcapital.com> Cc: "freebsd-net@freebsd.org" To: Jason Breitman References: <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <789BCFA9-D6BC-4C5A-AEA2-E6F7C6E26CB5@tildenparkcapital.com> X-Mailer: Apple Mail (2.3654.60.0.2.21) X-Virus-Scanned: ClamAV using ClamSMTP X-Rspamd-Queue-Id: 4F151H64gkz3kB3 X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.34 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Mar 2021 22:48:52 -0000 CLOSE_WAIT on the server side usually indicates that the kernel has sent = the ACK to the clients FIN (start of a shutdown) packet but hasn=E2=80=99t= sent it=E2=80=99s own FIN packet - something that usually happens when = the server has read all data queued up from the client and taken what = actions it need to shutdown down it=E2=80=99s service=E2=80=A6 Here=E2=80=99s a fine ASCII art. Probably needs to be viewed using a = monospaced font :-) Client > ESTABLISHED --> FIN-WAIT-1 +-----> FIN-WAIT-2 +-----> TIME-WAIT = ---> CLOSED > : ^ ^ : > FIN : : ACK FIN : ACK : > v : : v > ESTABLISHED +--> CLOSE-WAIT --....---> LAST-ACK = +--------> CLOSED Server TSO/LRO and/or =E2=80=9Cintelligence=E2=80=9D in some smart network = cards can cause all kinds of interesting bugs. What ethernet cards are = you using? (TSO/LRO seems to be working better these days for our Intel X710 cards, = but a couple of years ago they would freeze up on us so we had to = disable it) Hmm.. Perhaps the NFS server is waiting for some locks to be released = before it can close down it=E2=80=99s end of the TCP link? Reservations?=20= But I=E2=80=99d suspect something else since we=E2=80=99ve been running = NFSv4.1/Kerberos on our FreeBSD 11.3/12.2 servers for a long time with = many Linux clients and most issues (the last couple of years) we=E2=80=99v= e seen have been on the Linux end of things=E2=80=A6 Like the bugs in = the Linux gss daemons or their single-threaded mount() sys call, or = automounter freezing up... and other fun bugs. - Peter > On 17 Mar 2021, at 23:17, Jason Breitman = wrote: >=20 > Thank you for the responses. > The NFS Client does properly negotiate down to 128K for the rsize and = wsize. >=20 > The client port should be changing as we are using the noresvport = option. >=20 > On the NFS Client > cat /proc/mounts > nfs-server.domain.com:/data /mnt/data nfs4 = rw,relatime,vers=3D4.1,rsize=3D131072,wsize=3D131072,namlen=3D255,hard,nor= esvport,proto=3Dtcp,timeo=3D600,retrans=3D2,sec=3Dkrb5,clientaddr=3DNFS.Cl= ient.IP.X,lookupcache=3Dpos,local_lock=3Dnone,addr=3DNFS.Server.IP.X 0 0 >=20 > When the issue occurs, this is what I see on the NFS Server. > tcp4 0 0 NFS.Server.IP.X.2049 NFS.Client.IP.X.51550 = CLOSE_WAIT =20 >=20 > Capturing packets right before the issue is a great idea, but I am = concerned about running tcpdump for such an extended period of time on = an active server. > I have gone 9 days with no issue which would be a lot of data and = overhead. >=20 > I will look into disabling the TSO and LRO options and let the group = know how it goes. > Below are the current options on the NFS Server. > lagg0: flags=3D8943 = metric 0 mtu 1500 > = options=3De507bb >=20 > Please share other ideas if you have them. >=20 > Jason Breitman >=20 >=20 > On Mar 17, 2021, at 5:58 PM, Rick Macklem = wrote: >=20 > Alan Somers wrote: > [stuff snipped] >> Is the 128K limit related to MAXPHYS? If so, it should be greater in = 13.0. > For the client, yes. For the server, no. > For the server, it is just a compile time constant NFS_SRVMAXIO. >=20 > It's mainly related to the fact that I haven't gotten around to = testing larger > sizes yet. > - kern.ipc.maxsockbuf needs to be several times the limit, which means = it would > have to increase for 1Mbyte. > - The session code must negotiate a maximum RPC size > 1 Mbyte. > (I think the server code does do this, but it needs to be tested.) > And, yes, the client is limited to MAXPHYS. >=20 > Doing this is on my todo list, rick >=20 > The client should acquire the attributes that indicate that and set = rsize/wsize > to that. "# nfsstat -m" on the client should show you what the client > is actually using. If it is larger than 128K, set both rsize and wsize = to 128K. >=20 >> Output from the NFS Client when the issue occurs >> # netstat -an | grep NFS.Server.IP.X >> tcp 0 0 NFS.Client.IP.X:46896 NFS.Server.IP.X:2049 FIN_WAIT2 > I'm no TCP guy. Hopefully others might know why the client would be > stuck in FIN_WAIT2 (I vaguely recall this means it is waiting for a = fin/ack, > but could be wrong?) >=20 >> # cat /sys/kernel/debug/sunrpc/rpc_xprt/*/info >> netid: tcp >> addr: NFS.Server.IP.X >> port: 2049 >> state: 0x51 >>=20 >> syslog >> Mar 4 10:29:27 hostname kernel: [437414.131978] -pid- flgs status = -client- --rqstp- ->timeout ---ops-- >> Mar 4 10:29:27 hostname kernel: [437414.133158] 57419 40a1 0 9b723c73 = >143cfadf 30000 4ca953b5 nfsv4 OPEN_NOATTR a:call_connect_status = [sunrpc] >q:xprt_pending > I don't know what OPEN_NOATTR means, but I assume it is some variant > of NFSv4 Open operation. > [stuff snipped] >> Mar 4 10:29:30 hostname kernel: [437417.110517] RPC: 57419 = xprt_connect_status: >connect attempt timed out >> Mar 4 10:29:30 hostname kernel: [437417.112172] RPC: 57419 = call_connect_status >> (status -110) > I have no idea what status -110 means? >> Mar 4 10:29:30 hostname kernel: [437417.113337] RPC: 57419 = call_timeout (major) >> Mar 4 10:29:30 hostname kernel: [437417.114385] RPC: 57419 call_bind = (status 0) >> Mar 4 10:29:30 hostname kernel: [437417.115402] RPC: 57419 = call_connect xprt >00000000e061831b is not connected >> Mar 4 10:29:30 hostname kernel: [437417.116547] RPC: 57419 = xprt_connect xprt >00000000e061831b is not connected >> Mar 4 10:30:31 hostname kernel: [437478.551090] RPC: 57419 = xprt_connect_status: >connect attempt timed out >> Mar 4 10:30:31 hostname kernel: [437478.552396] RPC: 57419 = call_connect_status >(status -110) >> Mar 4 10:30:31 hostname kernel: [437478.553417] RPC: 57419 = call_timeout (minor) >> Mar 4 10:30:31 hostname kernel: [437478.554327] RPC: 57419 call_bind = (status 0) >> Mar 4 10:30:31 hostname kernel: [437478.555220] RPC: 57419 = call_connect xprt >00000000e061831b is not connected >> Mar 4 10:30:31 hostname kernel: [437478.556254] RPC: 57419 = xprt_connect xprt >00000000e061831b is not connected > Is it possible that the client is trying to (re)connect using the same = client port#? > I would normally expect the client to create a new TCP connection = using a > different client port# and then retry the outstanding RPCs. > --> Capturing packets when this happens would show us what is going = on. >=20 > If there is a problem on the FreeBSD end, it is most likely a broken > network device driver. > --> Try disabling TSO , LRO. > --> Try a different driver for the net hardware on the server. > --> Try a different net chip on the server. > If you can capture packets when (not after) the hang > occurs, then you can look at them in wireshark and see > what is actually happening. (Ideally on both client and > server, to check that your network hasn't dropped anything.) > --> I know, if the hangs aren't easily reproducible, this isn't > easily done. > --> Try a newer Linux kernel and see if the problem persists. > The Linux folk will get more interested if you can reproduce > the problem on 5.12. (Recent bakeathon testing of the 5.12 > kernel against the FreeBSD server did not find any issues.) >=20 > Hopefully the network folk have some insight w.r.t. why > the TCP connection is sitting in FIN_WAIT2. >=20 > rick >=20 >=20 >=20 > Jason Breitman >=20 >=20 >=20 >=20 >=20 >=20 > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >=20 > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >=20 > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"