From owner-freebsd-net@freebsd.org Mon Mar 22 13:24:30 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 96A235AEE81 for ; Mon, 22 Mar 2021 13:24:30 +0000 (UTC) (envelope-from jbreitman@tildenparkcapital.com) Received: from us-smtp-delivery-145.mimecast.com (us-smtp-delivery-145.mimecast.com [216.205.24.145]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.mimecast.com", Issuer "DigiCert TLS RSA SHA256 2020 CA1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4F3wFn4rfWz3PxP for ; Mon, 22 Mar 2021 13:24:29 +0000 (UTC) (envelope-from jbreitman@tildenparkcapital.com) Received: from zmcc-3-mta-2.zmailcloud.com (zmcc-3-mta-2.zmailcloud.com [35.238.170.66]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-177-xoW2U9uQM7y-YIXW40-otQ-1; Mon, 22 Mar 2021 09:24:27 -0400 X-MC-Unique: xoW2U9uQM7y-YIXW40-otQ-1 Received: from zmcc-3-mta-2.zmailcloud.com (localhost [127.0.0.1]) by zmcc-3-mta-2.zmailcloud.com (Postfix) with ESMTPS id 21E8BE2879; Mon, 22 Mar 2021 08:24:27 -0500 (CDT) Received: from localhost (localhost [127.0.0.1]) by zmcc-3-mta-2.zmailcloud.com (Postfix) with ESMTP id 0E88DE2987; Mon, 22 Mar 2021 08:24:27 -0500 (CDT) X-Virus-Scanned: amavisd-new at zmcc-3-mta-2.zmailcloud.com Received: from zmcc-3-mta-2.zmailcloud.com ([127.0.0.1]) by localhost (zmcc-3-mta-2.zmailcloud.com [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id KnIUUsMDCnai; Mon, 22 Mar 2021 08:24:26 -0500 (CDT) Received: from jbreitman-mac.zxcvm.com (unknown [72.22.182.150]) by zmcc-3-mta-2.zmailcloud.com (Postfix) with ESMTPSA id BE3EEE2879; Mon, 22 Mar 2021 08:24:26 -0500 (CDT) Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\)) Subject: Re: NFS Mount Hangs From: Jason Breitman In-Reply-To: Date: Mon, 22 Mar 2021 09:24:26 -0400 Cc: "freebsd-net@freebsd.org" Message-Id: References: <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> To: Youssef GHORBAL X-Mailer: Apple Mail (2.3608.120.23.2.4) X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: tildenparkcapital.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4F3wFn4rfWz3PxP X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of jbreitman@tildenparkcapital.com designates 216.205.24.145 as permitted sender) smtp.mailfrom=jbreitman@tildenparkcapital.com X-Spamd-Result: default: False [-2.90 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCVD_COUNT_FIVE(0.00)[6]; MID_RHS_MATCH_FROM(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; MV_CASE(0.50)[]; R_SPF_ALLOW(-0.20)[+ip4:216.205.24.0/24]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[tildenparkcapital.com: no valid DMARC record]; RWL_MAILSPIKE_VERYGOOD(0.00)[216.205.24.145:from]; NEURAL_HAM_LONG(-1.00)[-1.000]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_HAM_SHORT(-1.00)[-1.000]; RCPT_COUNT_TWO(0.00)[2]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:30031, ipnet:216.205.24.0/24, country:US]; RCVD_TLS_LAST(0.00)[]; MAILMAN_DEST(0.00)[freebsd-net]; RCVD_IN_DNSWL_LOW(-0.10)[216.205.24.145:from] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Mar 2021 13:24:30 -0000 Agreed. I had made the changes on the FreeBSD Server side and was suggesti= ng that a new TCP connection needed to be established between the client an= d server for the settings to take effect. I rebooted all of my Debian clients on Sunday to achieve that goal, establi= shing a new NFSv4 TCP connection with the file server, and will let the gr= oup know if I see another hang. Jason Breitman On Mar 22, 2021, at 7:27 AM, Youssef GHORBAL w= rote: > On 21 Mar 2021, at 14:41, Jason Breitman wrote: >=20 > Thanks for sharing as this sounds exactly like my issue. >=20 > I had implemented the change below on 3/8/2021 and have experienced the N= FS hang after that. > Do I need to reboot or umount / mount all of the clients and then I will = be ok? >=20 > I had not rebooted the clients, but would to get out of this situation. > It is logical that a new TCP session over 2049 needs to be reestablished = for the changes to take effect. >=20 > net.inet.tcp.fast_finwait2_recycle=3D1=20 > net.inet.tcp.finwait2_timeout=3D1000=20 In my case, those were implemented on the server (FreeBSD side) since the B= SD box that was closing the connection and the FIN_WAIT_2 state was on its = side. In your cas the FIN_WAIT_2 is on the client side. I don=E2=80=99t know if t= hese sysctl are even availale on Linux. > I can also confirm that the iptables solution that you use on the client = to get out of the hung mount without a reboot work for me. > #!/bin/sh >=20 > progName=3D"nfsClientFix" > delay=3D15 > nfs_ip=3DNFS.Server.IP.X >=20 > nfs_fin_wait2_state() { > /usr/bin/netstat -an | /usr/bin/grep ${nfs_ip}:2049 | /usr/bin/grep FIN= _WAIT2 > /dev/null 2>&1 > return $? > } >=20 >=20 > nfs_fin_wait2_state > result=3D$? > if [ ${result} -eq 0 ] ; then > /usr/bin/logger -s -i -p local7.error -t ${progName} "NFS Connection is= in FIN_WAIT2!" > /usr/bin/logger -s -i -p local7.error -t ${progName} "Enabling firewall= to block ${nfs_ip}!" > /usr/sbin/iptables -A INPUT -s ${nfs_ip} -j DROP >=20 > while true > do > /usr/bin/sleep ${delay} > =09nfs_fin_wait2_state > =09result=3D$? > if [ ${result} -ne 0 ] ; then > /usr/bin/logger -s -i -p local7.notice -t ${progName} "NFS Conn= ection is OK." > /usr/bin/logger -s -i -p local7.error -t ${progName} "Disabling= firewall to allow access to ${nfs_ip}!" > /usr/sbin/iptables -D INPUT -s ${nfs_ip} -j DROP > break > fi > done > fi >=20 >=20 > Jason Breitman >=20 >=20 > On Mar 19, 2021, at 8:40 PM, Youssef GHORBAL = wrote: >=20 > Hi Jason, >=20 >> On 17 Mar 2021, at 18:17, Jason Breitman wrote: >>=20 >> Please review the details below and let me know if there is a setting th= at I should apply to my FreeBSD NFS Server or if there is a bug fix that I = can apply to resolve my issue. >> I shared this information with the linux-nfs mailing list and they belie= ve the issue is on the server side. >>=20 >> Issue >> NFSv4 mounts periodically hang on the NFS Client. >>=20 >> During this time, it is possible to manually mount from another NFS Serv= er on the NFS Client having issues. >> Also, other NFS Clients are successfully mounting from the NFS Server in= question. >> Rebooting the NFS Client appears to be the only solution. >=20 > I had experienced a similar weird situation with periodically stuck Linux= NFS clients mounting Isilon NFS servers (Isilon is FreeBSD based but they = seem to have there own nfsd) > We=E2=80=99ve had better luck and we did manage to have packet captures o= n both sides during the issue. The gist of it goes like follows: >=20 > - Data flows correctly between SERVER and the CLIENT > - At some point SERVER starts decreasing it's TCP Receive Window until it= reachs 0 > - The client (eager to send data) can only ack data sent by SERVER. > - When SERVER was done sending data, the client starts sending TCP Window= Probes hoping that the TCP Window opens again so he can flush its buffers. > - SERVER responds with a TCP Zero Window to those probes. > - After 6 minutes (the NFS server default Idle timeout) SERVER racefully = closes the TCP connection sending a FIN Packet (and still a TCP Window at 0= )=20 > - CLIENT ACK that FIN. > - SERVER goes in FIN_WAIT_2 state > - CLIENT closes its half part part of the socket and goes in LAST_ACK sta= te. > - FIN is never sent by the client since there still data in its SendQ and= receiver TCP Window is still 0. At this stage the client starts sending TC= P Window Probes again and again hoping that the server opens its TCP Window= so it can flush it's buffers and terminate its side of the socket. > - SERVER keeps responding with a TCP Zero Window to those probes. > =3D> The last two steps goes on and on for hours/days freezing the NFS mo= unt bound to that TCP session. >=20 > If we had a situation where CLIENT was responsible for closing the TCP Wi= ndow (and initiating the TCP FIN first) and server wanting to send data we= =E2=80=99ll end up in the same state as you I think. >=20 > We=E2=80=99ve never had the root cause of why the SERVER decided to close= the TCP Window and no more acccept data, the fix on the Isilon part was to= recycle more aggressively the FIN_WAIT_2 sockets (net.inet.tcp.fast_finwai= t2_recycle=3D1 & net.inet.tcp.finwait2_timeout=3D5000). Once the socket rec= ycled and at the next occurence of CLIENT TCP Window probe, SERVER sends a = RST, triggering the teardown of the session on the client side, a new TCP h= andchake, etc and traffic flows again (NFS starts responding) >=20 > To avoid rebooting the client (and before the aggressive FIN_WAIT_2 was = implemented on the Isilon side) we=E2=80=99ve added a check script on the c= lient that detects LAST_ACK sockets on the client and through iptables rule= enforces a TCP RST, Something like: -A OUTPUT -p tcp -d $nfs_server_addr -= -sport $local_port -j REJECT --reject-with tcp-reset (the script removes th= is iptables rule as soon as the LAST_ACK disappears) >=20 > The bottom line would be to have a packet capture during the outage (clie= nt and/or server side), it will show you at least the shape of the TCP exch= ange when NFS is stuck. >=20 > Youssef >=20 >=20