From owner-freebsd-net@freebsd.org Sun Mar 21 13:41:20 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id C2BBC576475 for ; Sun, 21 Mar 2021 13:41:20 +0000 (UTC) (envelope-from jbreitman@tildenparkcapital.com) Received: from us-smtp-delivery-145.mimecast.com (us-smtp-delivery-145.mimecast.com [170.10.133.145]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.mimecast.com", Issuer "DigiCert TLS RSA SHA256 2020 CA1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4F3Jgg4gQJz4v72 for ; Sun, 21 Mar 2021 13:41:19 +0000 (UTC) (envelope-from jbreitman@tildenparkcapital.com) Received: from zmcc-3-mta-2.zmailcloud.com (zmcc-3-mta-2.zmailcloud.com [35.238.170.66]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-268-dB1TAAFSNPWyB7C1em1RNA-1; Sun, 21 Mar 2021 09:41:17 -0400 X-MC-Unique: dB1TAAFSNPWyB7C1em1RNA-1 Received: from zmcc-3-mta-2.zmailcloud.com (localhost [127.0.0.1]) by zmcc-3-mta-2.zmailcloud.com (Postfix) with ESMTPS id 9BE65E005D; Sun, 21 Mar 2021 08:41:16 -0500 (CDT) Received: from localhost (localhost [127.0.0.1]) by zmcc-3-mta-2.zmailcloud.com (Postfix) with ESMTP id 89979E085D; Sun, 21 Mar 2021 08:41:16 -0500 (CDT) X-Virus-Scanned: amavisd-new at zmcc-3-mta-2.zmailcloud.com Received: from zmcc-3-mta-2.zmailcloud.com ([127.0.0.1]) by localhost (zmcc-3-mta-2.zmailcloud.com [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id PscKGu616jBG; Sun, 21 Mar 2021 08:41:16 -0500 (CDT) Received: from jbreitman-mac.zxcvm.com (unknown [72.22.182.150]) by zmcc-3-mta-2.zmailcloud.com (Postfix) with ESMTPSA id 48509E005D; Sun, 21 Mar 2021 08:41:16 -0500 (CDT) Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\)) Subject: Re: NFS Mount Hangs From: Jason Breitman In-Reply-To: Date: Sun, 21 Mar 2021 09:41:15 -0400 Cc: "freebsd-net@freebsd.org" Message-Id: References: <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> To: Youssef GHORBAL X-Mailer: Apple Mail (2.3608.120.23.2.4) X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: tildenparkcapital.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4F3Jgg4gQJz4v72 X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of jbreitman@tildenparkcapital.com designates 170.10.133.145 as permitted sender) smtp.mailfrom=jbreitman@tildenparkcapital.com X-Spamd-Result: default: False [-2.90 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCVD_COUNT_FIVE(0.00)[6]; MID_RHS_MATCH_FROM(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; MV_CASE(0.50)[]; R_SPF_ALLOW(-0.20)[+ip4:170.10.133.0/24]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[tildenparkcapital.com: no valid DMARC record]; ARC_NA(0.00)[]; SPAMHAUS_ZRD(0.00)[170.10.133.145:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[170.10.133.145:from]; NEURAL_HAM_LONG(-1.00)[-1.000]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_SHORT(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:30031, ipnet:170.10.132.0/23, country:US]; RCVD_TLS_LAST(0.00)[]; MAILMAN_DEST(0.00)[freebsd-net]; RCVD_IN_DNSWL_LOW(-0.10)[170.10.133.145:from] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 21 Mar 2021 13:41:20 -0000 Thanks for sharing as this sounds exactly like my issue. I had implemented the change below on 3/8/2021 and have experienced the NFS= hang after that. Do I need to reboot or umount / mount all of the clients and then I will be= ok? I had not rebooted the clients, but would to get out of this situation. It is logical that a new TCP session over 2049 needs to be reestablished fo= r the changes to take effect. net.inet.tcp.fast_finwait2_recycle=3D1=20 net.inet.tcp.finwait2_timeout=3D1000=20 I can also confirm that the iptables solution that you use on the client to= get out of the hung mount without a reboot work for me. #!/bin/sh progName=3D"nfsClientFix" delay=3D15 nfs_ip=3DNFS.Server.IP.X nfs_fin_wait2_state() { /usr/bin/netstat -an | /usr/bin/grep ${nfs_ip}:2049 | /usr/bin/grep FIN= _WAIT2 > /dev/null 2>&1 return $? } nfs_fin_wait2_state result=3D$? if [ ${result} -eq 0 ] ; then /usr/bin/logger -s -i -p local7.error -t ${progName} "NFS Connection is= in FIN_WAIT2!" /usr/bin/logger -s -i -p local7.error -t ${progName} "Enabling firewall= to block ${nfs_ip}!" /usr/sbin/iptables -A INPUT -s ${nfs_ip} -j DROP while true do /usr/bin/sleep ${delay} =09nfs_fin_wait2_state =09result=3D$? if [ ${result} -ne 0 ] ; then /usr/bin/logger -s -i -p local7.notice -t ${progName} "NFS Conn= ection is OK." /usr/bin/logger -s -i -p local7.error -t ${progName} "Disabling= firewall to allow access to ${nfs_ip}!" /usr/sbin/iptables -D INPUT -s ${nfs_ip} -j DROP break fi done fi Jason Breitman On Mar 19, 2021, at 8:40 PM, Youssef GHORBAL w= rote: Hi Jason, > On 17 Mar 2021, at 18:17, Jason Breitman wrote: >=20 > Please review the details below and let me know if there is a setting tha= t I should apply to my FreeBSD NFS Server or if there is a bug fix that I c= an apply to resolve my issue. > I shared this information with the linux-nfs mailing list and they believ= e the issue is on the server side. >=20 > Issue > NFSv4 mounts periodically hang on the NFS Client. >=20 > During this time, it is possible to manually mount from another NFS Serve= r on the NFS Client having issues. > Also, other NFS Clients are successfully mounting from the NFS Server in = question. > Rebooting the NFS Client appears to be the only solution. I had experienced a similar weird situation with periodically stuck Linux N= FS clients mounting Isilon NFS servers (Isilon is FreeBSD based but they se= em to have there own nfsd) We=E2=80=99ve had better luck and we did manage to have packet captures on = both sides during the issue. The gist of it goes like follows: - Data flows correctly between SERVER and the CLIENT - At some point SERVER starts decreasing it's TCP Receive Window until it r= eachs 0 - The client (eager to send data) can only ack data sent by SERVER. - When SERVER was done sending data, the client starts sending TCP Window P= robes hoping that the TCP Window opens again so he can flush its buffers. - SERVER responds with a TCP Zero Window to those probes. - After 6 minutes (the NFS server default Idle timeout) SERVER racefully cl= oses the TCP connection sending a FIN Packet (and still a TCP Window at 0)= =20 - CLIENT ACK that FIN. - SERVER goes in FIN_WAIT_2 state - CLIENT closes its half part part of the socket and goes in LAST_ACK state= . - FIN is never sent by the client since there still data in its SendQ and r= eceiver TCP Window is still 0. At this stage the client starts sending TCP = Window Probes again and again hoping that the server opens its TCP Window s= o it can flush it's buffers and terminate its side of the socket. - SERVER keeps responding with a TCP Zero Window to those probes. =3D> The last two steps goes on and on for hours/days freezing the NFS moun= t bound to that TCP session. If we had a situation where CLIENT was responsible for closing the TCP Wind= ow (and initiating the TCP FIN first) and server wanting to send data we=E2= =80=99ll end up in the same state as you I think. We=E2=80=99ve never had the root cause of why the SERVER decided to close t= he TCP Window and no more acccept data, the fix on the Isilon part was to r= ecycle more aggressively the FIN_WAIT_2 sockets (net.inet.tcp.fast_finwait2= _recycle=3D1 & net.inet.tcp.finwait2_timeout=3D5000). Once the socket recyc= led and at the next occurence of CLIENT TCP Window probe, SERVER sends a RS= T, triggering the teardown of the session on the client side, a new TCP han= dchake, etc and traffic flows again (NFS starts responding) To avoid rebooting the client (and before the aggressive FIN_WAIT_2 was im= plemented on the Isilon side) we=E2=80=99ve added a check script on the cli= ent that detects LAST_ACK sockets on the client and through iptables rule e= nforces a TCP RST, Something like: -A OUTPUT -p tcp -d $nfs_server_addr --s= port $local_port -j REJECT --reject-with tcp-reset (the script removes this= iptables rule as soon as the LAST_ACK disappears) The bottom line would be to have a packet capture during the outage (client= and/or server side), it will show you at least the shape of the TCP exchan= ge when NFS is stuck. Youssef