From owner-freebsd-net@freebsd.org  Sun Mar 21 13:41:20 2021
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.nyi.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.nyi.freebsd.org (Postfix) with ESMTP id C2BBC576475
 for <freebsd-net@mailman.nyi.freebsd.org>;
 Sun, 21 Mar 2021 13:41:20 +0000 (UTC)
 (envelope-from jbreitman@tildenparkcapital.com)
Received: from us-smtp-delivery-145.mimecast.com
 (us-smtp-delivery-145.mimecast.com [170.10.133.145])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "*.mimecast.com",
 Issuer "DigiCert TLS RSA SHA256 2020 CA1" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 4F3Jgg4gQJz4v72
 for <freebsd-net@freebsd.org>; Sun, 21 Mar 2021 13:41:19 +0000 (UTC)
 (envelope-from jbreitman@tildenparkcapital.com)
Received: from zmcc-3-mta-2.zmailcloud.com (zmcc-3-mta-2.zmailcloud.com
 [35.238.170.66]) (Using TLS) by relay.mimecast.com with ESMTP id
 us-mta-268-dB1TAAFSNPWyB7C1em1RNA-1; Sun, 21 Mar 2021 09:41:17 -0400
X-MC-Unique: dB1TAAFSNPWyB7C1em1RNA-1
Received: from zmcc-3-mta-2.zmailcloud.com (localhost [127.0.0.1])
 by zmcc-3-mta-2.zmailcloud.com (Postfix) with ESMTPS id 9BE65E005D;
 Sun, 21 Mar 2021 08:41:16 -0500 (CDT)
Received: from localhost (localhost [127.0.0.1])
 by zmcc-3-mta-2.zmailcloud.com (Postfix) with ESMTP id 89979E085D;
 Sun, 21 Mar 2021 08:41:16 -0500 (CDT)
X-Virus-Scanned: amavisd-new at zmcc-3-mta-2.zmailcloud.com
Received: from zmcc-3-mta-2.zmailcloud.com ([127.0.0.1])
 by localhost (zmcc-3-mta-2.zmailcloud.com [127.0.0.1]) (amavisd-new,
 port 10026)
 with ESMTP id PscKGu616jBG; Sun, 21 Mar 2021 08:41:16 -0500 (CDT)
Received: from jbreitman-mac.zxcvm.com (unknown [72.22.182.150])
 by zmcc-3-mta-2.zmailcloud.com (Postfix) with ESMTPSA id 48509E005D;
 Sun, 21 Mar 2021 08:41:16 -0500 (CDT)
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\))
Subject: Re: NFS Mount Hangs
From: Jason Breitman <jbreitman@tildenparkcapital.com>
In-Reply-To: <D67AF317-D238-4EC0-8C7F-22D54AD5144C@pasteur.fr>
Date: Sun, 21 Mar 2021 09:41:15 -0400
Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
Message-Id: <B5E47AF2-5F24-4BD8-B228-A03246C03A6D@tildenparkcapital.com>
References: <C643BB9C-6B61-4DAC-8CF9-CE04EA7292D0@tildenparkcapital.com>
 <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com>
 <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com>
 <D67AF317-D238-4EC0-8C7F-22D54AD5144C@pasteur.fr>
To: Youssef GHORBAL <youssef.ghorbal@pasteur.fr>
X-Mailer: Apple Mail (2.3608.120.23.2.4)
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: tildenparkcapital.com
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-Rspamd-Queue-Id: 4F3Jgg4gQJz4v72
X-Spamd-Bar: --
Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none;
 spf=pass (mx1.freebsd.org: domain of jbreitman@tildenparkcapital.com
 designates 170.10.133.145 as permitted sender)
 smtp.mailfrom=jbreitman@tildenparkcapital.com
X-Spamd-Result: default: False [-2.90 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[];
 RCVD_VIA_SMTP_AUTH(0.00)[]; RCVD_COUNT_FIVE(0.00)[6];
 MID_RHS_MATCH_FROM(0.00)[]; FROM_HAS_DN(0.00)[];
 TO_DN_SOME(0.00)[]; MV_CASE(0.50)[];
 R_SPF_ALLOW(-0.20)[+ip4:170.10.133.0/24];
 MIME_GOOD(-0.10)[text/plain];
 DMARC_NA(0.00)[tildenparkcapital.com: no valid DMARC record];
 ARC_NA(0.00)[];
 SPAMHAUS_ZRD(0.00)[170.10.133.145:from:127.0.2.255];
 TO_MATCH_ENVRCPT_SOME(0.00)[];
 RBL_DBL_DONT_QUERY_IPS(0.00)[170.10.133.145:from];
 NEURAL_HAM_LONG(-1.00)[-1.000]; RCPT_COUNT_TWO(0.00)[2];
 NEURAL_HAM_SHORT(-1.00)[-1.000];
 NEURAL_HAM_MEDIUM(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[];
 R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+];
 ASN(0.00)[asn:30031, ipnet:170.10.132.0/23, country:US];
 RCVD_TLS_LAST(0.00)[]; MAILMAN_DEST(0.00)[freebsd-net];
 RCVD_IN_DNSWL_LOW(-0.10)[170.10.133.145:from]
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.34
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 21 Mar 2021 13:41:20 -0000

Thanks for sharing as this sounds exactly like my issue.

I had implemented the change below on 3/8/2021 and have experienced the NFS=
 hang after that.
Do I need to reboot or umount / mount all of the clients and then I will be=
 ok?

I had not rebooted the clients, but would to get out of this situation.
It is logical that a new TCP session over 2049 needs to be reestablished fo=
r the changes to take effect.

net.inet.tcp.fast_finwait2_recycle=3D1=20
net.inet.tcp.finwait2_timeout=3D1000=20

I can also confirm that the iptables solution that you use on the client to=
 get out of the hung mount without a reboot work for me.
#!/bin/sh

progName=3D"nfsClientFix"
delay=3D15
nfs_ip=3DNFS.Server.IP.X

nfs_fin_wait2_state() {
    /usr/bin/netstat -an | /usr/bin/grep ${nfs_ip}:2049 | /usr/bin/grep FIN=
_WAIT2 > /dev/null 2>&1
    return $?
}


nfs_fin_wait2_state
result=3D$?
if [ ${result} -eq 0 ] ; then
    /usr/bin/logger -s -i -p local7.error -t ${progName} "NFS Connection is=
 in FIN_WAIT2!"
    /usr/bin/logger -s -i -p local7.error -t ${progName} "Enabling firewall=
 to block ${nfs_ip}!"
    /usr/sbin/iptables -A INPUT -s ${nfs_ip} -j DROP

    while true
    do
        /usr/bin/sleep ${delay}
=09nfs_fin_wait2_state
=09result=3D$?
        if [ ${result} -ne 0 ] ; then
            /usr/bin/logger -s -i -p local7.notice -t ${progName} "NFS Conn=
ection is OK."
            /usr/bin/logger -s -i -p local7.error -t ${progName} "Disabling=
 firewall to allow access to ${nfs_ip}!"
            /usr/sbin/iptables -D INPUT -s ${nfs_ip}  -j DROP
            break
        fi
    done
fi


Jason Breitman


On Mar 19, 2021, at 8:40 PM, Youssef GHORBAL <youssef.ghorbal@pasteur.fr> w=
rote:

Hi Jason,

> On 17 Mar 2021, at 18:17, Jason Breitman <jbreitman@tildenparkcapital.com=
> wrote:
>=20
> Please review the details below and let me know if there is a setting tha=
t I should apply to my FreeBSD NFS Server or if there is a bug fix that I c=
an apply to resolve my issue.
> I shared this information with the linux-nfs mailing list and they believ=
e the issue is on the server side.
>=20
> Issue
> NFSv4 mounts periodically hang on the NFS Client.
>=20
> During this time, it is possible to manually mount from another NFS Serve=
r on the NFS Client having issues.
> Also, other NFS Clients are successfully mounting from the NFS Server in =
question.
> Rebooting the NFS Client appears to be the only solution.

I had experienced a similar weird situation with periodically stuck Linux N=
FS clients mounting Isilon NFS servers (Isilon is FreeBSD based but they se=
em to have there own nfsd)
We=E2=80=99ve had better luck and we did manage to have packet captures on =
both sides during the issue. The gist of it goes like follows:

- Data flows correctly between SERVER and the CLIENT
- At some point SERVER starts decreasing it's TCP Receive Window until it r=
eachs 0
- The client (eager to send data) can only ack data sent by SERVER.
- When SERVER was done sending data, the client starts sending TCP Window P=
robes hoping that the TCP Window opens again so he can flush its buffers.
- SERVER responds with a TCP Zero Window to those probes.
- After 6 minutes (the NFS server default Idle timeout) SERVER racefully cl=
oses the TCP connection sending a FIN Packet (and still a TCP Window at 0)=
=20
- CLIENT ACK that FIN.
- SERVER goes in FIN_WAIT_2 state
- CLIENT closes its half part part of the socket and goes in LAST_ACK state=
.
- FIN is never sent by the client since there still data in its SendQ and r=
eceiver TCP Window is still 0. At this stage the client starts sending TCP =
Window Probes again and again hoping that the server opens its TCP Window s=
o it can flush it's buffers and terminate its side of the socket.
- SERVER keeps responding with a TCP Zero Window to those probes.
=3D> The last two steps goes on and on for hours/days freezing the NFS moun=
t bound to that TCP session.

If we had a situation where CLIENT was responsible for closing the TCP Wind=
ow (and initiating the TCP FIN first) and server wanting to send data we=E2=
=80=99ll end up in the same state as you I think.

We=E2=80=99ve never had the root cause of why the SERVER decided to close t=
he TCP Window and no more acccept data, the fix on the Isilon part was to r=
ecycle more aggressively the FIN_WAIT_2 sockets (net.inet.tcp.fast_finwait2=
_recycle=3D1 & net.inet.tcp.finwait2_timeout=3D5000). Once the socket recyc=
led and at the next occurence of CLIENT TCP Window probe, SERVER sends a RS=
T, triggering the teardown of the session on the client side, a new TCP han=
dchake, etc and traffic flows again (NFS starts responding)

To avoid rebooting the client (and before the aggressive FIN_WAIT_2  was im=
plemented on the Isilon side) we=E2=80=99ve added a check script on the cli=
ent that detects LAST_ACK sockets on the client and through iptables rule e=
nforces a TCP RST, Something like: -A OUTPUT -p tcp -d $nfs_server_addr --s=
port $local_port -j REJECT --reject-with tcp-reset (the script removes this=
 iptables rule as soon as the LAST_ACK disappears)

The bottom line would be to have a packet capture during the outage (client=
 and/or server side), it will show you at least the shape of the TCP exchan=
ge when NFS is stuck.

Youssef