Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 11 Feb 2021 21:07:37 +0000
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Alan Somers <asomers@freebsd.org>, freebsd-fs <freebsd-fs@freebsd.org>
Subject:   Re: NFS delegations don't expire after unmounting client
Message-ID:  <YQXPR0101MB0968EC580D4F4006E155AC9DDD8C9@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM>
In-Reply-To: <CAOtMX2h_2zCNpyzOs=SzuohRvLgga=Eip-LJ-7QjJBvwmueLXg@mail.gmail.com>
References:  <CAOtMX2h_2zCNpyzOs=SzuohRvLgga=Eip-LJ-7QjJBvwmueLXg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Alan Somers wrote:=0A=
>I have several Linux 5.9.15 clients mounting NFS 4.1 served from a FreeBSD=
=0A=
>12.2-RELEASE server.  Today, most of those clients' mounts hung, and their=
=0A=
>dmesg displayed "nfs: server XXX not responding, still trying".  But one=
=0A=
>client kept running fine.  nfsdumpstate on the server showed that that=0A=
>client, and that one only, had 2 delegations.  It also had 1 OpenOwner, 1=
=0A=
>Open, and the CB flags set.  It was the only client that had CB set.  On=
=0A=
>the theory that its delegation callbacks weren't working, I tried=0A=
>unmounting all of its NFS shares.  That worked, but to my surprise=0A=
>nfsdumpstate showed no change!  I could see that the lease time recorded i=
n=0A=
>/var/run/nfs-stablerestart was 120s, and I must've waited about 30m in all=
=0A=
>before disabling delegations, unmounting everything, and returning to NFS=
=0A=
>v3.  So my questions are, what can cause a delegation to linger around lon=
g=0A=
>after it should've expired, and what else can I do to debug this problem i=
f=0A=
>it recurs?=0A=
The FreeBSD NFSv4 server implements "courtesy locks" (my idea, but someone=
=0A=
else coined the term for it), where a lock is not thrown away until both th=
e=0A=
lease has expired and a conflicting lock request is received from another c=
lient.=0A=
--> In this case, that would be an Open of the file from another client.=0A=
The idea is to avoid loss of lock state when there is a networking partitio=
ning=0A=
that exceeds the lease duration.=0A=
=0A=
When a client dismounts, it should tell the server it is done with the open=
/lock=0A=
state by doing a DestroyClientID operation. (SetClientID/SetClientIDConfirm=
 for 4.0)=0A=
--> If the Linux client did this, then it sounds like something is broken i=
n the server,=0A=
      but my hunch is that the Linux client did not do this.=0A=
If you can capture packets during a dismount, you should be able to look=0A=
at them in wireshark and see if the DestroyClientID happened.=0A=
=0A=
There is also the nfsrevoke command, which is supposed to be able to=0A=
get rid of client lock state, but I'll admit I haven't tested it in like a =
decade;-)=0A=
=0A=
Maybe courtesy locks should be made optional, but they have never=0A=
caused problems and I'd have to look at the code to see if that can=0A=
easily be done. Might need to do so as another "make the broken Linux=0A=
client work" sysctl;-). They are now the defacto standard, you know;-)=0A=
=0A=
rick=0A=
=0A=
-Alan=0A=
_______________________________________________=0A=
freebsd-fs@freebsd.org mailing list=0A=
https://lists.freebsd.org/mailman/listinfo/freebsd-fs=0A=
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"=0A=
=0A=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YQXPR0101MB0968EC580D4F4006E155AC9DDD8C9>