Date: Mon, 3 May 2021 00:27:42 +0000 From: Rick Macklem <rmacklem@uoguelph.ca> To: freebsd-stable <freebsd-stable@freebsd.org> Cc: Peter Eriksson <pen@lysator.liu.se>, Ryan Moeller <freqlabs@FreeBSD.org>, Garrett Wollman <wollman@hergotha.csail.mit.edu>, Alan Somers <asomers@freebsd.org>, Juraj Lutter <otis@FreeBSD.org> Subject: Re: wanna solve the Linux NFSv4 client puzzle? Message-ID: <YQXPR0101MB09685D1285AF7DE46E5D7738DD5C9@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM> In-Reply-To: <YQXPR0101MB09682E0EEF2995E3FBC20BB8DD409@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM> References: <YQXPR0101MB09682E0EEF2995E3FBC20BB8DD409@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM>
next in thread | previous in thread | raw e-mail | index | archive | help
Rick Macklem wrote:=0A= >Hi,=0A= >=0A= >I posted recently that enabling delegations should be avoided at this time= ,=0A= >especially if your FreeBSD NFS server has Linux client mounts...=0A= >=0A= >I thought some of you might be curious why, and I thought it would be=0A= >more fun if you look for yourselves.=0A= >To play the game, you need to download a packet capture:=0A= >fetch https://people.freebsd.org/~rmacklem/twoclientdeleg.pcap=0A= >and then load it into wireshark.=0A= >=0A= >192.168.1.5 - FreeBSD server with all recent patches=0A= >192.168.1.6 - FedoraCore 30 (Linux 5.2 kernel) client=0A= >192.168.1.13 - FreeBSD client=0A= >=0A= >A few hints buried in RFC5661:=0A= >- A fore channel is used for normal client->server RPCs and a back channel= =0A= > is used for server->client callback RPCs.=0A= >- After a new TCP is created, neither the fore nor back channels=0A= > are bound to the connection.=0A= >- Bindings channel(s) to a connection is done by BindConnectionToSession.= =0A= > but an implicit binding for the fore channel is created when the first R= PC=0A= > request with a Sequence operation in it is sent on the new TCP connectio= n.=0A= >- A server->client callback cannot be done until the back channel is bound= =0A= > via BindConnectionToServer.=0A= >=0A= >Ok, so we are ready...=0A= >- Look at packet #s 3518->3605.=0A= > - What is going on here?=0A= Ok, so here's my solution...=0A= packet #3518, 3520 and 3521 are delegation recalls (CB_RECALL)=0A= for 3 different delegations on three different session slots.=0A= time: 137.5=0A= =0A= Expected response from the Linux client--> 3 replies to the CB_RECALLs.=0A= What does it actually do?=0A= --> Creates a new TCP connection using same port#. You can see it send=0A= a FIN (packet# 3523) and a SYN (packet# 3527).=0A= This means that the client is no longer obliged to reply to the CB_RE= CALLs=0A= and the FreeBSD server will probably need to retry them.=0A= --> It also means that no back channel is bound to the session, so th= e=0A= server cannot do callbacks (ie. cannot retry the CB_RECALLs ye= t).=0A= =0A= packet# 3530 is a Setattr RPC, which has a Sequence operation in it.=0A= --> This means the fore channel is implicitly bound to the new TCP=0A= connection, but no back channel, so the server cannot retry the CB_RE= CALLs.=0A= =0A= You will notice a bunch of Setattr RPCs getting NFS4ERR_DELAY replies.=0A= This tells the Linux client to "try again later".=0A= --> It happens because the FreeBSD server cannot perform the Setattr=0A= until the client returns a delegation.=0A= --> That requires a CB_RECALL.=0A= =0A= packet# 3582 is a Setattr RPC reply. If you look in the Sequence operation= =0A= reply, you will see the flag SEQ4_STATUS_CB_PATH_DOWN is set.=0A= --> This is the FreeBSD server telling the Linux client that the callback p= ath=0A= is down (the back channel is not bound to the new TCP connection).= =0A= Time: 137.6 (took about 0.1sec for the server to notice that the callback= =0A= path/back channel is not working).=0A= =0A= packet# 3604 Linux client does a BindConnectionToSession to bind the=0A= back channel.=0A= --> This is not permitted by RFC5661, since it is required to be done on=0A= the new TCP connection before the implicit binding of the fore=0A= channel only, already done by packet# 3530.=0A= packet# 3605 FreeBSD server violates RFC5661 and allows the binding=0A= to be done, so that CB_RECALLs can again be done.=0A= Time: 152.7=0A= =0A= - How long does this take?=0A= 152.7 - 137.5 =3D 15.2seconds=0A= =0A= >--> One more hint. Starting with #3605, things are working again.=0A= --> Things start working again because the FreeBSD server=0A= cheats and allows the BindConnectionToSession to be done.=0A= RFC5661 specifies a reply of NFS4ERR_INVAL for this.=0A= =0A= >There are actually 3 other examples of this in the pack capture.=0A= Every time multiple concurrent callbacks are attempted, the Linux=0A= client "bails out" by creating a new TCP connection.=0A= --> This is said to be fixed in Linux 5.3, but I haven't tested a newer=0A= kernel than 5.2 yet.=0A= =0A= >Btw, one of the weirdnesses is said to be fixed in Linux 5.3 and the other= =0A= >in Linux 5.7, although I have not yet upgraded my kernel and tested this.= =0A= The "do BindConnectionToSession after an implicit binding" is said to be fi= xed=0A= in Linux 5.7, however the fix is not exactly what I would have expected.=0A= --> I would have expected a BindConnectionToSession to be done right=0A= away when a new TCP connection is created.=0A= --> Linux 5.7 and newer is said to still wait (15sec?) to do the=0A= BindConnectionToSession, but fixes the bug by creating yet=0A= another new TCP connection just before doing the=0A= BindConnectionToSession RPC.=0A= --> A SEQ4_STATUS_CB_PATH_DOWN flag set in a Sequence operation=0A= reply is what triggers the BindConnectionToSession and that is = still=0A= required for 5.7 or newer, but I'll need to test to see how lon= g it takes=0A= for newer kernels?=0A= =0A= The old "cheat", which is still in the released server code (recently remov= ed=0A= by a patch in main, stable/12 and stable/13) implicitly bound both the fore= =0A= and back channels. Look for this comment in sys/fs/nfsserver/nfs_nfsdstate.= c=0A= in unpatched code...=0A= /*=0A= * If this session handles the backchannel, save the nd_xprt for this=0A= * RPC, since this is the one being used.=0A= * RFC-5661 specifies that the fore channel will be implicitly=0A= * bound by a Sequence operation. However, since some NFSv4.1 clients=0A= * erroneously assumed that the back channel would be implicitly=0A= * bound as well, do the implicit binding unless a=0A= * BindConnectiontoSession has already been done on the session.=0A= */=0A= =0A= --> This worked fine and avoided most of the above craziness, but...=0A= (A) It violated RFC5661.=0A= and=0A= (B) It broke the Linux client badly when the "nconnects" mount=0A= option (added fairly recently) was used.=0A= --> So I felt I had to get rid of it. (The non-conformance with=0A= RFC5661 was reported by redhat.)=0A= =0A= Bottom line...unless all your Linux clients are kernel version 5.3 or newer= ,=0A= avoid enabling delegations in the FreeBSD NFSv4.1/4.2 server.=0A= --> Even with a completely patched server, you will still get 15second paus= es=0A= every time the server attempts multiple concurrent callbacks.=0A= =0A= >Have fun with it, rick=0A= At least you can now see why I have "fun with it";-) rick=0A= =0A= _______________________________________________=0A= freebsd-stable@freebsd.org mailing list=0A= https://lists.freebsd.org/mailman/listinfo/freebsd-stable=0A= To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"= =0A=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YQXPR0101MB09685D1285AF7DE46E5D7738DD5C9>