From owner-freebsd-stable@freebsd.org Mon May 3 00:35:35 2021 Return-Path: Delivered-To: freebsd-stable@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 44D30621895 for ; Mon, 3 May 2021 00:35:35 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-oi1-f174.google.com (mail-oi1-f174.google.com [209.85.167.174]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FYPCB3ySrz3ltT; Mon, 3 May 2021 00:35:34 +0000 (UTC) (envelope-from asomers@gmail.com) Received: by mail-oi1-f174.google.com with SMTP id d21so1882576oic.11; Sun, 02 May 2021 17:35:34 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=s6dJPCisXPx3cUnTBwceo7Bxqs4WYlditCTQymbAiqM=; b=tjKFJv38VnPI0a273KdoATrVm9FQBW3wWdclzIxZj5YU/LDe4CwROW8iXfEZ1ZVXOe ximhqduiLplFVy6mRkKHUQLopuXc0tFKbc1wb1WKfD5CZW5Z7J990SJOtw/LVt7PwYoE 3fQ/GYZVeS/DJbEmumJdNQZ3+vZAnulOIE/UKks9FkUREo53XD1xQN49FB+G7dd2d7Lb XZe+vQG1tMDww8Vnfid9ClAp2v06EVtHU78OmC9VHgMPzYOdaT7xH2GgydoOpSrLCxpu NJ5V47zX4z/6M33HNthOuZOWDsn4Z13S7xvNMQXd03FOjxgK34d2Hfgmq71YQ9japMoL xj6A== X-Gm-Message-State: AOAM530TRe9gOD7TzLUyzFoZANk9ta4SRHJnFZbqjf7eeVp2wLBoYmih VF3pD/Mw9m7fj4duXwF0RLzPfPRnBdR5ZiCbiTQYjaqf X-Google-Smtp-Source: ABdhPJyC10VJgSwiVGha2GhmqxNf2IXgZYGLeQxUHUZcOInDaEXVIkVm6F9g9IHUIfqD3DSN575DtV4DG1ODaa2+Yko= X-Received: by 2002:a54:4518:: with SMTP id l24mr12248634oil.73.1620002133365; Sun, 02 May 2021 17:35:33 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Alan Somers Date: Sun, 2 May 2021 18:35:22 -0600 Message-ID: Subject: Re: wanna solve the Linux NFSv4 client puzzle? To: Rick Macklem Cc: freebsd-stable , Peter Eriksson , Ryan Moeller , Garrett Wollman , Juraj Lutter X-Rspamd-Queue-Id: 4FYPCB3ySrz3ltT X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of asomers@gmail.com designates 209.85.167.174 as permitted sender) smtp.mailfrom=asomers@gmail.com X-Spamd-Result: default: False [-1.67 / 15.00]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17]; RCPT_COUNT_FIVE(0.00)[6]; TO_DN_ALL(0.00)[]; NEURAL_HAM_SHORT(-0.68)[-0.680]; FORGED_SENDER(0.30)[asomers@freebsd.org,asomers@gmail.com]; MIME_TRACE(0.00)[0:+,1:+,2:~]; SUBJECT_ENDS_QUESTION(1.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; RBL_DBL_DONT_QUERY_IPS(0.00)[209.85.167.174:from]; R_DKIM_NA(0.00)[]; ARC_NA(0.00)[]; FREEFALL_USER(0.00)[asomers]; FROM_HAS_DN(0.00)[]; NEURAL_HAM_MEDIUM(-0.99)[-0.995]; FROM_NEQ_ENVFROM(0.00)[asomers@freebsd.org,asomers@gmail.com]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; DMARC_NA(0.00)[freebsd.org]; NEURAL_HAM_LONG(-1.00)[-1.000]; SPAMHAUS_ZRD(0.00)[209.85.167.174:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[209.85.167.174:from]; RWL_MAILSPIKE_POSSIBLE(0.00)[209.85.167.174:from]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-stable] Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.34 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 May 2021 00:35:35 -0000 On Sun, May 2, 2021 at 6:27 PM Rick Macklem wrote: > Rick Macklem wrote: > >Hi, > > > >I posted recently that enabling delegations should be avoided at this > time, > >especially if your FreeBSD NFS server has Linux client mounts... > > > >I thought some of you might be curious why, and I thought it would be > >more fun if you look for yourselves. > >To play the game, you need to download a packet capture: > >fetch https://people.freebsd.org/~rmacklem/twoclientdeleg.pcap > >and then load it into wireshark. > > > >192.168.1.5 - FreeBSD server with all recent patches > >192.168.1.6 - FedoraCore 30 (Linux 5.2 kernel) client > >192.168.1.13 - FreeBSD client > > > >A few hints buried in RFC5661: > >- A fore channel is used for normal client->server RPCs and a back channel > > is used for server->client callback RPCs. > >- After a new TCP is created, neither the fore nor back channels > > are bound to the connection. > >- Bindings channel(s) to a connection is done by BindConnectionToSession. > > but an implicit binding for the fore channel is created when the first > RPC > > request with a Sequence operation in it is sent on the new TCP > connection. > >- A server->client callback cannot be done until the back channel is bound > > via BindConnectionToServer. > > > >Ok, so we are ready... > >- Look at packet #s 3518->3605. > > - What is going on here? > Ok, so here's my solution... > packet #3518, 3520 and 3521 are delegation recalls (CB_RECALL) > for 3 different delegations on three different session slots. > time: 137.5 > > Expected response from the Linux client--> 3 replies to the CB_RECALLs. > What does it actually do? > --> Creates a new TCP connection using same port#. You can see it send > a FIN (packet# 3523) and a SYN (packet# 3527). > This means that the client is no longer obliged to reply to the > CB_RECALLs > and the FreeBSD server will probably need to retry them. > --> It also means that no back channel is bound to the session, so > the > server cannot do callbacks (ie. cannot retry the CB_RECALLs > yet). > > packet# 3530 is a Setattr RPC, which has a Sequence operation in it. > --> This means the fore channel is implicitly bound to the new TCP > connection, but no back channel, so the server cannot retry the > CB_RECALLs. > > You will notice a bunch of Setattr RPCs getting NFS4ERR_DELAY replies. > This tells the Linux client to "try again later". > --> It happens because the FreeBSD server cannot perform the Setattr > until the client returns a delegation. > --> That requires a CB_RECALL. > > packet# 3582 is a Setattr RPC reply. If you look in the Sequence operation > reply, you will see the flag SEQ4_STATUS_CB_PATH_DOWN is set. > --> This is the FreeBSD server telling the Linux client that the callback > path > is down (the back channel is not bound to the new TCP connection). > Time: 137.6 (took about 0.1sec for the server to notice that the callback > path/back channel is not working). > > packet# 3604 Linux client does a BindConnectionToSession to bind the > back channel. > --> This is not permitted by RFC5661, since it is required to be done on > the new TCP connection before the implicit binding of the fore > channel only, already done by packet# 3530. > packet# 3605 FreeBSD server violates RFC5661 and allows the binding > to be done, so that CB_RECALLs can again be done. > Time: 152.7 > > - How long does this take? > 152.7 - 137.5 = 15.2seconds > > >--> One more hint. Starting with #3605, things are working again. > --> Things start working again because the FreeBSD server > cheats and allows the BindConnectionToSession to be done. > RFC5661 specifies a reply of NFS4ERR_INVAL for this. > > >There are actually 3 other examples of this in the pack capture. > Every time multiple concurrent callbacks are attempted, the Linux > client "bails out" by creating a new TCP connection. > --> This is said to be fixed in Linux 5.3, but I haven't tested a newer > kernel than 5.2 yet. > > >Btw, one of the weirdnesses is said to be fixed in Linux 5.3 and the other > >in Linux 5.7, although I have not yet upgraded my kernel and tested this. > The "do BindConnectionToSession after an implicit binding" is said to be > fixed > in Linux 5.7, however the fix is not exactly what I would have expected. > --> I would have expected a BindConnectionToSession to be done right > away when a new TCP connection is created. > --> Linux 5.7 and newer is said to still wait (15sec?) to do the > BindConnectionToSession, but fixes the bug by creating yet > another new TCP connection just before doing the > BindConnectionToSession RPC. > --> A SEQ4_STATUS_CB_PATH_DOWN flag set in a Sequence operation > reply is what triggers the BindConnectionToSession and that is > still > required for 5.7 or newer, but I'll need to test to see how > long it takes > for newer kernels? > > The old "cheat", which is still in the released server code (recently > removed > by a patch in main, stable/12 and stable/13) implicitly bound both the fore > and back channels. Look for this comment in > sys/fs/nfsserver/nfs_nfsdstate.c > in unpatched code... > /* > * If this session handles the backchannel, save the nd_xprt for > this > * RPC, since this is the one being used. > * RFC-5661 specifies that the fore channel will be implicitly > * bound by a Sequence operation. However, since some NFSv4.1 > clients > * erroneously assumed that the back channel would be implicitly > * bound as well, do the implicit binding unless a > * BindConnectiontoSession has already been done on the session. > */ > > --> This worked fine and avoided most of the above craziness, but... > (A) It violated RFC5661. > and > (B) It broke the Linux client badly when the "nconnects" mount > option (added fairly recently) was used. > --> So I felt I had to get rid of it. (The non-conformance with > RFC5661 was reported by redhat.) > > Bottom line...unless all your Linux clients are kernel version 5.3 or > newer, > avoid enabling delegations in the FreeBSD NFSv4.1/4.2 server. > --> Even with a completely patched server, you will still get 15second > pauses > every time the server attempts multiple concurrent callbacks. > > >Have fun with it, rick > At least you can now see why I have "fun with it";-) rick > Ughh. I'm glad you figured it out so I didn't have to. Thanks for all the hard work, Rick.