From owner-freebsd-current@freebsd.org Mon Jun 18 21:21:26 2018 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0586B1022FD2 for ; Mon, 18 Jun 2018 21:21:26 +0000 (UTC) (envelope-from swills@FreeBSD.org) Received: from mouf.net (mouf.net [IPv6:2607:fc50:0:4400:216:3eff:fe69:33b3]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mouf.net", Issuer "mouf.net" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id A0F4A73F29 for ; Mon, 18 Jun 2018 21:21:25 +0000 (UTC) (envelope-from swills@FreeBSD.org) Received: from lrrr.mouf.net (cpe-24-163-43-246.nc.res.rr.com [24.163.43.246]) (authenticated bits=0) by mouf.net (8.14.9/8.14.9) with ESMTP id w5ILLGhD033385 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Mon, 18 Jun 2018 21:21:22 GMT (envelope-from swills@FreeBSD.org) Subject: Re: ESXi NFSv4.1 client id is nasty To: Rick Macklem , "freebsd-current@freebsd.org" Cc: "andreas.nagy@frequentis.com" References: From: Steve Wills Message-ID: Date: Mon, 18 Jun 2018 17:21:10 -0400 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.4.3 (mouf.net [199.48.129.64]); Mon, 18 Jun 2018 21:21:23 +0000 (UTC) X-Spam-Status: No, score=1.3 required=4.5 tests=RCVD_IN_RP_RNBL autolearn=no autolearn_force=no version=3.4.1 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mouf.net X-Virus-Scanned: clamav-milter 0.99.2 at mouf.net X-Virus-Status: Clean X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jun 2018 21:21:26 -0000 Would it be possible or reasonable to use the client ID to log a message telling the admin to enable a sysctl to enable the hacks? Steve On 06/17/18 08:35, Rick Macklem wrote: > Hi, > > Andreas Nagy has been doing a lot of testing of the NFSv4.1 client in ESXi 6.5u1 > (VMware) against the FreeBSD server. I have given him a bunch of hackish patches > to try and some of them do help. However not all issues are resolved. > The problem is that these hacks pretty obviously violate the NFSv4.1 RFC (5661). > (Details on these come later, for those interested in such things.) > > I can think of three ways to deal with this: > 1 - Just leave the server as is and point people to the issues that should be addressed > in the ESXi client. > 2 - Put the hacks in, but only enable them based on a sysctl not enabled by default. > (The main problem with this is when the server also has non-ESXi mounts.) > 3 - Enable the hacks for ESXi client mounts only, using the implementation ID > it presents at mount time in its ExchangeID arguments. > - This is my preferred solution, but the RFC says: > An example use for implementation identifiers would be diagnostic > software that extracts this information in an attempt to identify > interoperability problems, performance workload behaviors, or general > usage statistics. Since the intent of having access to this > information is for planning or general diagnosis only, the client and > server MUST NOT interpret this implementation identity information in > a way that affects interoperational behavior of the implementation. > The reason is that if clients and servers did such a thing, they > might use fewer capabilities of the protocol than the peer can > support, or the client and server might refuse to interoperate. > > Note the "MUST NOT" w.r.t. doing this. Of course, I could argue that, since the > hacks violate the RFC, then why not enable them in a way that violates the RFC. > > Anyhow, I would like to hear from others w.r.t. how they think this should be handled? > > Here's details on the breakage and workarounds for those interested, from looking > at packet traces in wireshark: > Fairly benign ones: > - The client does a ReclaimComplete with one_fs == false and then does a > ReclaimComplete with one_fs == true. The server returns > NFS4ERR_COMPLETE_ALREADY for the second one, which the ESXi client > doesn't like. > Woraround: Don't return an error for the one_fs == true case and just assume > that same as "one_fs == false". > There is also a case where the client only does the > ReclaimComplete with one_fs == true. Since FreeBSD exports a hierarchy of > file systems, this doesn't indicate to the server that all reclaims are done. > (Other extant clients never do the "one_fs == true" variant of > ReclaimComplete.) > This case of just doing the "one_fs == true" variant is actually a limitation > of the server which I don't know how to fix. However the same workaround > as listed about gets around it. > > - The client puts random garbage in the delegate_type argument for > Open/ClaimPrevious. > Workaround: Since the client sets OPEN4_SHARE_ACCESS_WANT_NO_DELEG, it doesn't > want a delegation, so assume OPEN_DELEGATE_NONE or OPEN_DELEGATE_NONE_EXT > instead of garbage. (Not sure which of the two values makes it happier.) > > Serious ones: > - The client does a OpenDowngrade with arguments set to OPEN_SHARE_ACCESS_BOTH > and OPEN_SHARE_DENY_BOTH. > Since OpenDowngrade is supposed to decrease share_access and share_deny, > the server returns NFS4ERR_INVAL. OpenDowngrade is not supposed to ever > conflict with another Open. (A conflict happens when another Open has > set an OPEN_SHARE_DENY that denies the result of the OpenDowngrade.) > with NFS4ERR_SHARE_DENIED. > I believe this one is done by the client for something it calls a > "device lock" and really doesn't like this failing. > Workaround: All I can think of is ignore the check for new bits not being set > and reply NFS_OK, when no conflicting Open exists. > When there is a conflicting Open, returning NFS4ERR_INVAL seems to be the > only option, since NFS4ERR_SHARE_DENIED isn't listed for OpenDowngrade. > > - When a server reboots, client does not serialize ExchangeID/CreateSession. > When the server reboots, a client needs to do a serialized set of RPCs > with ExchangeID followed by CreateSession to confirm it. The reply to > ExchangeID has a sequence number (csr_sequence) in it and the > CreateSession needs to have the same value in its csa_sequence argument > to confirm the clientid issued by the ExchangeID. > The client sends many ExchangeIDs and CreateSessions, so they end up failing > many times due to the sequence number not matching the last ExchangeID. > (This might only happen in the trunked case.) > Workaround: Nothing that I can think of. > > - ExchangeID sometimes sends eia_clientowner.co_verifier argument as all zeros. > Sometimes the client bogusly fills in the eia_clientowner.co_verifier > argument to ExchangeID with all 0s instead of the correct value. > This indicates to the server that the client has rebooted (it has not) > and results in the server discarding any state for the client and > re-initializing the clientid. > Workaround: The server can ignore the verifier changing and make the recovery > work better. This clearly violates RFC5661 and can only be done for > ESXi clients, since ignoring this breaks a Linux client hard reboot. > > - The client doesn't seem to handle NFS4ERR_GRACE errors correctly. > These occur when any non-reclaim operations are done during the grace > period after a server boot. > (A client needs to delay a while and then retry the operation, repeating > for as long as NFS4ERR_GRACE is received from the server. This client > does not do this.) > Workaround: Nothing that I can think of. > > Thanks in advance for any comments, rick > _______________________________________________ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" >