From owner-freebsd-current@freebsd.org Sun Jun 17 12:35:15 2018 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 23726101D75B for ; Sun, 17 Jun 2018 12:35:15 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-QB1-obe.outbound.protection.outlook.com (mail-eopbgr660056.outbound.protection.outlook.com [40.107.66.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "Microsoft IT TLS CA 4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 94E1D6CCAE for ; Sun, 17 Jun 2018 12:35:14 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from YTOPR0101MB0953.CANPRD01.PROD.OUTLOOK.COM (52.132.44.24) by YTOPR0101MB0746.CANPRD01.PROD.OUTLOOK.COM (52.132.43.149) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.863.19; Sun, 17 Jun 2018 12:35:12 +0000 Received: from YTOPR0101MB0953.CANPRD01.PROD.OUTLOOK.COM ([fe80::d0eb:3783:7c99:2802]) by YTOPR0101MB0953.CANPRD01.PROD.OUTLOOK.COM ([fe80::d0eb:3783:7c99:2802%3]) with mapi id 15.20.0863.016; Sun, 17 Jun 2018 12:35:12 +0000 From: Rick Macklem To: "freebsd-current@freebsd.org" CC: "andreas.nagy@frequentis.com" Subject: ESXi NFSv4.1 client id is nasty Thread-Topic: ESXi NFSv4.1 client id is nasty Thread-Index: AQHUBjX1G4uSBXqIhUeci7+GbxFK0w== Date: Sun, 17 Jun 2018 12:35:12 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=rmacklem@uoguelph.ca; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; YTOPR0101MB0746; 7:2GCER1YYgXmSACK1D8MTZHMc7LtJJ/EseXpun1ItmEisLdrqQSXASMt7FjjM2WtNVMfZXO4WhjJJkXiEP+5iCN4Isk8laujChM0lmb4RR3/W+X4TRbYT3aQ0DVX2v7uPo9OZYrXMphQNZSFyt0GtL7IH8GFNiIOOhnIYbVLW3wJ4iNUjMBPBRncnP7lYDcXLI4FowWOUEFktfAMPtu29E4mRqoQntVG4iYeawBbx5OdoY2PjtjoKqcIkqVJ73EGN x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-correlation-id: f78b3922-0bdb-4be3-d95f-08d5d44ec523 x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(8989080)(4534165)(4627221)(201703031133081)(201702281549075)(8990040)(5600026)(711020)(2017052603328)(7153060)(7193020); SRVR:YTOPR0101MB0746; x-ms-traffictypediagnostic: YTOPR0101MB0746: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(158342451672863); x-ms-exchange-senderadcheck: 1 x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040522)(2401047)(5005006)(8121501046)(3231254)(944501410)(52105095)(3002001)(93006095)(93001095)(10201501046)(149027)(150027)(6041310)(20161123558120)(201703131423095)(201702281529075)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123562045)(20161123560045)(20161123564045)(6072148)(201708071742011)(7699016); SRVR:YTOPR0101MB0746; BCL:0; PCL:0; RULEID:; SRVR:YTOPR0101MB0746; x-forefront-prvs: 07063A0A30 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(396003)(346002)(376002)(39380400002)(366004)(39860400002)(199004)(189003)(51874003)(26005)(5660300001)(6436002)(74482002)(786003)(2351001)(4326008)(3280700002)(25786009)(5640700003)(105586002)(186003)(316002)(106356001)(486006)(476003)(59450400001)(7696005)(68736007)(6506007)(102836004)(55016002)(6916009)(9686003)(3660700001)(2906002)(53936002)(99286004)(86362001)(74316002)(97736004)(2900100001)(33656002)(305945005)(5250100002)(14454004)(2501003)(478600001)(8936002)(81156014)(81166006)(8676002)(12363001); DIR:OUT; SFP:1101; SCL:1; SRVR:YTOPR0101MB0746; H:YTOPR0101MB0953.CANPRD01.PROD.OUTLOOK.COM; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: uoguelph.ca does not designate permitted sender hosts) x-microsoft-antispam-message-info: qbIxzPNRguRIRsfLGjEtzpP/ujzzz6rCmxKjK3SNdcfeqhi+rqqO6pTNzB4L7jN0+sTpATdpSgd49GNjqty/XZOo4LZjVVF282cC+EpYadSFX/KuWok9uKQcFI8ppJyn6g2xI5vpJss9dykZZ8edaB0cy1udy2Wf+UtQ83bDoHJtp5X93FXCZ5H3KA5I1jqF spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-Network-Message-Id: f78b3922-0bdb-4be3-d95f-08d5d44ec523 X-MS-Exchange-CrossTenant-originalarrivaltime: 17 Jun 2018 12:35:12.2010 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-Transport-CrossTenantHeadersStamped: YTOPR0101MB0746 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 17 Jun 2018 12:35:15 -0000 Hi, Andreas Nagy has been doing a lot of testing of the NFSv4.1 client in ESXi = 6.5u1 (VMware) against the FreeBSD server. I have given him a bunch of hackish pa= tches to try and some of them do help. However not all issues are resolved. The problem is that these hacks pretty obviously violate the NFSv4.1 RFC (5= 661). (Details on these come later, for those interested in such things.) I can think of three ways to deal with this: 1 - Just leave the server as is and point people to the issues that should = be addressed in the ESXi client. 2 - Put the hacks in, but only enable them based on a sysctl not enabled by= default. (The main problem with this is when the server also has non-ESXi mount= s.) 3 - Enable the hacks for ESXi client mounts only, using the implementation = ID it presents at mount time in its ExchangeID arguments. - This is my preferred solution, but the RFC says: An example use for implementation identifiers would be diagnostic software that extracts this information in an attempt to identify interoperability problems, performance workload behaviors, or general usage statistics. Since the intent of having access to this information is for planning or general diagnosis only, the client and server MUST NOT interpret this implementation identity information in a way that affects interoperational behavior of the implementation. The reason is that if clients and servers did such a thing, they might use fewer capabilities of the protocol than the peer can support, or the client and server might refuse to interoperate. Note the "MUST NOT" w.r.t. doing this. Of course, I could argue that, since= the hacks violate the RFC, then why not enable them in a way that violates the = RFC. Anyhow, I would like to hear from others w.r.t. how they think this should = be handled? Here's details on the breakage and workarounds for those interested, from l= ooking at packet traces in wireshark: Fairly benign ones: - The client does a ReclaimComplete with one_fs =3D=3D false and then does = a ReclaimComplete with one_fs =3D=3D true. The server returns NFS4ERR_COMPLETE_ALREADY for the second one, which the ESXi client doesn't like. Woraround: Don't return an error for the one_fs =3D=3D true case and just= assume that same as "one_fs =3D=3D false". There is also a case where the client only does the ReclaimComplete with one_fs =3D=3D true. Since FreeBSD exports a hierarch= y of file systems, this doesn't indicate to the server that all reclaims are d= one. (Other extant clients never do the "one_fs =3D=3D true" variant of ReclaimComplete.) This case of just doing the "one_fs =3D=3D true" variant is actually a li= mitation of the server which I don't know how to fix. However the same workaround as listed about gets around it. - The client puts random garbage in the delegate_type argument for Open/ClaimPrevious. Workaround: Since the client sets OPEN4_SHARE_ACCESS_WANT_NO_DELEG, it do= esn't want a delegation, so assume OPEN_DELEGATE_NONE or OPEN_DELEGATE_NONE= _EXT instead of garbage. (Not sure which of the two values makes it happie= r.) Serious ones: - The client does a OpenDowngrade with arguments set to OPEN_SHARE_ACCESS_B= OTH and OPEN_SHARE_DENY_BOTH. Since OpenDowngrade is supposed to decrease share_access and share_deny, the server returns NFS4ERR_INVAL. OpenDowngrade is not supposed to ever conflict with another Open. (A conflict happens when another Open has set an OPEN_SHARE_DENY that denies the result of the OpenDowngrade.) with NFS4ERR_SHARE_DENIED. I believe this one is done by the client for something it calls a "device lock" and really doesn't like this failing. Workaround: All I can think of is ignore the check for new bits not being= set and reply NFS_OK, when no conflicting Open exists. When there is a conflicting Open, returning NFS4ERR_INVAL seems to be= the only option, since NFS4ERR_SHARE_DENIED isn't listed for OpenDowngrad= e. - When a server reboots, client does not serialize ExchangeID/CreateSession= . When the server reboots, a client needs to do a serialized set of RPCs with ExchangeID followed by CreateSession to confirm it. The reply to ExchangeID has a sequence number (csr_sequence) in it and the CreateSession needs to have the same value in its csa_sequence argument to confirm the clientid issued by the ExchangeID. The client sends many ExchangeIDs and CreateSessions, so they end up fail= ing many times due to the sequence number not matching the last ExchangeID. (This might only happen in the trunked case.) Workaround: Nothing that I can think of. - ExchangeID sometimes sends eia_clientowner.co_verifier argument as all ze= ros. Sometimes the client bogusly fills in the eia_clientowner.co_verifier argument to ExchangeID with all 0s instead of the correct value. This indicates to the server that the client has rebooted (it has not) and results in the server discarding any state for the client and re-initializing the clientid. Workaround: The server can ignore the verifier changing and make the reco= very work better. This clearly violates RFC5661 and can only be done for ESXi clients, since ignoring this breaks a Linux client hard reboot. - The client doesn't seem to handle NFS4ERR_GRACE errors correctly. These occur when any non-reclaim operations are done during the grace period after a server boot. (A client needs to delay a while and then retry the operation, repeating for as long as NFS4ERR_GRACE is received from the server. This client does not do this.) Workaround: Nothing that I can think of. Thanks in advance for any comments, rick