From owner-freebsd-current@freebsd.org Wed Dec 30 16:48:37 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id ACC694C89B3 for ; Wed, 30 Dec 2020 16:48:37 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-TO1-obe.outbound.protection.outlook.com (mail-to1can01on062c.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe5d::62c]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "GlobalSign Organization Validation CA - SHA256 - G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4D5cg93hjBz3spx; Wed, 30 Dec 2020 16:48:37 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=JybeFmH6gTBryuPxG6HzQ7znISkZ4ir1ved9sHjlI5vcWRbNTxpn14pS/ZM8wHAOfTUtIA6Ng0ljvRQsd19fE9o7P+idEdmg6oEz/jv0ECNtWtOLDMSmTV1b9drD8210hZ2M4MDxYumc1qnWidiB0clr2Q3ljG353DfHys6zSL0iRklpqaIUPqGR4khLHouDatd4lUbx2YHFX0YvDuurzUb06bgu+CTqrvYnLbPMcegUws0Zpsmd/MCH8mFu4UfKViEVx3FgVBerEP/44kMITv/MO3DKwf/07AqWP+tRfLMnpGLl0g2ASuclWP6He+NwOSjhqWkZ1/8/ua6ag72xjg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=idNS+xLra0eHmU4dhwyXbB/cU0R7177cvOyAqLq2epA=; b=AZBkL8e+29+ZzU856ybr1dzI3bczWgzgrvYrPaeNeEaNgh9feD8eGb4nNJSIJ8XVG3Q5N5jUc8+BtIsa2AkEYebXfx07nGHfz+bGN0qQt7R46wG8pz420qMBqF0FgyVtSKBbz0sFNMmcGji5QuTV0EwCxm602BpUTpyfExaG7qdKMXusG4B5M3rXvmrSPLdZxQo0KF/RU45SvPB8HIOg16mVJRx32iMC2jbUb8zRlaLDZ7t5POT1Vweu1ERXo6Z4o4VfLul8clbl3z3qizZtyo0bvf5E6ApmNyqCBn9ESX+U3U6NZmbh6hWZLG9fvTHKMam119Os8lCDtPJrTc0Wig== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:19::29) by YQXPR01MB3909.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:4e::29) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3721.20; Wed, 30 Dec 2020 16:48:35 +0000 Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::3d86:c7f9:bc4c:40c0]) by YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::3d86:c7f9:bc4c:40c0%6]) with mapi id 15.20.3721.020; Wed, 30 Dec 2020 16:48:27 +0000 From: Rick Macklem To: Konstantin Belousov CC: "freebsd-current@freebsd.org" , Alan Somers , Kirk McKusick , Mark Johnston Subject: Re: r367672 broke the NFS server Thread-Topic: r367672 broke the NFS server Thread-Index: AQHW3k3ynlL3lGhXRkC37zjbBa1PUKoPmNaAgAA7NVY= Date: Wed, 30 Dec 2020 16:48:27 +0000 Message-ID: References: , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 9218ba02-00a8-4d2d-8751-08d8ace2bb2a x-ms-traffictypediagnostic: YQXPR01MB3909: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: J/hWd2jzcqYmxGjYRbxZhbJvIsOyNQrg5LZ9dJAAiARB0bpnGRYf0ALjLg5udsQIdrl1Df8fuRfpZX/ckIBedRb90FoEw61sOOmIQ1De4hP1scXoYzuM+SphjaN0CCkhMIXQdaVjrsDZ7+b2ctL0d1ThdvApOkTz5sdvSp+VRjOPSJ2qNnu649UjvEDCHmKBIVGs6Ljuii9uMSzqboIAL5mWyYCKC07axLOA/qvgrFCHr6EA6upc8LBAa4rRNdfKGdmYcYrgds04FEE6fdgvrm2/bpy9beeIv4MsuYkAMoVlVRIObDy0HqIRzGMPWzA86wlwv626+YgeimlRzeMFAhDlzpriqkxpqEdRhDvYgIl1eT8PIp4EOjZAgKTY3dMK8LxmOrNHX55++r3KuIFQniAIdnAC+QKJmcABcGV+S+rZnFWiquV03dvmoVzgccSJJFYxQoLVI3ghjau6Qqp/Qw== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(376002)(136003)(396003)(39850400004)(346002)(366004)(66476007)(66946007)(8676002)(86362001)(66556008)(55016002)(91956017)(966005)(2906002)(8936002)(76116006)(54906003)(786003)(9686003)(316002)(66446008)(64756008)(478600001)(6506007)(33656002)(83380400001)(6916009)(4326008)(71200400001)(52536014)(186003)(26005)(5660300002)(7696005); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?iso-8859-1?Q?UTNjAnHf48rKxe/sJzN5/9UIWpuFNUC91pW71RYJdFJ3gO865KiBKOrINq?= =?iso-8859-1?Q?HF9laI1RXL+KdB17rfeVMe4g2+UYJJKHcpi/hTR+KkK7J5iwVe6S9pR+LP?= =?iso-8859-1?Q?S/xaXQ7BKW26rQG7Xs81iwLxoZ2Sfe/E3O1zHpwFU0tUVIv6onxhB3kCw3?= =?iso-8859-1?Q?QgZzbTGuvN8qSNIC00mTfGtoSibTA5tciMN/lucpMTdNXXyrfu7ogJTH5E?= =?iso-8859-1?Q?RXEkF0wjIV2clfkJL6+vGvLwjsgNv2VdhwyeoS11ORuiFmcUOkSqkMsSyH?= =?iso-8859-1?Q?nseuinq0ihHYHDLRJ+hwdAB5jdcUYnqkGq/s+yN2ZFFGDbGTYWhrUXOkCD?= =?iso-8859-1?Q?HF27Hgcr0hOYQnl6hE0TveR6qszocXS7e9+iotEDMLXO3h/v+P92OJoFt6?= =?iso-8859-1?Q?tIHD4hzoTKvXM4TVmgV2bXBwrr5CD/ixcRusIBI4G5eXIhAaQubsVEsQ62?= =?iso-8859-1?Q?Xu6AmrpmYAzzuvoRNWhhgJkc9eVHf4gudtNIywGetao9PB/3ZF75a21odY?= =?iso-8859-1?Q?Jgcye3cyFd4VwfhvbIegLLO3PeF8r7e3HMYQB6ZS9GAqSRbB/WVk2pxnhK?= =?iso-8859-1?Q?PXdVTzVRo1uplo29TR5pX7sy4uNcMVJQRKnrnhyHBi6maucm89nkII27GH?= =?iso-8859-1?Q?sNLxMbQ7ENC88J5emEqp44BrFPy2ySjsVTPpYxTZLAdYxZITItNDykwBsf?= =?iso-8859-1?Q?/J1WvmMjpOwTxsJlOLbV2Kh7XaAdchfBv1WuPYvrSe2kQfNZnzakqHkPNx?= =?iso-8859-1?Q?1EoS8mo4U/9u7qOgILn5QyM3qQGalp1BsXLIHZnM0YROI6krtcKxG/V3yq?= =?iso-8859-1?Q?XYtQQSuOjZlP8YC65vr4IIMqJKx+qR0FrlCsBZ3tyKlikyNlVktZxwKS4A?= =?iso-8859-1?Q?Kf2wcb3osLcW/QmGdwoOUngDT+vTuyNFih3UynQDlE8m/p9/Qfi+xViFc5?= =?iso-8859-1?Q?TyPawiLLjbeG3ilujIVQ9a+BsGpji0XBiRAmxx3mHsOYCjHJdsDFAxbm3W?= =?iso-8859-1?Q?J4HtbSM69dhjwirD4=3D?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: 9218ba02-00a8-4d2d-8751-08d8ace2bb2a X-MS-Exchange-CrossTenant-originalarrivaltime: 30 Dec 2020 16:48:27.5108 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: cism0sr3aO8morZKRUB2HhWDraAr3VkjS2TBBavAWw71NmEcTBZTpQltl+4D3rRFYwAywJZWNIdiBSrwcYGGSw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: YQXPR01MB3909 X-Rspamd-Queue-Id: 4D5cg93hjBz3spx X-Spamd-Bar: ---- X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Dec 2020 16:48:37 -0000 Kostik wrote:=0A= >On Wed, Dec 30, 2020 at 02:02:48AM +0000, Rick Macklem wrote:=0A= >> Hi,=0A= >>=0A= >> Post r367671...=0A= >> When multiple files are being created by an NFS client in the same=0A= >> directory, the VOP_CREATE()/ufs_create() can fail with ERELOOKUP.=0A= >> This results in a EIO return to the NFS client.=0A= >> --> This causes "nfsv4 client/server protocol prob err=3D10026"=0A= >> on the client for NFSv4.0 mounts.=0A= >> --> This explains why this error has been reported by=0A= >> several people lately, although it should "never happen".=0A= >>=0A= >> Unfortunately, for the NFS server, the Lookup call is done separately=0A= >> and it will not be easy to redo it, given the current NFS code structure= .=0A= >>=0A= >> Is there another way to deal with the problem r367672 was fixing that=0A= >> avoids ufs_create() returning ERELOOKUP?=0A= >=0A= >Idea of the change is to restart the syscall at top level. So for NFS=0A= >server the right approach is to not send a response and also to not=0A= >free the request mbuf chain, but to restart processing.=0A= Yes. I took a look and I think restarting the operation by rolling the=0A= working position in the mbuf lists back and redoing the operation=0A= is feasible and easier than fixing the individual operations.=0A= =0A= For NFSv4, you cannot redo the entire compound, since non-idempotent=0A= operations like exclusive open may have already been completed.=0A= However, rolling back to the beginning of the operation should be=0A= doable.=0A= --> It will serve as a good test, in that it may expose bugs in the=0A= RPC/operation code where failure (ERELOOKUP) doesn't clean=0A= things up correctly.=0A= --> In NFSv4, there is the open/lock state that cannot be updated=0A= for this error case. (The seqid stuff in NFSv4.0 Open can be fu= n.=0A= Its used to serialize the operations and the number must be=0A= incremented for some errors, but not for others. The 10026=0A= error occurs when you don't get this right.)=0A= =0A= I'll start working on this to-day, but I have no idea how long it might=0A= take?=0A= =0A= >I am sorry I forgot about NFS server when designing this fix, the only=0A= >mild excuse I can provide is that the change was quite complicated as is.= =0A= >I will start looking at the fix.=0A= No problem. Sometimes I'd like to forget about NFS too;-).=0A= =0A= For the rollback/redo the RPC/operation case, it's probably easier for me= =0A= to do it. As above, I'll start on it, but...=0A= =0A= My main concern is how long it will take, given the FreeBSD13 release=0A= starts soon.=0A= =0A= rick=0A= _______________________________________________=0A= freebsd-current@freebsd.org mailing list=0A= https://lists.freebsd.org/mailman/listinfo/freebsd-current=0A= To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"= =0A= =0A=