From nobody Thu May 5 00:56:05 2022 X-Original-To: stable@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 9C4841ACD4E3 for ; Thu, 5 May 2022 00:56:13 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-YT3-obe.outbound.protection.outlook.com (mail-yt3can01on2043.outbound.protection.outlook.com [40.107.115.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "DigiCert Cloud Services CA-1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4KtwHc3DYGz3N9T; Thu, 5 May 2022 00:56:12 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=oJvHMOwqWwTgEk2kQz/rVHnQTx5iT3D3AdqjJNyfHJA34BNzNsKa85fAhPrWggY2bwKqTqO/glVHjrwUf0yWnxWA5QSw1+V2RpXcpopbi6c4vId3BJC+MpI344GZ3grBCW+aaNdKBD6Cg8+0ss3TDD9LF5kOnoMWA5a0KD975E8j1Ftkydz43wDjfOELys5uHtX/7GreEP1xRBaUWmdDUPA7oc1/PJSF4EbwRlX4e5WOcyrM2o9JH3VuM+7kUY2z6bl5NmqORCfMB8/StmWCkrnjuE2+HaN6T4uo/pbN/YpYNwvy5WsFuUnzBGf+8FmWPKXTPpgyRyWE9nnPKcVbUg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=bYul4C3u5jG04YgO6/jnosXcRWw1T2ZQEGNUyQUkYW4=; b=RPXwxQ7633eLm0gK07GJsc50sLr2CWI7r6WXuaKs93mqwVGWq8rAsL+cHcj5pJ9QqadOrhI0RDcb3o6clUIo+J6dNB9Z5JONZ2QKBvkljYY+WNW6+zOVzaZn7KrMdSS3phlWqZyFhtHksvcF/n4nf8SBV8vv+wc4QaNhRoxlPXVDVfLS3XEupwt+hySCJgBXoMf7vgJ2qVQMg6FVXuvFpvO8eKuEvr6L8XW/PFxhw2Lx6QwqnKSQeUp6qhCofgiklDsp1lGssKhCRMFOgkepqvXf0RhK4LKwt/MA5EV8UV4w5cEkAhasw+/Ynj/fMfyNUKzQywM1l85TjVH31tFrWw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=bYul4C3u5jG04YgO6/jnosXcRWw1T2ZQEGNUyQUkYW4=; b=TINLVqxZys9ydZU9HJYG9TfhapeWNIw63IcFAAuk6LXJ0yxYKlDwaStVS67dzISsUpeNeq7dEjKEIjsSZyw6Mhz7vt5PB/3rI/pUaAePCzNzw5K56Su4ugkoTR+JNRqxA3l3Yw226GE0Pi28r9oQ+etiRmV05kfAVVMvHkqdOInlEvwiEm5a3yMt2HcOAb0UGMI00lh7XhijF5IABKvoWXyrzNxXMXuE4JRmL5yQKppBj8Up4hFka/m30RHLiBxVNliRNUsXPaQAgmc6xVsuR4TLu06yEhEVQKvZbYqyAABR8yL6fZbs9a68a/XtQ+Tn4ObVhblNfDKJ2lQlBBnQQA== Received: from YT2PR01MB9730.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:b01:de::14) by YT2PR01MB5744.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:b01:58::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5206.13; Thu, 5 May 2022 00:56:05 +0000 Received: from YT2PR01MB9730.CANPRD01.PROD.OUTLOOK.COM ([fe80::fdb1:ada8:7af0:3003]) by YT2PR01MB9730.CANPRD01.PROD.OUTLOOK.COM ([fe80::fdb1:ada8:7af0:3003%6]) with mapi id 15.20.5206.027; Thu, 5 May 2022 00:56:05 +0000 From: Rick Macklem To: Alan Somers CC: FreeBSD Stable ML Subject: Re: nfs client's OpenOwner count increases without bounds Thread-Topic: nfs client's OpenOwner count increases without bounds Thread-Index: AQHYX/oEvZae9laYBUaMKYABkmfK960PSuzXgAAZa4CAAA4SAg== Date: Thu, 5 May 2022 00:56:05 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: suggested_attachment_session_id: 9ef1dce4-2e5b-c502-4f5f-015eaf696f7d x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 48bd0516-beec-44cb-d94b-08da2e32088b x-ms-traffictypediagnostic: YT2PR01MB5744:EE_ x-microsoft-antispam-prvs: x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: cv26n3SBbz4cwGyLTelVFsGDRU5vSGoDFEGoxgsBcHA0UnMZxj8ibJDmwDlxo2mvOBLInpXPohsndekjs1jZ/6RtIxnDX6/Ta4T6SUmWZOU0MVtziGx9niw+s8BrQs4D3IkFX7j8hgmyb+wY1Iua1jJT785GyPBKPWUqERiymmDt1pEjxh5/SNxwgI1Ty0rvxxY4EpyqpcJvHj3TCyOqvJrvacKW5qm+yTwS/uPNHQ0whSwApXLUi7Ny7xnBDgfwQ91UOx5BlPjI99Qfa3CYTS+rd+RZ0E+3AUmJiFBMBX28HPJ6moYQiPHzZPckzYpFMozog5uvzLQXr2VxdTAWejwAvxFcqRwcV16lrBlVk5zUvSBNNaDr0WCTC0+1hp2hIn9VeRxzzRpMF9NLjhEHpHhxcg4u6U9XJt0XJKlEdqUIAigb3psWRCgLxjnPYl5lXdJsyRw+Luw4ECLSLF/z8z/qw5sRgPIeSmOxlld+0A7F2c+lvf11UV6XiXFEzm4R0AFrBDAFnDfx0fkCr3wES4KQkV2n4eKuQ3wK2kgyB662dW5c6ivjOz8xI/4uCg0ePXKSG7Qp0L4Gu9h28VOpPLovFCRWYV8oMbxH4ybVBo5ly/wFGwZObn7CMnXHBCBtIKgeozYWwmw03vW9Mb0MIXwY+KsxRePTUf0KraEEaEnxSzmC0NY3nS96apVLQSD8OGs9MIzUv628tsuTjxv/Gg== x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:YT2PR01MB9730.CANPRD01.PROD.OUTLOOK.COM;PTR:;CAT:NONE;SFS:(13230001)(4636009)(366004)(786003)(66946007)(316002)(450100002)(91956017)(76116006)(66556008)(66476007)(66446008)(64756008)(508600001)(6916009)(83380400001)(86362001)(2906002)(33656002)(4326008)(8676002)(5660300002)(8936002)(38070700005)(38100700002)(122000001)(52536014)(6506007)(71200400001)(9686003)(53546011)(7696005)(55016003)(186003);DIR:OUT;SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 2 x-ms-exchange-antispam-messagedata-0: =?iso-8859-1?Q?ZHREQmDWMjRW3R6nta5CTH64EoOCO/QUfamFPLuWoQo9TorjBWxVd4ukbo?= =?iso-8859-1?Q?miiidgAOpgaCunBjaAxWIjlcub6GLn+hK+ivSRKsQcQ6mjT1q+fUNGcW3T?= =?iso-8859-1?Q?DFl3cVNCHFSAycxMtfpfoYOzcQWBfVfxm26OI2kJmXmlkQzgANbEuSKiWA?= =?iso-8859-1?Q?zrmV1K22xiFhszFOugyWfew0HG7Hi6GXtJRZEIhuqAFYein5BTmB+zHO8W?= =?iso-8859-1?Q?mWjigIdZolDNREf68uLCQLJM5i8Ud+FmUvKbXM/Z3Qcnsdofk3CSDBIEMM?= =?iso-8859-1?Q?qA+IHwdEe8ht7CZWJcVFvRK9Yz/DW39m/xiyJq+kdzhL+P7YI3xy6wgqCD?= =?iso-8859-1?Q?zcUeDQYaluAqGIjB0ch6W7QNL1Q1qo87Xb1jOVOUmfTSN9XSZHZ7W/ut8e?= =?iso-8859-1?Q?KIfENRN7PdjgdMCevLO8Ygtcl8B39JljLJ2shMyIrU0izaEGQJPgRJDDNB?= =?iso-8859-1?Q?qZWJr0LvmFZDscHPEVjfsPrfzsutyYhkdh0buVSJgq1tavQnRJyjl2jDuw?= =?iso-8859-1?Q?Fi5rX2WFi2DfDPOTYL7nU4BayAgZnTGS+ERwe2BHVrxe81oSUCyFPmTczo?= =?iso-8859-1?Q?HXBbUpc7nF7yTFd4XrsfvN0TN8u+E9j7JBnpwZ73RsZsx6BSh4ga7k2dlq?= =?iso-8859-1?Q?7VRuwgch3txyVxLYIBFTsg48gRz3ua3GCdFMPpINRRUIq2Gr3L/bw0MKtI?= =?iso-8859-1?Q?eBMrxvpDo4oPc0RjX7GiEn33kHdMAnndUa+cV+0yyuoHI0DIsXh/V/1AZR?= =?iso-8859-1?Q?0kO6HST5zcc09oUd+8CTBsvA24wh+rqCS2H0asZqcRTRGWcmPcSYZ9heia?= =?iso-8859-1?Q?5/SL6TCioGLTZ5+hcL/rJTK3LlUT2lmkv25atFkgOHWsUYQnHmVxr7H8eY?= =?iso-8859-1?Q?K4M1kPtBEv8lB1D6C1cthvTWOEG2SCWg8Qlycmlu/Cvws0EDw3uCtlwRZy?= =?iso-8859-1?Q?vNtZjcQ8E8VCNHAeJJ47sAY+0QjhnffBmqep67B4a9D9rt5dE7e0ylpCxq?= =?iso-8859-1?Q?U+ARFk02PT7gNV+tlZOt/JMO6AFeeNP112PguD1GbcXI0La23MK8Gc0jk/?= =?iso-8859-1?Q?oXdhycK8IsrAvymLBw2wFuvXa46wlqjYdGyOBs/F3O5utqd4ZHSBX1Zo9L?= =?iso-8859-1?Q?Vlq9BlowiGjlR7psOfyFcbFo96bbeiKVtzjEdl1EBPb/hWVym8c5U/EsjB?= =?iso-8859-1?Q?lUixhwRKmsw3swlrMD50feebcuN34j4y3USA99Fia+yzilAh4+6N7vZY8f?= =?iso-8859-1?Q?YYcIwmvESV43IHMsDz+DBc7aoQy3urP9B7XjcqL2M7TetxjRHKS37EmJpz?= =?iso-8859-1?Q?lrAfbcNykpbLOR6gmwuD5DX1hO0c3gZ4b9UOsdDMtdNXhi+/ObLRP0FUSE?= =?iso-8859-1?Q?2ZrFmAA5pXSVxnuGP0cFypMGLHIb8GIizHUbkhdZCZlQmCfDC5J64Nxbe/?= =?iso-8859-1?Q?ioiinPBSkbLitlD7VR8N95jF+oYFG2UvQbl7i6N2ehurvQheoxPPv7Ue1U?= =?iso-8859-1?Q?RomtWzxKKDQq7stk506daCdcNfWsRgX/yD2RdNbA60Bc1JsiFT+Tfpakjn?= =?iso-8859-1?Q?7rDumCBmLoufsHcPmRseFAEXQVFzbqhiwCwu6Hx2fpZO9CXiibWHSPXqv/?= =?iso-8859-1?Q?1+fnSv/KT+6b3oxMPXAF+aMiMtgOiP+OKTM8YVCJow0kk2TguWolaX8+Bl?= =?iso-8859-1?Q?3YR6pDZQDOi2rJe11TicQhAWvPTRoUOXCNSaR9JycegGwqvHTzejcmOhKJ?= =?iso-8859-1?Q?ybqpJINmdBiqDiQJT7NxonQq1M0161Sy8CveexfuXIq/OvGDFRRJ+cQB32?= =?iso-8859-1?Q?ZEXTjTC2T3x3SqAX3T2CSorZRJLfLvoF8YtGBHaGu5szccUGIXZ+AL4P/6?= =?iso-8859-1?Q?qh?= x-ms-exchange-antispam-messagedata-1: KWpGY/z+5Z8f1Q== Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable List-Id: Production branch of FreeBSD source code List-Archive: https://lists.freebsd.org/archives/freebsd-stable List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YT2PR01MB9730.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: 48bd0516-beec-44cb-d94b-08da2e32088b X-MS-Exchange-CrossTenant-originalarrivaltime: 05 May 2022 00:56:05.2346 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: rusvrN22rJK5mEaPzpj+NAY63mrN2Fy2008GRN/AsMu6tGgV9pLf2VCZohMUEl3xRk4uz9x050evzZ//X5D+Aw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: YT2PR01MB5744 X-Rspamd-Queue-Id: 4KtwHc3DYGz3N9T X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector2 header.b=TINLVqxZ; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=pass (policy=none) header.from=uoguelph.ca; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 40.107.115.43 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-6.00 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector2]; FREEFALL_USER(0.00)[rmacklem]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:40.107.0.0/16]; MIME_GOOD(-0.10)[text/plain]; NEURAL_HAM_LONG(-1.00)[-1.000]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[uoguelph.ca:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[uoguelph.ca,none]; RCVD_IN_DNSWL_NONE(0.00)[40.107.115.43:from]; MLMMJ_DEST(0.00)[stable]; NEURAL_HAM_SHORT(-1.00)[-0.998]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:8075, ipnet:40.104.0.0/14, country:US]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; RWL_MAILSPIKE_POSSIBLE(0.00)[40.107.115.43:from] X-ThisMailContainsUnwantedMimeParts: N Alan Somers wrote:=0A= > On Wed, May 4, 2022 at 5:23 PM Rick Macklem wrote:= =0A= > >=0A= > > Alan Somers wrote:=0A= > > > I have a FreeBSD 13 (tested on both 13.0-RELEASE and 13.1-RC5) deskto= p=0A= > > > mounting /usr/home over NFS 4.2 from an 13.0-RELEASE server. It=0A= > > > worked fine until a few weeks ago. Now, the desktop's performance=0A= > > > slowly degrades. It becomes less and less responsive until I restart= =0A= > > > X after 2-3 days. /var/log/Xorg.0.log shows plenty of entries like= =0A= > > > "AT keyboard: client bug: event processing lagging behind by 112ms,= =0A= > > > your system is too slow". "top -S" shows that the busiest process is= =0A= > > > nfscl. A dtrace profile shows that nfscl is spending most of its tim= e=0A= > > > in nfscl_cleanup_common, in the loop over all nfsclowner objects.=0A= > > > Running "nfsdumpstate" on the server shows thousands of OpenOwners fo= r=0A= > > > that client, and < 10 for any other NFS client. The OpenOwners=0A= > > > increases by about 3000 per day. And yet, "fstat" shows only a coupl= e=0A= > > > hundred open files on the NFS file system. Why are OpenOwners so=0A= > > > high? Killing most of my desktop processes doesn't seem to make a=0A= > > > difference. Restarting X does improve the perceived responsiveness,= =0A= > > > though it does not change the number of OpenOwners.=0A= > > >=0A= > > > How can I figure out which process(es) are responsible for the=0A= > > > excessive OpenOwners?=0A= > > An OpenOwner represents a process on the client. The OpenOwner=0A= > > name is an encoding of pid + process startup time.=0A= > > However, I can't think of an easy way to get at the OpenOwner name.=0A= > >=0A= > > Now, why aren't they going away, hmm..=0A= > >=0A= > > I'm assuming the # of Opens is not large?=0A= > > (Openowners cannot go away until all associated opens=0A= > > are closed.)=0A= > =0A= > Oh, I didn't mention that yes the number of Opens is large. Right=0A= > now, for example, I have 7950 OpenOwner and 8277 Open.=0A= Well, the openowners cannot go away until the opens go away,=0A= so the problem is that the opens are not getting closed.=0A= =0A= Close happens when the v_usecount on the vnode goes to zero.=0A= Something is retaining the v_usecount. One possibility is that most=0A= of the opens are for the same file, but with different openowners.=0A= If that is the case, the "oneopenown" mount option will deal with it.=0A= =0A= Another possibility is that something is retaining a v_usecount=0A= reference on a lot of the vnodes. (This used to happen when a nullfs=0A= mount with caching enabled was on top of the nfs mount.)=0A= I don't know what other things might do that?=0A= =0A= > >=0A= > > Commit 1cedb4ea1a79 in main changed the semantics of this=0A= > > a little, to avoid a use-after-free bug. However, it is dated=0A= > > Feb. 25, 2022 and is not in 13.0, so I don't think it could=0A= > > be the culprit.=0A= > >=0A= > > Essentially, the function called nfscl_cleanupkext() should call=0A= > > nfscl_procdoesntexist(), which returns true after the process has=0A= > > exited and when that is the case, calls nfscl_cleanup_common().=0A= > > --> nfscl_cleanup_common() will either get rid of the openowner or,=0A= > > if there are still children with open file descriptors, mark it "= defunct"=0A= > > so it can be free'd once the children close the file.=0A= > >=0A= > > It could be that X is now somehow creating a long chain of processes=0A= > > where the children inherit a file descriptor and that delays the cleanu= p=0A= > > indefinitely?=0A= > > Even then, everything should get cleaned up once you kill off X?=0A= > > (It might take a couple of seconds after killing all the processes off.= )=0A= > >=0A= > > Another possibility is that the "nfscl" thread is wedged somehow.=0A= > > It is the one that will call nfscl_cleanupkext() once/sec. If it never= =0A= > > gets called, the openowners will never go away.=0A= > >=0A= > > Being old fashioned, I'd probably try to figure this out by adding=0A= > > some printf()s to nfscl_cleanupkext() and nfscl_cleanup_common().=0A= > =0A= > dtrace shows that nfscl_cleanupkext() is getting called at about 0.6 hz.= =0A= That sounds ok. Since there are a lot of opens/openowners, it probably=0A= is getting behind.=0A= =0A= > >=0A= > > To avoid the problem, you can probably just use the "oneopenown"=0A= > > mount option. With that option, only one openowner is used for=0A= > > all opens. (Having separate openowners for each process was needed=0A= > > for NFSv4.0, but not NFSv4.1/4.2.)=0A= > >=0A= > > > Or is it just a red herring and I shouldn't=0A= > > > worry?=0A= > > Well, you can probably avoid the problem by using the "oneopenown"=0A= > > mount option.=0A= > =0A= > Ok, I'm trying that now. After unmounting and remounting NFS,=0A= > "nfsstat -cE" reports 1 OpenOwner and 11 Opens". But on the server,=0A= > "nfsdumpstate" still reports thousands. Will those go away=0A= > eventually?=0A= If the opens are gone then, yes, they will go away. They are retained for= =0A= a little while so that another Open against the openowner does not need=0A= to recreate the openowner (which also implied an extra RPC to confirm=0A= the openowner in NFSv4.0).=0A= =0A= I think they go away after a few minutes, if I recall correctly.=0A= If the server thinks there are still Opens, then they will not go away.=0A= =0A= rick=0A= =0A= >=0A= > Thanks for reporting this, rick=0A= > ps: And, yes, large numbers of openowners will slow things down,=0A= > since the code ends up doing linear scans of them all in a linked= =0A= > list in various places.=0A= >=0A= > -Alan=0A= >=0A=