From nobody Sat May 28 21:27:45 2022 X-Original-To: freebsd-fs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 5DA001B59165 for ; Sat, 28 May 2022 21:27:54 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-YT3-obe.outbound.protection.outlook.com (mail-yt3can01on2086.outbound.protection.outlook.com [40.107.115.86]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "DigiCert Cloud Services CA-1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4L9ZX91nsYz4v02; Sat, 28 May 2022 21:27:53 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=k4crpnfYm8Cj3ZS0uj1IFNhe/q+PteAie/M2RcV4XScGFgcKoxr4OdunrjWB2PdFy9/NVrC1tgwcFfjWLPi3x99A08waHF3inyGW3Bvuq+N6Xn1q10jDxEOhM4mqa4Fid1k/NNXlyNPOQ9IGR/uytc0rj1jLJnRkZx90sksDgMCgxKzdRyRCKDz8QkpyrNw7caMP8gINM3P8UO6MbbEm286bcxXtjuF2GOmqDwuoPIIQsUcAol8fgYEMnPa+L6LSA7aHe8JqCTMJtXsDM/5JEveyWpzIPyX01JlnieSGXPcZBqNq34rD8aTUBm1Kl7r5cKXypwA4ws6QfyXrDnfmRw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=N9RyI4pmGN8g22PgxeLh9ZZkP7thzvni7h/Tc2VfmCs=; b=WJPS/AULKCN/6cS9c1xxxQh7dM/WV9eG5TCyDscfYHf3DWQdT2GSzZpjLn8fmRVqywcIvc/KYkmRXrT93Ib51C3+yzxNn2JdVS8W4ftLxaLNMofFZPbOAdQpKLuKdbBzRhNJ0fjcZaxS2A4N7UxrZydw9tsMc4wf7DssDAR3tSU7p+RgwCQv3WUhl/0uGSdF+AixHQRMSCNSkhhroXXvdmNtPfBI5r7EEDIWEpzXpt59lg1adF3BOvtDw9ZL3eBBCfz9uu0SdpzxhLRydBIShsXpfn3hNr+AHJIj4XBROuplGt6GAGryGV3OpxhcvVNiEdm/986wML5Lbj3dS8/eNQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=N9RyI4pmGN8g22PgxeLh9ZZkP7thzvni7h/Tc2VfmCs=; b=FEEK/02aoNv84JXNDRPx4XglT/GDD6zwFSUwPPE7GY/dJ6bxo3cKV9dmnl3JM30dzixAjWpCgjpDLfy8mVv3s1dlOuGAAoco7bT6Bz/ctCWhjXyDBUmKcctZtLxlMvAk51OiMaKXhHM4T4wDfExGc7OM4FVnMpZHmksUULNxghsZLUqKee1CuVM20570QhTlPbACABIGIMZ836V0gieJJDtUkTmXhfx7dX8a08221BMid7/kqsLmYPsF2agjfCWY8UproPvTninqpcvv/pV3W42SQC078C1HMxe8TSEZTX7nZ2jooAN4aK/FORf6Orw7qs6XmxEDk1rjxhoy4cMfhg== Received: from YQBPR0101MB9742.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c01:81::14) by YT3PR01MB9234.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:b01:a1::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5293.13; Sat, 28 May 2022 21:27:45 +0000 Received: from YQBPR0101MB9742.CANPRD01.PROD.OUTLOOK.COM ([fe80::b921:251e:4a0b:54fc]) by YQBPR0101MB9742.CANPRD01.PROD.OUTLOOK.COM ([fe80::b921:251e:4a0b:54fc%6]) with mapi id 15.20.5293.018; Sat, 28 May 2022 21:27:45 +0000 From: Rick Macklem To: Kurt Jaeger CC: "freebsd-fs@freebsd.org" Subject: Re: FreeBSD 12.3/13.1 NFS client hang Thread-Topic: FreeBSD 12.3/13.1 NFS client hang Thread-Index: AQHYcgY+3LPS/CtmUk+ZVTnbRrY6qK0zLx2fgADCMYCAAHUPJ4AAZZKm Date: Sat, 28 May 2022 21:27:45 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: suggested_attachment_session_id: b99ddb7d-0138-36c0-0657-44ebb50525ff x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 501529be-0af2-4ebd-7b75-08da40f0e808 x-ms-traffictypediagnostic: YT3PR01MB9234:EE_ x-microsoft-antispam-prvs: x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: oSBmXG3UEAlv0GNcfAyN0lPURkoTjWi8TsPMkbXQkBJYyit0YHhwtpvr6SH/sY4o7L7dHR9nALCS7XQjSXkWba3efS7lDvVqeaXvD2zKx120Xdnv2g5OgzZCJFcD1PGscnx9dXhseYyCwv3GBUwD5aIbaejiJIBAxw5FYHhZZ2TsRFv6XdE7Y+u5e71UPfjzJ2nP9xm8CdJVq1TnBJFpITj2+gG9OOWV+D2kFbmuBHAD/Hj2TKRM4aetBdyfMzEHlPkOIMC9At8KDVS4ZZ1hD8Px/V07sW4xW60brDkJM7zlWpkpq8rY/st7kI1t9hYA5ZkMDZxRNhAGjfbRqE90Tc8U3SdTzGHM5WZGYNZmUAZFBlYlAwuLlJxdU/L7jnF41D7sGYfkgtIww5ctpyTbhn/dRncciCu/VNEMPQWhEtDeQAxGApSVM3odKExtQHq8mpj2q3TvYyz/R11lWD6F5y7lTWEAIFIZJpHPo3qsW6UAq+c2fLH/lmXDGWjjqOb6hyP66YIpkHy+f1A3gMijsb15Fi0IRigVRh9Ny5owlfcrJO+btFoU0InmlfpebsHEPQAg36FoNMMpkpz3av/AhvV0JSeCXXH4mppkiNiX704lJDdF0L0GCGWqTmv1oTD7H4iMScueFH9ENItLRgsNsJbj/nyD+waxeYSE8hK/XFCoVgaAnQNsQfe4L/iGQh4EoSGBAGrYoqYBlUNLW9rToUoG1O6PGju0oIzvMEpyREjqPuev9DRVywx9rPZdYDiBUh/wv4iu6mxVrzF/bXUrgPW0BgK5fOQW6UsDqzLeHo4= x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:YQBPR0101MB9742.CANPRD01.PROD.OUTLOOK.COM;PTR:;CAT:NONE;SFS:(13230001)(4636009)(366004)(508600001)(66946007)(966005)(64756008)(55016003)(76116006)(7696005)(6506007)(91956017)(4326008)(66476007)(52536014)(66446008)(9686003)(86362001)(2940100002)(186003)(5660300002)(450100002)(83380400001)(8936002)(2906002)(8676002)(71200400001)(6916009)(33656002)(38100700002)(122000001)(786003)(38070700005)(316002)(66556008);DIR:OUT;SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 2 x-ms-exchange-antispam-messagedata-0: =?iso-8859-1?Q?Yi/57pXOCNmYuMwpcIniR6aAtFsA7+egGTOJlG6y77SF3cKljpZtH5aTDa?= =?iso-8859-1?Q?xxTVcFaQQ3jxDMGk5Y+GziB2XkjiSetFF7KP/gtonSObntpA8N9ScI/UiI?= =?iso-8859-1?Q?qF9fbdqBfDo+lWZrwaPnhKsqkKt7YOjsz6L0GvlsndZ9x5KyAvEr/RVShN?= =?iso-8859-1?Q?1BznFvwHu1rZNHAzvkKUow8zPVxx5r8+2l/BHK5jZnYSUnVnMk310SqjFV?= =?iso-8859-1?Q?RSDVYw3lcSZd/k4cRTAJU3l4KpGUWBbUL7Mf/nR5kzUXQ1aqkVpKQD2w//?= =?iso-8859-1?Q?fl5NdvatWdp2AM8tuQ3qkx2HiJMqMpqV7OE6bhuNkhgk51oAGhJMRKj4E8?= =?iso-8859-1?Q?xbP2d5A9W7H2tCg93kiLRcQ/JOo2Qjqnqk9r2jYekO/kKAuuwDlVoCstz6?= =?iso-8859-1?Q?aYbZb5mO1dOHdU6mgqoT0M7JyWFj4llt++wtxKL+jVJFqPplBDE7D99Sm/?= =?iso-8859-1?Q?msfEZF5EnsrYG4ERvnGXJ7lFHi7k96JECaBr9CKKUe0ayFXkoauzPwfO4z?= =?iso-8859-1?Q?PiE3ddUDUA4M5MIxqLAx3e5rZ8gE0UvIZlrYyulOU89z6UnFmKmZEI/xrW?= =?iso-8859-1?Q?wV9MrMZrxBipb7aj/5wvwcS7ZBtCYWjl/F8ZbePwiNoSYKZQPwOMv3q2kt?= =?iso-8859-1?Q?g+qrkgWNVIfjkK7FVkOkgDNnQjVxTjn5V7d5L7pCEWZaCIZeMw5N+Xh3Px?= =?iso-8859-1?Q?/nxzGoXBrEb9t9PvbJ6SEHnjEAlZ/iU+8zHLETs7Qb6es+xDPfj3rZ+Xm4?= =?iso-8859-1?Q?FLCxcq+xC5HIkZR/giaBZ58c+zGC7zyxoxHcRh775JA6RCkve1h7HFpJsB?= =?iso-8859-1?Q?rW8T7sHwOlWN6XHxD6oXdccxYwLVmuY8RI/ceMCBsCT4M1VA4Pre33T/wH?= =?iso-8859-1?Q?0PmbvwztxdJ3MyMvz/ITn2wAiBSNj61FpmQ6k/5mEBRrRQkscMKd6+jEj6?= =?iso-8859-1?Q?4Zi2eKBnWitC3gURwcK6NluFwBQJrAhBsfiJ2GbwTb3+J1pF0ibQ6sCaNl?= =?iso-8859-1?Q?Igrw+4OdOYG0VzCPpRd53Fe6ymqKF5SaQIC9GqbAwAUWT+ivC8Xh+9XcCX?= =?iso-8859-1?Q?9ixJFDD/8k32YKhb4HDIEBSYYH+qV8aw4TyptuyfT+OsMuC/jJJrBTFvAD?= =?iso-8859-1?Q?+gukq2cq6dawOQ2qceTNQiQUuN7Q5B+XuLVzmkrEWQ9116EvY9OWDxSilL?= =?iso-8859-1?Q?w+6AHtky6y9Bhtmfiel8Uy9ZTd8ZC3aahzKonSmxyHRgceTMsuOVc/CTUt?= =?iso-8859-1?Q?UY7esMbd72Bqw6MCCTrcuIePq/fDSrzI8bcIYJgpm8+7DKdrtaDa4XakEp?= =?iso-8859-1?Q?C6c13Lf86RxpTXHFKe0OprRtuMgZikYjaVgbrbogZ2LomYt+MBO3qm0B83?= =?iso-8859-1?Q?xd0qk6+QToAq5uOMZTAu9nC9kJmGJrWdC7Sq/yS2f6e6QZmfJJzv9yKqtU?= =?iso-8859-1?Q?owNECJ/uBi73ay/tHrtxbKIpwEkYfqeIyaqCQhK5C8/y1ZKHsMnEM8bRTj?= =?iso-8859-1?Q?CHqIkgdaltAr8B2wY5xF7XE45L7g7it/j6BVg1I+dLnvZk5tZyJvK1ORwF?= =?iso-8859-1?Q?TFq8fbIo/Fa668jHD4VP0BcdkMpGMEI4fYAodp9D39ohJDNgm/3F4y0CfY?= =?iso-8859-1?Q?LLZoN5IpIRvmHGxadO8eh6efN0uu72I+7Fg8t7IF1zeawlbCjOchSAkWND?= =?iso-8859-1?Q?Rv3SYSN971lTTZo8DVcFIklRQ3HqXFjnxR/2U6D9SEJnP0WdlF+dqMO6I9?= =?iso-8859-1?Q?uu2LQCJIX7FSpnK+vPEkL/lOIkdGD3f5uuXsYcNw08ZaILMzXmCfCX6Z8O?= =?iso-8859-1?Q?5JVQ4WxLTFiuol5WdJZIxO1+Gg/oG8NSryHPNWe5QPgG+EJRRhe3WP/Zuq?= =?iso-8859-1?Q?cf?= x-ms-exchange-antispam-messagedata-1: 8j3dpJDqv4dzRQ== Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable List-Id: Filesystems List-Archive: https://lists.freebsd.org/archives/freebsd-fs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-fs@freebsd.org MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YQBPR0101MB9742.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: 501529be-0af2-4ebd-7b75-08da40f0e808 X-MS-Exchange-CrossTenant-originalarrivaltime: 28 May 2022 21:27:45.5232 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: CpA2F/jKJMcQQdI/YdMXkul9U5Qq0S9v2aAMs4Zekx7a+Oo9a6vG5Y6hu5BIADPqNou9DqorBTy96Z+YQOOjaw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: YT3PR01MB9234 X-Rspamd-Queue-Id: 4L9ZX91nsYz4v02 X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector2 header.b="FEEK/02a"; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=pass (policy=none) header.from=uoguelph.ca; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 40.107.115.86 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-6.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector2]; FREEFALL_USER(0.00)[rmacklem]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:40.107.0.0/16]; MIME_GOOD(-0.10)[text/plain]; NEURAL_HAM_LONG(-1.00)[-1.000]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[uoguelph.ca:+]; RCPT_COUNT_TWO(0.00)[2]; RCVD_IN_DNSWL_NONE(0.00)[40.107.115.86:from]; NEURAL_HAM_SHORT(-1.00)[-0.999]; DMARC_POLICY_ALLOW(-0.50)[uoguelph.ca,none]; MLMMJ_DEST(0.00)[freebsd-fs]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:8075, ipnet:40.104.0.0/14, country:US]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; RWL_MAILSPIKE_POSSIBLE(0.00)[40.107.115.86:from] X-ThisMailContainsUnwantedMimeParts: N Rick Macklem wrote:=0A= > Kurt Jaeger wrote:=0A= > > > > I'm having issues with the NFS clients on FreeBSD 12.3 and 13.1=0A= > >=0A= > > I have it with an 13.0p7 client against an 13.1 server with=0A= > > a hanging soft-mount (I tried unmount to change it to a hard mount).=0A= > >=0A= > > 61585 93- D+ 0:00.00 umount /office/serv=0A= > > 61635 133 D 0:00.00 umount -f /office/serv=0A= > > 7784 138 D 0:00.00 umount -N /office/serv=0A= > The first umount must be "-N". Once you've hung a non "-N" umount,=0A= > rebooting is the only option.=0A= > (I have thought of doing a "umount -N -A" (for all NFS mounts), which=0A= > would allow it to kill off all NFS activity without even finding the path= name=0A= > for the mountpoint, but I have not done so.)=0A= I take this back. I just did a fairly trivial test of this and it worked.= =0A= Looking at the "ps" output, I don't think your case is a "NFS protocol hang= ".=0A= When I look at the "ps" output, there are no threads waiting on NFS RPCs to= complete.=0A= (umount -N kills off outstanding RPCs, so the VFS/VOP ops can complete with= error, which should=0A= dismount a hang caused by an unresponsive NFS server or similar.)=0A= =0A= The only threads sleeping in the nfs code are waiting for an NFS vnode lock= .=0A= I suspect that some process/thread is hung for something non-NFS while hold= ing a lock=0A= on a NFS vnode. "umount -N" won't know how to unhang this process/thread.= =0A= Just a hunch, but I'd suspect one of the threads sleeping on "vmopar", alth= ough I'm=0A= not a vm guy.=0A= What I don't know how to do is figure out what thread(s) are holding vnode = locks?=0A= =0A= This also implies that switching from soft->hard won't fix the problem.=0A= =0A= It would be nice if "umount -N" could handle this case. I'll look at the VF= S code and=0A= maybe talk to kib@ to see if there is a way to mark all NFS vnodes "dead" s= o that=0A= vn_lock() will either return an error or a locked bit VI_DOOMED vnode (if L= K_RETRY is=0A= specified).=0A= =0A= In summary, I don't think your hang is anything like Andreas's, rick=0A= =0A= > and procstat:=0A= >=0A= > # procstat -kk 7784=0A= > PID TID COMM TDNAME KSTACK=0A= > 7784 107226 umount - mi_switch+0xc1 sleepl= k+0xec lockmgr_xlock_hard+0x345 _vn_lock+0x48 vget_finish+0x21 cache_lookup= +0x299 vfs_cache_lookup+0x7b lookup+0x68c namei+0x487 kern_unmount+0x164 am= d64_syscall+0x10c fast_syscall_common+0xf8=0A= > # procstat -kk 61635=0A= > PID TID COMM TDNAME KSTACK=0A= > 61635 775458 umount - mi_switch+0xc1 sleep= lk+0xec lockmgr_slock_hard+0x382 _vn_lock+0x48 vget_finish+0x21 cache_looku= p+0x299 vfs_cache_lookup+0x7b lookup+0x68c namei+0x487 sys_statfs+0xc3 amd6= 4_syscall+0x10c fast_syscall_common+0xf8=0A= > # procstat -kk 61585=0A= > PID TID COMM TDNAME KSTACK=0A= > 61585 516164 umount - mi_switch+0xc1 sleep= lk+0xec lockmgr_xlock_hard+0x345 nfs_lock+0x2c vop_sigdefer+0x2b _vn_lock+0= x48 vflush+0x151 nfs_unmount+0xc3 vfs_unmount_sigdefer+0x2e dounmount+0x437= kern_unmount+0x332 amd64_syscall+0x10c fast_syscall_common+0xf8=0A= These just show that they are waiting for NFS vnodes. In the "ps" there are= =0A= threads waiting on zfs vnodes as well.=0A= =0A= > ps-axHl can be found at=0A= >=0A= > https://people.freebsd.org/~pi/logs/ps-axHl.txt=0A= I suspect your problem might be related to wired pages. Note that=0A= several threads are sleeping on "vmopar". I'm no vm guy, but I=0A= think that might mean too many pages have become wired.=0A= =0A= rick=0A= =0A= > > systems hanging when using a CentOS 7 server.=0A= > First, make sure you are using hard mounts. "soft" or "intr" mounts won't= =0A= > work and will mess up the session sooner or later. (A messed up session c= ould=0A= > result in no free slots on the session and that will wedge threads in=0A= > nfsv4_sequencelookup() as you describe.=0A= > (This is briefly described in the BUGS section of "man mount_nfs".)=0A= >=0A= > Do a:=0A= > # nfsstat -m=0A= > on the clients and look for "hard".=0A= =0A= No output at all for that 8-(=0A= =0A= --=0A= pi@FreeBSD.org +49 171 3101372 Now what ?=0A= =0A= =0A=