Date: Mon, 6 Jun 2022 14:22:25 -0400 From: Mark Johnston <markj@freebsd.org> To: Jan Mikkelsen <janm@transactionware.com> Cc: Paul Floyd <paulf2718@gmail.com>, FreeBSD Hackers <freebsd-hackers@freebsd.org> Subject: Re: Hang ast / pipelk / piperd Message-ID: <Yp5F4acZq%2B4kOM15@nuc> In-Reply-To: <6DBE2C6F-7F8B-457A-AB10-1912965C3376@transactionware.com> References: <84015bf9-8504-1c3c-0ba5-58d0d7824843@gmail.com> <dca6a5b4-6f0c-98c0-2f2d-6e5da7405af4@gmail.com> <YpTRj7jVE0jfbxPO@nuc> <b598c89e-f11a-eed1-3d74-c3ef37bc400a@gmail.com> <Ypd0ziZKcZ2pgm0P@nuc> <bf595064-e03c-e69d-9d93-3f7de52360c0@gmail.com> <6DBE2C6F-7F8B-457A-AB10-1912965C3376@transactionware.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Jun 02, 2022 at 12:49:45PM +0200, Jan Mikkelsen wrote: > All these mi_switch+0xc2 hangs reminded me of something I saw once on 13.1-RC2 back in April. The machine was running five concurrent “make -j32 installword” processes. > > The machine hung, disk activity stopped. Results of ^T on various running commands: > > ^T on a “tail -F” command: > > load: 1.93 cmd: tail 27541 [zfs teardown inactive] 393.65r 0.06u 0.10s 0% 2548k > mi_switch+0xc2 _sleep+0x1fc rms_rlock_fallback+0x90 zfs_freebsd_reclaim+0x26 VOP_RECLAIM_APV+0x1f vgonel+0x342 vnlru_free_impl+0x2f7 vn_alloc_hard+0xc8 getnewvnode_reserve+0x93 zfs_zget+0x22 zfs_dirent_lookup+0x16b zfs_dirlook+0x7a zfs_lookup+0x3d0 zfs_cache_lookup+0xa9 VOP_LOOKUP+0x30 cache_fplookup_noentry+0x1a3 cache_fplookup+0x366 namei+0x12a > > ^T on a zsh doing a cd to a UFS directory: > > load: 0.48 cmd: zsh 86937 [zfs teardown inactive] 84663.01r 0.06u 0.01s 0% 6412k > mi_switch+0xc2 _sleep+0x1fc rms_rlock_fallback+0x90 zfs_freebsd_reclaim+0x26 VOP_RECLAIM_APV+0x1f vgonel+0x342 vnlru_free_impl+0x2f7 vn_alloc_hard+0xc8 getnewvnode_reserve+0x93 zfs_zget+0x22 zfs_dirent_lookup+0x16b zfs_dirlook+0x7a zfs_lookup+0x3d0 zfs_cache_lookup+0xa9 lookup+0x45c namei+0x259 kern_statat+0xf3 sys_fstatat+0x2f This looks very similar to the problem described here: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=261448 Though, in my case I did not see any deadlocks. In other words, the hang always ended after some time (typically a few seconds). > ^T on an attempt to start gstat > > load: 0.17 cmd: gstat 63307 [ufs] 298.29r 0.00u 0.00s 0% 228k > mi_switch+0xc2 sleeplk+0xf6 lockmgr_slock_hard+0x3e7 ffs_lock+0x6c _vn_lock+0x48 vget_finish+0x21 cache_lookup+0x26c vfs_cache_lookup+0x7b lookup+0x45c namei+0x259 vn_open_cred+0x533 kern_openat+0x283 amd64_syscall+0x10c fast_syscall_common+0xf8 > > A short press of the system power button did nothing. > > The installworld target directories were on a ZFS filesystem with a single mirror of two SATA SSDs. > > Unsure if it’s related because the rest of the stack traces are different. However, the mi_switch+0xc2 triggered a memory. mi_switch() is main entry point into the CPU scheduler, so pretty much any thread which isn't on a CPU will have mi_switch() appear in its backtrace.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Yp5F4acZq%2B4kOM15>