Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 6 Jun 2022 14:22:25 -0400
From:      Mark Johnston <markj@freebsd.org>
To:        Jan Mikkelsen <janm@transactionware.com>
Cc:        Paul Floyd <paulf2718@gmail.com>, FreeBSD Hackers <freebsd-hackers@freebsd.org>
Subject:   Re: Hang ast / pipelk / piperd
Message-ID:  <Yp5F4acZq%2B4kOM15@nuc>
In-Reply-To: <6DBE2C6F-7F8B-457A-AB10-1912965C3376@transactionware.com>
References:  <84015bf9-8504-1c3c-0ba5-58d0d7824843@gmail.com> <dca6a5b4-6f0c-98c0-2f2d-6e5da7405af4@gmail.com> <YpTRj7jVE0jfbxPO@nuc> <b598c89e-f11a-eed1-3d74-c3ef37bc400a@gmail.com> <Ypd0ziZKcZ2pgm0P@nuc> <bf595064-e03c-e69d-9d93-3f7de52360c0@gmail.com> <6DBE2C6F-7F8B-457A-AB10-1912965C3376@transactionware.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Jun 02, 2022 at 12:49:45PM +0200, Jan Mikkelsen wrote:
> All these mi_switch+0xc2 hangs reminded me of something I saw once on 13.1-RC2 back in April. The machine was running five concurrent “make -j32 installword” processes.
> 
> The machine hung, disk activity stopped. Results of ^T on various running commands:
> 
> ^T on a “tail -F” command:
> 
> load: 1.93  cmd: tail 27541 [zfs teardown inactive] 393.65r 0.06u 0.10s 0% 2548k
> mi_switch+0xc2 _sleep+0x1fc rms_rlock_fallback+0x90 zfs_freebsd_reclaim+0x26 VOP_RECLAIM_APV+0x1f vgonel+0x342 vnlru_free_impl+0x2f7 vn_alloc_hard+0xc8 getnewvnode_reserve+0x93 zfs_zget+0x22 zfs_dirent_lookup+0x16b zfs_dirlook+0x7a zfs_lookup+0x3d0 zfs_cache_lookup+0xa9 VOP_LOOKUP+0x30 cache_fplookup_noentry+0x1a3 cache_fplookup+0x366 namei+0x12a 
> 
> ^T on a zsh doing a cd to a UFS directory:
> 
> load: 0.48  cmd: zsh 86937 [zfs teardown inactive] 84663.01r 0.06u 0.01s 0% 6412k
> mi_switch+0xc2 _sleep+0x1fc rms_rlock_fallback+0x90 zfs_freebsd_reclaim+0x26 VOP_RECLAIM_APV+0x1f vgonel+0x342 vnlru_free_impl+0x2f7 vn_alloc_hard+0xc8 getnewvnode_reserve+0x93 zfs_zget+0x22 zfs_dirent_lookup+0x16b zfs_dirlook+0x7a zfs_lookup+0x3d0 zfs_cache_lookup+0xa9 lookup+0x45c namei+0x259 kern_statat+0xf3 sys_fstatat+0x2f 

This looks very similar to the problem described here:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=261448
Though, in my case I did not see any deadlocks.  In other words, the
hang always ended after some time (typically a few seconds).

> ^T on an attempt to start gstat
> 
> load: 0.17  cmd: gstat 63307 [ufs] 298.29r 0.00u 0.00s 0% 228k
> mi_switch+0xc2 sleeplk+0xf6 lockmgr_slock_hard+0x3e7 ffs_lock+0x6c _vn_lock+0x48 vget_finish+0x21 cache_lookup+0x26c vfs_cache_lookup+0x7b lookup+0x45c namei+0x259 vn_open_cred+0x533 kern_openat+0x283 amd64_syscall+0x10c fast_syscall_common+0xf8 
> 
> A short press of the system power button did nothing.
> 
> The installworld target directories were on a ZFS filesystem with a single mirror of two SATA SSDs.
> 
> Unsure if it’s related because the rest of the stack traces are different. However, the mi_switch+0xc2 triggered a memory.

mi_switch() is main entry point into the CPU scheduler, so pretty much
any thread which isn't on a CPU will have mi_switch() appear in its
backtrace.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Yp5F4acZq%2B4kOM15>