Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 12 Nov 2021 15:10:50 -0500
From:      Mark Johnston <markj@freebsd.org>
To:        Chris Ross <cross+freebsd@distal.com>
Cc:        ronald-lists@klop.ws, freebsd-fs <freebsd-fs@freebsd.org>
Subject:   Re: swap_pager: cannot allocate bio
Message-ID:  <YY7KSgGZY9ehdjzu@nuc>
In-Reply-To: <4E5511DF-B163-4928-9CC3-22755683999E@distal.com>
References:  <9FE99EEF-37C5-43D1-AC9D-17F3EDA19606@distal.com> <09989390-FED9-45A6-A866-4605D3766DFE@distal.com> <op.1cpimpsmkndu52@joepie> <4E5511DF-B163-4928-9CC3-22755683999E@distal.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Nov 11, 2021 at 04:49:21PM -0500, Chris Ross wrote:
> 
> 
> > On Nov 11, 2021, at 13:50, Ronald Klop via freebsd-fs <freebsd-fs@freebsd.org> wrote:
> > 
> > Can you press ctrl-t on the hanging process? That should print the stacktrace indicating where it is waiting on.
> 
> So, I rebooted the machine this morning, but now have [tried to] log into it to check on it and find that an ssh connection doesn’t result in a shell.  I logged into the console, tried to start a “screen” to get more prompts, and it hung.  Ctrl-T on that shows (after running a console screen-capture through OCR, and hand correction, so may not be 100%):
> 
> root@host:~ # screen
> load: 0.07 cmd: csh 56116 [vmwait] 35.00r 0.00u 0.01s 0% 3984k
> mi_switch+0xc1 _sleep+0x1cb vm_wait_doms+0xe2 vm_wait_domain+0x51 vm_domain_alloc_fail+0x86 vm_page_alloc_domain_after+0x7e uma_small_alloc+0x58 keg_alloc_slab+0xba zone_import+0xee zone_alloc_item+0x6f malloc+0x5d sigacts_alloc+0x1c fork1+0x9fb sys_fork+0x54 amd64_syscall+0x10c fast_syscall_common+0xf8 
> 
> As before, ps and even mount and df work here on console.  But, a “zpool status tank” will hang as before.  A Ctrl+D on it
> 
> root@host:~ # screen
> load: 0.00 cmd: zpool 62829 [aw.aew_cv] 37.89r 0.00u 0.00s 0% 6976k
> mi_switch+0xc1 _cv_wait+0xf2 arc_wait_for_eviction+0x14a arc_get_data_impl+0xdb arc_hdr_alloc_abd+0xa6 arc_hdr_alloc+0x11e arc_read+0x4f4 dbuf_read+0xc08 dmu_buf_hold+0x46 zap_lookup_norm+0x35 zap_contains+0x26 vdev_rebuild_get_stats+0xac vdev_config_generate+0x3e9 vdev_config_generate+0x74f spa_config_generate+0x2a2 spa_open_common+0x25c spa_get_stats+0x4e zfs_ioc_pool_stats+0x22
> 
> 
> 
> > On Nov 11, 2021, at 14:10, Dave Cottlehuber <> wrote:
> > 
> > Grab output of ‘procstat-kk’ and see if this is similar to  https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=258208 a few more prods might get this one addressed!
> 
> procstat -kk 62829 yields the same as above.  Which I presume is expected, I’d just never used procstate -kk before.
> 
> Unfortunately, I can’t tell if this is sufficiently similar to bug 258208.  A different ZFS operation is happening here, so the calls behind my zpool status are different.  The other non-zfs stat above (screen in my case) doesn’t seem to be hitting zfs at all, but I may be missing something.  Andriy, Mark J, let me know if you think this is relevant, I can build a 13-STABLE with D32931 if you think it will be of use.

No, this looks like a different problem.

If it's possible to reproduce this and procstat -kka is usable, it would
be helpful to see the full output.  In particular, I am wondering if the
page daemon is getting blocked waiting for the arc evict handler to
successfully allocate memory.

https://cgit.freebsd.org/src/commit/?id=97ed4babb51636d8a4b11bc7b207c3219ffcd0e3
is an example of a fix for such a problem, and is not present in 13.0.
I would also suggest trying to apply that patch, though I'm fairly sure
there are other such problems still lurking.

> Thanks.  Let me know any thoughts you have.
> 
>           - Chris
> 



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YY7KSgGZY9ehdjzu>