Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 23 Nov 2021 12:18:58 +0100
From:      Ronald Klop via freebsd-fs <>
To:        "Andriy Gapon" <>, "Chris Ross" <>
Cc:        "Mark Johnston" <>, freebsd-fs <>
Subject:   Re: swap_pager: cannot allocate bio
Message-ID:  <op.1da5pwr2kndu52@joepie>
In-Reply-To: <>
References:  <> <> <op.1cpimpsmkndu52@joepie> <> <YY7KSgGZY9ehdjzu@nuc> <> <> <> <YZJzy%2ByI40wXFYjd@nuc> <> <> <>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 20 Nov 2021 04:35:52 +0100, Chris Ross <
>  =


> (Sorry that the subject on this thread may not be relevant any more, b=
ut  =

> I don=E2=80=99t want to disconnect the thread.)
>> On Nov 15, 2021, at 13:17, Chris Ross <> wrot=
>>> On Nov 15, 2021, at 10:08, Andriy Gapon <> wrote:
>>> Yes, I propose to remove the wait for ARC evictions from arc_lowmem(=
>>> Another thing that may help a bit is having a greater "slack" betwee=
n  =

>>> a threshold where the page daemon starts paging out and a threshold =

>>> where memory allocations start to wait (via vm_wait_domain).
>>> Also, I think that for a long time we had a problem (but not sure if=

>>> it's still present) where allocations succeeded without waiting unti=
l  =

>>> the free memory went below certain threshold M, but once a thread  =

>>> started waiting in vm_wait it would not be woken up until the free  =

>>> memory went above another threshold N.  And the problem was that N >=
>  =

>>> M.  In other words, a lot of memory had to be freed (and not grabbed=

>>> by other threads) before the waiting thread would be woken up.
>> Thank you both for your inputs.  Let me know if you=E2=80=99d like me=
 to try  =

>> anything, and I=E2=80=99ll kick (reboot) the system and can build a n=
ew kernel  =

>> when you=E2=80=99d like.  I did get another procstat -kka out of it t=
his  =

>> morning, and the system has since gone less responsive, but I assume =

>> that new procstat won=E2=80=99t show anything last night=E2=80=99s di=
> I=E2=80=99m still having this issue.  I rebooted the machine, fsck=E2=80=
=99d the disks,  =

> and got it running again.  Again, it ran for ~50 hours before getting =

> stuck.  I got another procstat-kka off of it, let me know if you=E2=80=
=99d like  =

> a copy of it.  But, it looks like the active processes are all in  =

> arc_wait_for_eviction.  A pagedaemon is in a arc_wait_for_eviction und=
er  =

> a arc_lowmem, but the python processes that were doing the real work  =

> don=E2=80=99t have arc_lowmem in their stacks, just the arc_wait_for_e=
> Please let me know if there=E2=80=99s anything I can do to assist in f=
inding a  =

> remedy for this.  Thank you.
>              - Chris

Just a wild guess. Would it help if you set a limit in the vfs.zfs.arc_m=
ax  =

Maybe that will help lower the memory pressure and gain some stability.
You can use the zfs-stats package to see the current ARC size.

My RPI4 gives:
# zfs-stats -A
ARC Size:                               28.19%  1.93    GiB
         Target Size: (Adaptive)         30.47%  2.08    GiB
         Min Size (Hard Limit):          3.58%   250.80  MiB
         Max Size (High Water):          27:1    6.84    GiB

You can use your stats to tune it to not use too much memory for ARC and=

leave more for the running applications so swapping might also be reduce=

You can check zfs-stats -E to see if the ARC cache hit ratio is still ok=

with limited ARC.


Want to link to this message? Use this URL: <>