Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 23 Nov 2021 12:18:58 +0100
From:      Ronald Klop via freebsd-fs <freebsd-fs@freebsd.org>
To:        "Andriy Gapon" <avg@freebsd.org>, "Chris Ross" <cross+freebsd@distal.com>
Cc:        "Mark Johnston" <markj@freebsd.org>, freebsd-fs <freebsd-fs@freebsd.org>
Subject:   Re: swap_pager: cannot allocate bio
Message-ID:  <op.1da5pwr2kndu52@joepie>
In-Reply-To: <A945F841-3EFD-46EE-91B2-38287D2F2F8D@distal.com>
References:  <9FE99EEF-37C5-43D1-AC9D-17F3EDA19606@distal.com> <09989390-FED9-45A6-A866-4605D3766DFE@distal.com> <op.1cpimpsmkndu52@joepie> <4E5511DF-B163-4928-9CC3-22755683999E@distal.com> <YY7KSgGZY9ehdjzu@nuc> <19A3AAF6-149B-4A3C-8C27-4CFF22382014@distal.com> <6DA63618-F0E9-48EC-AB57-3C3C102BC0C0@distal.com> <35c14795-3b1c-9315-8e9b-a8dfad575a04@FreeBSD.org> <YZJzy%2ByI40wXFYjd@nuc> <b2121d25-0782-5cc3-2b55-33ba11c41995@FreeBSD.org> <471B80F4-B8F4-4D5A-9DEB-3F1E00F42A68@distal.com> <A945F841-3EFD-46EE-91B2-38287D2F2F8D@distal.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 20 Nov 2021 04:35:52 +0100, Chris Ross <cross+freebsd@distal.com=
>  =

wrote:

> (Sorry that the subject on this thread may not be relevant any more, b=
ut  =

> I don=E2=80=99t want to disconnect the thread.)
>
>> On Nov 15, 2021, at 13:17, Chris Ross <cross+freebsd@distal.com> wrot=
e:
>>> On Nov 15, 2021, at 10:08, Andriy Gapon <avg@freebsd.org> wrote:
>>
>>> Yes, I propose to remove the wait for ARC evictions from arc_lowmem(=
).
>>>
>>> Another thing that may help a bit is having a greater "slack" betwee=
n  =

>>> a threshold where the page daemon starts paging out and a threshold =
 =

>>> where memory allocations start to wait (via vm_wait_domain).
>>>
>>> Also, I think that for a long time we had a problem (but not sure if=
  =

>>> it's still present) where allocations succeeded without waiting unti=
l  =

>>> the free memory went below certain threshold M, but once a thread  =

>>> started waiting in vm_wait it would not be woken up until the free  =

>>> memory went above another threshold N.  And the problem was that N >=
>  =

>>> M.  In other words, a lot of memory had to be freed (and not grabbed=
  =

>>> by other threads) before the waiting thread would be woken up.
>>
>> Thank you both for your inputs.  Let me know if you=E2=80=99d like me=
 to try  =

>> anything, and I=E2=80=99ll kick (reboot) the system and can build a n=
ew kernel  =

>> when you=E2=80=99d like.  I did get another procstat -kka out of it t=
his  =

>> morning, and the system has since gone less responsive, but I assume =
 =

>> that new procstat won=E2=80=99t show anything last night=E2=80=99s di=
dn=E2=80=99t.
>
> I=E2=80=99m still having this issue.  I rebooted the machine, fsck=E2=80=
=99d the disks,  =

> and got it running again.  Again, it ran for ~50 hours before getting =
 =

> stuck.  I got another procstat-kka off of it, let me know if you=E2=80=
=99d like  =

> a copy of it.  But, it looks like the active processes are all in  =

> arc_wait_for_eviction.  A pagedaemon is in a arc_wait_for_eviction und=
er  =

> a arc_lowmem, but the python processes that were doing the real work  =

> don=E2=80=99t have arc_lowmem in their stacks, just the arc_wait_for_e=
viction.
>
> Please let me know if there=E2=80=99s anything I can do to assist in f=
inding a  =

> remedy for this.  Thank you.
>
>              - Chris
>
>
>


Just a wild guess. Would it help if you set a limit in the vfs.zfs.arc_m=
ax  =

variable?
Maybe that will help lower the memory pressure and gain some stability.
You can use the zfs-stats package to see the current ARC size.

My RPI4 gives:
# zfs-stats -A
...
ARC Size:                               28.19%  1.93    GiB
         Target Size: (Adaptive)         30.47%  2.08    GiB
         Min Size (Hard Limit):          3.58%   250.80  MiB
         Max Size (High Water):          27:1    6.84    GiB
...

You can use your stats to tune it to not use too much memory for ARC and=
  =

leave more for the running applications so swapping might also be reduce=
d.

You can check zfs-stats -E to see if the ARC cache hit ratio is still ok=
  =

with limited ARC.

Regards,
Ronald.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?op.1da5pwr2kndu52>