Date: Tue, 23 Nov 2021 12:18:58 +0100 From: Ronald Klop via freebsd-fs <freebsd-fs@freebsd.org> To: "Andriy Gapon" <avg@freebsd.org>, "Chris Ross" <cross+freebsd@distal.com> Cc: "Mark Johnston" <markj@freebsd.org>, freebsd-fs <freebsd-fs@freebsd.org> Subject: Re: swap_pager: cannot allocate bio Message-ID: <op.1da5pwr2kndu52@joepie> In-Reply-To: <A945F841-3EFD-46EE-91B2-38287D2F2F8D@distal.com> References: <9FE99EEF-37C5-43D1-AC9D-17F3EDA19606@distal.com> <09989390-FED9-45A6-A866-4605D3766DFE@distal.com> <op.1cpimpsmkndu52@joepie> <4E5511DF-B163-4928-9CC3-22755683999E@distal.com> <YY7KSgGZY9ehdjzu@nuc> <19A3AAF6-149B-4A3C-8C27-4CFF22382014@distal.com> <6DA63618-F0E9-48EC-AB57-3C3C102BC0C0@distal.com> <35c14795-3b1c-9315-8e9b-a8dfad575a04@FreeBSD.org> <YZJzy%2ByI40wXFYjd@nuc> <b2121d25-0782-5cc3-2b55-33ba11c41995@FreeBSD.org> <471B80F4-B8F4-4D5A-9DEB-3F1E00F42A68@distal.com> <A945F841-3EFD-46EE-91B2-38287D2F2F8D@distal.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 20 Nov 2021 04:35:52 +0100, Chris Ross <cross+freebsd@distal.com= > = wrote: > (Sorry that the subject on this thread may not be relevant any more, b= ut = > I don=E2=80=99t want to disconnect the thread.) > >> On Nov 15, 2021, at 13:17, Chris Ross <cross+freebsd@distal.com> wrot= e: >>> On Nov 15, 2021, at 10:08, Andriy Gapon <avg@freebsd.org> wrote: >> >>> Yes, I propose to remove the wait for ARC evictions from arc_lowmem(= ). >>> >>> Another thing that may help a bit is having a greater "slack" betwee= n = >>> a threshold where the page daemon starts paging out and a threshold = = >>> where memory allocations start to wait (via vm_wait_domain). >>> >>> Also, I think that for a long time we had a problem (but not sure if= = >>> it's still present) where allocations succeeded without waiting unti= l = >>> the free memory went below certain threshold M, but once a thread = >>> started waiting in vm_wait it would not be woken up until the free = >>> memory went above another threshold N. And the problem was that N >= > = >>> M. In other words, a lot of memory had to be freed (and not grabbed= = >>> by other threads) before the waiting thread would be woken up. >> >> Thank you both for your inputs. Let me know if you=E2=80=99d like me= to try = >> anything, and I=E2=80=99ll kick (reboot) the system and can build a n= ew kernel = >> when you=E2=80=99d like. I did get another procstat -kka out of it t= his = >> morning, and the system has since gone less responsive, but I assume = = >> that new procstat won=E2=80=99t show anything last night=E2=80=99s di= dn=E2=80=99t. > > I=E2=80=99m still having this issue. I rebooted the machine, fsck=E2=80= =99d the disks, = > and got it running again. Again, it ran for ~50 hours before getting = = > stuck. I got another procstat-kka off of it, let me know if you=E2=80= =99d like = > a copy of it. But, it looks like the active processes are all in = > arc_wait_for_eviction. A pagedaemon is in a arc_wait_for_eviction und= er = > a arc_lowmem, but the python processes that were doing the real work = > don=E2=80=99t have arc_lowmem in their stacks, just the arc_wait_for_e= viction. > > Please let me know if there=E2=80=99s anything I can do to assist in f= inding a = > remedy for this. Thank you. > > - Chris > > > Just a wild guess. Would it help if you set a limit in the vfs.zfs.arc_m= ax = variable? Maybe that will help lower the memory pressure and gain some stability. You can use the zfs-stats package to see the current ARC size. My RPI4 gives: # zfs-stats -A ... ARC Size: 28.19% 1.93 GiB Target Size: (Adaptive) 30.47% 2.08 GiB Min Size (Hard Limit): 3.58% 250.80 MiB Max Size (High Water): 27:1 6.84 GiB ... You can use your stats to tune it to not use too much memory for ARC and= = leave more for the running applications so swapping might also be reduce= d. You can check zfs-stats -E to see if the ARC cache hit ratio is still ok= = with limited ARC. Regards, Ronald.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?op.1da5pwr2kndu52>