Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 10 May 2022 17:47:06 +0200
From:      Jan Mikkelsen <janm@transactionware.com>
To:        Mark Millard <marklmi@yahoo.com>
Cc:        Pete Wright <pete@nomadlogic.org>, freebsd-current <freebsd-current@freebsd.org>
Subject:   Re: Chasing OOM Issues - good sysctl metrics to use?
Message-ID:  <75C02C8C-6A5E-4E19-AC7D-B5DB704E8F16@transactionware.com>
In-Reply-To: <3C5C183F-1471-4139-A53C-0B3815CFC25E@yahoo.com>
References:  <83A713B9-A973-4C97-ACD6-830DF6A50B76.ref@yahoo.com> <83A713B9-A973-4C97-ACD6-830DF6A50B76@yahoo.com> <a5b2e248-3298-80e3-4bb6-742c8431f064@nomadlogic.org> <94B2E2FD-2371-4FEA-8E01-F37103F63CC0@yahoo.com> <0fcb5a4a-5517-e57b-2b69-4f3b3b10589a@nomadlogic.org> <DD98C932-A07F-4097-AE7F-D9CEF0BB6AEE@yahoo.com> <f43d7276-3718-df89-cbf0-5c1ef3d67e77@nomadlogic.org> <f00ccd1f-b6f6-bb00-f0a7-2f760c8953a0@nomadlogic.org> <464ED220-0DE4-4D2F-9DA2-AFD00D8D42B7@yahoo.com> <446d5913-a8c2-7dd0-860b-792fa9fe7c5b@nomadlogic.org> <33B740AA-A431-49CB-9F27-50B8C49734A2@yahoo.com> <3C5C183F-1471-4139-A53C-0B3815CFC25E@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 10 May 2022, at 10:01, Mark Millard <marklmi@yahoo.com> wrote:
>=20
> On 2022-Apr-29, at 13:57, Mark Millard <marklmi@yahoo.com> wrote:
>=20
>> On 2022-Apr-29, at 13:41, Pete Wright <pete@nomadlogic.org> wrote:
>>>=20
>>>> . . .
>>>=20
>>> d'oh - went out for lunch and workstation locked up.  i *knew* i =
shouldn't have said anything lol.
>>=20
>> Any interesting console messages ( or dmesg -a or /var/log/messages =
)?
>>=20
>=20
> I've been doing some testing of a patch by tijl at FreeBSD.org
> and have reproduced both hang-ups (ZFS/ARC context) and kills
> (UFS/noARC and ZFS/ARC) for "was killed: failed to reclaim
> memory", both with and without the patch. This is with only a
> tiny fraction of the swap partition(s) enabled being put to
> use. So far, the testing was deliberately with
> vm.pageout_oom_seq=3D12 (the default value). My testing has been
> with main [so: 14].
>=20
> But I also learned how to avoid the hang-ups that I got --but
> it costs making kills more likely/quicker, other things being
> equal.
>=20
> I discovered that the hang-ups that I got were from all the
> processes that I interact with the system via ending up with
> the process's kernel threads swapped out and were not being
> swapped in. (including sshd, so no new ssh connections). In
> some contexts I only had escaping into the kernel debugger
> available, not even ^T would work. Other times ^T did work.
>=20
> So, when I'm willing to risk kills in order to maintain
> the ability to interact normally, I now use in
> /etc/sysctl.conf :
>=20
> vm.swap_enabled=3D0

I have been looking at an OOM related issue. Ignoring the actual leak, =
the problem leads to a process being killed because the system was out =
of memory. This is fine. After that, however, the system console was =
black with a single block cursor and the console keyboard was =
unresponsive. Caps lock and num lock didn=E2=80=99t toggle their lights =
when pressed.

Using an ssh session, the system looked fine. USB events for the =
keyboard being disconnected and reconnected appeared but the keyboard =
stayed unresponsive.

Setting vm.swap_enabled=3D0, as you did above, resolved this problem. =
After the process was killed a perfectly normal console returned.

The interesting thing is that this test system is configured with no =
swap space.

This is on 13.1-RC5.

> This disables swapping out of process kernel stacks. It
> is just with that option removedfor gaining free RAM, there
> fewer options tried before a kill is initiated. It is not a
> loader-time tunable but is writable, thus the
> /etc/sysctl.conf placement.

Is that really what it does? =46rom a quick look at the code in =
vm/vm_swapout.c, it seems little more complex.

Regards,

Jan M.





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?75C02C8C-6A5E-4E19-AC7D-B5DB704E8F16>