Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 2 Jun 2024 20:05:06 -0400
From:      Warner Losh <imp@bsdimp.com>
To:        Mark Johnston <markj@freebsd.org>
Cc:        "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org>
Subject:   Re: removing support for kernel stack swapping
Message-ID:  <CANCZdfqkCW=RYNKwR6deqqdpNCA-ygCO8yqAyRJmDHQRkHPWDw@mail.gmail.com>
In-Reply-To: <Zl0G0FNquRSQi6aU@nuc>

index | next in thread | previous in thread | raw e-mail

[-- Attachment #1 --]
On Sun, Jun 2, 2024, 5:57 PM Mark Johnston <markj@freebsd.org> wrote:

> FreeBSD will, when free pages are scarce, try to swap out the kernel
> stacks (typically 16KB per thread) of sleeping user threads.  I'm told
> that this mechanism was first implemented in BSD for the VAX port and
> that stabilizing it was quite an endeavour.
>
> This feature has wide-ranging implications for code in the kernel.  For
> instance, if a thread allocates a structure on its stack, links it into
> some data structure visible to other threads, and goes to sleep, it must
> use PHOLD to ensure that the stack doesn't get swapped out while
> sleeping.  A missing PHOLD can thus result in a kernel panic, but this
> kind of mistake is very easy to make and hard to catch without thorough
> stress testing.  The kernel stack allocator also requires a fair bit of
> code to implement this feature, and we've had multiple bugs in that
> area, especially in relation to NUMA support.  Moreover, this feature
> will leave threads swapped out after the system has recovered, resulting
> in high scheduling latency once they're ready to run again.
>
> In a very stressed system, it's possible that we can free up something
> like 1MB of RAM using this mechanism.  I argue that this mechanism is
> not worth it on modern systems: it isn't going to make the difference
> between a graceful recovery from memory pressure and a catatonic state
> which forces a reboot.  The complexity and resulting bugs it induces is
> not worth it.
>


+1.

The smallest bootable system for me is like 256MB, and in a system like
that it might save 256k given the number of threads typical in a system
like that...

Warner

At the BSDCan devsummit I proposed removing support for kernel stack
> swapping and got only positive feedback.  Does anyone here have any
> comments or objections?
>
>

[-- Attachment #2 --]
<div dir="auto"><div><br><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sun, Jun 2, 2024, 5:57 PM Mark Johnston &lt;<a href="mailto:markj@freebsd.org">markj@freebsd.org</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">FreeBSD will, when free pages are scarce, try to swap out the kernel<br>
stacks (typically 16KB per thread) of sleeping user threads.  I&#39;m told<br>
that this mechanism was first implemented in BSD for the VAX port and<br>
that stabilizing it was quite an endeavour.<br>
<br>
This feature has wide-ranging implications for code in the kernel.  For<br>
instance, if a thread allocates a structure on its stack, links it into<br>
some data structure visible to other threads, and goes to sleep, it must<br>
use PHOLD to ensure that the stack doesn&#39;t get swapped out while<br>
sleeping.  A missing PHOLD can thus result in a kernel panic, but this<br>
kind of mistake is very easy to make and hard to catch without thorough<br>
stress testing.  The kernel stack allocator also requires a fair bit of<br>
code to implement this feature, and we&#39;ve had multiple bugs in that<br>
area, especially in relation to NUMA support.  Moreover, this feature<br>
will leave threads swapped out after the system has recovered, resulting<br>
in high scheduling latency once they&#39;re ready to run again.<br>
<br>
In a very stressed system, it&#39;s possible that we can free up something<br>
like 1MB of RAM using this mechanism.  I argue that this mechanism is<br>
not worth it on modern systems: it isn&#39;t going to make the difference<br>
between a graceful recovery from memory pressure and a catatonic state<br>
which forces a reboot.  The complexity and resulting bugs it induces is<br>
not worth it.<br></blockquote></div></div><div dir="auto"><br></div><div dir="auto"><br></div><div dir="auto">+1. </div><div dir="auto"><br></div><div dir="auto">The smallest bootable system for me is like 256MB, and in a system like that it might save 256k given the number of threads typical in a system like that...</div><div dir="auto"><br></div><div dir="auto">Warner</div><div dir="auto"><br></div><div dir="auto"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
At the BSDCan devsummit I proposed removing support for kernel stack<br>
swapping and got only positive feedback.  Does anyone here have any<br>
comments or objections?<br>
<br>
</blockquote></div></div></div>
help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfqkCW=RYNKwR6deqqdpNCA-ygCO8yqAyRJmDHQRkHPWDw>