Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 8 Sep 2024 11:30:20 -0600
From:      Warner Losh <imp@bsdimp.com>
To:        David Chisnall <theraven@freebsd.org>
Cc:        Kristof Provost <kp@freebsd.org>, Poul-Henning Kamp <phk@phk.freebsd.dk>,  Alan Somers <asomers@freebsd.org>, Dmitry Salychev <dsl@freebsd.org>,  Jan Knepper <jan@digitaldaemon.com>, FreeBSD Hackers <freebsd-hackers@freebsd.org>
Subject:   Re: The Case for Rust (in any system)
Message-ID:  <CANCZdfo322v_z3NdmB%2Bgci-e33Yek9cvoD0n22Virr5yOptmTA@mail.gmail.com>
In-Reply-To: <0BC57127-5CF9-45C5-9BE6-7E21D2313291@FreeBSD.org>
References:  <202409060725.4867P3ul040678@critter.freebsd.dk> <4E4FB8CC-A974-42C4-95D5-2E1E4BF681AD@freebsd.org> <B355DB3E-82A2-407A-9D70-2A40C953DEB2@FreeBSD.org> <CANCZdfr03sUOz7AEZhTDmWPCPcA%2BqjRf4ZuUxs1FMi2xjnomWA@mail.gmail.com> <0BC57127-5CF9-45C5-9BE6-7E21D2313291@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
--000000000000578a3f06219eff80
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Sun, Sep 8, 2024, 9:05=E2=80=AFAM David Chisnall <theraven@freebsd.org> =
wrote:

> On 8 Sep 2024, at 14:50, Warner Losh <imp@bsdimp.com> wrote:
>
>
> So there's four big issues with C++ in the kernel, all surmountable if we
> wanted.
>
>
> There are two missing from your list, which I encountered when I wrote a
> kernel module for FreeBSD in C++ a few years ago:
>
> C++ relies on COMDATs quite a lot.  Each inline function and each functio=
n
> that=E2=80=99s instantiated as a template is a separate section with some=
 flags
> indicating that the linker / loader should keep one and discard the rest.
> If you have a single C++ module, this is fine, but for two it=E2=80=99s h=
arder.  I
> did a small =E2=80=98libkxx=E2=80=99 module that provided a subset of lib=
c++ for use by
> different modules, but the kernel loader code didn=E2=80=99t have enough =
comments
> for me to understand how to fix it. I would be tempted to approach this
> with a userspace tool that runs over a set of kernel modules and pulls ou=
t
> duplicated COMDATs into separate modules that other things can depend on.
> Alternatively, the kernel loader could be modified to load only reference=
d
> COMDATs, reference count them, and not load unused things from each kerne=
l
> module.  The latter is a cleaner approach but is more work.
>
> Second, between 11 and 12, someone decided to replaces a load of static
> inline functions in kernel headers with macros.  These conflict a lot.
>
> There's the low-level allocation issues. Right now we know what memory is
> used by what because we have malloc enhanced to track this (oversimplifyi=
ng
> a lot I know). So we'd need some framework to make it easy to have 'custo=
m
> allocators' that could track it as well. At a bare minimum, we need the
> runtime support for new and delete...
>
>
> This is not technically required, but it is a good idea to think about
> what the right strategy is.  A C++ class can implement its own `operator
> new` and `operator delete` wrapping `malloc(9)` and then subclasses will
> allocate with that.  Similarly, things like `std::unique_ptr` can take an
> explicit deleter.
>
> This can be a bit clunky and it=E2=80=99s probably a good idea to have so=
me
> sensible defaults.
>
> Next, there's all the other run-time support that's provided by
> compiler-rt.
>
>
> Nothing in compiler-rt is needed for C++ except the unwinder if you want
> exceptions (no one else except NT uses exceptions in a kernel).  The one
> bit of libcxxrt that you would probably want is the support for guard
> variables, which would need modifying to use kernel locks.  This is fairl=
y
> small, I wrote a custom one for CHERIoT RTOS which uses our futex APIs.
>
> Next, there's the issues of exceptions. They are quite useful for RAII
> (since you know dtors will get run when an error happens), and there'd ne=
ed
> to be some sane plan for these (or we'd have to forego them).
>
>
> Most kernels disable exceptions.  You absolutely do not want Itanium-styl=
e
> exceptions in a kernel because they need to allocate to throw exceptions
> and so you would only be able to throw from places where allocation is
> safe.  Given that the most common place you=E2=80=99d want to throw an ex=
ception
> (if you had them) is if `malloc` with `M_NOWAIT` failed, this could be a
> problem.
>
> NT uses SEH exceptions, which allocate all of the state on the stack and
> then run funclets for cleanup.  It would be possible to support this in t=
he
> kernel (the relevant patents expired over ten years ago), but a non-trivi=
al
> amount of work.  If someone wanted to do the work, it would be great: SEH
> is one of the very few things I really liked about the NT kernel.
>
> Finally, there's getting the subset of the standard library that's useful
> into the kernel. There's a lot of templates for facilitating RAII that ar=
e
> needed, for example, and some subset of STL, etc.
>
>
> You don=E2=80=99t need templates for RAII, RAII just depends on destructo=
rs.
> Templates are useful, but largely orthogonal.  I=E2=80=99d personally rec=
ommend
> against using much of the standard library in the kernel because it does
> not have good ways of handling allocation failure without exceptions.  Th=
e
> C++ standard defines a Freestanding profile (similar to C) that includes
> things like the type traits that are useful for compile-time
> metaprogramming.  There are a few bits you might want to pull in but a lo=
t
> more that you=E2=80=99d want to avoid (I actually have iostream working w=
ith the
> kernel=E2=80=99s printf family, but it was a terrible idea and no one sho=
uld ever
> use that code).
>
> For example, `std::shared_ptr` is probably not a good idea (it allocates =
a
> separate control block to hold the ref count), but something that wraps
> things that are intrusively reference counted with `refcount(9)` in smart
> pointers would be valuable.  Using member pointers, it=E2=80=99s easy to =
build a
> smart-pointer template that takes a C type that contains a refcount and a
> pointer to the field and automatically manipulates the reference count wh=
en
> you copy the pointer.
>
> Once you have those 'table stakes' issues out of the way, you'll need to
> see how it performs, and work out a few dozen logistical issues surroundi=
ng
> what compiler flags to use, what language features to enable/disable, how
> to optimize and what set of warnings are sensible.
>
>
> -fno-exceptions and -fno-rtti is what most peopls use for kernel
> programming (there are a few dozen kernels written in C++, it=E2=80=99s n=
ot like
> we=E2=80=99d be the first to try).
>
> You could start using C++ with just the second of these items.
>
>
> You can use it within a single kernel module now, as long as you resolve
> COMDATs prior to linking and #undef a bunch of things.  I was doing so fi=
ve
> years ago.  The build system actually supports it already, though possibl=
y
> not deliberately.
>

I did C++ in the kernel in the 4.x->7-current time frame. I stopped because
g++ was still evolving a lot in that time period and things kept
breaking... and our use of C++ wasn't so big that rewriting in C was hard
when one of the compiler updates in base broke something. A lot has changed
since then and I'm just extrapolating from that... your experience is much
more recent.

Warner

David
>
>

--000000000000578a3f06219eff80
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"auto"><div><br><br><div class=3D"gmail_quote"><div dir=3D"ltr" =
class=3D"gmail_attr">On Sun, Sep 8, 2024, 9:05=E2=80=AFAM David Chisnall &l=
t;<a href=3D"mailto:theraven@freebsd.org">theraven@freebsd.org</a>&gt; wrot=
e:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;bo=
rder-left:1px #ccc solid;padding-left:1ex"><div style=3D"line-break:after-w=
hite-space">On 8 Sep 2024, at 14:50, Warner Losh &lt;<a href=3D"mailto:imp@=
bsdimp.com" target=3D"_blank" rel=3D"noreferrer">imp@bsdimp.com</a>&gt; wro=
te:<br><div><blockquote type=3D"cite"><br><div><div dir=3D"ltr" style=3D"fo=
nt-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:norm=
al;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;t=
ext-transform:none;white-space:normal;word-spacing:0px;text-decoration:none=
"><div class=3D"gmail_quote"><div>So there&#39;s four big issues with C++ i=
n the kernel, all surmountable if we wanted.</div></div></div></div></block=
quote><div><br></div><div>There are two missing from your list, which I enc=
ountered when I wrote a kernel module for FreeBSD in C++ a few years ago:</=
div><div><br></div><div>C++ relies on COMDATs quite a lot.=C2=A0 Each inlin=
e function and each function that=E2=80=99s instantiated as a template is a=
 separate section with some flags indicating that the linker / loader shoul=
d keep one and discard the rest. If you have a single C++ module, this is f=
ine, but for two it=E2=80=99s harder.=C2=A0 I did a small =E2=80=98libkxx=
=E2=80=99 module that provided a subset of libc++ for use by different modu=
les, but the kernel loader code didn=E2=80=99t have enough comments for me =
to understand how to fix it. I would be tempted to approach this with a use=
rspace tool that runs over a set of kernel modules and pulls out duplicated=
 COMDATs into separate modules that other things can depend on.=C2=A0 Alter=
natively, the kernel loader could be modified to load only referenced COMDA=
Ts, reference count them, and not load unused things from each kernel modul=
e.=C2=A0 The latter is a cleaner approach but is more work.</div><div><br><=
/div><div>Second, between 11 and 12, someone decided to replaces a load of =
static inline functions in kernel headers with macros.=C2=A0 These conflict=
 a lot.</div><div><div dir=3D"ltr" style=3D"font-family:Helvetica;font-size=
:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spa=
cing:normal;text-align:start;text-indent:0px;text-transform:none;white-spac=
e:normal;word-spacing:0px;text-decoration:none"><div class=3D"gmail_quote">=
<div><br></div></div></div></div><blockquote type=3D"cite"><div><div dir=3D=
"ltr" style=3D"font-family:Helvetica;font-size:12px;font-style:normal;font-=
variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;=
text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;tex=
t-decoration:none"><div class=3D"gmail_quote"><div>There&#39;s the low-leve=
l allocation issues. Right now we know what memory is used by what because =
we have malloc enhanced to track this (oversimplifying a lot I know). So we=
&#39;d need some framework to make it easy to have &#39;custom allocators&#=
39; that could track it as well. At a bare minimum, we need the runtime sup=
port for new and delete...</div></div></div></div></blockquote><div><br></d=
iv><div>This is not technically required, but it is a good idea to think ab=
out what the right strategy is.=C2=A0 A C++ class can implement its own `op=
erator new` and `operator delete` wrapping `malloc(9)` and then subclasses =
will allocate with that.=C2=A0 Similarly, things like `std::unique_ptr` can=
 take an explicit deleter.</div><div><br></div><div>This can be a bit clunk=
y and it=E2=80=99s probably a good idea to have some sensible defaults.</di=
v><div><div dir=3D"ltr" style=3D"font-family:Helvetica;font-size:12px;font-=
style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal=
;text-align:start;text-indent:0px;text-transform:none;white-space:normal;wo=
rd-spacing:0px;text-decoration:none"><div class=3D"gmail_quote"><div><br></=
div></div></div></div><blockquote type=3D"cite"><div><div dir=3D"ltr" style=
=3D"font-family:Helvetica;font-size:12px;font-style:normal;font-variant-cap=
s:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent=
:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoratio=
n:none"><div class=3D"gmail_quote"><div>Next, there&#39;s all the other run=
-time support that&#39;s provided by compiler-rt.</div></div></div></div></=
blockquote><div><br></div><div>Nothing in compiler-rt is needed for C++ exc=
ept the unwinder if you want exceptions (no one else except NT uses excepti=
ons in a kernel).=C2=A0 The one bit of libcxxrt that you would probably wan=
t is the support for guard variables, which would need modifying to use ker=
nel locks.=C2=A0 This is fairly small, I wrote a custom one for CHERIoT RTO=
S which uses our futex APIs.</div><div><div dir=3D"ltr" style=3D"font-famil=
y:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-=
weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-tran=
sform:none;white-space:normal;word-spacing:0px;text-decoration:none"><div c=
lass=3D"gmail_quote"><div><br></div></div></div></div><blockquote type=3D"c=
ite"><div><div dir=3D"ltr" style=3D"font-family:Helvetica;font-size:12px;fo=
nt-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:nor=
mal;text-align:start;text-indent:0px;text-transform:none;white-space:normal=
;word-spacing:0px;text-decoration:none"><div class=3D"gmail_quote"><div>Nex=
t, there&#39;s the issues of exceptions. They are quite useful for RAII (si=
nce you know dtors will get run when an error happens), and there&#39;d nee=
d to be some sane plan for these (or we&#39;d have to forego them).</div></=
div></div></div></blockquote><div><br></div><div>Most kernels disable excep=
tions.=C2=A0 You absolutely do not want Itanium-style exceptions in a kerne=
l because they need to allocate to throw exceptions and so you would only b=
e able to throw from places where allocation is safe.=C2=A0 Given that the =
most common place you=E2=80=99d want to throw an exception (if you had them=
) is if `malloc` with `M_NOWAIT` failed, this could be a problem.</div><div=
><br></div><div>NT uses SEH exceptions, which allocate all of the state on =
the stack and then run funclets for cleanup.=C2=A0 It would be possible to =
support this in the kernel (the relevant patents expired over ten years ago=
), but a non-trivial amount of work.=C2=A0 If someone wanted to do the work=
, it would be great: SEH is one of the very few things I really liked about=
 the NT kernel.</div><div><div dir=3D"ltr" style=3D"font-family:Helvetica;f=
ont-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;le=
tter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;wh=
ite-space:normal;word-spacing:0px;text-decoration:none"><div class=3D"gmail=
_quote"><div><br></div></div></div></div><blockquote type=3D"cite"><div><di=
v dir=3D"ltr" style=3D"font-family:Helvetica;font-size:12px;font-style:norm=
al;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-alig=
n:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing=
:0px;text-decoration:none"><div class=3D"gmail_quote"><div>Finally, there&#=
39;s getting the subset of the standard library that&#39;s useful into the =
kernel. There&#39;s a lot of templates for facilitating RAII that are neede=
d, for example, and some subset of STL, etc.</div></div></div></div></block=
quote><div><br></div><div>You don=E2=80=99t need templates for RAII, RAII j=
ust depends on destructors.=C2=A0 Templates are useful, but largely orthogo=
nal.=C2=A0 I=E2=80=99d personally recommend against using much of the stand=
ard library in the kernel because it does not have good ways of handling al=
location failure without exceptions.=C2=A0 The C++ standard defines a Frees=
tanding profile (similar to C) that includes things like the type traits th=
at are useful for compile-time metaprogramming.=C2=A0 There are a few bits =
you might want to pull in but a lot more that you=E2=80=99d want to avoid (=
I actually have iostream working with the kernel=E2=80=99s printf family, b=
ut it was a terrible idea and no one should ever use that code).</div><div>=
<br></div><div>For example, `std::shared_ptr` is probably not a good idea (=
it allocates a separate control block to hold the ref count), but something=
 that wraps things that are intrusively reference counted with `refcount(9)=
` in smart pointers would be valuable.=C2=A0 Using member pointers, it=E2=
=80=99s easy to build a smart-pointer template that takes a C type that con=
tains a refcount and a pointer to the field and automatically manipulates t=
he reference count when you copy the pointer.</div><div><div dir=3D"ltr" st=
yle=3D"font-family:Helvetica;font-size:12px;font-style:normal;font-variant-=
caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-ind=
ent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decora=
tion:none"><div class=3D"gmail_quote"><div><br></div></div></div></div><blo=
ckquote type=3D"cite"><div><div dir=3D"ltr" style=3D"font-family:Helvetica;=
font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;l=
etter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;w=
hite-space:normal;word-spacing:0px;text-decoration:none"><div class=3D"gmai=
l_quote"><div>Once you have those &#39;table stakes&#39; issues out of the =
way, you&#39;ll need to see how it performs, and work out a few dozen logis=
tical issues surrounding what compiler flags to use, what language features=
 to enable/disable, how to optimize and what set of warnings are sensible.<=
/div></div></div></div></blockquote><div><br></div><div>-fno-exceptions and=
 -fno-rtti is what most peopls use for kernel programming (there are a few =
dozen kernels written in C++, it=E2=80=99s not like we=E2=80=99d be the fir=
st to try).</div><div><br></div><blockquote type=3D"cite"><div><div dir=3D"=
ltr" style=3D"font-family:Helvetica;font-size:12px;font-style:normal;font-v=
ariant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;t=
ext-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text=
-decoration:none"><div class=3D"gmail_quote"><div>You could start using C++=
 with just the second of these items.</div></div></div></div></blockquote><=
div><br></div><div>You can use it within a single kernel module now, as lon=
g as you resolve COMDATs prior to linking and #undef a bunch of things.=C2=
=A0 I was doing so five years ago.=C2=A0 The build system actually supports=
 it already, though possibly not deliberately.</div></div></div></blockquot=
e></div></div><div dir=3D"auto"><br></div><div dir=3D"auto">I did C++ in th=
e kernel in the 4.x-&gt;7-current time frame. I stopped because g++ was sti=
ll evolving a lot in that time period and things kept breaking... and our u=
se of C++ wasn&#39;t so big that rewriting in C was hard when one of the co=
mpiler updates in base broke something. A lot has changed since then and I&=
#39;m just extrapolating from that... your experience is much more recent.<=
/div><div dir=3D"auto"><br></div><div dir=3D"auto">Warner</div><div dir=3D"=
auto"><br></div><div dir=3D"auto"><div class=3D"gmail_quote"><blockquote cl=
ass=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;p=
adding-left:1ex"><div style=3D"line-break:after-white-space">David<div><br>=
</div></div></blockquote></div></div></div>

--000000000000578a3f06219eff80--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfo322v_z3NdmB%2Bgci-e33Yek9cvoD0n22Virr5yOptmTA>