Date: Sun, 8 Sep 2024 11:30:20 -0600 From: Warner Losh <imp@bsdimp.com> To: David Chisnall <theraven@freebsd.org> Cc: Kristof Provost <kp@freebsd.org>, Poul-Henning Kamp <phk@phk.freebsd.dk>, Alan Somers <asomers@freebsd.org>, Dmitry Salychev <dsl@freebsd.org>, Jan Knepper <jan@digitaldaemon.com>, FreeBSD Hackers <freebsd-hackers@freebsd.org> Subject: Re: The Case for Rust (in any system) Message-ID: <CANCZdfo322v_z3NdmB%2Bgci-e33Yek9cvoD0n22Virr5yOptmTA@mail.gmail.com> In-Reply-To: <0BC57127-5CF9-45C5-9BE6-7E21D2313291@FreeBSD.org> References: <202409060725.4867P3ul040678@critter.freebsd.dk> <4E4FB8CC-A974-42C4-95D5-2E1E4BF681AD@freebsd.org> <B355DB3E-82A2-407A-9D70-2A40C953DEB2@FreeBSD.org> <CANCZdfr03sUOz7AEZhTDmWPCPcA%2BqjRf4ZuUxs1FMi2xjnomWA@mail.gmail.com> <0BC57127-5CF9-45C5-9BE6-7E21D2313291@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--000000000000578a3f06219eff80 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sun, Sep 8, 2024, 9:05=E2=80=AFAM David Chisnall <theraven@freebsd.org> = wrote: > On 8 Sep 2024, at 14:50, Warner Losh <imp@bsdimp.com> wrote: > > > So there's four big issues with C++ in the kernel, all surmountable if we > wanted. > > > There are two missing from your list, which I encountered when I wrote a > kernel module for FreeBSD in C++ a few years ago: > > C++ relies on COMDATs quite a lot. Each inline function and each functio= n > that=E2=80=99s instantiated as a template is a separate section with some= flags > indicating that the linker / loader should keep one and discard the rest. > If you have a single C++ module, this is fine, but for two it=E2=80=99s h= arder. I > did a small =E2=80=98libkxx=E2=80=99 module that provided a subset of lib= c++ for use by > different modules, but the kernel loader code didn=E2=80=99t have enough = comments > for me to understand how to fix it. I would be tempted to approach this > with a userspace tool that runs over a set of kernel modules and pulls ou= t > duplicated COMDATs into separate modules that other things can depend on. > Alternatively, the kernel loader could be modified to load only reference= d > COMDATs, reference count them, and not load unused things from each kerne= l > module. The latter is a cleaner approach but is more work. > > Second, between 11 and 12, someone decided to replaces a load of static > inline functions in kernel headers with macros. These conflict a lot. > > There's the low-level allocation issues. Right now we know what memory is > used by what because we have malloc enhanced to track this (oversimplifyi= ng > a lot I know). So we'd need some framework to make it easy to have 'custo= m > allocators' that could track it as well. At a bare minimum, we need the > runtime support for new and delete... > > > This is not technically required, but it is a good idea to think about > what the right strategy is. A C++ class can implement its own `operator > new` and `operator delete` wrapping `malloc(9)` and then subclasses will > allocate with that. Similarly, things like `std::unique_ptr` can take an > explicit deleter. > > This can be a bit clunky and it=E2=80=99s probably a good idea to have so= me > sensible defaults. > > Next, there's all the other run-time support that's provided by > compiler-rt. > > > Nothing in compiler-rt is needed for C++ except the unwinder if you want > exceptions (no one else except NT uses exceptions in a kernel). The one > bit of libcxxrt that you would probably want is the support for guard > variables, which would need modifying to use kernel locks. This is fairl= y > small, I wrote a custom one for CHERIoT RTOS which uses our futex APIs. > > Next, there's the issues of exceptions. They are quite useful for RAII > (since you know dtors will get run when an error happens), and there'd ne= ed > to be some sane plan for these (or we'd have to forego them). > > > Most kernels disable exceptions. You absolutely do not want Itanium-styl= e > exceptions in a kernel because they need to allocate to throw exceptions > and so you would only be able to throw from places where allocation is > safe. Given that the most common place you=E2=80=99d want to throw an ex= ception > (if you had them) is if `malloc` with `M_NOWAIT` failed, this could be a > problem. > > NT uses SEH exceptions, which allocate all of the state on the stack and > then run funclets for cleanup. It would be possible to support this in t= he > kernel (the relevant patents expired over ten years ago), but a non-trivi= al > amount of work. If someone wanted to do the work, it would be great: SEH > is one of the very few things I really liked about the NT kernel. > > Finally, there's getting the subset of the standard library that's useful > into the kernel. There's a lot of templates for facilitating RAII that ar= e > needed, for example, and some subset of STL, etc. > > > You don=E2=80=99t need templates for RAII, RAII just depends on destructo= rs. > Templates are useful, but largely orthogonal. I=E2=80=99d personally rec= ommend > against using much of the standard library in the kernel because it does > not have good ways of handling allocation failure without exceptions. Th= e > C++ standard defines a Freestanding profile (similar to C) that includes > things like the type traits that are useful for compile-time > metaprogramming. There are a few bits you might want to pull in but a lo= t > more that you=E2=80=99d want to avoid (I actually have iostream working w= ith the > kernel=E2=80=99s printf family, but it was a terrible idea and no one sho= uld ever > use that code). > > For example, `std::shared_ptr` is probably not a good idea (it allocates = a > separate control block to hold the ref count), but something that wraps > things that are intrusively reference counted with `refcount(9)` in smart > pointers would be valuable. Using member pointers, it=E2=80=99s easy to = build a > smart-pointer template that takes a C type that contains a refcount and a > pointer to the field and automatically manipulates the reference count wh= en > you copy the pointer. > > Once you have those 'table stakes' issues out of the way, you'll need to > see how it performs, and work out a few dozen logistical issues surroundi= ng > what compiler flags to use, what language features to enable/disable, how > to optimize and what set of warnings are sensible. > > > -fno-exceptions and -fno-rtti is what most peopls use for kernel > programming (there are a few dozen kernels written in C++, it=E2=80=99s n= ot like > we=E2=80=99d be the first to try). > > You could start using C++ with just the second of these items. > > > You can use it within a single kernel module now, as long as you resolve > COMDATs prior to linking and #undef a bunch of things. I was doing so fi= ve > years ago. The build system actually supports it already, though possibl= y > not deliberately. > I did C++ in the kernel in the 4.x->7-current time frame. I stopped because g++ was still evolving a lot in that time period and things kept breaking... and our use of C++ wasn't so big that rewriting in C was hard when one of the compiler updates in base broke something. A lot has changed since then and I'm just extrapolating from that... your experience is much more recent. Warner David > > --000000000000578a3f06219eff80 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"auto"><div><br><br><div class=3D"gmail_quote"><div dir=3D"ltr" = class=3D"gmail_attr">On Sun, Sep 8, 2024, 9:05=E2=80=AFAM David Chisnall &l= t;<a href=3D"mailto:theraven@freebsd.org">theraven@freebsd.org</a>> wrot= e:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;bo= rder-left:1px #ccc solid;padding-left:1ex"><div style=3D"line-break:after-w= hite-space">On 8 Sep 2024, at 14:50, Warner Losh <<a href=3D"mailto:imp@= bsdimp.com" target=3D"_blank" rel=3D"noreferrer">imp@bsdimp.com</a>> wro= te:<br><div><blockquote type=3D"cite"><br><div><div dir=3D"ltr" style=3D"fo= nt-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:norm= al;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;t= ext-transform:none;white-space:normal;word-spacing:0px;text-decoration:none= "><div class=3D"gmail_quote"><div>So there's four big issues with C++ i= n the kernel, all surmountable if we wanted.</div></div></div></div></block= quote><div><br></div><div>There are two missing from your list, which I enc= ountered when I wrote a kernel module for FreeBSD in C++ a few years ago:</= div><div><br></div><div>C++ relies on COMDATs quite a lot.=C2=A0 Each inlin= e function and each function that=E2=80=99s instantiated as a template is a= separate section with some flags indicating that the linker / loader shoul= d keep one and discard the rest. If you have a single C++ module, this is f= ine, but for two it=E2=80=99s harder.=C2=A0 I did a small =E2=80=98libkxx= =E2=80=99 module that provided a subset of libc++ for use by different modu= les, but the kernel loader code didn=E2=80=99t have enough comments for me = to understand how to fix it. I would be tempted to approach this with a use= rspace tool that runs over a set of kernel modules and pulls out duplicated= COMDATs into separate modules that other things can depend on.=C2=A0 Alter= natively, the kernel loader could be modified to load only referenced COMDA= Ts, reference count them, and not load unused things from each kernel modul= e.=C2=A0 The latter is a cleaner approach but is more work.</div><div><br><= /div><div>Second, between 11 and 12, someone decided to replaces a load of = static inline functions in kernel headers with macros.=C2=A0 These conflict= a lot.</div><div><div dir=3D"ltr" style=3D"font-family:Helvetica;font-size= :12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spa= cing:normal;text-align:start;text-indent:0px;text-transform:none;white-spac= e:normal;word-spacing:0px;text-decoration:none"><div class=3D"gmail_quote">= <div><br></div></div></div></div><blockquote type=3D"cite"><div><div dir=3D= "ltr" style=3D"font-family:Helvetica;font-size:12px;font-style:normal;font-= variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;= text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;tex= t-decoration:none"><div class=3D"gmail_quote"><div>There's the low-leve= l allocation issues. Right now we know what memory is used by what because = we have malloc enhanced to track this (oversimplifying a lot I know). So we= 'd need some framework to make it easy to have 'custom allocators&#= 39; that could track it as well. At a bare minimum, we need the runtime sup= port for new and delete...</div></div></div></div></blockquote><div><br></d= iv><div>This is not technically required, but it is a good idea to think ab= out what the right strategy is.=C2=A0 A C++ class can implement its own `op= erator new` and `operator delete` wrapping `malloc(9)` and then subclasses = will allocate with that.=C2=A0 Similarly, things like `std::unique_ptr` can= take an explicit deleter.</div><div><br></div><div>This can be a bit clunk= y and it=E2=80=99s probably a good idea to have some sensible defaults.</di= v><div><div dir=3D"ltr" style=3D"font-family:Helvetica;font-size:12px;font-= style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal= ;text-align:start;text-indent:0px;text-transform:none;white-space:normal;wo= rd-spacing:0px;text-decoration:none"><div class=3D"gmail_quote"><div><br></= div></div></div></div><blockquote type=3D"cite"><div><div dir=3D"ltr" style= =3D"font-family:Helvetica;font-size:12px;font-style:normal;font-variant-cap= s:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent= :0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoratio= n:none"><div class=3D"gmail_quote"><div>Next, there's all the other run= -time support that's provided by compiler-rt.</div></div></div></div></= blockquote><div><br></div><div>Nothing in compiler-rt is needed for C++ exc= ept the unwinder if you want exceptions (no one else except NT uses excepti= ons in a kernel).=C2=A0 The one bit of libcxxrt that you would probably wan= t is the support for guard variables, which would need modifying to use ker= nel locks.=C2=A0 This is fairly small, I wrote a custom one for CHERIoT RTO= S which uses our futex APIs.</div><div><div dir=3D"ltr" style=3D"font-famil= y:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-= weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-tran= sform:none;white-space:normal;word-spacing:0px;text-decoration:none"><div c= lass=3D"gmail_quote"><div><br></div></div></div></div><blockquote type=3D"c= ite"><div><div dir=3D"ltr" style=3D"font-family:Helvetica;font-size:12px;fo= nt-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:nor= mal;text-align:start;text-indent:0px;text-transform:none;white-space:normal= ;word-spacing:0px;text-decoration:none"><div class=3D"gmail_quote"><div>Nex= t, there's the issues of exceptions. They are quite useful for RAII (si= nce you know dtors will get run when an error happens), and there'd nee= d to be some sane plan for these (or we'd have to forego them).</div></= div></div></div></blockquote><div><br></div><div>Most kernels disable excep= tions.=C2=A0 You absolutely do not want Itanium-style exceptions in a kerne= l because they need to allocate to throw exceptions and so you would only b= e able to throw from places where allocation is safe.=C2=A0 Given that the = most common place you=E2=80=99d want to throw an exception (if you had them= ) is if `malloc` with `M_NOWAIT` failed, this could be a problem.</div><div= ><br></div><div>NT uses SEH exceptions, which allocate all of the state on = the stack and then run funclets for cleanup.=C2=A0 It would be possible to = support this in the kernel (the relevant patents expired over ten years ago= ), but a non-trivial amount of work.=C2=A0 If someone wanted to do the work= , it would be great: SEH is one of the very few things I really liked about= the NT kernel.</div><div><div dir=3D"ltr" style=3D"font-family:Helvetica;f= ont-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;le= tter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;wh= ite-space:normal;word-spacing:0px;text-decoration:none"><div class=3D"gmail= _quote"><div><br></div></div></div></div><blockquote type=3D"cite"><div><di= v dir=3D"ltr" style=3D"font-family:Helvetica;font-size:12px;font-style:norm= al;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-alig= n:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing= :0px;text-decoration:none"><div class=3D"gmail_quote"><div>Finally, there&#= 39;s getting the subset of the standard library that's useful into the = kernel. There's a lot of templates for facilitating RAII that are neede= d, for example, and some subset of STL, etc.</div></div></div></div></block= quote><div><br></div><div>You don=E2=80=99t need templates for RAII, RAII j= ust depends on destructors.=C2=A0 Templates are useful, but largely orthogo= nal.=C2=A0 I=E2=80=99d personally recommend against using much of the stand= ard library in the kernel because it does not have good ways of handling al= location failure without exceptions.=C2=A0 The C++ standard defines a Frees= tanding profile (similar to C) that includes things like the type traits th= at are useful for compile-time metaprogramming.=C2=A0 There are a few bits = you might want to pull in but a lot more that you=E2=80=99d want to avoid (= I actually have iostream working with the kernel=E2=80=99s printf family, b= ut it was a terrible idea and no one should ever use that code).</div><div>= <br></div><div>For example, `std::shared_ptr` is probably not a good idea (= it allocates a separate control block to hold the ref count), but something= that wraps things that are intrusively reference counted with `refcount(9)= ` in smart pointers would be valuable.=C2=A0 Using member pointers, it=E2= =80=99s easy to build a smart-pointer template that takes a C type that con= tains a refcount and a pointer to the field and automatically manipulates t= he reference count when you copy the pointer.</div><div><div dir=3D"ltr" st= yle=3D"font-family:Helvetica;font-size:12px;font-style:normal;font-variant-= caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-ind= ent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decora= tion:none"><div class=3D"gmail_quote"><div><br></div></div></div></div><blo= ckquote type=3D"cite"><div><div dir=3D"ltr" style=3D"font-family:Helvetica;= font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;l= etter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;w= hite-space:normal;word-spacing:0px;text-decoration:none"><div class=3D"gmai= l_quote"><div>Once you have those 'table stakes' issues out of the = way, you'll need to see how it performs, and work out a few dozen logis= tical issues surrounding what compiler flags to use, what language features= to enable/disable, how to optimize and what set of warnings are sensible.<= /div></div></div></div></blockquote><div><br></div><div>-fno-exceptions and= -fno-rtti is what most peopls use for kernel programming (there are a few = dozen kernels written in C++, it=E2=80=99s not like we=E2=80=99d be the fir= st to try).</div><div><br></div><blockquote type=3D"cite"><div><div dir=3D"= ltr" style=3D"font-family:Helvetica;font-size:12px;font-style:normal;font-v= ariant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;t= ext-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text= -decoration:none"><div class=3D"gmail_quote"><div>You could start using C++= with just the second of these items.</div></div></div></div></blockquote><= div><br></div><div>You can use it within a single kernel module now, as lon= g as you resolve COMDATs prior to linking and #undef a bunch of things.=C2= =A0 I was doing so five years ago.=C2=A0 The build system actually supports= it already, though possibly not deliberately.</div></div></div></blockquot= e></div></div><div dir=3D"auto"><br></div><div dir=3D"auto">I did C++ in th= e kernel in the 4.x->7-current time frame. I stopped because g++ was sti= ll evolving a lot in that time period and things kept breaking... and our u= se of C++ wasn't so big that rewriting in C was hard when one of the co= mpiler updates in base broke something. A lot has changed since then and I&= #39;m just extrapolating from that... your experience is much more recent.<= /div><div dir=3D"auto"><br></div><div dir=3D"auto">Warner</div><div dir=3D"= auto"><br></div><div dir=3D"auto"><div class=3D"gmail_quote"><blockquote cl= ass=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;p= adding-left:1ex"><div style=3D"line-break:after-white-space">David<div><br>= </div></div></blockquote></div></div></div> --000000000000578a3f06219eff80--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfo322v_z3NdmB%2Bgci-e33Yek9cvoD0n22Virr5yOptmTA>