Date: Sun, 8 Sep 2024 16:05:00 +0100 From: David Chisnall <theraven@FreeBSD.org> To: Warner Losh <imp@bsdimp.com> Cc: Kristof Provost <kp@freebsd.org>, Poul-Henning Kamp <phk@phk.freebsd.dk>, Alan Somers <asomers@freebsd.org>, Dmitry Salychev <dsl@freebsd.org>, Jan Knepper <jan@digitaldaemon.com>, freebsd-hackers@freebsd.org Subject: Re: The Case for Rust (in any system) Message-ID: <0BC57127-5CF9-45C5-9BE6-7E21D2313291@FreeBSD.org> In-Reply-To: <CANCZdfr03sUOz7AEZhTDmWPCPcA%2BqjRf4ZuUxs1FMi2xjnomWA@mail.gmail.com> References: <202409060725.4867P3ul040678@critter.freebsd.dk> <4E4FB8CC-A974-42C4-95D5-2E1E4BF681AD@freebsd.org> <B355DB3E-82A2-407A-9D70-2A40C953DEB2@FreeBSD.org> <CANCZdfr03sUOz7AEZhTDmWPCPcA%2BqjRf4ZuUxs1FMi2xjnomWA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
--Apple-Mail=_60EE3B51-FED3-4D06-9928-5631210681CB Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 On 8 Sep 2024, at 14:50, Warner Losh <imp@bsdimp.com> wrote: >=20 > So there's four big issues with C++ in the kernel, all surmountable if = we wanted. There are two missing from your list, which I encountered when I wrote a = kernel module for FreeBSD in C++ a few years ago: C++ relies on COMDATs quite a lot. Each inline function and each = function that=E2=80=99s instantiated as a template is a separate section = with some flags indicating that the linker / loader should keep one and = discard the rest. If you have a single C++ module, this is fine, but for = two it=E2=80=99s harder. I did a small =E2=80=98libkxx=E2=80=99 module = that provided a subset of libc++ for use by different modules, but the = kernel loader code didn=E2=80=99t have enough comments for me to = understand how to fix it. I would be tempted to approach this with a = userspace tool that runs over a set of kernel modules and pulls out = duplicated COMDATs into separate modules that other things can depend = on. Alternatively, the kernel loader could be modified to load only = referenced COMDATs, reference count them, and not load unused things = from each kernel module. The latter is a cleaner approach but is more = work. Second, between 11 and 12, someone decided to replaces a load of static = inline functions in kernel headers with macros. These conflict a lot. > There's the low-level allocation issues. Right now we know what memory = is used by what because we have malloc enhanced to track this = (oversimplifying a lot I know). So we'd need some framework to make it = easy to have 'custom allocators' that could track it as well. At a bare = minimum, we need the runtime support for new and delete... This is not technically required, but it is a good idea to think about = what the right strategy is. A C++ class can implement its own `operator = new` and `operator delete` wrapping `malloc(9)` and then subclasses will = allocate with that. Similarly, things like `std::unique_ptr` can take = an explicit deleter. This can be a bit clunky and it=E2=80=99s probably a good idea to have = some sensible defaults. > Next, there's all the other run-time support that's provided by = compiler-rt. Nothing in compiler-rt is needed for C++ except the unwinder if you want = exceptions (no one else except NT uses exceptions in a kernel). The one = bit of libcxxrt that you would probably want is the support for guard = variables, which would need modifying to use kernel locks. This is = fairly small, I wrote a custom one for CHERIoT RTOS which uses our futex = APIs. > Next, there's the issues of exceptions. They are quite useful for RAII = (since you know dtors will get run when an error happens), and there'd = need to be some sane plan for these (or we'd have to forego them). Most kernels disable exceptions. You absolutely do not want = Itanium-style exceptions in a kernel because they need to allocate to = throw exceptions and so you would only be able to throw from places = where allocation is safe. Given that the most common place you=E2=80=99d = want to throw an exception (if you had them) is if `malloc` with = `M_NOWAIT` failed, this could be a problem. NT uses SEH exceptions, which allocate all of the state on the stack and = then run funclets for cleanup. It would be possible to support this in = the kernel (the relevant patents expired over ten years ago), but a = non-trivial amount of work. If someone wanted to do the work, it would = be great: SEH is one of the very few things I really liked about the NT = kernel. > Finally, there's getting the subset of the standard library that's = useful into the kernel. There's a lot of templates for facilitating RAII = that are needed, for example, and some subset of STL, etc. You don=E2=80=99t need templates for RAII, RAII just depends on = destructors. Templates are useful, but largely orthogonal. I=E2=80=99d = personally recommend against using much of the standard library in the = kernel because it does not have good ways of handling allocation failure = without exceptions. The C++ standard defines a Freestanding profile = (similar to C) that includes things like the type traits that are useful = for compile-time metaprogramming. There are a few bits you might want = to pull in but a lot more that you=E2=80=99d want to avoid (I actually = have iostream working with the kernel=E2=80=99s printf family, but it = was a terrible idea and no one should ever use that code). For example, `std::shared_ptr` is probably not a good idea (it allocates = a separate control block to hold the ref count), but something that = wraps things that are intrusively reference counted with `refcount(9)` = in smart pointers would be valuable. Using member pointers, it=E2=80=99s = easy to build a smart-pointer template that takes a C type that contains = a refcount and a pointer to the field and automatically manipulates the = reference count when you copy the pointer. > Once you have those 'table stakes' issues out of the way, you'll need = to see how it performs, and work out a few dozen logistical issues = surrounding what compiler flags to use, what language features to = enable/disable, how to optimize and what set of warnings are sensible. -fno-exceptions and -fno-rtti is what most peopls use for kernel = programming (there are a few dozen kernels written in C++, it=E2=80=99s = not like we=E2=80=99d be the first to try). > You could start using C++ with just the second of these items. You can use it within a single kernel module now, as long as you resolve = COMDATs prior to linking and #undef a bunch of things. I was doing so = five years ago. The build system actually supports it already, though = possibly not deliberately. David --Apple-Mail=_60EE3B51-FED3-4D06-9928-5631210681CB Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 <html><head><meta http-equiv=3D"content-type" content=3D"text/html; = charset=3Dutf-8"></head><body style=3D"overflow-wrap: break-word; = -webkit-nbsp-mode: space; line-break: after-white-space;">On 8 Sep 2024, = at 14:50, Warner Losh <imp@bsdimp.com> wrote:<br><div><blockquote = type=3D"cite"><br><div><div dir=3D"ltr" style=3D"caret-color: rgb(0, 0, = 0); font-family: Helvetica; font-size: 12px; font-style: normal; = font-variant-caps: normal; font-weight: 400; letter-spacing: normal; = text-align: start; text-indent: 0px; text-transform: none; white-space: = normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; = text-decoration: none;"><div class=3D"gmail_quote"><div>So there's four = big issues with C++ in the kernel, all surmountable if we = wanted.</div></div></div></div></blockquote><div><br></div><div>There = are two missing from your list, which I encountered when I wrote a = kernel module for FreeBSD in C++ a few years = ago:</div><div><br></div><div>C++ relies on COMDATs quite a lot. = Each inline function and each function that=E2=80=99s instantiated = as a template is a separate section with some flags indicating that the = linker / loader should keep one and discard the rest. If you have a = single C++ module, this is fine, but for two it=E2=80=99s harder. = I did a small =E2=80=98libkxx=E2=80=99 module that provided a = subset of libc++ for use by different modules, but the kernel loader = code didn=E2=80=99t have enough comments for me to understand how to fix = it. I would be tempted to approach this with a userspace tool that runs = over a set of kernel modules and pulls out duplicated COMDATs into = separate modules that other things can depend on. Alternatively, = the kernel loader could be modified to load only referenced COMDATs, = reference count them, and not load unused things from each kernel = module. The latter is a cleaner approach but is more = work.</div><div><br></div><div>Second, between 11 and 12, someone = decided to replaces a load of static inline functions in kernel headers = with macros. These conflict a lot.</div><div><div dir=3D"ltr" = style=3D"caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: = 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; = letter-spacing: normal; text-align: start; text-indent: 0px; = text-transform: none; white-space: normal; word-spacing: 0px; = -webkit-text-stroke-width: 0px; text-decoration: none;"><div = class=3D"gmail_quote"><div><br></div></div></div></div><blockquote = type=3D"cite"><div><div dir=3D"ltr" style=3D"caret-color: rgb(0, 0, 0); = font-family: Helvetica; font-size: 12px; font-style: normal; = font-variant-caps: normal; font-weight: 400; letter-spacing: normal; = text-align: start; text-indent: 0px; text-transform: none; white-space: = normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; = text-decoration: none;"><div class=3D"gmail_quote"><div>There's the = low-level allocation issues. Right now we know what memory is used by = what because we have malloc enhanced to track this (oversimplifying a = lot I know). So we'd need some framework to make it easy to have 'custom = allocators' that could track it as well. At a bare minimum, we need the = runtime support for new and = delete...</div></div></div></div></blockquote><div><br></div><div>This = is not technically required, but it is a good idea to think about what = the right strategy is. A C++ class can implement its own `operator = new` and `operator delete` wrapping `malloc(9)` and then subclasses will = allocate with that. Similarly, things like `std::unique_ptr` can = take an explicit deleter.</div><div><br></div><div>This can be a bit = clunky and it=E2=80=99s probably a good idea to have some sensible = defaults.</div><div><div dir=3D"ltr" style=3D"caret-color: rgb(0, 0, 0); = font-family: Helvetica; font-size: 12px; font-style: normal; = font-variant-caps: normal; font-weight: 400; letter-spacing: normal; = text-align: start; text-indent: 0px; text-transform: none; white-space: = normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; = text-decoration: none;"><div = class=3D"gmail_quote"><div><br></div></div></div></div><blockquote = type=3D"cite"><div><div dir=3D"ltr" style=3D"caret-color: rgb(0, 0, 0); = font-family: Helvetica; font-size: 12px; font-style: normal; = font-variant-caps: normal; font-weight: 400; letter-spacing: normal; = text-align: start; text-indent: 0px; text-transform: none; white-space: = normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; = text-decoration: none;"><div class=3D"gmail_quote"><div>Next, there's = all the other run-time support that's provided by = compiler-rt.</div></div></div></div></blockquote><div><br></div><div>Nothi= ng in compiler-rt is needed for C++ except the unwinder if you want = exceptions (no one else except NT uses exceptions in a kernel). = The one bit of libcxxrt that you would probably want is the = support for guard variables, which would need modifying to use kernel = locks. This is fairly small, I wrote a custom one for CHERIoT RTOS = which uses our futex APIs.</div><div><div dir=3D"ltr" = style=3D"caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: = 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; = letter-spacing: normal; text-align: start; text-indent: 0px; = text-transform: none; white-space: normal; word-spacing: 0px; = -webkit-text-stroke-width: 0px; text-decoration: none;"><div = class=3D"gmail_quote"><div><br></div></div></div></div><blockquote = type=3D"cite"><div><div dir=3D"ltr" style=3D"caret-color: rgb(0, 0, 0); = font-family: Helvetica; font-size: 12px; font-style: normal; = font-variant-caps: normal; font-weight: 400; letter-spacing: normal; = text-align: start; text-indent: 0px; text-transform: none; white-space: = normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; = text-decoration: none;"><div class=3D"gmail_quote"><div>Next, there's = the issues of exceptions. They are quite useful for RAII (since you know = dtors will get run when an error happens), and there'd need to be some = sane plan for these (or we'd have to forego = them).</div></div></div></div></blockquote><div><br></div><div>Most = kernels disable exceptions. You absolutely do not want = Itanium-style exceptions in a kernel because they need to allocate to = throw exceptions and so you would only be able to throw from places = where allocation is safe. Given that the most common place you=E2=80= =99d want to throw an exception (if you had them) is if `malloc` with = `M_NOWAIT` failed, this could be a problem.</div><div><br></div><div>NT = uses SEH exceptions, which allocate all of the state on the stack and = then run funclets for cleanup. It would be possible to support = this in the kernel (the relevant patents expired over ten years ago), = but a non-trivial amount of work. If someone wanted to do the = work, it would be great: SEH is one of the very few things I really = liked about the NT kernel.</div><div><div dir=3D"ltr" = style=3D"caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: = 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; = letter-spacing: normal; text-align: start; text-indent: 0px; = text-transform: none; white-space: normal; word-spacing: 0px; = -webkit-text-stroke-width: 0px; text-decoration: none;"><div = class=3D"gmail_quote"><div><br></div></div></div></div><blockquote = type=3D"cite"><div><div dir=3D"ltr" style=3D"caret-color: rgb(0, 0, 0); = font-family: Helvetica; font-size: 12px; font-style: normal; = font-variant-caps: normal; font-weight: 400; letter-spacing: normal; = text-align: start; text-indent: 0px; text-transform: none; white-space: = normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; = text-decoration: none;"><div class=3D"gmail_quote"><div>Finally, there's = getting the subset of the standard library that's useful into the = kernel. There's a lot of templates for facilitating RAII that are = needed, for example, and some subset of STL, = etc.</div></div></div></div></blockquote><div><br></div><div>You don=E2=80= =99t need templates for RAII, RAII just depends on destructors. = Templates are useful, but largely orthogonal. I=E2=80=99d = personally recommend against using much of the standard library in the = kernel because it does not have good ways of handling allocation failure = without exceptions. The C++ standard defines a Freestanding = profile (similar to C) that includes things like the type traits that = are useful for compile-time metaprogramming. There are a few bits = you might want to pull in but a lot more that you=E2=80=99d want to = avoid (I actually have iostream working with the kernel=E2=80=99s printf = family, but it was a terrible idea and no one should ever use that = code).</div><div><br></div><div>For example, `std::shared_ptr` is = probably not a good idea (it allocates a separate control block to hold = the ref count), but something that wraps things that are intrusively = reference counted with `refcount(9)` in smart pointers would be = valuable. Using member pointers, it=E2=80=99s easy to build a = smart-pointer template that takes a C type that contains a refcount and = a pointer to the field and automatically manipulates the reference count = when you copy the pointer.</div><div><div dir=3D"ltr" = style=3D"caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: = 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; = letter-spacing: normal; text-align: start; text-indent: 0px; = text-transform: none; white-space: normal; word-spacing: 0px; = -webkit-text-stroke-width: 0px; text-decoration: none;"><div = class=3D"gmail_quote"><div><br></div></div></div></div><blockquote = type=3D"cite"><div><div dir=3D"ltr" style=3D"caret-color: rgb(0, 0, 0); = font-family: Helvetica; font-size: 12px; font-style: normal; = font-variant-caps: normal; font-weight: 400; letter-spacing: normal; = text-align: start; text-indent: 0px; text-transform: none; white-space: = normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; = text-decoration: none;"><div class=3D"gmail_quote"><div>Once you have = those 'table stakes' issues out of the way, you'll need to see how it = performs, and work out a few dozen logistical issues surrounding what = compiler flags to use, what language features to enable/disable, how to = optimize and what set of warnings are = sensible.</div></div></div></div></blockquote><div><br></div><div>-fno-exc= eptions and -fno-rtti is what most peopls use for kernel programming = (there are a few dozen kernels written in C++, it=E2=80=99s not like = we=E2=80=99d be the first to try).</div><div><br></div><blockquote = type=3D"cite"><div><div dir=3D"ltr" style=3D"caret-color: rgb(0, 0, 0); = font-family: Helvetica; font-size: 12px; font-style: normal; = font-variant-caps: normal; font-weight: 400; letter-spacing: normal; = text-align: start; text-indent: 0px; text-transform: none; white-space: = normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; = text-decoration: none;"><div class=3D"gmail_quote"><div>You could start = using C++ with just the second of these = items.</div></div></div></div></blockquote><div><br></div><div>You can = use it within a single kernel module now, as long as you resolve COMDATs = prior to linking and #undef a bunch of things. I was doing so five = years ago. The build system actually supports it already, though = possibly not = deliberately.</div><div><br></div></div>David<div><br></div></body></html>= --Apple-Mail=_60EE3B51-FED3-4D06-9928-5631210681CB--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?0BC57127-5CF9-45C5-9BE6-7E21D2313291>