From nobody Wed Jan 31 15:07:34 2024 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4TQ54z6c4Qz58j4l for ; Wed, 31 Jan 2024 15:08:03 +0000 (UTC) (envelope-from junchoon@dec.sakura.ne.jp) Received: from www121.sakura.ne.jp (www121.sakura.ne.jp [153.125.133.21]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4TQ54z1pF8z431H; Wed, 31 Jan 2024 15:08:03 +0000 (UTC) (envelope-from junchoon@dec.sakura.ne.jp) Authentication-Results: mx1.freebsd.org; none Received: from kalamity.joker.local (123-1-91-49.area1b.commufa.jp [123.1.91.49]) (authenticated bits=0) by www121.sakura.ne.jp (8.17.1/8.17.1/[SAKURA-WEB]/20201212) with ESMTPA id 40VF7YIR053509; Thu, 1 Feb 2024 00:07:35 +0900 (JST) (envelope-from junchoon@dec.sakura.ne.jp) Date: Thu, 1 Feb 2024 00:07:34 +0900 From: Tomoaki AOKI To: David Chisnall Cc: Wojciech Puchar , Antranig Vartanian , Alan Somers , FreeBSD Hackers , Warner Losh , Scott Long , Goran =?UTF-8?B?TWVracSH?= Subject: Re: The Case for Rust (in the base system) Message-Id: <20240201000734.83a86f486691276e533530e4@dec.sakura.ne.jp> In-Reply-To: <3DCF4236-4DFA-448E-A378-DE04EC147B50@FreeBSD.org> References: <3DCF4236-4DFA-448E-A378-DE04EC147B50@FreeBSD.org> Organization: Junchoon corps X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.33; amd64-portbld-freebsd14.0) List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@freebsd.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 4TQ54z1pF8z431H X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:7684, ipnet:153.125.128.0/18, country:JP] On Wed, 31 Jan 2024 11:14:55 +0000 David Chisnall wrote: > On 31 Jan 2024, at 10:15, Wojciech Puchar wrote: > > > > The is no such thing as secure system programming language. > > While true in the absolute, that’s an unhelpful framing. Different languages can eliminate different bug classes by construction. The same is true of different APIs. For example, SQL injection attacks are completely eliminated by APIs that present format strings, whereas they are easy to introduce in ones that require you to construct an SQL expression by string concatenation. > > For the languages under discussion, the key properties are memory safety. Rust and modern C++ prevent a lot of bounds errors by carrying bounds either in the type or as properties of a value and performing explicit checks. More importantly, they provide abstractions such as iterators, ranges, and views, which make bounds errors impossible by construction because they use subset-like operators on valid ranges derived from a collection and so ensure that every range that you iterate over *must* be a valid subrange of the underlying collection. They provide ownership semantics for pointers (in the language for Rust, in the library via RIAA and smart pointers) than ensure that owning a pointer prolongs its lifetime and so avoid temporal safety errors. > > Can a programmer get these things right without language support? Absolutely, but each programmer has a finite budget for cognitive load. In addition to thinking about memory management, they need to think about good data structure design, efficient algorithms, and higher-level security properties. The more attention that they have for these things, the better their code is. We’ve seen this in some of the Apple Silicon drivers for Linux, where writing them in Rust with strong ownership types made it easy to implement some concurrent algorithms that would be hard to get right in C and led to fast and correct code. > > In terms of safe *systems* programming, I would regard the definition of *systems* programming as programming that needs to step outside of a language’s abstract machine. Memory allocators and schedulers, for example, are in this category. These still benefit from richer type systems (in snmalloc, as I mentioned previously in the thread, we model the allocation state machine in C++ templates to ensure that memory state transitions are all valid as memory moves between allocated and the various states of deallocation), but the language cannot enforce strong properties, it can at best provide the programmer with tools to ensure that certain properties are always true in code that compiles. > > It’s up to you where you want to invest your cognitive budget, but for code that runs in the TCB I’d rather we have as many properties correct by construction as possible. > > David First of all, NO MEMORY-SAFE language can write codes using volatile memory objects, most notably, memory-mapped I/O and/or DMA driver. I formerly thought "Rust should NOT be able to write one by itself only, as Rust guys states it's memory-safe". But I huppened to know "unsafe" keyword to allowing such a thing. So I come to consider Rust as non-memory-safe. Evil programmer CAN ABUSE THE MECHANISM. Recall what happened to Linux kernel by UMN. [1] [2] Even using Rust, memory safety is ON HUMAN. Rust should have NOT introducing such a functionality and let them be kept on C and/or assembler codes to be called. This way, C/assembler could have been focused upon left C/assembler codes only about memory-safety. Yes, there can be memory-mapped I/O devices which completely divide addresses it uses for input and output. In such devices, "unsafe" keyword should be needed at all IN THEORY, but hardware/firmware bugs and intentional backdoors could overrun the buffer on write. Keeping these in mind, Rust would be EASIER to write memory-safe codes than C/assembler. What are problems of Rust-in-Base, I think, are that *Differences of philosophy. Unix and its delivetives, of course including FreeBSD base, is based on simplicity and memory/disk efficiency, notably uses of shared objects (*.so). Reuse / share everything immutable anywhere. In contrast with it, Rust persons prefer "sharing source codes as crates but do NOT share compiled objects". IIUC, if crate A is used on crate B and C, and crate B is used by crate D, usually crate D is built with crate A and B included. Then, if crate C and D is required by crate E as dinamically loadable module (the API/ABI changes when Rust itseld is updated, thus basically required full rebuild to link as native Rust objects), crate A and B are duplicated when object created from crate E are loaded. In these cases, all of crate A, B, C, D and E should be build as dynamically linked object and avoid duplicates. *API/ABI instability within Rust versions. But IIUC, this would be avoided if built with --crate-type=cdylib. We should force this unless the whole codes of FreeBSD are rewritten with Rust only. No moving goal is wanted for not-enough-large projects. *crates. All crates needed for FreeBSD base SHALL be BSD-compatiblly licensed and allowed to fork to commit to FreeBSD src tree. Every other crates need to be REIMPLEMENTED BY OURSELVES WITH CLEAN ROOM DEVELOPEMENT. *CPU consumption of rustc. For anyone not having dedicated build computer like me, if any of poudriere jail stard building Rust codes, the computer on hand becomes unresponsive/unusable, as ALL EXISTING CORES ARE EATEN UP. This would possibly fixed/relaxed once sched_ule is overhauled or new scheduler which can avoid this is implemented. At least for now, 6C12T core processor is not at all enough for Rust. [1] https://www.bleepingcomputer.com/news/security/linux-bans-university-of-minnesota-for-committing-malicious-code/ [2] https://www.reddit.com/r/HobbyDrama/comments/nku6bt/kernel_development_that_time_linux_banned_the/ -- Tomoaki AOKI