Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 14 Oct 2024 08:33:33 +0100
From:      David Chisnall <theraven@freebsd.org>
To:        Cy Schubert <Cy.Schubert@cschubert.com>
Cc:        "Kevin P. Neal" <kpn@neutralgood.org>, "Gavin D. Howard" <gavin@gavinhoward.com>, freebsd-arch@freebsd.org, freebsd-hackers@freebsd.org, freebsd-net@freebsd.org, tcpdump-workers@lists.tcpdump.org, tech-net@netbsd.org, Alexander Nasonov <alnsn@netbsd.org>
Subject:   Re: BPF64: proposal of platform-independent hardware-friendly  backwards-compatible eBPF alternative
Message-ID:  <1FFE8A39-0061-4749-B9AD-65BE31CABAE0@freebsd.org>
In-Reply-To: <20241014024912.B82FD289@slippy.cwsent.com>
References:  <20241014024912.B82FD289@slippy.cwsent.com>

next in thread | previous in thread | raw e-mail | index | archive | help


> On 14 Oct 2024, at 03:49, Cy Schubert <Cy.Schubert@cschubert.com> wrote:
>=20
>>  It
>> can be solved, I think the DirectX LLVM backend ("DXIL") does this, but I=

>> still suggest you not do this.

NaCl and SPIR made this mistake first. WebAssembly and SPIR-V learned the le=
sson.

>> LLVM is huge. Really huge. A codebase that large has no business being in=

>> the kernel.

Many years ago, I wrote a proof of concept BPF to LLVM IR compiler. The idea=
 was that a trusted userspace component could do the BPF compilation and loa=
d binary code into the kernel. BPF would still be BPF and so have the same g=
uarantees, but compiling it would be faster (on average, each BPF bytecode w=
as slightly more than one x86 instruction after LLVM optimisations had run).=
 LLVM was still in the TCB though, even in userspace. I didn=E2=80=99t perus=
e it because LLVM is *not* safe in the presence of untrusted inputs.

More generally, the LLVM IR model is similar to C. It allows arbitrary point=
er casts and arbitrary pointer arithmetic. It is not a good starting point f=
or anything that you want to analyse for security. LLVM analyses take advant=
age of undefined behaviour. An in-bounds address calculation instruction is a=
n assertion from the front end that the result will be in bounds. Optimisati=
ons are free to rely on this, even when they can=E2=80=99t prove it, because=
 it is undefined behaviour to claim something is in bounds when it is not. T=
he same is true of a lot of other properties on the IR. Many are not computa=
ble to recover post facto, they rely on translation from a higher-level lang=
uage that enforces the properties by construction.

David




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1FFE8A39-0061-4749-B9AD-65BE31CABAE0>