Date: Tue, 10 Sep 2024 13:59:02 +0100 From: David Chisnall <theraven@FreeBSD.org> To: Vadim Goncharov <vadimnuclight@gmail.com> Cc: Poul-Henning Kamp <phk@phk.freebsd.dk>, tcpdump-workers@lists.tcpdump.org, "freebsd-arch@freebsd.org" <freebsd-arch@FreeBSD.org>, "freebsd-hackers@freebsd.org" <freebsd-hackers@FreeBSD.org>, "freebsd-net@freebsd.org" <freebsd-net@FreeBSD.org>, "tech-net@netbsd.org" <tech-net@NetBSD.org>, Alexander Nasonov <alnsn@NetBSD.org> Subject: Re: BPF64: proposal of platform-independent hardware-friendly backwards-compatible eBPF alternative Message-ID: <4D84AF55-51C7-4C2B-94F7-D486A29E8821@FreeBSD.org> In-Reply-To: <20240910144557.4d95052a@nuclight.lan> References: <20240910040544.125245ad@nuclight.lan> <202409100638.48A6cor2090591@critter.freebsd.dk> <20240910144557.4d95052a@nuclight.lan>
next in thread | previous in thread | raw e-mail | index | archive | help
--Apple-Mail=_568E13D8-1F5C-410F-B911-1402B36B059B Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 On 10 Sep 2024, at 12:45, Vadim Goncharov <vadimnuclight@gmail.com> = wrote: >=20 > It's easy for your Lua code (or whatever) code to hang kernel by > infinite loop. Or crash it by access on arbitrary pointer. That's why > original BPF has no backward jumps and memory access, and eBPF's > nightmare verifier walks all code paths and check pointers. I=E2=80=99m not convinced by the second: Lua has a GC=E2=80=99d heap, = you=E2=80=99d need to expose FFI things to it that did unsafe things, = and that=E2=80=99s equally a problem for eBPF. The first is not a problem. The Lua interpreter has a bytecode limit. = You can define a bounded number of bytecodes that it will execute. The = problem comes from the standard library. Things like string.gmatch can = have high-order polynomial complexity and so it=E2=80=99s possible for a = Lua program that executes a small number of bytecodes to create a string = that takes a vast amount of time to match on. Again, this is also a = problem for eBPF if you expose a similar function, the solution is to = not expose functions with large data-dependent runtimes to untrusted = script. More generally, there are a lot of problems with interpreting or JITing = untrusted code in the kernel in *any* runtime. Speculative execution = makes it easy to use these as primitives to leak kernel secrets, either = via timing of the programs themselves, using the JIT to generate = gadgets, or by leaking data via cache priming. Both eBPF and Lua have these problems. The thing I would like to see for our current use of semi-trusted Lua in = the kernel (ZFS channel programs) is a way of exposing them (under = /dev/something) as file descriptors and modifying the ioctls that run = them to take a file descriptor argument. I would like to separate the = two operations: - Load a channel program. - Run a channel program. In the post-Spectre world, the former remains a privileged operation. = Even though Linux pretends it isn=E2=80=99t, allowing arbitrary (even = arbitrary constrained) code to run in the kernel=E2=80=99s address space = is a problem. Invoking such code; however, should follow the same rules = as everything else. A trusted entity should be able to load a pile of = Lua / eBPF / BPF64 / whatever programs into the kernel and then set up = permissions so that sandboxed programs (and jails) can use a defined = subset of them. David --Apple-Mail=_568E13D8-1F5C-410F-B911-1402B36B059B Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 <html><head><meta http-equiv=3D"content-type" content=3D"text/html; = charset=3Dutf-8"></head><body style=3D"overflow-wrap: break-word; = -webkit-nbsp-mode: space; line-break: after-white-space;">On 10 Sep = 2024, at 12:45, Vadim Goncharov <vadimnuclight@gmail.com> = wrote:<br><div><blockquote type=3D"cite"><br = class=3D"Apple-interchange-newline"><div><span style=3D"caret-color: = rgb(0, 0, 0); font-family: SourceCodePro-Regular; font-size: 12px; = font-style: normal; font-variant-caps: normal; font-weight: 400; = letter-spacing: normal; text-align: start; text-indent: 0px; = text-transform: none; white-space: normal; word-spacing: 0px; = -webkit-text-stroke-width: 0px; text-decoration: none; float: none; = display: inline !important;">It's easy for your Lua code (or whatever) = code to hang kernel by</span><br style=3D"caret-color: rgb(0, 0, 0); = font-family: SourceCodePro-Regular; font-size: 12px; font-style: normal; = font-variant-caps: normal; font-weight: 400; letter-spacing: normal; = text-align: start; text-indent: 0px; text-transform: none; white-space: = normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; = text-decoration: none;"><span style=3D"caret-color: rgb(0, 0, 0); = font-family: SourceCodePro-Regular; font-size: 12px; font-style: normal; = font-variant-caps: normal; font-weight: 400; letter-spacing: normal; = text-align: start; text-indent: 0px; text-transform: none; white-space: = normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; = text-decoration: none; float: none; display: inline = !important;">infinite loop. Or crash it by access on arbitrary pointer. = That's why</span><br style=3D"caret-color: rgb(0, 0, 0); font-family: = SourceCodePro-Regular; font-size: 12px; font-style: normal; = font-variant-caps: normal; font-weight: 400; letter-spacing: normal; = text-align: start; text-indent: 0px; text-transform: none; white-space: = normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; = text-decoration: none;"><span style=3D"caret-color: rgb(0, 0, 0); = font-family: SourceCodePro-Regular; font-size: 12px; font-style: normal; = font-variant-caps: normal; font-weight: 400; letter-spacing: normal; = text-align: start; text-indent: 0px; text-transform: none; white-space: = normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; = text-decoration: none; float: none; display: inline = !important;">original BPF has no backward jumps and memory access, and = eBPF's</span><br style=3D"caret-color: rgb(0, 0, 0); font-family: = SourceCodePro-Regular; font-size: 12px; font-style: normal; = font-variant-caps: normal; font-weight: 400; letter-spacing: normal; = text-align: start; text-indent: 0px; text-transform: none; white-space: = normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; = text-decoration: none;"><span style=3D"caret-color: rgb(0, 0, 0); = font-family: SourceCodePro-Regular; font-size: 12px; font-style: normal; = font-variant-caps: normal; font-weight: 400; letter-spacing: normal; = text-align: start; text-indent: 0px; text-transform: none; white-space: = normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; = text-decoration: none; float: none; display: inline = !important;">nightmare verifier walks all code paths and check = pointers.</span><br></div></blockquote><br></div><div>I=E2=80=99m not = convinced by the second: Lua has a GC=E2=80=99d heap, you=E2=80=99d need = to expose FFI things to it that did unsafe things, and that=E2=80=99s = equally a problem for eBPF.</div><div><br></div><div>The first is not a = problem. The Lua interpreter has a bytecode limit. You can = define a bounded number of bytecodes that it will execute. The = problem comes from the standard library. Things like string.gmatch = can have high-order polynomial complexity and so it=E2=80=99s possible = for a Lua program that executes a small number of bytecodes to create a = string that takes a vast amount of time to match on. Again, this = is also a problem for eBPF if you expose a similar function, the = solution is to not expose functions with large data-dependent runtimes = to untrusted script.</div><div><br></div><div>More generally, there are = a lot of problems with interpreting or JITing untrusted code in the = kernel in *any* runtime. Speculative execution makes it easy to = use these as primitives to leak kernel secrets, either via timing of the = programs themselves, using the JIT to generate gadgets, or by leaking = data via cache priming.</div><div><br></div><div>Both eBPF and Lua have = these problems.</div><div><br></div><div>The thing I would like to see = for our current use of semi-trusted Lua in the kernel (ZFS channel = programs) is a way of exposing them (under /dev/something) as file = descriptors and modifying the ioctls that run them to take a file = descriptor argument. I would like to separate the two = operations:</div><div><br></div><div> - Load a channel = program.</div><div> - Run a channel = program.</div><div><br></div><div>In the post-Spectre world, the former = remains a privileged operation. Even though Linux pretends it = isn=E2=80=99t, allowing arbitrary (even arbitrary constrained) code to = run in the kernel=E2=80=99s address space is a problem. Invoking = such code; however, should follow the same rules as everything else. = A trusted entity should be able to load a pile of Lua / eBPF / = BPF64 / whatever programs into the kernel and then set up permissions so = that sandboxed programs (and jails) can use a defined subset of = them.</div><div><br></div><div>David</div><div><br></div></body></html>= --Apple-Mail=_568E13D8-1F5C-410F-B911-1402B36B059B--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4D84AF55-51C7-4C2B-94F7-D486A29E8821>