From nobody Tue Sep 10 01:05:44 2024 X-Original-To: freebsd-arch@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4X2lrJ6VMJz5W5yZ; Tue, 10 Sep 2024 01:05:52 +0000 (UTC) (envelope-from vadimnuclight@gmail.com) Received: from mail-lf1-x134.google.com (mail-lf1-x134.google.com [IPv6:2a00:1450:4864:20::134]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4X2lrH62PYz4Xrg; Tue, 10 Sep 2024 01:05:51 +0000 (UTC) (envelope-from vadimnuclight@gmail.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20230601 header.b=daMs+lUl; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of vadimnuclight@gmail.com designates 2a00:1450:4864:20::134 as permitted sender) smtp.mailfrom=vadimnuclight@gmail.com Received: by mail-lf1-x134.google.com with SMTP id 2adb3069b0e04-5365cc68efaso2934780e87.1; Mon, 09 Sep 2024 18:05:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725930349; x=1726535149; darn=freebsd.org; h=content-transfer-encoding:mime-version:message-id:subject:cc:to :from:date:from:to:cc:subject:date:message-id:reply-to; bh=9Nr+PMWJ0cvFk9krbCuHyFG2/xuUijibH8ER+7a1Gek=; b=daMs+lUl2JvLETVKX9X9xRrOgIwHzoaiXx/s6ROKvZ4Blwka+wA45pwh2xg6FWnI4z fjdaxHGelNWDkmTtjBH/ZY3TCIUvqv+i9GHBLTisMJtKarHZecKeJH2b3NWxTFlN+HD1 kJYYFXJ/Coj4uA9XeNS5H/k7fL0/sWuEROIu1lqXkhYGtp6H7i7v+Y44YZNvbw0NU5Of 9QMgwZIFydv1NONCxoqqlb2/hT7xrGUNtMZEcyXJGhDok8Sbfo6toTxUnUNVdq2t0SMR 0o/x2X3alIyD5vpaOKtRsJ4Ujn5XpbY0xGRmdqxBbDAaCbehy/9f8sVk0/XpIscpglQD Ny9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725930349; x=1726535149; h=content-transfer-encoding:mime-version:message-id:subject:cc:to :from:date:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=9Nr+PMWJ0cvFk9krbCuHyFG2/xuUijibH8ER+7a1Gek=; b=UhPyebZPMzaUOMKmF/DJnkVmw2nrMiePgVfveP/l6oXcZOsFccjoId7anVs0RLAzcm N9xM0Bw6dyg50MGv590355VXT5+TyzfWmzKT3T9CoQ34w4iTmw7IBFai5d2S7POALcH0 EYCKZRDtjotKXUcayoKxn7Mkem/M5cyzYHfxVJBuYxFk+67GU7foRvXl4bHb/Yw/KhiS DqUq5a7G2KCiICiXtZadwpyqanDSSg1A9h+ilAuFI/g7WEOSV1nD7YJA55HBu55EMLdH XBQ3BHEucb+fiwL1g3hCLrXjMYkcggXA8YduzxTd1S5dnEsHDisOOODqgJx3thyFX/an PMBw== X-Forwarded-Encrypted: i=1; AJvYcCV84kkRkeWISeXlDgn2jfzvfw6a5m75CCF5GX+RbRF2F9Nf3p7sU7LeuEztrN1L/J8Z/kobLM4px7/4quM=@freebsd.org, AJvYcCXylIcOAr5hAKXI3Xn9MeVZfBXI76kRHMVQl9s26VpAXN0ZMtOA4pBpHTMsmGm0x97ag5J0S6xRYsBiVH3UqWw=@freebsd.org X-Gm-Message-State: AOJu0Yz0PpwA33HLZkmemvzZ1NHCfLPV5lwCmDPHFifgDfyIrtFGDVYv Jb+fOtzCtJgrcsa7ZUyW1OwhD4m0A/ODcSL4pHTWQ57O+jCBXVFLCwVuFG2O X-Google-Smtp-Source: AGHT+IFgmynDElvygkAMoUaHF9Nx9iG/peV6DgXkFUjzvvnKTA+tvW8l0ViqB0ZX33ODbvZM9xiTKw== X-Received: by 2002:a05:6512:1110:b0:52d:b150:b9b3 with SMTP id 2adb3069b0e04-536587b545dmr8620481e87.32.1725930348584; Mon, 09 Sep 2024 18:05:48 -0700 (PDT) Received: from nuclight.lan ([37.204.254.214]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-5365f8cb6fbsm922220e87.134.2024.09.09.18.05.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Sep 2024 18:05:48 -0700 (PDT) Date: Tue, 10 Sep 2024 04:05:44 +0300 From: Vadim Goncharov To: freebsd-arch@FreeBSD.org, freebsd-hackers@FreeBSD.org, freebsd-net@FreeBSD.org, tcpdump-workers@lists.tcpdump.org, tech-net@NetBSD.org Cc: Alexander Nasonov Subject: BPF64: proposal of platform-independent hardware-friendly backwards-compatible eBPF alternative Message-ID: <20240910040544.125245ad@nuclight.lan> X-Mailer: Claws Mail 3.19.1 (GTK+ 2.24.33; amd64-portbld-freebsd12.4) List-Id: Discussion related to FreeBSD architecture List-Archive: https://lists.freebsd.org/archives/freebsd-arch List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arch@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Spamd-Bar: --- X-Spamd-Result: default: False [-3.17 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20230601]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36]; NEURAL_HAM_SHORT(-0.17)[-0.169]; MIME_GOOD(-0.10)[text/plain]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; FREEMAIL_ENVFROM(0.00)[gmail.com]; MIME_TRACE(0.00)[0:+]; TO_DN_SOME(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; FREEMAIL_FROM(0.00)[gmail.com]; ARC_NA(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::134:from]; RCPT_COUNT_FIVE(0.00)[6]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_LAST(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MLMMJ_DEST(0.00)[freebsd-arch@freebsd.org,freebsd-hackers@freebsd.org,freebsd-net@freebsd.org]; DKIM_TRACE(0.00)[gmail.com:+] X-Rspamd-Queue-Id: 4X2lrH62PYz4Xrg Hello! We don't need ELF relocations! We want better loop control! No so little parameters, Verifier! Leave our code alone! -- Ping Floyd I've recently had some experience with Linux's ePBF in it's XDP, and this l= eft quite negative impression. I was following via https://github.com/xdp-proje= ct/xdp-tutorial and after 3rd lesson was trying to create a simple program for searching TCP timestamp option and incrementing it by one. As you know, eBPF tool stack consists of at least clang and eBPF verifier in the kernel, and after two d= ozen tries eBPF verifier still didn't accept my code. I was digging into verifier sources, and the abysses opened in front of me! Carefully and boringly going via disassembler and verifier output, I've found that clang optimizer ignor= es just checked register - patching one byte in assembler sources (and target = .o) did help. I've filed https://github.com/iovisor/bcc/issues/5062 with details if one curious. So, looking at eBPF ecosystem, I must say it's a Frankenstein. Sewn from go= od, sometimes brilliant parts, it's a monster in outcome. Verifier is in it's o= wn right, compiler/optimizer is in it's own right... But at the end you even don't have a high-level programming language! You must write in C, relative= ly low-level C, and restricted subset of C. This requires very skilled professionals - it's far from something not even user-friendly, but at least sysadmin-friendly, like `ipfw` or `iptables` firewall rules. Thus I looked at the foundation of eBPF architecture, with which presupposi= tions in mind it was created with. In fact, it tries to be just usual programming after checks - that is, with all that pointers. It's too x86-centric and Linux-centric - number of registers was added just ten. So if you look at t= he GitHub ticket above, when I tried to add debug to program - you know, just specific `printf()`s - it failed verifier checks again because compiler now had to move some variables between registers and memory, as there is limit = on just 5 arguments to call due to limit of 5 registers! And verifier, despite being more than 20,000 lines of code, still was not smart enough to track i= nfo between registers and stack. So, if we'd started from beginning, what should we do? Remember classic BPF: it has very simple validator due to it's Virtual Machine design - only forw= ard jumps, checks for packet boundaries at runtime, etc. You'd say eBPF tries f= or performance if verifier's checks were passed? But in practice you have to t= oss in as much packet boundary checks as near to actual access as possible, or verifier may "forget" it, because of compiler optimizer. So this is not of much difference for checking if access is after packet in classic BPF - the same CMP/JUMP in JIT if buffer is linear, and if your OS has put packet in several buffers, like *BSD or DPDK `mbuf`'s, the runtime check overhead is negligible in comparison. Ensuring kernel stability? Just don't allow arbitrary pointers, like origin= al BPF. Guaranteed termination time? It's possible if you place some restrictions. = For example, don't allow backward jumps but allow function calls - in case of stack overflow, terminate program. Really need backward jumps? Let's analyze for what purpose. You'll find these are needed for loops on packet contents. Solve it but supporting loops in "hardware"-controlled loops, which can't be infinite. Finally, platforms. It's beginning of sunset of x86 era now - RISC is comin= g. ARM is now not only on mobiles, but on desktops and servers. Moreover, it's era of specialized hardware accelerators - e.g. GPU, neural processors. Even general purpose ARM64 has 31 register, and specialized hardware can implement much more. Then, don't tie to Linux kernel - BPF helpers are very rigid interface, from ancient era, like syscalls. So, let's continue *Berkeley* Packet Filter with Berkeley RISC design - hav= ing register window idea, updated by SPARC and then by Itanium (to not waste registers). Take NetBSD's coprocessor functions which set is passed with a context, instead of hardcoded enums of functions - for example, BPF maps = is not something universal, both NetBSD and FreeBSD have their own tables in firewall. Add more features actually needed for *network* processor - e.g. 128-bit registers for IPv6 (eBPF axed out even BPF_MSH!). And do all of this in ful= ly backwards-compatible way - new language should allow to run older programs from e.g. `tcpdump` to run without any modifications, binary-compatible (again, eBPF does not do this - it's incompatible with classiv BPF and uses a translator from it). Next, eBPF took "we are masquerading usual x86 programming" way not only ju= st in assembly language. They have very complex ELF infrastructure around it w= hich may be not suitable for every network card - having pc-addressed literals, = as in RISC processors allows for much simpler format: just BLOB of instruction= s. BPF64 adds BPF_LITERAL "instruction" of varying length (it's interpreted by just skipping over contents as if it was jump), which, if have special signatures and format, allow for this BLOB of instructions to contain some metadata about itself for loading, much simpler than ELF (esp. with DWARF). Then, ecosystem. eBPF defines functions callable from user code like: > enum bpf_func_id___x { BPF_FUNC_snprintf___x =3D 42 /* avoid zero */ }; That is, ancient syscall-like way of global constant, instead of context. A "context" here is the structure passed with code to execution which contains function pointers of what is available to this user code, in spirit of NetBSD's `bpf_ctx_t` for their BPF_COP/BPF_COPX extensions. This is not only provides better way than "set in stone" syscall-like number, but BPF64 goes further and defines an "packages" in running kernel with namespaces to allow e.g. Foo::Bar::baz() function to call Foo::quux() from another BPF program, populating ("linking") it's context with needed function without relocation= s. These "packages" expected to be available to admin in e.g. sysctl tree, with descriptions, versioning and other attributes. Some other quotes about how restricted eBPF is: > First, a BPF program using bpf_trace_printk() has to have a GPL-compatibl= e license. > Another hard limitation is that bpf_trace_printk() can accept only up to = 3 input arguments (in addition to fmt and fmt_size). This is quite often pr= etty limiting and you might need to use multiple bpf_trace_printk() invocat= ions to log all the data. This limitation stems from the BPF helpers abilit= y to accept only up to 5 input arguments in total. > Previously, bpf_trace_printk() allowed the use of only one string (%s) ar= gument, which was quite limiting. Linux 5.13 release lifts this restriction= and allows multiple string arguments, as long as total formatted output do= esn't exceed 512 bytes. Another annoying restriction was the lack of suppor= t for width specifiers, like %10d or %-20s. This restriction is gone now as= well > Helper function bpf_snprintf > Outputs a string into the str buffer of size str_size based on a format s= tring stored in a read-only map pointed by fmt. > > Each format specifier in fmt corresponds to one u64 element in the data a= rray. For strings and pointers where pointees are accessed, only the pointe= r values are stored in the data array. The data_len is the size of data in = bytes - must be a multiple of 8. > > Formats %s and %p{i,I}{4,6} require to read kernel memory. Reading kernel= memory may fail due to either invalid address or valid address but requiri= ng a major memory fault. If reading kernel memory fails, the string for %s = will be an empty string, and the ip address for %p{i,I}{4,6} will be 0. Not= returning error to bpf program is consistent with what bpf_trace_printk() = does for now. > > Returns > > The strictly positive length of the formatted string, including the trail= ing zero character. If the return value is greater than str_size, str conta= ins a truncated string, guaranteed to be zero-terminated except when str_si= ze is 0. > > Or -EBUSY if the per-CPU memory copy buffer is busy. > > static long (* const bpf_snprintf)(char *str, __u32 str_size, const char = *fmt, __u64 *data, __u32 data_len) =3D (void *) 165; So. In summary: BPF64 has not only traditional ("firewall on network packet= s") area of use but also suitable - and having goals to design in mind - for: * non-network. e.g. syscall arguments checking * network protocols passing untrusted code which can be run by other side in restricted sandbox (e.g. muSCTP custom (de)compression rules) * an alternative to https://capnproto.org/rpc.html for multi-method calls on high-latency links (e.g. MQTT5-like and/or for environments when Cap=E2=80=99n Proto itself is not applicable, e.g. no direct connect= ions) I've put a sketch of design to https://github.com/nuclight/bpf64 with files: * The `nuc_ts_prog_kern.c` (and it's include `nuc_ts_common_kern_user.h`) is XDP/eBPF program (for Linux 6.5) for parsing TCP packet and incrementing it's Timestamp option, if any, recording statisitics intop eBPF map. * The `nuc_ts_incr.baw` (and it's include `nuc_ts_incr.bah`) is the equival= ent program doing the same thing, but in a new BPF64 Assembler Wrapper langua= ge, not yet written and subject to change. Note this is a lower-level language than C, viewed as intermediate solution until BPF64 becomes stable, after which more higher-level language (higher than C) should be written, at le= ast as expressible as `tcpdump` (`libpcap`) one. * `bpf64spec.md` for draft of Specification, but currently it's more in a "a draft of draft" state :-) I am requesting comments about this architecture before implementing, especially from people knowledgeable in JIT compilers, because, though while I see BPF64 much more simpler than eBPF (probably just several man-months to implement), my sabbatical has ended - and design mistakes are much harder to fix when implementation already exists. --=20 WBR, @nuclight From nobody Tue Sep 10 06:38:50 2024 X-Original-To: freebsd-arch@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4X2vDY3J09z5TjnM; Tue, 10 Sep 2024 06:38:53 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4X2vDY296Gz44db; Tue, 10 Sep 2024 06:38:53 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Authentication-Results: mx1.freebsd.org; none Received: from critter.freebsd.dk (unknown [192.168.55.3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by phk.freebsd.dk (Postfix) with ESMTPS id 81BCB892DA; Tue, 10 Sep 2024 06:38:51 +0000 (UTC) Received: (from phk@localhost) by critter.freebsd.dk (8.18.1/8.16.1/Submit) id 48A6cor2090591; Tue, 10 Sep 2024 06:38:50 GMT (envelope-from phk) Message-Id: <202409100638.48A6cor2090591@critter.freebsd.dk> To: Vadim Goncharov cc: freebsd-arch@FreeBSD.org, freebsd-hackers@FreeBSD.org, freebsd-net@FreeBSD.org, tcpdump-workers@lists.tcpdump.org, tech-net@NetBSD.org, Alexander Nasonov Subject: Re: BPF64: proposal of platform-independent hardware-friendly backwards-compatible eBPF alternative In-reply-to: <20240910040544.125245ad@nuclight.lan> From: "Poul-Henning Kamp" References: <20240910040544.125245ad@nuclight.lan> List-Id: Discussion related to FreeBSD architecture List-Archive: https://lists.freebsd.org/archives/freebsd-arch List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arch@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <90589.1725950330.1@critter.freebsd.dk> Date: Tue, 10 Sep 2024 06:38:50 +0000 X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:1835, ipnet:130.225.0.0/16, country:EU] X-Rspamd-Queue-Id: 4X2vDY296Gz44db -------- Vadim Goncharov writes: > I've put a sketch of design to https://github.com/nuclight/bpf64 with files: Counter proposal: 1. Define the Lua execution environment in the kernel. 2. Add syscall to submit a precompiled Lua program (as bytecode) 3. Add syscall to execute submitted Lua program And yes: I'm being 100% serious. If we are going to reinvent "Channel Programs" 67 years after IBM came up with them for their 709 vacuum tube computer, at the very least we should use a sensible language syntax. Poul-Henning -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From nobody Tue Sep 10 11:45:57 2024 X-Original-To: freebsd-arch@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4X323032y3z5WK1c; Tue, 10 Sep 2024 11:46:04 +0000 (UTC) (envelope-from vadimnuclight@gmail.com) Received: from mail-lj1-x231.google.com (mail-lj1-x231.google.com [IPv6:2a00:1450:4864:20::231]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4X32300fVhz4t3M; Tue, 10 Sep 2024 11:46:04 +0000 (UTC) (envelope-from vadimnuclight@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-lj1-x231.google.com with SMTP id 38308e7fff4ca-2f75c0b78fbso6881101fa.1; Tue, 10 Sep 2024 04:46:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725968760; x=1726573560; darn=freebsd.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=WBkYJiNc2Nbz4EKv3F00lPQUtex0LWZeDlyUvZRnzLo=; b=eLmSAbnTTmxkV1BYRPvdL8ihc6tJk78BhA7fTqW43ezwpZsIQF8yfzUR5q57j1m8Hq PmxxvQ9gnR6I520lqhsrN/HSU7toU+Zrkr09eVhW3sckiEanD3a1G8lVWkEUG5LaPLwZ wK1imPmbTOizLqPsE9QlI8S0hIRosQxxfP9YksLVp0VdJxH209h/fTY9ExMaPIfXCBov eS+Gw18luw4/s7PvnFdRiHZgXRz5nBwYLsHkn+cF64Lil9lNdQl6SElKTmEejMRY5Up5 cYANsZPrZ33iOgqJJW5I0E2HFST3rkuuuGbnyovpEq9+T3A4WYJ4YMeBiFqrci9eHXRU CG9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725968760; x=1726573560; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WBkYJiNc2Nbz4EKv3F00lPQUtex0LWZeDlyUvZRnzLo=; b=ITJmRrJAdnu++v3tPrkUoF9eSjrR8477wAgf7CZy5FV/U8xju4VpffAXrUBWSGX+yT VzZfc183gY8ZAD95hgLRPkAOQBNymtS65AWbscY4aMakUqu/PS6zZG6fzZ6vuPxDQkeS /n6r2yOM2YXLnXQWAOOTJFcNYBmNMvwkhiJ9+bgg5p0h5apd8VD8greF+3omjn4AHLUP lT4jjbSxeNNiLu1Yiw+NkDiLETRMkCVF29yIyyLA7Z3JdXH/EGLBMtGH24KLqp0RMTfY 21I+iqlRe96t9GeP5HBzb/tILoBZiulOkcHijdSRSlsWNYJ6wP0SEDnclIJZLrceXYpZ Mt2w== X-Forwarded-Encrypted: i=1; AJvYcCXABsXwFqB9228+lHx1x/NRaTNdGIhiAtvPEsG+ym0Qe7PTj6I1BmoO7P5N26nDz3+bC9sqLFi+imd0r48=@freebsd.org, AJvYcCXN/5rVKkRrdRC8xfQ4EvjUmuH9gjlcBFLDaK+BC2oxZ1Xyq8PU1B5rJ+LiV2I/2nMA7y5jLuSxU3oIYoTgpkI=@freebsd.org X-Gm-Message-State: AOJu0YyG7smioDa/4fInc26m+OFrnDfxv8+P0t+VgXwsk5SopzV5x55O zy2qQI/hEhuqgYkObWZ5LKpAV2NILwU3IPybXGSvkDtSoJ69xTFm X-Google-Smtp-Source: AGHT+IGT6G/PcabgIwPjGt4XifTh3HHZgQq37Ig9miee4qJLIrWgIe6f17r20iP9+oq+FYhtHbvv5A== X-Received: by 2002:a05:651c:1a0c:b0:2f7:6653:8046 with SMTP id 38308e7fff4ca-2f766538106mr45293161fa.25.1725968759775; Tue, 10 Sep 2024 04:45:59 -0700 (PDT) Received: from nuclight.lan ([37.204.254.214]) by smtp.gmail.com with ESMTPSA id 38308e7fff4ca-2f75c07c539sm11600861fa.88.2024.09.10.04.45.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Sep 2024 04:45:59 -0700 (PDT) Date: Tue, 10 Sep 2024 14:45:57 +0300 From: Vadim Goncharov To: "Poul-Henning Kamp" , tcpdump-workers@lists.tcpdump.org Cc: freebsd-arch@FreeBSD.org, freebsd-hackers@FreeBSD.org, freebsd-net@FreeBSD.org, tech-net@NetBSD.org, Alexander Nasonov Subject: Re: BPF64: proposal of platform-independent hardware-friendly backwards-compatible eBPF alternative Message-ID: <20240910144557.4d95052a@nuclight.lan> In-Reply-To: <202409100638.48A6cor2090591@critter.freebsd.dk> References: <20240910040544.125245ad@nuclight.lan> <202409100638.48A6cor2090591@critter.freebsd.dk> X-Mailer: Claws Mail 3.19.1 (GTK+ 2.24.33; amd64-portbld-freebsd12.4) List-Id: Discussion related to FreeBSD architecture List-Archive: https://lists.freebsd.org/archives/freebsd-arch List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arch@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US] X-Rspamd-Queue-Id: 4X32300fVhz4t3M On Tue, 10 Sep 2024 06:38:50 +0000 "Poul-Henning Kamp" wrote: > -------- > Vadim Goncharov writes: > > > I've put a sketch of design to https://github.com/nuclight/bpf64 > > with files: > > Counter proposal: > > 1. Define the Lua execution environment in the kernel. > > 2. Add syscall to submit a precompiled Lua program (as bytecode) Anyone who thinks "any generic bytecode" misses the main point, see below. > 3. Add syscall to execute submitted Lua program > > And yes: I'm being 100% serious. Well, preparing spec/letter in a rush I probably forgot the main reason for BPF (and successors) to exist thinking it's obviuos: safety. Let's restate: *BPF* allows UNTRUSTED user code to be executed SAFELY in kernel. It's easy for your Lua code (or whatever) code to hang kernel by infinite loop. Or crash it by access on arbitrary pointer. That's why original BPF has no backward jumps and memory access, and eBPF's nightmare verifier walks all code paths and check pointers. And that's why DTrace also has it's own VM and bytecode in kernel (see https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-924.pdf Chapter 7) Your "counter proposal" was essentially available for all these decades in form "oh, just write KLD in C instead of that limited tcpdump". > If we are going to reinvent "Channel Programs" 67 years after IBM > came up with them for their 709 vacuum tube computer, at the very > least we should use a sensible language syntax. Don't know what that is, quick googling shows something modern on AMQP. But Lua at least doesn't have *sensible* syntax, Perl or Tcl much better. And I'm surprised why Fort, being available in loader, wasn't ported for all these years. -- WBR, @nuclight From nobody Tue Sep 10 11:51:08 2024 X-Original-To: freebsd-arch@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4X329L5v4rz5WKPk; Tue, 10 Sep 2024 11:51:34 +0000 (UTC) (envelope-from rb@gid.co.uk) Received: from gid2.gid.co.uk (ns0.gid.co.uk [IPv6:2001:470:94de::240]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "gid2.gid.co.uk", Issuer "gid2.gid.co.uk" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4X329L2Jp1z3xp0; Tue, 10 Sep 2024 11:51:34 +0000 (UTC) (envelope-from rb@gid.co.uk) Authentication-Results: mx1.freebsd.org; none Received: from mx0.gid.co.uk (mx0.gid.co.uk [194.32.164.250]) by gid2.gid.co.uk (8.15.2/8.15.2) with ESMTP id 48ABpNLb021367; Tue, 10 Sep 2024 12:51:23 +0100 (BST) (envelope-from rb@gid.co.uk) Received: from smtpclient.apple ([89.248.30.154]) by mx0.gid.co.uk (8.14.2/8.14.2) with ESMTP id 48ABpI43065562; Tue, 10 Sep 2024 12:51:18 +0100 (BST) (envelope-from rb@gid.co.uk) Content-Type: text/plain; charset=us-ascii List-Id: Discussion related to FreeBSD architecture List-Archive: https://lists.freebsd.org/archives/freebsd-arch List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arch@FreeBSD.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3776.700.51\)) Subject: Re: BPF64: proposal of platform-independent hardware-friendly backwards-compatible eBPF alternative From: Bob Bishop In-Reply-To: <202409100638.48A6cor2090591@critter.freebsd.dk> Date: Tue, 10 Sep 2024 12:51:08 +0100 Cc: Vadim Goncharov , "freebsd-arch@freebsd.org" , "freebsd-hackers@freebsd.org" , "freebsd-net@freebsd.org" , tcpdump-workers@lists.tcpdump.org, "tech-net@netbsd.org" , Alexander Nasonov Content-Transfer-Encoding: quoted-printable Message-Id: <2F9BBCAE-C0EA-43E7-B371-AE3FF733E72E@gid.co.uk> References: <20240910040544.125245ad@nuclight.lan> <202409100638.48A6cor2090591@critter.freebsd.dk> To: Poul-Henning Kamp X-Mailer: Apple Mail (2.3776.700.51) X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:6939, ipnet:2001:470::/32, country:US] X-Rspamd-Queue-Id: 4X329L2Jp1z3xp0 Hi, > On 10 Sep 2024, at 07:38, Poul-Henning Kamp = wrote: >=20 > -------- > Vadim Goncharov writes: >=20 >> I've put a sketch of design to https://github.com/nuclight/bpf64 with = files: >=20 > Counter proposal: >=20 > 1. Define the Lua execution environment in the kernel. >=20 > 2. Add syscall to submit a precompiled Lua program (as bytecode) >=20 > 3. Add syscall to execute submitted Lua program >=20 > And yes: I'm being 100% serious. >=20 > If we are going to reinvent "Channel Programs" 67 years after IBM > came up with them for their 709 vacuum tube computer, at the very > least we should use a sensible language syntax. +1 We did something like this at $work years ago with FORTH, to do weird = things in a driver. > Poul-Henning >=20 > --=20 > Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 > phk@FreeBSD.ORG | TCP/IP since RFC 956 > FreeBSD committer | BSD since 4.3-tahoe =20 > Never attribute to malice what can adequately be explained by = incompetence. >=20 -- Bob Bishop rb@gid.co.uk From nobody Tue Sep 10 12:24:07 2024 X-Original-To: freebsd-arch@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4X32ty5v0Yz5WPS0; Tue, 10 Sep 2024 12:24:10 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4X32ty309fz442f; Tue, 10 Sep 2024 12:24:10 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Authentication-Results: mx1.freebsd.org; none Received: from critter.freebsd.dk (unknown [192.168.55.3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by phk.freebsd.dk (Postfix) with ESMTPS id 0A68489284; Tue, 10 Sep 2024 12:24:08 +0000 (UTC) Received: (from phk@localhost) by critter.freebsd.dk (8.18.1/8.16.1/Submit) id 48ACO7oj094058; Tue, 10 Sep 2024 12:24:07 GMT (envelope-from phk) Message-Id: <202409101224.48ACO7oj094058@critter.freebsd.dk> To: Vadim Goncharov cc: tcpdump-workers@lists.tcpdump.org, freebsd-arch@FreeBSD.org, freebsd-hackers@FreeBSD.org, freebsd-net@FreeBSD.org, tech-net@NetBSD.org, Alexander Nasonov Subject: Re: BPF64: proposal of platform-independent hardware-friendly backwards-compatible eBPF alternative In-reply-to: <20240910144557.4d95052a@nuclight.lan> From: "Poul-Henning Kamp" References: <20240910040544.125245ad@nuclight.lan> <202409100638.48A6cor2090591@critter.freebsd.dk> <20240910144557.4d95052a@nuclight.lan> List-Id: Discussion related to FreeBSD architecture List-Archive: https://lists.freebsd.org/archives/freebsd-arch List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arch@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-ID: <94056.1725971047.1@critter.freebsd.dk> Content-Transfer-Encoding: 8bit Date: Tue, 10 Sep 2024 12:24:07 +0000 X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:1835, ipnet:130.225.0.0/16, country:EU] X-Rspamd-Queue-Id: 4X32ty309fz442f -------- Vadim Goncharov writes: > It's easy for your Lua code (or whatever) code to hang kernel by > infinite loop. Or crash it by access on arbitrary pointer. Lua has pointers now ? > Your "counter proposal" was essentially available for all these decades > in form "oh, just write KLD in C instead of that limited tcpdump". You're yelling at the guy who implemented a (very fast!) firewall where the rules were compiled to C code in a KLD. > > If we are going to reinvent "Channel Programs" 67 years after IBM > > came up with them for their 709 vacuum tube computer, at the very > > least we should use a sensible language syntax. > > Don't know what that is, quick googling […] Well, you probably should do some more research then, because unawareness of history is /the/ major cause of pointlessly repeating mistakes. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From nobody Tue Sep 10 12:59:02 2024 X-Original-To: freebsd-arch@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4X33gR2X2Yz5WT26; Tue, 10 Sep 2024 12:59:15 +0000 (UTC) (envelope-from theraven@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [96.47.72.83]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4X33gR1bPhz487P; Tue, 10 Sep 2024 12:59:15 +0000 (UTC) (envelope-from theraven@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1725973155; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=52WhLO+OOHHelT2Ulj2UpeO8k0p2xywwOt9PKUFdeUE=; b=Z9leBlhmFM1wChnwg8J+UYDNYI79vTXSLsfTgdJJEXM21aXDvdIedF4jMih1Sc8fZNkLAO jhV7+IsLEg0wk00SVmpRYo4hj5EjlJt8MXXRzg7G4fzPq73nCIeqs5YYYKkh+igF1vZi/5 L804Mq/G8taMah23Et0VIZ8OH9ZiIJT4EkD2enMxWmakB8Yl60PcQHeG5O0KoM6lHVnS3L U36iIZ578yTmEDfaK2R3+FvYTRsHN7NyvqH1Uo7pJ5NRKtlu+2jBWsSiHE4UB/Xrv5JGuH c/zGstZ5ZbIly10xanfW2yAN0X3jA47l/ilTzoA0xMqMxh1uhA9p3VyLkt9BQA== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1725973155; a=rsa-sha256; cv=none; b=EPHvqxYtQCoIHtuF6ELe6d98c6hcy6ZTAffa6LfPF4ZEMaKjVZMM7FS/yaDtPF0U/9Wj/T zEnYrlAs0rsdrRQE4oxgxcknTcqFUe5V893KkgDF2Iid22HDVvC6nn/41n9u6qcr/gix1V 7C5uEzfDERYUr8ymKHvFFsvCqBKdsd6nRJftshmxD5eFcn25dBEW3EJPey7ZxnU9KZm5us 1j1z3rGA2LsBNm8OZtraxZ+/nOd0LeGbp30vj/UM2i/W8mOd/6lEBh8Y/V7UvNh+MxdmLT PsMGCSg8rcMkl5a9uxP+xXZCEomHbYR871vc/onLG8rGWUOz10WyaE5/U8+/mw== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1725973155; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=52WhLO+OOHHelT2Ulj2UpeO8k0p2xywwOt9PKUFdeUE=; b=XPXxoJsZW7hScfl6faNRdizk8iNIMDcmwRx20lDNX2XGUlHp7BNFeCsT5OyHfcG+EGsbs3 9LUlhAqs+VDtS85Akd351XwffW8FkkfXjezE+XD9L+PhNGKOjjwldUvvlUMpK4Omz3kRWP PRl9pCTfsKazovqSwBGfUHzjTJaKQonVzi9G9bG5/hK8J1ezTaCzYdXHcWdLEXI/KvpwNy SAe0fogKCY8Q6ZUzFQ0Thkx8I4LJ7q81fFqJcO9mNW90aerrwz57g2K7MtrGRSQQ5be2Uk 9/ntKjaddeVAB1YIJNtDQRrAdtUHdQyBqDQCnCMNQEw9FeNO63xxLUOCJfOC4A== Received: from smtp.theravensnest.org (smtp.theravensnest.org [45.77.103.195]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: theraven) by smtp.freebsd.org (Postfix) with ESMTPSA id 4X33gR10JPzRMq; Tue, 10 Sep 2024 12:59:15 +0000 (UTC) (envelope-from theraven@FreeBSD.org) Received: from smtpclient.apple (host109-155-136-107.range109-155.btcentralplus.com [109.155.136.107]) by smtp.theravensnest.org (Postfix) with ESMTPSA id A6A0065B5; Tue, 10 Sep 2024 13:59:13 +0100 (BST) From: David Chisnall Message-Id: <4D84AF55-51C7-4C2B-94F7-D486A29E8821@FreeBSD.org> Content-Type: multipart/alternative; boundary="Apple-Mail=_568E13D8-1F5C-410F-B911-1402B36B059B" List-Id: Discussion related to FreeBSD architecture List-Archive: https://lists.freebsd.org/archives/freebsd-arch List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arch@FreeBSD.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3776.700.51\)) Subject: Re: BPF64: proposal of platform-independent hardware-friendly backwards-compatible eBPF alternative Date: Tue, 10 Sep 2024 13:59:02 +0100 In-Reply-To: <20240910144557.4d95052a@nuclight.lan> Cc: Poul-Henning Kamp , tcpdump-workers@lists.tcpdump.org, "freebsd-arch@freebsd.org" , "freebsd-hackers@freebsd.org" , "freebsd-net@freebsd.org" , "tech-net@netbsd.org" , Alexander Nasonov To: Vadim Goncharov References: <20240910040544.125245ad@nuclight.lan> <202409100638.48A6cor2090591@critter.freebsd.dk> <20240910144557.4d95052a@nuclight.lan> X-Mailer: Apple Mail (2.3776.700.51) --Apple-Mail=_568E13D8-1F5C-410F-B911-1402B36B059B Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 On 10 Sep 2024, at 12:45, Vadim Goncharov = wrote: >=20 > It's easy for your Lua code (or whatever) code to hang kernel by > infinite loop. Or crash it by access on arbitrary pointer. That's why > original BPF has no backward jumps and memory access, and eBPF's > nightmare verifier walks all code paths and check pointers. I=E2=80=99m not convinced by the second: Lua has a GC=E2=80=99d heap, = you=E2=80=99d need to expose FFI things to it that did unsafe things, = and that=E2=80=99s equally a problem for eBPF. The first is not a problem. The Lua interpreter has a bytecode limit. = You can define a bounded number of bytecodes that it will execute. The = problem comes from the standard library. Things like string.gmatch can = have high-order polynomial complexity and so it=E2=80=99s possible for a = Lua program that executes a small number of bytecodes to create a string = that takes a vast amount of time to match on. Again, this is also a = problem for eBPF if you expose a similar function, the solution is to = not expose functions with large data-dependent runtimes to untrusted = script. More generally, there are a lot of problems with interpreting or JITing = untrusted code in the kernel in *any* runtime. Speculative execution = makes it easy to use these as primitives to leak kernel secrets, either = via timing of the programs themselves, using the JIT to generate = gadgets, or by leaking data via cache priming. Both eBPF and Lua have these problems. The thing I would like to see for our current use of semi-trusted Lua in = the kernel (ZFS channel programs) is a way of exposing them (under = /dev/something) as file descriptors and modifying the ioctls that run = them to take a file descriptor argument. I would like to separate the = two operations: - Load a channel program. - Run a channel program. In the post-Spectre world, the former remains a privileged operation. = Even though Linux pretends it isn=E2=80=99t, allowing arbitrary (even = arbitrary constrained) code to run in the kernel=E2=80=99s address space = is a problem. Invoking such code; however, should follow the same rules = as everything else. A trusted entity should be able to load a pile of = Lua / eBPF / BPF64 / whatever programs into the kernel and then set up = permissions so that sandboxed programs (and jails) can use a defined = subset of them. David --Apple-Mail=_568E13D8-1F5C-410F-B911-1402B36B059B Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 On 10 Sep = 2024, at 12:45, Vadim Goncharov <vadimnuclight@gmail.com> = wrote:

It's easy for your Lua code (or whatever) = code to hang kernel by
infinite loop. Or crash it by access on arbitrary pointer. = That's why
original BPF has no backward jumps and memory access, and = eBPF's
nightmare verifier walks all code paths and check = pointers.

I=E2=80=99m not = convinced by the second: Lua has a GC=E2=80=99d heap, you=E2=80=99d need = to expose FFI things to it that did unsafe things, and that=E2=80=99s = equally a problem for eBPF.

The first is not a = problem.  The Lua interpreter has a bytecode limit.  You can = define a bounded number of bytecodes that it will execute.  The = problem comes from the standard library.  Things like string.gmatch = can have high-order polynomial complexity and so it=E2=80=99s possible = for a Lua program that executes a small number of bytecodes to create a = string that takes a vast amount of time to match on.  Again, this = is also a problem for eBPF if you expose a similar function, the = solution is to not expose functions with large data-dependent runtimes = to untrusted script.

More generally, there are = a lot of problems with interpreting or JITing untrusted code in the = kernel in *any* runtime.  Speculative execution makes it easy to = use these as primitives to leak kernel secrets, either via timing of the = programs themselves, using the JIT to generate gadgets, or by leaking = data via cache priming.

Both eBPF and Lua have = these problems.

The thing I would like to see = for our current use of semi-trusted Lua in the kernel (ZFS channel = programs) is a way of exposing them (under /dev/something) as file = descriptors and modifying the ioctls that run them to take a file = descriptor argument.  I would like to separate the two = operations:

 - Load a channel = program.
 - Run a channel = program.

In the post-Spectre world, the former = remains a privileged operation.  Even though Linux pretends it = isn=E2=80=99t, allowing arbitrary (even arbitrary constrained) code to = run in the kernel=E2=80=99s address space is a problem.  Invoking = such code; however, should follow the same rules as everything else. =  A trusted entity should be able to load a pile of Lua / eBPF / = BPF64 / whatever programs into the kernel and then set up permissions so = that sandboxed programs (and jails) can use a defined subset of = them.

David

= --Apple-Mail=_568E13D8-1F5C-410F-B911-1402B36B059B-- From nobody Tue Sep 10 13:09:15 2024 X-Original-To: freebsd-arch@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4X33v465Hpz5WTtq; Tue, 10 Sep 2024 13:09:20 +0000 (UTC) (envelope-from vadimnuclight@gmail.com) Received: from mail-lf1-x12d.google.com (mail-lf1-x12d.google.com [IPv6:2a00:1450:4864:20::12d]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4X33v44LG2z4Dl6; Tue, 10 Sep 2024 13:09:20 +0000 (UTC) (envelope-from vadimnuclight@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-lf1-x12d.google.com with SMTP id 2adb3069b0e04-5365cc68efaso3536809e87.1; Tue, 10 Sep 2024 06:09:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725973759; x=1726578559; darn=freebsd.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=RkKU7OzzhXolcRcGkGudy213qu3KY/KqHoqUnGVQTPQ=; b=TjIKSiuTIbl6+zsrcY5WHuTne3sayhEsaq05ctDTCdwWOYxKF6lAy0HYjJjFX2tnpK NOluUSyHTxtCEGgxuaf9rIuaUSlkgaRmV7KHAVuYiziMa3eO8Gb4SDPAu2S55CTrF0kH uKVqfwSuHVNzK8Q99JfhPa7dao4cMpzcj2J6lCITJ3h4u7TDASUsyeDmZUDF7tghluZr B0IICSZd0FeYH5ECePlPRK4dMpuCA/AT7nHOzXsBUVryealF17IlnaJK+XmHxCice0aQ RUK+5BjVLR0WDYerZMLnj0d4vjb1ojcfEdL38KFldCgyVHzh7l0JmafqYGCkFvFFmrJL wx4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725973759; x=1726578559; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RkKU7OzzhXolcRcGkGudy213qu3KY/KqHoqUnGVQTPQ=; b=R8jy9qfeNQDxpDKAx52YW0XlvljEN4VblyOX3P32nv2sHS8j2gClezwdpryv3L6y5I KZuBH2vniCjn0R5c6ZO9fRwVJAJyhogJxwE0KnjK3u9UNmDeuU8y2pFoBjmLOuJ/e2G6 Lz82S4Mba9TgMRweggBOGN6lZu4hQrjxm3MdUTyQkJG/XqypOsdqKDddBAZyiZiwk3mj XN8XysnFkwPTd/DUAByG1KcNUvm+EYUYxEEHdhqhcNzvud8FPm8iNUm7zzT5mbnjsYtC HKhT4fdtuZwjPN0ML98Id9m+9l1IhL0KytgeJcH2QA9ZSJOl5QfaePFnTjwsqS86joGy b/7g== X-Forwarded-Encrypted: i=1; AJvYcCVezgU/L3uEP2R0J3jz+xMcVCeVg0GWQv1GO6nTm2h5lc7SUMB0cP7DGQK8Oh6A9Z/1glVwcGPCf4reNUmcszQ=@freebsd.org, AJvYcCXrCqD/Cda2K/UO+KmO4HKjVu/0UM0o93RVDlvwND7ww9aoYsIxZwVRemcJYkC3SdISd6uWfvwWb3qnv0Q=@freebsd.org X-Gm-Message-State: AOJu0YzhHOXgk5vkBqIvh3Jq8jKMhIe5mozxg4Im/YoL6dgLtC8kEM05 3bF+6KL3zMyxsV1Nl9MycWzFMJT/YNC2j5gVib4AENlNxKRiydzE X-Google-Smtp-Source: AGHT+IHCZSlqVTBTlUDxjG0SinNhtLNa+Ng/lDo/GxAOb0zVvoslx8qbRLU0qLFsfO+55WpvVTXp2Q== X-Received: by 2002:a05:6512:15a7:b0:535:6992:f2c3 with SMTP id 2adb3069b0e04-536587f5ce0mr10817602e87.41.1725973758289; Tue, 10 Sep 2024 06:09:18 -0700 (PDT) Received: from nuclight.lan ([37.204.254.214]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-5365f90c306sm1153853e87.245.2024.09.10.06.09.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Sep 2024 06:09:18 -0700 (PDT) Date: Tue, 10 Sep 2024 16:09:15 +0300 From: Vadim Goncharov To: "Poul-Henning Kamp" Cc: freebsd-arch@FreeBSD.org, freebsd-hackers@FreeBSD.org, freebsd-net@FreeBSD.org, tech-net@NetBSD.org, Alexander Nasonov Subject: Re: BPF64: proposal of platform-independent hardware-friendly backwards-compatible eBPF alternative Message-ID: <20240910160915.55ff579b@nuclight.lan> In-Reply-To: <202409101224.48ACO7oj094058@critter.freebsd.dk> References: <20240910040544.125245ad@nuclight.lan> <202409100638.48A6cor2090591@critter.freebsd.dk> <20240910144557.4d95052a@nuclight.lan> <202409101224.48ACO7oj094058@critter.freebsd.dk> X-Mailer: Claws Mail 3.19.1 (GTK+ 2.24.33; amd64-portbld-freebsd12.4) List-Id: Discussion related to FreeBSD architecture List-Archive: https://lists.freebsd.org/archives/freebsd-arch List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arch@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US] X-Rspamd-Queue-Id: 4X33v44LG2z4Dl6 On Tue, 10 Sep 2024 12:24:07 +0000 "Poul-Henning Kamp" wrote: > -------- > Vadim Goncharov writes: >=20 > > It's easy for your Lua code (or whatever) code to hang kernel by > > infinite loop. Or crash it by access on arbitrary pointer. =20 >=20 > Lua has pointers now ? It's implementation has. Do you have mathematical verifier of such loaded bytecode proving it's C interpreter will have no side effects during it's running? > > Your "counter proposal" was essentially available for all these > > decades in form "oh, just write KLD in C instead of that limited > > tcpdump". =20 >=20 > You're yelling at the guy who implemented a (very fast!) firewall > where the rules were compiled to C code in a KLD. That's exactly the way which must be avoided. See 5.2 of https://www.usenix.org/legacy/events/bsdcon02/full_papers/lidl/lidl.pdf > > > If we are going to reinvent "Channel Programs" 67 years after IBM > > > came up with them for their 709 vacuum tube computer, at the very > > > least we should use a sensible language syntax. =20 > > > > Don't know what that is, quick googling [=E2=80=A6] =20 >=20 > Well, you probably should do some more research then, because > unawareness of history is /the/ major cause of pointlessly repeating > mistakes. You're either trolling or completely misunderstand the problem domain.=20 <> (c) https://www.ece.ucdavis.edu/~vojin/CLASSES/EEC272/S2005/Papers/IBM-A= rchitecture-Bashe_sep81.pdf This has nothing to do with BPF at all. Go and read original papers on kernel filters and why they're *such* restricted, e.g. Van Jacobson's paper on BPF/tcpdump, aforementitioned paper on BSD/OS's IPFW (esp. section 5.7 on loops), etc. --=20 WBR, @nuclight From nobody Tue Sep 10 13:35:11 2024 X-Original-To: freebsd-arch@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4X34T438Tyz5WXxb; Tue, 10 Sep 2024 13:35:20 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4X34T418FPz4Jjs; Tue, 10 Sep 2024 13:35:19 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Authentication-Results: mx1.freebsd.org; none Received: from critter.freebsd.dk (unknown [192.168.55.3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by phk.freebsd.dk (Postfix) with ESMTPS id 1B79B89284; Tue, 10 Sep 2024 13:35:12 +0000 (UTC) Received: (from phk@localhost) by critter.freebsd.dk (8.18.1/8.16.1/Submit) id 48ADZBhq094507; Tue, 10 Sep 2024 13:35:11 GMT (envelope-from phk) Message-Id: <202409101335.48ADZBhq094507@critter.freebsd.dk> To: David Chisnall cc: Vadim Goncharov , tcpdump-workers@lists.tcpdump.org, "freebsd-arch@freebsd.org" , "freebsd-hackers@freebsd.org" , "freebsd-net@freebsd.org" , "tech-net@netbsd.org" , Alexander Nasonov Subject: Re: BPF64: proposal of platform-independent hardware-friendly backwards-compatible eBPF alternative In-reply-to: <4D84AF55-51C7-4C2B-94F7-D486A29E8821@FreeBSD.org> From: "Poul-Henning Kamp" References: <20240910040544.125245ad@nuclight.lan> <202409100638.48A6cor2090591@critter.freebsd.dk> <20240910144557.4d95052a@nuclight.lan> <4D84AF55-51C7-4C2B-94F7-D486A29E8821@FreeBSD.org> List-Id: Discussion related to FreeBSD architecture List-Archive: https://lists.freebsd.org/archives/freebsd-arch List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arch@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <94505.1725975311.1@critter.freebsd.dk> Date: Tue, 10 Sep 2024 13:35:11 +0000 X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:1835, ipnet:130.225.0.0/16, country:EU] X-Rspamd-Queue-Id: 4X34T418FPz4Jjs David Chisnall writes: > The thing I would like to see for our current use of semi-trusted Lua in > the kernel (ZFS channel programs) is a way of exposing them (under > /dev/something) as file descriptors and modifying the ioctls that run > them to take a file descriptor argument. I would like to separate the > two operations: > > - Load a channel program. > - Run a channel program. > > In the post-Spectre world, the former remains a privileged operation. > Even though Linux pretends it isn't, allowing arbitrary (even > arbitrary constrained) code to run in the kernel's address space > is a problem. Invoking such code; however, should follow the same rules > as everything else. A trusted entity should be able to load a pile of > Lua / eBPF / BPF64 / whatever programs into the kernel and then set up > permissions so that sandboxed programs (and jails) can use a defined > subset of them. That would be a great way to do it. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From nobody Tue Sep 10 13:44:47 2024 X-Original-To: freebsd-arch@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4X34h56SKpz5WYnp; Tue, 10 Sep 2024 13:44:53 +0000 (UTC) (envelope-from vadimnuclight@gmail.com) Received: from mail-lf1-x136.google.com (mail-lf1-x136.google.com [IPv6:2a00:1450:4864:20::136]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4X34h54cG5z4Lj6; Tue, 10 Sep 2024 13:44:53 +0000 (UTC) (envelope-from vadimnuclight@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-lf1-x136.google.com with SMTP id 2adb3069b0e04-5344ab30508so6086811e87.0; Tue, 10 Sep 2024 06:44:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725975892; x=1726580692; darn=freebsd.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=K+m/jXapYzzXBOdJT5EyQSuIHmGF1QSU2be+dgGMMww=; b=T6GVpJhbQSx6/URyN6PPzxi7ypP0zKtbikTBd0H5C9buGHazTYPowuBPGKO/hFk4XP RmK/tIa7ObKa8Ohx4JGDYXaa+Jz1Y5aoIRWI0rl6eFtvhsQYVVpxYIoA3Iwe98Ut0cgL /5QYuXkpwOJwWPr9WeedzPTRZyY06sISA0YgUrKv4jvxdYHXxkldsD7nv6HKARCGn5Zn s6L5/t8AwBJZRgrJMW+xVatWop+XA9RqEE3VSa1hJN1+3Hj1YmFttByuqAD2w8ZEbzhK aBdqAzDR0msKUEkRr3jGk/JXcJ9rV0Fr2ak77Ze2Lzxz68vg5VGO4fJMGOA8iMBMBu8n jIPA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725975892; x=1726580692; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=K+m/jXapYzzXBOdJT5EyQSuIHmGF1QSU2be+dgGMMww=; b=eRnu3sc67XvxB61xInDithI9dlO0j2b92NPLWxkctx8zAPd6xp4RJxDLRVip3RxfP6 uQTKK/PSTJmL2XGULIRODsPb+sI48Ts+O/vwG9VtzwdELuEX81LDBbECeaJUTQ52fABo uq+KdjibZh8/FbVGQMBo87Xqh9cGNf/7bjvnphJPYRCmc8+FH19j9bwM2oyQtNm6Hfff GxtFRAB8nG291wyN/KTUG0EaesxBaK9QLZB8eUbfOKyyjgQhWuCpTjQvqDiDGvm8drP9 GjnC+SL8Cx1lE2ifxIPuTJh9OxkhjXMhWLFcxw86nHIC0RqdFXIPErRd+zuKVNx1Orda 47cg== X-Forwarded-Encrypted: i=1; AJvYcCUh5bIP+W0j+lThDKRqnVFX9x8FIb07Panuy+C3XdokX/T8X2q/mJq7DKNphg2I9rIqiEIPGYGEDPhOnEU=@freebsd.org, AJvYcCUy8S3ov+JBxkFjigWHd305wMfVfsuPznmYyOkLh9+PdLwyTSH2OU/BeYpsyoPCYKs3b1w5JnexmYI+TuE5TMDK@freebsd.org, AJvYcCVvzCBXJs5H8FmmzA8hVsotBtIkuFrRJ1aAIyL3COpv40nNt18rEFLXz+gk7C25VunqVD2KncbxTXpy2AA=@freebsd.org X-Gm-Message-State: AOJu0YzZlyIwa40T9zfgx94clsP5S/v0mkdPfW2ce3f2u95Rf6xMdVxO p3s/0aKZVSJMnEx21v4p+YmUiLYYYJ1iheTUU9Fkmx8t6gX+DdTM4C522jRD X-Google-Smtp-Source: AGHT+IF1I+H9IpV3UVoCM8UL/JKt5Oi6jnQ8OUUoZY6/7NCvz13NY+Q89FJXXtqaMIAOsUoRddtaMw== X-Received: by 2002:a05:6512:3e0e:b0:536:628d:20e with SMTP id 2adb3069b0e04-5366bb48633mr1099505e87.29.1725975891203; Tue, 10 Sep 2024 06:44:51 -0700 (PDT) Received: from nuclight.lan ([37.204.254.214]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-5365f912ee5sm1181543e87.301.2024.09.10.06.44.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Sep 2024 06:44:50 -0700 (PDT) Date: Tue, 10 Sep 2024 16:44:47 +0300 From: Vadim Goncharov To: David Chisnall Cc: Poul-Henning Kamp , tcpdump-workers@lists.tcpdump.org, "freebsd-arch@freebsd.org" , "freebsd-hackers@freebsd.org" , "freebsd-net@freebsd.org" , "tech-net@netbsd.org" , Alexander Nasonov Subject: Re: BPF64: proposal of platform-independent hardware-friendly backwards-compatible eBPF alternative Message-ID: <20240910164447.30039291@nuclight.lan> In-Reply-To: <4D84AF55-51C7-4C2B-94F7-D486A29E8821@FreeBSD.org> References: <20240910040544.125245ad@nuclight.lan> <202409100638.48A6cor2090591@critter.freebsd.dk> <20240910144557.4d95052a@nuclight.lan> <4D84AF55-51C7-4C2B-94F7-D486A29E8821@FreeBSD.org> X-Mailer: Claws Mail 3.19.1 (GTK+ 2.24.33; amd64-portbld-freebsd12.4) List-Id: Discussion related to FreeBSD architecture List-Archive: https://lists.freebsd.org/archives/freebsd-arch List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arch@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US] X-Rspamd-Queue-Id: 4X34h54cG5z4Lj6 On Tue, 10 Sep 2024 13:59:02 +0100 David Chisnall wrote: > On 10 Sep 2024, at 12:45, Vadim Goncharov > wrote: > >=20 > > It's easy for your Lua code (or whatever) code to hang kernel by > > infinite loop. Or crash it by access on arbitrary pointer. That's > > why original BPF has no backward jumps and memory access, and eBPF's > > nightmare verifier walks all code paths and check pointers. =20 >=20 > I=E2=80=99m not convinced by the second: Lua has a GC=E2=80=99d heap, you= =E2=80=99d need to > expose FFI things to it that did unsafe things, and that=E2=80=99s equall= y a > problem for eBPF. Not quite. For eBPF (and BPF64) there must be not just FFI but special wrappers or even written from scratch functions keeping in mind they work for restricted environment. Lua, of course, does not have such thing - it will be needed to reimplement standard library. > The first is not a problem. The Lua interpreter has a bytecode > limit. You can define a bounded number of bytecodes that it will > execute. The problem comes from the standard library. Things like > string.gmatch can have high-order polynomial complexity and so it=E2=80= =99s > possible for a Lua program that executes a small number of bytecodes > to create a string that takes a vast amount of time to match on. > Again, this is also a problem for eBPF if you expose a similar > function, the solution is to not expose functions with large > data-dependent runtimes to untrusted script. In BPF64 some safety belts are supposed - e.g. on CALL/RET time is checked, and if exceeded, program is marked unsafe and disabled. > More generally, there are a lot of problems with interpreting or > JITing untrusted code in the kernel in *any* runtime. Speculative > execution makes it easy to use these as primitives to leak kernel > secrets, either via timing of the programs themselves, using the JIT > to generate gadgets, or by leaking data via cache priming. >=20 > Both eBPF and Lua have these problems. > [...] > - Run a channel program. >=20 > In the post-Spectre world, the former remains a privileged operation. > Even though Linux pretends it isn=E2=80=99t, allowing arbitrary (even > arbitrary constrained) code to run in the kernel=E2=80=99s address space = is a > problem. Invoking such code; however, should follow the same rules > as everything else. A trusted entity should be able to load a pile > of Lua / eBPF / BPF64 / whatever programs into the kernel and then > set up permissions so that sandboxed programs (and jails) can use a > defined subset of them. I am not an experience assembler user and don't understand how Spectre works - that's why I've written RFC letter even before spec finished - but isn't that (Spectre) an x86-specific thing? BPF64 has more registers and primarily target RISC architectures if we're speaking of JIT. For BPF64 I've did separate stack as register window exactly to mitigate ROP and it's gadgets. And BPF64 is meant as backwards-compatible extension of existing BPF, that is, it has bytecode interpreter (for(;;) switch/case) as primary form and JIT only then - thus e.g. JIT can be disabled for non-root users in case of doubt. eBPF can't do this - it always exists in native machine code form at execution, bytecode is only for verifier stage. ^^ that's fallback if you say "safe JIT is impossible", but may be you have advices on how to do architecture to still do it safe? As BPF64 looks doable improvement for us in much lower resource investment than even to *porting* eBPF to *BSD. > The thing I would like to see for our current use of semi-trusted Lua > in the kernel (ZFS channel programs) is a way of exposing them (under > /dev/something) as file descriptors and modifying the ioctls that run > them to take a file descriptor argument. I would like to separate > the two operations: >=20 > - Load a channel program. Didn't hear about, looked at the zfs-program(8) and see no reason why these are called "channel" programs (just to please some old farts?) and even reason for them to run in kernel, for same userland-utilities-achi= evable things, seems doubtful. --=20 WBR, @nuclight From nobody Tue Sep 10 14:29:19 2024 X-Original-To: freebsd-arch@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4X35gT2FPTz5VtlY; Tue, 10 Sep 2024 14:29:25 +0000 (UTC) (envelope-from jhibbits@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4X35gT1Kzyz4S4b; Tue, 10 Sep 2024 14:29:25 +0000 (UTC) (envelope-from jhibbits@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1725978565; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=s3GEIp6Pm27U5X0WQLnTgbvAeltQZEvg1PejEPUlfp4=; b=cvzKdpS4y/XM6wOWOoHbMXAfIppBsl9X8BwqQ7AAqXW9pzvE+Mh62dMSLS4zmRQFjqS7TC rJeGxdzDesn8wxJ9WriGDftWQ3/eScH7FuaOtfSyvY1CstweHpu+twpgaBSK+C2z+STb9j PYhDr9+p6ZfHxcw1iCHoGrUkXddxcBjtL62/ZBH7R8ATRpSyB7gHuTcWFYy1IW7JnDM/4d fidKR9umuA/kYDU126QBHKYfHg/n+JXg/G+FFasoY9dHwcGFlcIDmdUH5hJCcFl/ZAMMUM U4OhAeef162Db9/sXDHWiPK9DS7thAOdoDc1VzdvwOKa7U9/OYJ7bHka09AbvQ== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1725978565; a=rsa-sha256; cv=none; b=Fq/+YRBojozG5185gbYrB4r+d2lR0/MQva7scfwLKpaTniOKBoEEozKQosyfIe3k9/fiOg iUfKTPTiC2gc4XKgS6rZcl8777Or5eSd2ogz8WOQWDz/mW466GvmI9LRqJ9v8AQK3VYAgL JameTunSlRMuBT9YYhBa1Mc2Pkp4M3JYtDWJgC05N3hqiPdgwy1wsx8AidE3a8vxjuaM0g 00dUPQmsCdmYI6rrJeRNEFOjFP58tIayCoAu4H/T7nqk+LZVLhj3X0Lq8UcGzJBZC0agCT wLRRp7/KieJOK+CnoTX0pkuGsn2cIv+WlLFbinc+SISv/zfMczKicT1QL28CDw== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1725978565; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=s3GEIp6Pm27U5X0WQLnTgbvAeltQZEvg1PejEPUlfp4=; b=Ra7qVGw7JRrPb3O11lG8B9ZCMgsWRLRnu+hxI9LzTpoMvXWrXHoVaNHjaoKCogDwhCeeox EkKSXQXELolYnR7SnAAhu1hA0ynYmsNTaxFTiDUHlJpmcPsr5LJZiFVloRfW1IOzlRaBqE 82PXpn0ggAO8bPlZF8AY9f+p3qHYIE/W173ykWFViTdNQu8Ayh+QIQHWW20y+ds80B1i/P UsOkNKeHV4pyrsxNNh9U7T0IjzD7otF8lke3h6Hoqt1QDJLr5XQNXCYuC57hig7zUcDwH8 LqzrUhQbcfzifoP/XyKQOD6VLnLplPUz9DBuYk8klPkqbBafjnlfGogNDPE5hg== Received: from ralga.knownspace (ip-163-182-7-56.dynamic.fuse.net [163.182.7.56]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: jhibbits) by smtp.freebsd.org (Postfix) with ESMTPSA id 4X35gS5DtpzSJp; Tue, 10 Sep 2024 14:29:24 +0000 (UTC) (envelope-from jhibbits@FreeBSD.org) Date: Tue, 10 Sep 2024 10:29:19 -0400 From: Justin Hibbits To: David Chisnall Cc: Vadim Goncharov , Poul-Henning Kamp , tcpdump-workers@lists.tcpdump.org, "freebsd-arch@freebsd.org" , "freebsd-hackers@freebsd.org" , "freebsd-net@freebsd.org" , "tech-net@netbsd.org" , Alexander Nasonov Subject: Re: BPF64: proposal of platform-independent hardware-friendly backwards-compatible eBPF alternative Message-ID: <20240910102919.7d8927d0@ralga.knownspace> In-Reply-To: <4D84AF55-51C7-4C2B-94F7-D486A29E8821@FreeBSD.org> References: <20240910040544.125245ad@nuclight.lan> <202409100638.48A6cor2090591@critter.freebsd.dk> <20240910144557.4d95052a@nuclight.lan> <4D84AF55-51C7-4C2B-94F7-D486A29E8821@FreeBSD.org> X-Mailer: Claws Mail 4.3.0 (GTK 3.24.43; powerpc64le-unknown-linux-gnu) List-Id: Discussion related to FreeBSD architecture List-Archive: https://lists.freebsd.org/archives/freebsd-arch List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arch@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Tue, 10 Sep 2024 13:59:02 +0100 David Chisnall wrote: > On 10 Sep 2024, at 12:45, Vadim Goncharov > wrote: > >=20 > > It's easy for your Lua code (or whatever) code to hang kernel by > > infinite loop. Or crash it by access on arbitrary pointer. That's > > why original BPF has no backward jumps and memory access, and eBPF's > > nightmare verifier walks all code paths and check pointers. =20 >=20 > I=E2=80=99m not convinced by the second: Lua has a GC=E2=80=99d heap, you= =E2=80=99d need to > expose FFI things to it that did unsafe things, and that=E2=80=99s equall= y a > problem for eBPF. >=20 > The first is not a problem. The Lua interpreter has a bytecode > limit. You can define a bounded number of bytecodes that it will > execute. The problem comes from the standard library. Things like > string.gmatch can have high-order polynomial complexity and so it=E2=80= =99s > possible for a Lua program that executes a small number of bytecodes > to create a string that takes a vast amount of time to match on. > Again, this is also a problem for eBPF if you expose a similar > function, the solution is to not expose functions with large > data-dependent runtimes to untrusted script. >=20 > More generally, there are a lot of problems with interpreting or > JITing untrusted code in the kernel in *any* runtime. Speculative > execution makes it easy to use these as primitives to leak kernel > secrets, either via timing of the programs themselves, using the JIT > to generate gadgets, or by leaking data via cache priming. >=20 > Both eBPF and Lua have these problems. >=20 > The thing I would like to see for our current use of semi-trusted Lua > in the kernel (ZFS channel programs) is a way of exposing them (under > /dev/something) as file descriptors and modifying the ioctls that run > them to take a file descriptor argument. I would like to separate > the two operations: >=20 > - Load a channel program. > - Run a channel program. >=20 > In the post-Spectre world, the former remains a privileged operation. > Even though Linux pretends it isn=E2=80=99t, allowing arbitrary (even > arbitrary constrained) code to run in the kernel=E2=80=99s address space = is a > problem. Invoking such code; however, should follow the same rules > as everything else. A trusted entity should be able to load a pile > of Lua / eBPF / BPF64 / whatever programs into the kernel and then > set up permissions so that sandboxed programs (and jails) can use a > defined subset of them. >=20 > David >=20 This sounds a lot like IBM i / OS400 / System/360, or even Singularity from Microsoft, which uses a trusted JIT or AOT compiler to convert arbitrary bytecode to constrained machine code, so the machine code translator becomes the only trusted party, to accept or reject the arbitrary source (byte)code from the user. - Justin From nobody Tue Sep 10 14:29:29 2024 X-Original-To: freebsd-arch@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4X35gc3Q5Hz5VttN for ; Tue, 10 Sep 2024 14:29:32 +0000 (UTC) (envelope-from debdrup@freebsd.org) Received: from freefall.freebsd.org (freefall.freebsd.org [96.47.72.132]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "freefall.freebsd.org", Issuer "R11" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4X35gc0GBDz4SSb for ; Tue, 10 Sep 2024 14:29:32 +0000 (UTC) (envelope-from debdrup@freebsd.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1725978572; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=86lOAgWB7sWXpgjnfmA3AtDdYZNCVoTd87YuUY/KpUg=; b=og24pehVPZvgFePlIIVq9u1OfwZvbOEtJ7ACLEjX1BgezA6HAlQP9xUqt5Q8MCqi2IbXOK pOe7x55Ytk898gFoxTUE7yL+lwERUc0UJpd0rnFnUH+t/0LTwS//EWYLZ7Vof3d7+SAlwT tL/MwbLWe8mnAQhMOVPxRSeQPpJ9PL38NcNPiVU1scXGFEnd+T6/LOeTZy+W9jCxj3AY8c fjLYt/Lfmwo14Pyptr2O1Ds0jr4FqJUX39j5wFSurRlTXmUsFN0LZjzM+oWSi9YtLDZG3F CqBiPiQ2Re40mRoc3PJlX53Tn6231QM4LiMy6i0de4QSkUDCqQUEW1o2m2mWOw== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1725978572; a=rsa-sha256; cv=none; b=cJvbzekPp46vTAtX8lR6/glUwbtfgKwAhF8gpPjgtfFWIi+pJJBPFNa45T0YL75p1ukRxE BF1GRR3VIKZfvWcdpNfSpMjBAn3yXVjl+5S6lmhakH49lF+Qf4EEO2pJyBaG2rX7p+Hh5Y HPhAn1lQ9Jkn+Qmu564yhc/kXYt+2A/ihsrFuHQLk3rwQJRXiOxaAhMWpFooA/c8d+lcAh z0+HDYGXgcIL9TQoH7fVfx+mbIoOnK+Y4HN+Q/wgIE4Go9m7qsBJb8B37SLjVjpavLp4/R edYgVNqdZGaDyi84iKl/2YDI7BgvgIlBiApcpXcaxqUfWPEAc+kGIZieOHtoow== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1725978572; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=86lOAgWB7sWXpgjnfmA3AtDdYZNCVoTd87YuUY/KpUg=; b=QrQmuUsKP7mTDirlqQB/a0un6EkecO6XXdhRWbfH6/5lcG8UA/lJGaQK+EZzB8iVmgfJXX FNZqviejYxu1p6d2p9qxCW7FxV6hJyjQir1p5fQ5DZ5hgOiCN7JHAyzPqHsowU77pEU6lb cDGp6rbNl6duP4/Khw/Ld6iB4phpUTAbQf13UYeJGS4+llx9WhL65Y3FOxXdcOQHetZ1nQ uG/xE/6qnlt1Xq+wyNOrHI2M/JYNrz+aFTmeb2VbP2XqVUWXLXACHnNYL4RJbXWg6WuGSk oNDVDD9FEXvzYrE/J5Gb5uqATivk028mq6vgAT52kMff3OnRfVBtwiOQWNuJFg== Received: by freefall.freebsd.org (Postfix, from userid 1471) id 0030919D64; Tue, 10 Sep 2024 14:29:31 +0000 (UTC) Date: Tue, 10 Sep 2024 16:29:29 +0200 From: Daniel Ebdrup Jensen To: freebsd-arch@freebsd.org Subject: Re: BPF64: proposal of platform-independent hardware-friendly backwards-compatible eBPF alternative Message-ID: References: <20240910040544.125245ad@nuclight.lan> <202409100638.48A6cor2090591@critter.freebsd.dk> List-Id: Discussion related to FreeBSD architecture List-Archive: https://lists.freebsd.org/archives/freebsd-arch List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arch@FreeBSD.org MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="4iioazphuikmpv4g" Content-Disposition: inline In-Reply-To: <202409100638.48A6cor2090591@critter.freebsd.dk> --4iioazphuikmpv4g Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Sep 10, 2024 at 06:38:50AM UTC, Poul-Henning Kamp wrote: >-------- >Vadim Goncharov writes: > >> I've put a sketch of design to https://github.com/nuclight/bpf64 with fi= les: > >Counter proposal: > >1. Define the Lua execution environment in the kernel. > >2. Add syscall to submit a precompiled Lua program (as bytecode) > >3. Add syscall to execute submitted Lua program > >And yes: I'm being 100% serious. > >If we are going to reinvent "Channel Programs" 67 years after IBM >came up with them for their 709 vacuum tube computer, at the very >least we should use a sensible language syntax. > >Poul-Henning > >--=20 >Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 >phk@FreeBSD.ORG | TCP/IP since RFC 956 >FreeBSD committer | BSD since 4.3-tahoe >Never attribute to malice what can adequately be explained by incompetence. > Hi folks, It might also be interesting to make note that ZFS implementing channel programs as zfs-program(8). Incidentally, they also use Lua. Similarly, there's lots of lua in the base system, and it's also what replaced the forth interpreter for the loader. Yours, Daniel Ebdrup Jensen --4iioazphuikmpv4g Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQGTBAABCgB9FiEEDonNJPbg/JLIMoS6Ps5hSHzN87oFAmbgV8lfFIAAAAAALgAo aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDBF ODlDRDI0RjZFMEZDOTJDODMyODRCQTNFQ0U2MTQ4N0NDREYzQkEACgkQPs5hSHzN 87o66QgArnaDCNoqBJ/oEky9LDfPh4VP/DTCAmXHxDe5AgBcpz7sETtMvEAfxwxV 7biwkBj7crAPnL5j05a9CZD1T6A+wdxObnndUeKReY55vJmveCQz8KV0RMrV1x/z MOdh/+82FdtBl+WEY99HYO3o78tLZgg4CMBGfpxfLSLYwxeibRM/m3zAhHYmvico 003VnMbg699GdSyX0d09fOe71NBYK1dE2HRofCpOkgjOWQMfzHAR3gupJrHXpfhV uHvBTpx029LOEMNuNz6PLZ9tM3D9HZENrEK0WVaS+eM9GupqM0mqj3lup7QlKBC4 eZudxF51Eb1L7vRL7/bv+D5DmRZlew== =BsJ0 -----END PGP SIGNATURE----- --4iioazphuikmpv4g-- From nobody Tue Sep 10 14:32:56 2024 X-Original-To: freebsd-arch@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4X35lZ4fQvz5Vvq9; Tue, 10 Sep 2024 14:32:58 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4X35lZ24jdz4Vq2; Tue, 10 Sep 2024 14:32:58 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Authentication-Results: mx1.freebsd.org; none Received: from critter.freebsd.dk (unknown [192.168.55.3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by phk.freebsd.dk (Postfix) with ESMTPS id 9C8EE892DA; Tue, 10 Sep 2024 14:32:56 +0000 (UTC) Received: (from phk@localhost) by critter.freebsd.dk (8.18.1/8.16.1/Submit) id 48AEWu1q094702; Tue, 10 Sep 2024 14:32:56 GMT (envelope-from phk) Message-Id: <202409101432.48AEWu1q094702@critter.freebsd.dk> To: Vadim Goncharov cc: freebsd-arch@FreeBSD.org, freebsd-hackers@FreeBSD.org, freebsd-net@FreeBSD.org, tech-net@NetBSD.org, Alexander Nasonov Subject: Re: BPF64: proposal of platform-independent hardware-friendly backwards-compatible eBPF alternative In-reply-to: <20240910160915.55ff579b@nuclight.lan> From: "Poul-Henning Kamp" References: <20240910040544.125245ad@nuclight.lan> <202409100638.48A6cor2090591@critter.freebsd.dk> <20240910144557.4d95052a@nuclight.lan> <202409101224.48ACO7oj094058@critter.freebsd.dk> <20240910160915.55ff579b@nuclight.lan> List-Id: Discussion related to FreeBSD architecture List-Archive: https://lists.freebsd.org/archives/freebsd-arch List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arch@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <94700.1725978776.1@critter.freebsd.dk> Date: Tue, 10 Sep 2024 14:32:56 +0000 X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:1835, ipnet:130.225.0.0/16, country:EU] X-Rspamd-Queue-Id: 4X35lZ24jdz4Vq2 -------- Vadim Goncharov writes: > On Tue, 10 Sep 2024 12:24:07 +0000 > "Poul-Henning Kamp" wrote: > > > Lua has pointers now ? > > It's implementation has. As does (e)BPF's implementation. > > You're yelling at the guy who implemented a (very fast!) firewall > > where the rules were compiled to C code in a KLD. > > That's exactly the way which must be avoided. See 5.2 of > https://www.usenix.org/legacy/events/bsdcon02/full_papers/lidl/lidl.pdf I didn't agree with them then, and I dont agree with them now :-) In my implementation, the ipfw(8) worked *precisely* the same as it always did, just a little slower because it had to run the C-compiler. Later I used the same trick in Varnish, where the "VCL" configuration language is compiled into a shared library, a major performance gain. > You're either trolling or completely misunderstand the problem domain. > [wiki quote] > This has nothing to do with BPF at all. Go and read original papers on > kernel filters and why they're *such* restricted, e.g. Van Jacobson's > paper on BPF/tcpdump, aforementitioned paper on BSD/OS's IPFW (esp. > section 5.7 on loops), etc. Yes, I read those papers when they were fresh, and I still think I know more about this problem domain than you will ever have to learn. I will admit that I have not run channel programs on a 709 myself, I only got into that game on a 4381. I also wrote my own prototype eBPF around 1996-97, in an attempt to deal with some obscure protocols, but I gave up on it, precisely because BPF was a "toy" language so it couldn't do what I needed. There are two fundamental questions: A) How powerful do you want the downloaded bytecode to be ? B) What syntax to you express it in ? There are basically two possible answers to A. Either the downloaded code is "real", which means it can include loops, function calls etc. or it is a "toy" which relies on Brinch-Hansen's "all arrows point to the right" argument to prove that it will always terminate. Obviously there is a lot of stuff you cannot do with a "toy", but it is a valid trade-off. VJ had no trouble making BPF a "toy": He lost no functionality, and it made it easier/possible to convince people that this would not cause the same problems as IBM channel programs did. I made VCL a "toy", also because no functionality was lost, and it eliminated a very obvious way for webmasters, not the worlds best programmers to begin with, to shoot their own feet. Personally I think it should be a "securelevel" like setting if FreeBSD's channel programs should be allowed to loop: We supply code, not policy. But that question is /entirely/ separate from the second question. The answer to that is that you can write them in /any/ programming language you want, as long as the interpreter for the resulting (byte-)code an spot attempts to jump backwards, and refuse to do so, if so configured. And that is why I say: If we're going to do this, let us do it with Lua: It's already in our tree and it will do the job nicely. After all, we dont want to make another mess like ACPI, do we ? Poul-Henning -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From nobody Tue Sep 10 14:58:25 2024 X-Original-To: freebsd-arch@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4X36KB534yz5Vxnp; Tue, 10 Sep 2024 14:58:38 +0000 (UTC) (envelope-from theraven@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4X36KB4M8tz4bZ8; Tue, 10 Sep 2024 14:58:38 +0000 (UTC) (envelope-from theraven@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1725980318; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=RgfWg2bulIbwtlV/4Gjmiz25NxsSSsRKimR8IwQMrQE=; b=Yqjp1onLc6T66JnXpukvkLGCGByhHTqXQ/326gFljqWs22byt1+e31gu+GCi/qbx1Qe6is 19z0z9mVS66YgQnLUcS8Fz5A2RX2KYusseSh5luBCoLs5aUugArFyS9I70IO7MGUA5Kz6d lWjLGexiEHflu+Vbtgb4qeqaKpr5cLMisHU7Nt47w8Kmuudk3TS1OHJwwx85Ru4oJ156U3 l8R7uKdnkjq0HlcgFFQlrd8u+/TqjRFjvd30cVT72QopUl2IpcAEUxhWyC0YYCJIvhnkP5 +zeQyqQpg+Q9/lmtGm4wXowi5s2CZYfEBp0XtFOFICzaDFaX6K+LnUTLU+uS+A== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1725980318; a=rsa-sha256; cv=none; b=r7J1AAmkrkm88Vv2xWg8U5zAVGa+oX4cpBbaBVKACfFD7x3NRSIVu05KyQwMQL5yG2WZyt Q3mPi9W6FkJr+qTpzJlpVZ6BiGdlc9ekOH8MGH46zWjkNV+CNRQlHBabZvH/r3O4sUHqPs 6c5dUmvpMZEEEBSJvn23ioPLwYjA7g0f7v+EuPSR3yDVoDFZpMFonBC5rg7cI7ceP/yn8s 1NtdgdOcRPVVpI6xEKbTSMOWiFonw7QgqQcRTBkUe3K7Ez/rW+p2WYCZ4BpkwfiQ1bjFmS j/LAwD8cbfSJUr7cvByrmKoDrbTBfDjDrLRVzVeYYxnVZO7f2gk2DsaP+GPMqw== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1725980318; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=RgfWg2bulIbwtlV/4Gjmiz25NxsSSsRKimR8IwQMrQE=; b=q7NHQXHrOSUosQwO+DzcksYTdwz4obEKP7ay6HIibVp7dwPFksuNRXj5sP48Tj396u9aAp +nWPswQuT7dFtKGmNEIWLzx/QaAzKaNUQzQO1Q5d+vPgTD7sJgK4u5kTgexTYzKf7h8dus 3DspqJvgkEq+rSS/7k55rXxyOGUICY3pRc0cAuvGsfhelj0rqRi3CTGFuO9+eqT1T9xBJz RSASC+aARtnuWwLKcKyhPz3StD71XHDdzb8aEOzcFcQkv3JnUsQ/7PY/Ya2SZ9yA4fr+6i 7JB2X+kMw0q+nvd1HxmuuP55rhSdwyT28vfUxIZjLQS5ihaUonNv4WbU0fZbiA== Received: from smtp.theravensnest.org (smtp.theravensnest.org [45.77.103.195]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: theraven) by smtp.freebsd.org (Postfix) with ESMTPSA id 4X36KB3VKczSKG; Tue, 10 Sep 2024 14:58:38 +0000 (UTC) (envelope-from theraven@FreeBSD.org) Received: from smtpclient.apple (host109-155-136-107.range109-155.btcentralplus.com [109.155.136.107]) by smtp.theravensnest.org (Postfix) with ESMTPSA id DA1E365B7; Tue, 10 Sep 2024 15:58:36 +0100 (BST) From: David Chisnall Message-Id: <3F3533E4-6059-4B4F-825F-6995745FDE35@FreeBSD.org> Content-Type: multipart/alternative; boundary="Apple-Mail=_C79A00F0-FADC-4B5C-84B5-8912A75C117E" List-Id: Discussion related to FreeBSD architecture List-Archive: https://lists.freebsd.org/archives/freebsd-arch List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arch@FreeBSD.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3776.700.51\)) Subject: Re: BPF64: proposal of platform-independent hardware-friendly backwards-compatible eBPF alternative Date: Tue, 10 Sep 2024 15:58:25 +0100 In-Reply-To: <20240910164447.30039291@nuclight.lan> Cc: Poul-Henning Kamp , tcpdump-workers@lists.tcpdump.org, "freebsd-arch@freebsd.org" , "freebsd-hackers@freebsd.org" , "freebsd-net@freebsd.org" , "tech-net@netbsd.org" , Alexander Nasonov To: Vadim Goncharov References: <20240910040544.125245ad@nuclight.lan> <202409100638.48A6cor2090591@critter.freebsd.dk> <20240910144557.4d95052a@nuclight.lan> <4D84AF55-51C7-4C2B-94F7-D486A29E8821@FreeBSD.org> <20240910164447.30039291@nuclight.lan> X-Mailer: Apple Mail (2.3776.700.51) --Apple-Mail=_C79A00F0-FADC-4B5C-84B5-8912A75C117E Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 On 10 Sep 2024, at 14:44, Vadim Goncharov = wrote: >=20 > I am not an experience assembler user and don't understand how Spectre > works - that's why I've written RFC letter even before spec finished - = but > isn't that (Spectre) an x86-specific thing? BPF64 has more registers > and primarily target RISC architectures if we're speaking of JIT. No, speculative execution vulnerabilities are present in any CPUs that = do speculative execution that does not have explicit mitigations against = them (i.e. all that have shipped now). Cache side channels are present = in any system with caches and do not have explicit mitigations (i.e. all = that have shipped so far). Mitigations around these things are an active research area, but so far = everything that=E2=80=99s been proposed has a performance hit and = several of them were broken before anyone even implemented them outside = a simulator. > And BPF64 is meant as backwards-compatible extension of existing BPF, > that is, it has bytecode interpreter (for(;;) switch/case) as primary > form and JIT only then - thus e.g. JIT can be disabled for non-root > users in case of doubt. eBPF can't do this - it always exists in = native > machine code form at execution, bytecode is only for verifier stage. This has absolutely no impact on cache side channels. The JIT makes = some attacks harder but prime-and-probe attacks are still possible. BPF can be loaded only by root, who can also load kernel modules and map = /dev/[k]mem, and FreeBSD does not protect the root <-> kernel boundary. Please read some of the (many) attacks on eBPF to better understand the = security landscape here. It=E2=80=99s a *very* hard problem to solve. David --Apple-Mail=_C79A00F0-FADC-4B5C-84B5-8912A75C117E Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 On 10 Sep = 2024, at 14:44, Vadim Goncharov <vadimnuclight@gmail.com> = wrote:

I am not an experience assembler user and = don't understand how Spectre
works - = that's why I've written RFC letter even before spec finished - = but
isn't = that (Spectre) an x86-specific thing? BPF64 has more registers
and = primarily target RISC architectures if we're speaking of JIT.

No, speculative execution = vulnerabilities are present in any CPUs that do speculative execution = that does not have explicit mitigations against them (i.e. all that have = shipped now).  Cache side channels are present in any system with = caches and do not have explicit mitigations (i.e. all that have shipped = so far).

Mitigations around these things are an = active research area, but so far everything that=E2=80=99s been proposed = has a performance hit and several of them were broken before anyone even = implemented them outside a simulator.

And BPF64 is meant as = backwards-compatible extension of existing BPF,
that = is, it has bytecode interpreter (for(;;) switch/case) as = primary
form = and JIT only then - thus e.g. JIT can be disabled for non-root
users = in case of doubt. eBPF can't do this - it always exists in = native
machine = code form at execution, bytecode is only for verifier stage.

This has absolutely no impact = on cache side channels.  The JIT makes some attacks harder but = prime-and-probe attacks are still possible.

BPF = can be loaded only by root, who can also load kernel modules and map = /dev/[k]mem, and FreeBSD does not protect the root <-> kernel = boundary.

Please read some of the (many) = attacks on eBPF to better understand the security landscape here. =  It=E2=80=99s a *very* hard problem to = solve.

David

= --Apple-Mail=_C79A00F0-FADC-4B5C-84B5-8912A75C117E-- From nobody Tue Sep 10 15:17:11 2024 X-Original-To: freebsd-arch@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4X36kj54Gbz5W0jd; Tue, 10 Sep 2024 15:17:17 +0000 (UTC) (envelope-from vadimnuclight@gmail.com) Received: from mail-lf1-x136.google.com (mail-lf1-x136.google.com [IPv6:2a00:1450:4864:20::136]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4X36kh6wg7z4h7w; Tue, 10 Sep 2024 15:17:16 +0000 (UTC) (envelope-from vadimnuclight@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-lf1-x136.google.com with SMTP id 2adb3069b0e04-5365b71a6bdso5150035e87.2; Tue, 10 Sep 2024 08:17:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725981434; x=1726586234; darn=freebsd.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=W+EtQAWIcFr+qyUilH/Xt2d7xpL9Fl6f8mgiQAu4olk=; b=O55e2XG+xisoVf3uMiBEdhWfYniaXzAQYrUJHVg1sXj+MGRDanOdQtICsuv2TplTUd 361DPWCVnIWoTon7NGA5ZFKuCwah9rllLz2PgBDZ5irWsyjMNvBGlV4zL1RSEG9AGc1K cqJEb2v3amcci6zXKiGeSsGF7J1tttOMVTCXGB7OTZrHsFKlNamq49nFO52z2aJoVCau qYS/8MbEzTkiVBWd6eN/zj5EMlRnU4OUZQ6wA/LcAIZ2MK3TGfFBx0dy7VSGo9RfDXnu 71K/kQN2Naz+TxI3d0kqtx2jK5+kHCZVZB9UmVSjSVft0Htg1CsizUjN1s2Pis5IiEDa c8ng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725981434; x=1726586234; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=W+EtQAWIcFr+qyUilH/Xt2d7xpL9Fl6f8mgiQAu4olk=; b=nhnhVMjkVhbtgyGerv/WRmZ6ndHTbdBeB1YerOkPrdYsbUp+9pfggG+d5ebzXDupzX SNVNtyxo5DB6XjZ9iaHz441XZchTvtILLfzQqO0ibs3D6iY+Vky+2J7x0gLCqob7xWCA j0V0SwqY42VwSkwQKoTHv6EdCpq0415eqcv++FfBK5F3mUXwmZQBa4VrZU2JbrM9p/Hl pQt/z312SqP3iMMPpIGwj1axb1gYujizwFW2fUWxIAoLRhDXT7uxmoZoHicoHLZzl6tx 9IHWEe8vZl4LaUM8E3CNODNks45tenshhhkLuZeQrVIB4PiK/JyW5Y0MxxHooag5Atnt IsjQ== X-Forwarded-Encrypted: i=1; AJvYcCUBrZXLR5iL++2H5jTPRqk24AzFLTFq+nhBgOtGpW6la2Jo5125JwEVlVHTVydGUunpyTsumveEuwYKSL7ezUE=@freebsd.org, AJvYcCV/6pcIHgaoxKppqu7x19+k1dC7jaMNwXUoVjAyP6Rcwnj44SECl2i57W1dI1As7Q1IuvjU6a2DxYWdVEg=@freebsd.org X-Gm-Message-State: AOJu0YxorJ8tM1QPfCYVH3paHt3Gxxx7OdtWyIfkil/ZctV41tMyvebk B4eWRJ1FE0MGH4j+RKYUW1frAQgw0q8FWJu9Y/Vi+/eN8xnuRPOT X-Google-Smtp-Source: AGHT+IHB+HtNqR5Mhpl+gR9Q78EPAz5C5X6JRviKpNPloEM6ML1p7/WA+anUT0/yPZnqvpaJWeXDyw== X-Received: by 2002:a05:6512:2215:b0:534:5453:ecda with SMTP id 2adb3069b0e04-536587ae6c8mr9702590e87.23.1725981433978; Tue, 10 Sep 2024 08:17:13 -0700 (PDT) Received: from nuclight.lan ([37.204.254.214]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-5365f86925asm1198726e87.31.2024.09.10.08.17.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Sep 2024 08:17:13 -0700 (PDT) Date: Tue, 10 Sep 2024 18:17:11 +0300 From: Vadim Goncharov To: "Gavin D. Howard" Cc: freebsd-arch@FreeBSD.org, freebsd-hackers@FreeBSD.org, freebsd-net@FreeBSD.org, tcpdump-workers@lists.tcpdump.org, tech-net@NetBSD.org, Alexander Nasonov Subject: Re: BPF64: proposal of platform-independent hardware-friendly backwards-compatible eBPF alternative Message-ID: <20240910181711.5d324ac5@nuclight.lan> In-Reply-To: References: <20240910040544.125245ad@nuclight.lan> X-Mailer: Claws Mail 3.19.1 (GTK+ 2.24.33; amd64-portbld-freebsd12.4) List-Id: Discussion related to FreeBSD architecture List-Archive: https://lists.freebsd.org/archives/freebsd-arch List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arch@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US] X-Rspamd-Queue-Id: 4X36kh6wg7z4h7w On Tue, 10 Sep 2024 14:41:20 +0000 "Gavin D. Howard" wrote: > Hello, > > New user here, not a contributor. > > > Ensuring kernel stability? Just don't allow arbitrary pointers, > > like original BPF. Guaranteed termination time? It's possible if > > you place some restrictions. For example, don't allow backward > > jumps but allow function calls - in case of stack overflow, > > terminate program. Really need backward jumps? Let's analyze for > > what purpose. You'll find these are needed for loops on packet > > contents. Solve it but supporting loops in "hardware"-controlled > > loops, which can't be infinite. > > If I understand Turing-completeness correctly, the "no backward jumps > but allow recursion and quit on stack overflow" is exactly equivalent > to having non-infinite loops. Sure, but then look at practical usefulness: suppose you have stack of 32 frames (current limit of eBPF and BPF64). Then you can have only 31 iterations on a loop, loosing ability to call more functions from loop body. > I'm not entirely sure, but I think the lack of backwards jumps would > be "simple" to check on LLVM IR: just make sure that a basic block > does not jump, directly or indirectly, to a basic block that > dominates it. [1] > > [1]: https://en.wikipedia.org/wiki/Dominator_(graph_theory) > > And then the stack overflow mechanic would be an entirely runtime > check, which is safer than pure static checking. > > But the good thing about this is that FreeBSD could use LLVM IR as the > BPF64 language, which means any language that compiles to LLVM is a > possible target. > > As for restricting access, I think it would be possible to check the > instructions in LLVM IR for any unsafe instructions or calls to > restricted functions. > > The downsides: > > * Someone would need to write an LLVM analyze pass or whatever they're > called. Maybe more than one. > * The kernel would need the ability to compile LLVM IR, making LLVM > part of the Ring 0 domain. > * Either that, or someone builds an LLVM-to-bytecode > translator. Well, using LLVM were supposed for higher-level languages when bytecode is no longer experimental, utilizing full power of optimizer, but for direct using... let's see how they use it in Linux currently: 1) you write .c code, relatively low-level/restricted, with eBPF .h-s 2) clang turns .c into LLVM IR file (yes, this step is separate, may be things has changed since then, but at least it was so) 3) file with LLVM IR turned to eBPF bytecode in ELF file 4) ELF .o loaded by ip(8) or specialized loader into kernel 5) verifier checks it and most probably will return error to you :) 6) bytecode is JIT-compiled in kernel and then may be run Linux people are in mood "let's throw more man-months instead of thinking of better design", eBPF infrastructure consists of hundreds of thousandls lines of code, but seems that utilizing LLVM IR directly required too much resources even from them so was rejected. > * But the analysis pass(es) must still live in the kernel. > * There would need to be tutorials or some docs on how to write > whatever language so that backwards jumps are avoided. So BPF64 took simplicity pass: while you have tutorials etc. it's still very hard to write (non-toy) code that passes verifier. I think a language where you do not need backward jumps but have usable constructs (so you just can't write bad code), even BAW, is a better way to go than try to train people to fight with unnecessary Cerber. > Please take my words with a full dose of salt; I'm not an expert on > this or on FreeBSD goals. BPF64 is not FreeBSD-only, you can see several non-FreeBSD mailing lists here. It can be cross-platform and independent enough to be implemented in e.g. network card or switch, for performance - having more registers allows to achieve better results then eBPF for same goal. -- WBR, @nuclight From nobody Tue Sep 10 22:12:28 2024 X-Original-To: freebsd-arch@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4X3Hxt0Zhpz5TfHl; Tue, 10 Sep 2024 22:12:34 +0000 (UTC) (envelope-from vadimnuclight@gmail.com) Received: from mail-lj1-x22e.google.com (mail-lj1-x22e.google.com [IPv6:2a00:1450:4864:20::22e]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4X3Hxs4cWVz4sWw; Tue, 10 Sep 2024 22:12:33 +0000 (UTC) (envelope-from vadimnuclight@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-lj1-x22e.google.com with SMTP id 38308e7fff4ca-2f025b94e07so50790821fa.0; Tue, 10 Sep 2024 15:12:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1726006351; x=1726611151; darn=freebsd.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=VQklzI5f+UhE/4HXXB6RHKU+J3Qo/ASreqA6+yuuQMA=; b=Gt0Mxk0EpZN59+KWga6Y85o6QJxPxRtDzMbNCHMxOpiO95o/UbDHjCCMYrjGqeIu7I 6EVZDtyxrXexv/aKlQBVKWCfvz+YmoJMuTk20Iuz80tjI7KS3diMT+yDBZghH4SlK6+0 7vNouIrF8WU3PrIOuDzHKfZyudU2vMPdiEyRFBPVtm8NJPkCuww9BZKwlnm3zsL6eTiQ lJHMWVFumb9aqhaHr5oxWd/UKf79Tk4pu+kDoLVi5p2SRacCxoGBztvLqGmCwqVUpQ0f 41p8A6rARrO8fy/7iFLPQ2EX57k+xDRpUh8msj8o9zwdCSa5JvamrHhJDzCM8yAeevZE aG4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726006351; x=1726611151; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=VQklzI5f+UhE/4HXXB6RHKU+J3Qo/ASreqA6+yuuQMA=; b=FrWGWSlI4V3C/Y5L7RHRSZ88lRJqCqQ44c8jPsOjgh/JlTg8/cjjy2vTpr2xwqWqLW 1xqlqDpWV8357RJycWFzS5yTsQEwXDi2pVcXZv3LHcoUeEz4lX/VmTsvaqZFysW9gSTV s5ID/WErJj0fR6yBT6+ZsCoNBG1Rl6/osSwKfp06Ov0alR/GWqNXEKGSeHE+FkipS3pn NMVpl68rsT1wuu+GeEbHVukVud5Ri6Fie5o5U3/97UaoqieywVsiK44zx4jeh4eAVsH1 rNUplXNXShVR7mYYN4qwixMgkpRuBPx2FwxKmLaxHdMsURLY4dH1MIiD3/zjiSj/azNP vkEA== X-Forwarded-Encrypted: i=1; AJvYcCUXyhl85uihZ4xjmNCvxgw17NL/6on20C7C+DJyGBs/HvOzIvL5HVYPlh3vj35o3OzEEpxXp4Omsr9pQeeCSSwe@freebsd.org, AJvYcCX+g+SUU4VBC6hqQRhaoGw/K6YNNGZyKCWdY1y2RTiuZJsKefVZ9xAuO2gIoh7r1lk6tT17vBviGrqpC+c=@freebsd.org, AJvYcCXfLs0EFHjn8KXuPPelq99ALzOXy1TOAcVLhxFuESmpvVFWMxBuEj1iqCkteUYCV+liXHKVGslHkagEidc=@freebsd.org X-Gm-Message-State: AOJu0YzSy9AGJhXKHBVk7pjrgB96buI3IoxqhmPrxrzqoyTyXXWMTQvZ o/O1jYKXV2n9QZHMqWoObZ31hPPniboiPuYefbUctmApXTQPewTDnWqsHZjW X-Google-Smtp-Source: AGHT+IHV0riUgCN7WGyLkuUPPbvyKnE92FLj4+LbmsITqsigNsHw13jPjtefeDp//sUQyk2pW6hqNA== X-Received: by 2002:a05:651c:1a0b:b0:2f7:6653:8044 with SMTP id 38308e7fff4ca-2f76653812dmr52539331fa.20.1726006350849; Tue, 10 Sep 2024 15:12:30 -0700 (PDT) Received: from nuclight.lan ([37.204.254.214]) by smtp.gmail.com with ESMTPSA id 38308e7fff4ca-2f75c07cf5esm13208111fa.86.2024.09.10.15.12.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Sep 2024 15:12:30 -0700 (PDT) Date: Wed, 11 Sep 2024 01:12:28 +0300 From: Vadim Goncharov To: David Chisnall Cc: Poul-Henning Kamp , tcpdump-workers@lists.tcpdump.org, "freebsd-arch@freebsd.org" , "freebsd-hackers@freebsd.org" , "freebsd-net@freebsd.org" , "tech-net@netbsd.org" , Alexander Nasonov Subject: Re: BPF64: proposal of platform-independent hardware-friendly backwards-compatible eBPF alternative Message-ID: <20240911011228.161f94db@nuclight.lan> In-Reply-To: <3F3533E4-6059-4B4F-825F-6995745FDE35@FreeBSD.org> References: <20240910040544.125245ad@nuclight.lan> <202409100638.48A6cor2090591@critter.freebsd.dk> <20240910144557.4d95052a@nuclight.lan> <4D84AF55-51C7-4C2B-94F7-D486A29E8821@FreeBSD.org> <20240910164447.30039291@nuclight.lan> <3F3533E4-6059-4B4F-825F-6995745FDE35@FreeBSD.org> X-Mailer: Claws Mail 3.19.1 (GTK+ 2.24.33; amd64-portbld-freebsd12.4) List-Id: Discussion related to FreeBSD architecture List-Archive: https://lists.freebsd.org/archives/freebsd-arch List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arch@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US] X-Rspamd-Queue-Id: 4X3Hxs4cWVz4sWw On Tue, 10 Sep 2024 15:58:25 +0100 David Chisnall wrote: > On 10 Sep 2024, at 14:44, Vadim Goncharov > wrote: > >=20 > > I am not an experience assembler user and don't understand how > > Spectre works - that's why I've written RFC letter even before spec > > finished - but isn't that (Spectre) an x86-specific thing? BPF64 > > has more registers and primarily target RISC architectures if we're > > speaking of JIT. =20 >=20 > No, speculative execution vulnerabilities are present in any CPUs > that do speculative execution that does not have explicit mitigations > against them (i.e. all that have shipped now). Cache side channels > are present in any system with caches and do not have explicit > mitigations (i.e. all that have shipped so far). >=20 > Mitigations around these things are an active research area, but so > far everything that=E2=80=99s been proposed has a performance hit and sev= eral > of them were broken before anyone even implemented them outside a > simulator. >=20 > > And BPF64 is meant as backwards-compatible extension of existing > > BPF, that is, it has bytecode interpreter (for(;;) switch/case) as > > primary form and JIT only then - thus e.g. JIT can be disabled for > > non-root users in case of doubt. eBPF can't do this - it always > > exists in native machine code form at execution, bytecode is only > > for verifier stage. =20 >=20 > This has absolutely no impact on cache side channels. The JIT makes > some attacks harder but prime-and-probe attacks are still possible. Wait, do you want to say that problem is not in JIT, that is, that current BPF (e.g. tcpdump) present in the kernel - are also vulnerable? Also, let's classify vulnerabilities. Is speculative execution vulnerability the same as cache side channels? In any case, what impact is? E.g. attacker could leak secrets, but *where* would them leak? BPF typically returns one 32-bit number as a verdict (often as just boolean), is it really attack vector? That is, may be solution is just "don't give read access to BPF-writable memory segments to untrusteds". Next, if problem is with timing, then isn't that enough to just restrict BPF code on having access to timers with resolution high enough? > BPF can be loaded only by root, who can also load kernel modules and > map /dev/[k]mem, and FreeBSD does not protect the root <-> kernel > boundary. Wrong. It is possible for decades to do `chmod a+r /dev/bpf*` and run tcpdump as non-root, which will load BPF code into kernel. Is *that* also a vulnerability, and if so, why it was never reported? > Please read some of the (many) attacks on eBPF to better understand > the security landscape here. It=E2=80=99s a *very* hard problem to solve. =20 Finally, the most big (in effort) question: suppose we limited to trusted root user etc. so it's of no concern. Are there now any objections/suggestions/comments on (rest of) BPF64 ? --=20 WBR, @nuclight From nobody Tue Sep 10 23:05:21 2024 X-Original-To: freebsd-arch@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4X3K6s624kz5Tn5V; Tue, 10 Sep 2024 23:05:25 +0000 (UTC) (envelope-from rob.fx907@gmail.com) Received: from mail-ej1-x629.google.com (mail-ej1-x629.google.com [IPv6:2a00:1450:4864:20::629]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4X3K6s2thtz4268; Tue, 10 Sep 2024 23:05:25 +0000 (UTC) (envelope-from rob.fx907@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-ej1-x629.google.com with SMTP id a640c23a62f3a-a8d2b4a5bf1so177810666b.2; Tue, 10 Sep 2024 16:05:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1726009523; x=1726614323; darn=freebsd.org; h=cc:to:subject:message-id:date:from:references:in-reply-to :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=lxaC1kdp70ZxV3KhRBwEMA5ciML5mPC6AyD8PtVl0sQ=; b=kLq1X+gCCrUQAJJaAkCqbGcSTEZHY9cvmHKdUTUFndDY/T1HTTrRt0u+8Cs93F8M9X uHxW++6AlYWnjXkfVMTnEiT614VeNsZv6wUImzv7unqwbN2WWJtNxBUi4W5Cvi3EkzhV 6NU6ODFJLgJ0FOd7QLE0wyOJNjFASs6bKwXEcN11IDL3YyCvJe1SrynK22nJkOKR7wvF d7pDv9SUG3ryLdpOtcULL8EeZTQ9zeQrlmjrWvC1RDI7Tf3OR84j4flkfI+JrXk8Rxf5 hjl98BABFt44dqDQQ5lWzvKZVzfX+2hfTiMmnXjQzpOY3LTZA+hab5JLNo9taieV5PhY mSsg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726009523; x=1726614323; h=cc:to:subject:message-id:date:from:references:in-reply-to :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=lxaC1kdp70ZxV3KhRBwEMA5ciML5mPC6AyD8PtVl0sQ=; b=PY/IgHulPwi3VauhmS+qt1KIZPzkc7Bi6GxdQf83naGM1ODd9kRdZ8l2y/L7Utn4zB DDaHXqjKz3TPQCJBfhl8K76mmXZhiojN+u2q9dhLbVwXOOb2HlzyUqQAEJA29Q9S3klG Ph71rR8GxFk9PrjN523AeXRvbno19mlzOOF/jq93aZ5w6jAUP9nv6fenFc2i0z1B+gmo rgr3ldwm7CTGMBbDgeUpDhODfIbzc2pjfTmsPlcQRuu5d1HzRp0mlgH3878vRkVhvJw5 AitMmk95wTdOENZke7+aC88XkhfCT4LLkNaKGYXruTBFPNQEFqyhSwfJ8h+/zHfMhKnB X+Cg== X-Forwarded-Encrypted: i=1; AJvYcCUg9TU2r0kns0kO8tgvGJxIFxuMmLNVPrpyVp+jZE4BSlRgSGIZ462Cz729fCWoaRi1SptFK3/rXYzEcGk=@freebsd.org, AJvYcCVBh/aO361mgc/x6YZ22lxe+FNANYZy+dgia8i/s9EwWRIeF2UGTw3WMj5Esc2oWm9azDgVtatV/z7x6I8gYjow@freebsd.org, AJvYcCVO32wBtAyJEhvVq7jkboSbUPPcdTOck2vkjH8E4+0K91bxRl5kM/ifW97+No/xWa+xZs6oBphixvU9g6U=@freebsd.org X-Gm-Message-State: AOJu0YzxSISue3ez/7tVjxonI74ZkVjTe14ihc7kuEext4zTjV1qHYY5 92bj5EaBIxpHMBdVnaLDrEBrJfRtBVWXvlBBL7NRniBw9F6Pym8Eqvj5uBFc4YS23bz0unGv/9I tB5x2gMDGpApQVTurRI1WnvDfx4A= X-Google-Smtp-Source: AGHT+IHZii59aDQK4QI8qZz8O0y3eAwMpxF/hv2Me7NMwX/JK67b7rovaAf8S362Hp23Z8LjUTKKIGV9KyCum8Ktx00= X-Received: by 2002:a17:907:e651:b0:a7a:97ca:3056 with SMTP id a640c23a62f3a-a900482f97dmr100131666b.16.1726009522387; Tue, 10 Sep 2024 16:05:22 -0700 (PDT) List-Id: Discussion related to FreeBSD architecture List-Archive: https://lists.freebsd.org/archives/freebsd-arch List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arch@FreeBSD.org MIME-Version: 1.0 Received: by 2002:a17:907:7b06:b0:a7a:caa2:b01 with HTTP; Tue, 10 Sep 2024 16:05:21 -0700 (PDT) In-Reply-To: <20240911011228.161f94db@nuclight.lan> References: <20240910040544.125245ad@nuclight.lan> <202409100638.48A6cor2090591@critter.freebsd.dk> <20240910144557.4d95052a@nuclight.lan> <4D84AF55-51C7-4C2B-94F7-D486A29E8821@FreeBSD.org> <20240910164447.30039291@nuclight.lan> <3F3533E4-6059-4B4F-825F-6995745FDE35@FreeBSD.org> <20240911011228.161f94db@nuclight.lan> From: Rob Wing Date: Tue, 10 Sep 2024 15:05:21 -0800 Message-ID: Subject: Re: BPF64: proposal of platform-independent hardware-friendly backwards-compatible eBPF alternative To: Vadim Goncharov Cc: David Chisnall , Poul-Henning Kamp , "tcpdump-workers@lists.tcpdump.org" , "freebsd-arch@freebsd.org" , "freebsd-hackers@freebsd.org" , "freebsd-net@freebsd.org" , "tech-net@netbsd.org" , Alexander Nasonov Content-Type: multipart/alternative; boundary="0000000000007602a90621cbe8a7" X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; TAGGED_FROM(0.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US] X-Rspamd-Queue-Id: 4X3K6s2thtz4268 --0000000000007602a90621cbe8a7 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Doesn't NetBSD have Lua in the kernel...anyone have any experience with it? re BPF64: looks like hand waving, proposing two unwritten languages that extends an ISA that is still being drafted by IETF...hmm you can make anything look good on paper but the devil is in the details...which there are plenty of On Tuesday, September 10, 2024, Vadim Goncharov wrote: > On Tue, 10 Sep 2024 15:58:25 +0100 > David Chisnall wrote: > > > On 10 Sep 2024, at 14:44, Vadim Goncharov > > wrote: > > > > > > I am not an experience assembler user and don't understand how > > > Spectre works - that's why I've written RFC letter even before spec > > > finished - but isn't that (Spectre) an x86-specific thing? BPF64 > > > has more registers and primarily target RISC architectures if we're > > > speaking of JIT. > > > > No, speculative execution vulnerabilities are present in any CPUs > > that do speculative execution that does not have explicit mitigations > > against them (i.e. all that have shipped now). Cache side channels > > are present in any system with caches and do not have explicit > > mitigations (i.e. all that have shipped so far). > > > > Mitigations around these things are an active research area, but so > > far everything that=E2=80=99s been proposed has a performance hit and s= everal > > of them were broken before anyone even implemented them outside a > > simulator. > > > > > And BPF64 is meant as backwards-compatible extension of existing > > > BPF, that is, it has bytecode interpreter (for(;;) switch/case) as > > > primary form and JIT only then - thus e.g. JIT can be disabled for > > > non-root users in case of doubt. eBPF can't do this - it always > > > exists in native machine code form at execution, bytecode is only > > > for verifier stage. > > > > This has absolutely no impact on cache side channels. The JIT makes > > some attacks harder but prime-and-probe attacks are still possible. > > Wait, do you want to say that problem is not in JIT, that is, that > current BPF (e.g. tcpdump) present in the kernel - are also vulnerable? > Also, let's classify vulnerabilities. Is speculative execution > vulnerability the same as cache side channels? In any case, what impact > is? E.g. attacker could leak secrets, but *where* would them leak? BPF > typically returns one 32-bit number as a verdict (often as just > boolean), is it really attack vector? That is, may be solution is just > "don't give read access to BPF-writable memory segments to untrusteds". > > Next, if problem is with timing, then isn't that enough to just > restrict BPF code on having access to timers with resolution high > enough? > > > BPF can be loaded only by root, who can also load kernel modules and > > map /dev/[k]mem, and FreeBSD does not protect the root <-> kernel > > boundary. > > Wrong. It is possible for decades to do `chmod a+r /dev/bpf*` and run > tcpdump as non-root, which will load BPF code into kernel. Is *that* > also a vulnerability, and if so, why it was never reported? > > > Please read some of the (many) attacks on eBPF to better understand > > the security landscape here. It=E2=80=99s a *very* hard problem to sol= ve. > > Finally, the most big (in effort) question: suppose we limited to > trusted root user etc. so it's of no concern. Are there now any > objections/suggestions/comments on (rest of) BPF64 ? > > -- > WBR, @nuclight > > --0000000000007602a90621cbe8a7 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Doesn't NetBSD have Lua in the kernel...anyone have any experience with= it?

re BPF64:

looks like hand = waving, proposing two unwritten languages that extends an ISA that is still= being drafted by IETF...hmm

you can make anything= look good on paper but the devil is in the details...which there are plent= y of

On Tuesday, September 10, 2024, Vadim Goncharov <vadimnuclight@gmail.com> wrot= e:
On Tue, 10 Sep 2024 15:58:25 +0100
David Chisnall <theraven@FreeBSD.org> wrote:

> On 10 Sep 2024, at 14:44, Vadim Goncharov <vadimnuclight@gmail.com>
> wrote:
> >
> > I am not an experience assembler user and don't understand ho= w
> > Spectre works - that's why I've written RFC letter even b= efore spec
> > finished - but isn't that (Spectre) an x86-specific thing? BP= F64
> > has more registers and primarily target RISC architectures if we&= #39;re
> > speaking of JIT.=C2=A0
>
> No, speculative execution vulnerabilities are present in any CPUs
> that do speculative execution that does not have explicit mitigations<= br> > against them (i.e. all that have shipped now).=C2=A0 Cache side channe= ls
> are present in any system with caches and do not have explicit
> mitigations (i.e. all that have shipped so far).
>
> Mitigations around these things are an active research area, but so > far everything that=E2=80=99s been proposed has a performance hit and = several
> of them were broken before anyone even implemented them outside a
> simulator.
>
> > And BPF64 is meant as backwards-compatible extension of existing<= br> > > BPF, that is, it has bytecode interpreter (for(;;) switch/case) a= s
> > primary form and JIT only then - thus e.g. JIT can be disabled fo= r
> > non-root users in case of doubt. eBPF can't do this - it alwa= ys
> > exists in native machine code form at execution, bytecode is only=
> > for verifier stage.=C2=A0
>
> This has absolutely no impact on cache side channels.=C2=A0 The JIT ma= kes
> some attacks harder but prime-and-probe attacks are still possible.
Wait, do you want to say that problem is not in JIT, that is, that
current BPF (e.g. tcpdump) present in the kernel - are also vulnerable?
Also, let's classify vulnerabilities. Is speculative execution
vulnerability the same as cache side channels? In any case, what impact
is? E.g. attacker could leak secrets, but *where* would them leak? BPF
typically returns one 32-bit number as a verdict (often as just
boolean), is it really attack vector? That is, may be solution is just
"don't give read access to BPF-writable memory segments to untrust= eds".

Next, if problem is with timing, then isn't that enough to just
restrict BPF code on having access to timers with resolution high
enough?

> BPF can be loaded only by root, who can also load kernel modules and > map /dev/[k]mem, and FreeBSD does not protect the root <-> kerne= l
> boundary.

Wrong. It is possible for decades to do `chmod a+r /dev/bpf*` and run
tcpdump as non-root, which will load BPF code into kernel. Is *that*
also a vulnerability, and if so, why it was never reported?

> Please read some of the (many) attacks on eBPF to better understand > the security landscape here.=C2=A0 It=E2=80=99s a *very* hard problem = to solve.

Finally, the most big (in effort) question: suppose we limited to
trusted root user etc. so it's of no concern. Are there now any
objections/suggestions/comments on (rest of) BPF64 ?

--
WBR, @nuclight

--0000000000007602a90621cbe8a7-- From nobody Tue Sep 10 23:47:35 2024 X-Original-To: freebsd-arch@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4X3L3g4wZ9z5TtpP; Tue, 10 Sep 2024 23:47:43 +0000 (UTC) (envelope-from vadimnuclight@gmail.com) Received: from mail-lj1-x22c.google.com (mail-lj1-x22c.google.com [IPv6:2a00:1450:4864:20::22c]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4X3L3f3QfJz476k; Tue, 10 Sep 2024 23:47:42 +0000 (UTC) (envelope-from vadimnuclight@gmail.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20230601 header.b=JQpQoezA; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of vadimnuclight@gmail.com designates 2a00:1450:4864:20::22c as permitted sender) smtp.mailfrom=vadimnuclight@gmail.com Received: by mail-lj1-x22c.google.com with SMTP id 38308e7fff4ca-2f6580c2bbfso2953571fa.1; Tue, 10 Sep 2024 16:47:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1726012060; x=1726616860; darn=freebsd.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=pLn0PIT2lwV+Djl805vfqvUiLnwCki2KPcUPimdE1mU=; b=JQpQoezARMrUdMigjLsU/NfW1PkDPD6Aa9gI/PQhObE02NfRyrPdaSQ6vRkXU6R1zh T4Hc2rNqRXGal/sCKKcfjsi+5bEtq/CH5VxpE1dxlIrbVNRCVlTy+sKUQlNGCYDIyHKt DSAbTF2ygoqINrpt0Q7nQjtKIrwbT74stMHfRuMq5bzd8LQc/uyL9TCyxIkXC9b7kBUq odIfyHokp6V8hKB8PA4pVxI8OyBrSO2oDPMx9qdvji/4kMGrYyJgsoesh22JqwDg8pvG VhkJoS0xSKqGEoHzKpQz6ThNEtoX0LniyRwVEOBDVsFNNzD3lnfmSWKjd4k6Mu3YWX8i 2VyQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726012060; x=1726616860; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=pLn0PIT2lwV+Djl805vfqvUiLnwCki2KPcUPimdE1mU=; b=QPN1tM80O+C4c0E87P9QyQURC4x8kaRGbxu9upCwovByDg6whRVKwoE9CiLvJRzfaq /UNIAbT/2JpQPlmH5hMrvkgbSPoK2H2+49f/Zyx+BK6TAponmTMK2oxsLQgO0Uf1cPTp wHqvoPFieJb3E9EvDsumbsS4Yt0lFlnfK2ltE64jjluE85jZkBmb6popiHyST5ZJjpT0 EzP1K+O/EcQCtdodk74OfVm5jSXvge/jg+fZznB8YxVgPpOJsiWgrSqJDVJkvanCnvoK nGNXwCen6XZzrr9QKyfIs6DKwNIobXfAjqiizk0MWMrgExAPfz021udRvJ363j81jBbg mNrw== X-Forwarded-Encrypted: i=1; AJvYcCUIhjJ1GZRlt1kOYsVRvTaHCzQjlfvbcTtK8BKhqFe3kZkHEiJhKqxtcZCLTJS+aBRfmTIIWkqzTNbgXPw=@freebsd.org, AJvYcCVGNoZBKdqC15RKc8pIdSRYdRHvNZEwXWp/DYG/M7jHLEJ9WAfooxVtC96a7itePRGsqdXi4RZRslml18s=@freebsd.org, AJvYcCWi2SotehQSxWaJ+u/mW/Uy3+8KQqFrcZwNTgqAXdIMErrwUVs/6G1MPBqbA9KD+ggZoDpepqnrCjMQN2jviU1B@freebsd.org X-Gm-Message-State: AOJu0Yxn37B/OcQfcfcOqHSvFDY8bWdY/BnkM1Um1u5lrUdTof1O4iK8 1m7P/Z4gp1qvWrCip2T/qb0aU8xtwa5eBPbpxDP7EwywWd0lrORHEGjD4OjM X-Google-Smtp-Source: AGHT+IFS3vPrdngBZPNcSrE198Tqj5S5aP4CHm4pOYWmA1awgaVQ++udPn+7VFedXQGZ7dqVjpIOBQ== X-Received: by 2002:a05:6512:1387:b0:536:552e:5d36 with SMTP id 2adb3069b0e04-5366b91f158mr1474569e87.12.1726012059924; Tue, 10 Sep 2024 16:47:39 -0700 (PDT) Received: from nuclight.lan ([37.204.254.214]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-5365f90d482sm1362675e87.262.2024.09.10.16.47.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Sep 2024 16:47:39 -0700 (PDT) Date: Wed, 11 Sep 2024 02:47:35 +0300 From: Vadim Goncharov To: Rob Wing Cc: David Chisnall , Poul-Henning Kamp , "freebsd-arch@freebsd.org" , "freebsd-hackers@freebsd.org" , "freebsd-net@freebsd.org" , "tech-net@netbsd.org" Subject: That's how (why) BSD is dying (Was: BPF64) Message-ID: <20240911024735.37522532@nuclight.lan> In-Reply-To: References: <20240910040544.125245ad@nuclight.lan> <202409100638.48A6cor2090591@critter.freebsd.dk> <20240910144557.4d95052a@nuclight.lan> <4D84AF55-51C7-4C2B-94F7-D486A29E8821@FreeBSD.org> <20240910164447.30039291@nuclight.lan> <3F3533E4-6059-4B4F-825F-6995745FDE35@FreeBSD.org> <20240911011228.161f94db@nuclight.lan> X-Mailer: Claws Mail 3.19.1 (GTK+ 2.24.33; amd64-portbld-freebsd12.4) List-Id: Discussion related to FreeBSD architecture List-Archive: https://lists.freebsd.org/archives/freebsd-arch List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arch@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Spamd-Bar: -- X-Spamd-Result: default: False [-2.49 / 15.00]; SUSPICIOUS_RECIPS(1.50)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.99)[-0.989]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20230601]; MIME_GOOD(-0.10)[text/plain]; FREEMAIL_ENVFROM(0.00)[gmail.com]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; FREEMAIL_FROM(0.00)[gmail.com]; FREEMAIL_TO(0.00)[gmail.com]; ARC_NA(0.00)[]; TO_DN_SOME(0.00)[]; TO_DN_EQ_ADDR_SOME(0.00)[]; FROM_HAS_DN(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; MLMMJ_DEST(0.00)[freebsd-arch@freebsd.org,freebsd-hackers@freebsd.org,freebsd-net@freebsd.org]; TAGGED_RCPT(0.00)[]; RCPT_COUNT_SEVEN(0.00)[7]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::22c:from] X-Rspamd-Queue-Id: 4X3L3f3QfJz476k On Tue, 10 Sep 2024 15:05:21 -0800 Rob Wing wrote: > Doesn't NetBSD have Lua in the kernel...anyone have any experience > with it? Lua is not suitable for the discussed problem domains, don't listen to those who did not read material and did not understand short descriptions. > re BPF64: >=20 > looks like hand waving, proposing two unwritten languages that > extends an ISA that is still being drafted by IETF...hmm What's the point to waste resources on writing thing that is known to be not accepted to base system from the very beginning? FreeBSD already had precedent when a *ready* code framework was rejected be some FreeBSD users' enemy, leaving FreeBSD users to suck with absolutely *no* alternative (yep, sensors) - as ridiculous as if it was to say "current BSD firewalls are inferior to Linux so we'll better sit without that bad code (and firewall) AT ALL". Yes, the situation in networking for us was for many years - and still is - exactly so. This is exactly the way how to loose market competition and die - just don't try to innovate and do something better than competitor; then you find yourself porting compatibility layers after decades of rot, like Netlink or Linuxulator ABI, in a try to at least hobble with a cane instead of lying in a coma. > you can make anything look good on paper but the devil is in the > details...which there are plenty of So do you have to say something constructive on the real subject? Or even bikesheds are in short supply now? > On Tuesday, September 10, 2024, Vadim Goncharov > wrote: >=20 > > On Tue, 10 Sep 2024 15:58:25 +0100 > > David Chisnall wrote: > > =20 > > > On 10 Sep 2024, at 14:44, Vadim Goncharov > > > wrote: =20 > > > > > > > > I am not an experience assembler user and don't understand how > > > > Spectre works - that's why I've written RFC letter even before > > > > spec finished - but isn't that (Spectre) an x86-specific thing? > > > > BPF64 has more registers and primarily target RISC > > > > architectures if we're speaking of JIT. =20 > > > > > > No, speculative execution vulnerabilities are present in any CPUs > > > that do speculative execution that does not have explicit > > > mitigations against them (i.e. all that have shipped now). Cache > > > side channels are present in any system with caches and do not > > > have explicit mitigations (i.e. all that have shipped so far). > > > > > > Mitigations around these things are an active research area, but > > > so far everything that=E2=80=99s been proposed has a performance hit = and > > > several of them were broken before anyone even implemented them > > > outside a simulator. > > > =20 > > > > And BPF64 is meant as backwards-compatible extension of existing > > > > BPF, that is, it has bytecode interpreter (for(;;) switch/case) > > > > as primary form and JIT only then - thus e.g. JIT can be > > > > disabled for non-root users in case of doubt. eBPF can't do > > > > this - it always exists in native machine code form at > > > > execution, bytecode is only for verifier stage. =20 > > > > > > This has absolutely no impact on cache side channels. The JIT > > > makes some attacks harder but prime-and-probe attacks are still > > > possible. =20 > > > > Wait, do you want to say that problem is not in JIT, that is, that > > current BPF (e.g. tcpdump) present in the kernel - are also > > vulnerable? Also, let's classify vulnerabilities. Is speculative > > execution vulnerability the same as cache side channels? In any > > case, what impact is? E.g. attacker could leak secrets, but *where* > > would them leak? BPF typically returns one 32-bit number as a > > verdict (often as just boolean), is it really attack vector? That > > is, may be solution is just "don't give read access to BPF-writable > > memory segments to untrusteds". > > > > Next, if problem is with timing, then isn't that enough to just > > restrict BPF code on having access to timers with resolution high > > enough? > > =20 > > > BPF can be loaded only by root, who can also load kernel modules > > > and map /dev/[k]mem, and FreeBSD does not protect the root <-> > > > kernel boundary. =20 > > > > Wrong. It is possible for decades to do `chmod a+r /dev/bpf*` and > > run tcpdump as non-root, which will load BPF code into kernel. Is > > *that* also a vulnerability, and if so, why it was never reported? > > =20 > > > Please read some of the (many) attacks on eBPF to better > > > understand the security landscape here. It=E2=80=99s a *very* hard > > > problem to solve. =20 > > > > Finally, the most big (in effort) question: suppose we limited to > > trusted root user etc. so it's of no concern. Are there now any > > objections/suggestions/comments on (rest of) BPF64 ? > > > > -- > > WBR, @nuclight > > > > =20 --=20 WBR, @nuclight From nobody Wed Sep 11 00:55:15 2024 X-Original-To: freebsd-arch@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4X3MYg3jnNz5V4q7; Wed, 11 Sep 2024 00:55:19 +0000 (UTC) (envelope-from rob.fx907@gmail.com) Received: from mail-ej1-x630.google.com (mail-ej1-x630.google.com [IPv6:2a00:1450:4864:20::630]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4X3MYf6729z4KQ4; Wed, 11 Sep 2024 00:55:18 +0000 (UTC) (envelope-from rob.fx907@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-ej1-x630.google.com with SMTP id a640c23a62f3a-a8d2daa2262so386447966b.1; Tue, 10 Sep 2024 17:55:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1726016116; x=1726620916; darn=freebsd.org; h=cc:to:subject:message-id:date:from:references:in-reply-to :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=8OOORre1APQx/atxC+Yz/cGDDA5j4dnQedINsPQZPgY=; b=e/9kOjgBAW3I4UV61UIh0+zCFF57kqptx7lJ2pW+bEnVktLGehRFkqmqYLERHUu+PC i++waLqMkdr/n11GW+CpE+jH+tXOpgOVtbpmjU51GI0Vr0DIoabgfD6VPVIGyqFaEHfz VW4iOhHGfQdweH0ogA5MoVk+CbJVUn8FDLjYvHOl39lOEZfnlGZaqc9moxXJURsWGZUf kFS4ZtL9yWSpuxBqiF4UyOK9yg+1KKmS5S7Ib3uAmWV4OivNi2+gbVfbU7NU1o2DbSy/ WPNen6uZQLZBqY5kcIW/GkzAIHMwP+X/eJF0uddI7EjToX5G55QuHNRjBbFALJ4T9lzM I+NA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726016116; x=1726620916; h=cc:to:subject:message-id:date:from:references:in-reply-to :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=8OOORre1APQx/atxC+Yz/cGDDA5j4dnQedINsPQZPgY=; b=CGnF44KOGXK16puzY1PMIKsC7uTMo4e7KQIk9Of0SRvYxJGF3Oj8Td+VvmbsfCEsjE 2gJrvbWqfryyV0EHMGUbTytNUKpn+jon3DleeEYsZTmePdEGOMKaueV0bSPtsm/9GBMy UdRfOoydtkp1S8VzLtd4nPezYu9OKMJFPskn+W5u/uJTB1IC9ChRyZotg2tSxqzflh/q 605japAzLW9AyYZDYIQOnnzDjIUVEjmm1aRmsXiEUKsIKYeYhmv2FbhQuEsH7/ATCyyd c+4KqK1IVKKNm28eoMw0eUuEBEazwNSvinrF71yS+g0RR9gIS9FKSa5tcvyZHy0M8CRv wzHw== X-Forwarded-Encrypted: i=1; AJvYcCU+wxA7KYH9RV9QIBeEGIy7ECE1u+EjOzMvPpNIge9/WXIrcxyJgvBTYvLDv1mC5/al2mSUyzaEhplxeRg=@freebsd.org, AJvYcCUaO9YA9C/dqpNl1QMixaLMXQ7j4DVddkAEFahIHWYHE94aMVeg72L5RV0gygnNbClLa4vwqudWu+q1spdE9PQ1@freebsd.org, AJvYcCWpBgeD2TQG+G/+GnhKPgwZfMa93zliMA0ODrP0jagEOrOEE4ZJi80U/nWHvYKjAWONvgKj+Nrpob/lMV4=@freebsd.org X-Gm-Message-State: AOJu0YzV3w27Ny9AbzVb6lKPL3jiHAAq76F94GnOAAQkxY666MhHNpS/ eSI8o/+JT26GBMdH7RtZPf2qeJDmCWyronNZDqmhsUKASGtxqVOoKvLtUQkx7QK2s9j3kjgcOIe EaD4NlnqZWE9YrWrRPnTdSXGfoqc= X-Google-Smtp-Source: AGHT+IEljO2MLm6QccSzBgyWmdCPtGh+yfzDLn1w3uiLQkvzazIAbScpIEH5p/CExvXJBgcWdWKJZvI55cF3wAf/f6s= X-Received: by 2002:a17:907:2d1e:b0:a8d:2faf:d341 with SMTP id a640c23a62f3a-a8ffaaafbc7mr233657166b.10.1726016115968; Tue, 10 Sep 2024 17:55:15 -0700 (PDT) List-Id: Discussion related to FreeBSD architecture List-Archive: https://lists.freebsd.org/archives/freebsd-arch List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arch@FreeBSD.org MIME-Version: 1.0 Received: by 2002:a17:907:7b06:b0:a7a:caa2:b01 with HTTP; Tue, 10 Sep 2024 17:55:15 -0700 (PDT) In-Reply-To: <20240911024735.37522532@nuclight.lan> References: <20240910040544.125245ad@nuclight.lan> <202409100638.48A6cor2090591@critter.freebsd.dk> <20240910144557.4d95052a@nuclight.lan> <4D84AF55-51C7-4C2B-94F7-D486A29E8821@FreeBSD.org> <20240910164447.30039291@nuclight.lan> <3F3533E4-6059-4B4F-825F-6995745FDE35@FreeBSD.org> <20240911011228.161f94db@nuclight.lan> <20240911024735.37522532@nuclight.lan> From: Rob Wing Date: Tue, 10 Sep 2024 16:55:15 -0800 Message-ID: Subject: Re: That's how (why) BSD is dying (Was: BPF64) To: Vadim Goncharov Cc: David Chisnall , Poul-Henning Kamp , "freebsd-arch@freebsd.org" , "freebsd-hackers@freebsd.org" , "freebsd-net@freebsd.org" , "tech-net@netbsd.org" Content-Type: multipart/alternative; boundary="0000000000007811300621cd713b" X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; TAGGED_FROM(0.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US] X-Rspamd-Queue-Id: 4X3MYf6729z4KQ4 --0000000000007811300621cd713b Content-Type: text/plain; charset="UTF-8" On Tuesday, September 10, 2024, Vadim Goncharov wrote: > > > What's the point to waste resources on writing thing that is known to > be not accepted to base system from the very beginning? what's the point of talking about thing you're never going to write/finish even if it was accepted? Or even bikesheds are in short supply now? have you considered calling it eBPF++ or BPF128? at any rate, don't claim the sky is falling on the grounds that I think your proposal is far fetched if you think you've got a great idea and have a need for it, then write it and use it - if other people adopt it, even better all the best to your endeavors and don't let the naysayers hold you back -Rob --0000000000007811300621cd713b Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

On Tuesday, September 10, 2024, Vadim Goncharov <vadimnuclight@gmail.com> wrote:
What's the point to waste resources on writing thing that is known to be not accepted to base system from the very beginning?=C2=A0<= div>
what's the point of talking about thing you're n= ever going to write/finish even if it was accepted?

Or even bikesheds are in short supply now?<= /blockquote>
=C2=A0
have you considered calling it eBPF++ or = BPF128?

at any rate, don't claim the sky is fa= lling on the grounds that I think your proposal is far fetched
if you think you've got a great idea and have a need for i= t, then write it and use it - if other people adopt it, even better

all the best to your endeavors and don't let the nays= ayers hold you back

-Rob
--0000000000007811300621cd713b-- From nobody Wed Sep 11 09:05:18 2024 X-Original-To: freebsd-arch@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4X3ZRB6BRCz5WB2X; Wed, 11 Sep 2024 09:05:26 +0000 (UTC) (envelope-from vadimnuclight@gmail.com) Received: from mail-lf1-x12a.google.com (mail-lf1-x12a.google.com [IPv6:2a00:1450:4864:20::12a]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4X3ZRB04qPz4BHx; Wed, 11 Sep 2024 09:05:26 +0000 (UTC) (envelope-from vadimnuclight@gmail.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20230601 header.b=XZDy8rHe; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of vadimnuclight@gmail.com designates 2a00:1450:4864:20::12a as permitted sender) smtp.mailfrom=vadimnuclight@gmail.com Received: by mail-lf1-x12a.google.com with SMTP id 2adb3069b0e04-5366fd6fdf1so2272410e87.0; Wed, 11 Sep 2024 02:05:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1726045524; x=1726650324; darn=freebsd.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=aygyrhQ+/cZDvRSNjzEqcDkc5n4UHMSS0q3kDo8ROik=; b=XZDy8rHeoSRNIob10aLzEwgev209wHvm0C9hqPI8yrGq5+Pqownx9/gjdI0pXltJdR ZnhHKHivvTvARAXuGkuh7otMDmTlXMXISO1yl+UaVwIN72GzK65qKm0NP8gZILBui1Zp dsGJurnd8HlHtq2jcl1AA7TGY+X6t2dc3hYgh8PYjfcsgFIRuYMjtHJW4AKLkVaz2O4O ddnUPVN5XZXT7efl2ITc7zR7NtjbleU0hyY44HCAJ53JrKjzg/J2F+RSmM8CiCAfydVY vaw1P3AyVTeeLdjdxncU4R9XE51xlKPgACeD8+YB6x7csXSgteH3UUQO6bNkOEo7r/vk CgjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726045524; x=1726650324; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=aygyrhQ+/cZDvRSNjzEqcDkc5n4UHMSS0q3kDo8ROik=; b=a8h6jY1OEQYvExg+gxcBH+tiAsVIyV9C3CL87QjwRPxZBVwT3oCV7Sac9tlH6pfz99 AzFv1KNzYv2acMDBJT2MX7vJAo7QHhfbdMfmTICBQmHYsJo1Ousg6vT4n1HBUnHYKLwV g5IxDCVxe65G/qcLu2r8TTJl7BDTZZaxUjmgVyGJjEzZ4rZI+9WXJI3KKtmV+TWtbDBB u0+e2ltH+y75SU4H09d/3E6SSg8c1/ipyyLqrbSgAdCY1vlGZVfupGrxneC+kOnTAQbc sLviSzLz2xEKMws0wMeddpzkzQxtHRY4e7W2qiueo1OPw5ExPBenHhEHnxWR2bi3I4dK Ty6Q== X-Forwarded-Encrypted: i=1; AJvYcCU9O2gsRL2NL9BKC3i0VaJhTuMnlu0/XwRydalh3F1obUngb/MLHEDQ41xs6/Mq1mR8frUawRoMPgOtW87nAID3@freebsd.org, AJvYcCUfnW5Ccji0temXzlyfHjObcGTWBmJjmu49Bm4UYaKAezK2xEKTj9rBci/KhnddgdoxCfySysqTjCCe5qs=@freebsd.org, AJvYcCWpzp9DwEFJzU0jpe7tlTucwb8MFjStk/1avOuTdn4vPjcrimbUVbAD38Y5kkxJMLeUy8aK7PgqtC/Xwsg=@freebsd.org X-Gm-Message-State: AOJu0YwmCX2pMGd5uuieeQq6NHed7hx4j1Rl1HGBMULQ0FRGOscNfT3W LpTGY17PVs7KLhz8nw40PSz1o9SLgPGsMF0xbano+X/B6wEo0wrV X-Google-Smtp-Source: AGHT+IHFLBaXxhDXONnQW29ANP0GW0U6/BDTTyLArumdV+Kh1qEDoKIpQr+cVHC495aTj9UXrGCLEQ== X-Received: by 2002:a05:6512:3989:b0:535:3ce5:6173 with SMTP id 2adb3069b0e04-536587f6087mr11930442e87.37.1726045523447; Wed, 11 Sep 2024 02:05:23 -0700 (PDT) Received: from nuclight.lan ([37.204.254.214]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-5365f912ee5sm1513806e87.301.2024.09.11.02.05.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Sep 2024 02:05:23 -0700 (PDT) Date: Wed, 11 Sep 2024 12:05:18 +0300 From: Vadim Goncharov To: Philip Paeps Cc: David Chisnall , Poul-Henning Kamp , freebsd-arch@FreeBSD.org, freebsd-hackers@FreeBSD.org, freebsd-net@FreeBSD.org, tech-net@NetBSD.org Subject: Re: BPF64: proposal of platform-independent hardware-friendly backwards-compatible eBPF alternative Message-ID: <20240911120518.1ba191b5@nuclight.lan> In-Reply-To: References: <20240910040544.125245ad@nuclight.lan> <202409100638.48A6cor2090591@critter.freebsd.dk> <20240910144557.4d95052a@nuclight.lan> <4D84AF55-51C7-4C2B-94F7-D486A29E8821@FreeBSD.org> <20240910164447.30039291@nuclight.lan> <3F3533E4-6059-4B4F-825F-6995745FDE35@FreeBSD.org> <20240911011228.161f94db@nuclight.lan> X-Mailer: Claws Mail 3.19.1 (GTK+ 2.24.33; amd64-portbld-freebsd12.4) List-Id: Discussion related to FreeBSD architecture List-Archive: https://lists.freebsd.org/archives/freebsd-arch List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arch@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spamd-Bar: --- X-Spamd-Result: default: False [-3.55 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.55)[-0.554]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20230601]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36]; MIME_GOOD(-0.10)[text/plain]; RCVD_COUNT_TWO(0.00)[2]; MIME_TRACE(0.00)[0:+]; TO_DN_SOME(0.00)[]; FREEMAIL_FROM(0.00)[gmail.com]; FREEMAIL_ENVFROM(0.00)[gmail.com]; FROM_HAS_DN(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; ARC_NA(0.00)[]; MLMMJ_DEST(0.00)[freebsd-arch@freebsd.org,freebsd-hackers@freebsd.org,freebsd-net@freebsd.org]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::12a:from]; RCPT_COUNT_SEVEN(0.00)[7]; DKIM_TRACE(0.00)[gmail.com:+] X-Rspamd-Queue-Id: 4X3ZRB04qPz4BHx On Wed, 11 Sep 2024 10:14:44 +0800 Philip Paeps wrote: > On 2024-09-11 06:12:28 (+0800), Vadim Goncharov wrote: > > David Chisnall wrote: > >> BPF can be loaded only by root, who can also load kernel modules > >> and map /dev/[k]mem, and FreeBSD does not protect the root <-> > >> kernel boundary. > > > > Wrong. It is possible for decades to do `chmod a+r /dev/bpf*` and > > run tcpdump as non-root, which will load BPF code into kernel. Is > > *that* also a vulnerability, and if so, why it was never reported? > > This is equivalent to chmod a+w /dev/mem. > > Unwise configuration decisions are not vulnerabilities. But then a possibility to give this to non-root is. And many things are considered vulnerabilitites even if they are only available to root - for example, when root can be tricked into running malicious code etc. (unconscious) actions without direct intention. Equivalency of classic BPF to writable /dev/mem is too loud and controversial statement. Demonstrate how it can be done on stock FreeBSD 13 with /dev/bpf available to attacker (e.g. `sudo tcpdump` allowed). -- WBR, @nuclight From nobody Wed Sep 11 11:21:09 2024 X-Original-To: freebsd-arch@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4X3dS230wdz5WS1f; Wed, 11 Sep 2024 11:21:22 +0000 (UTC) (envelope-from theraven@freebsd.org) Received: from smtp.freebsd.org (smtp.freebsd.org [96.47.72.83]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4X3dS20pX2z4ZLd; Wed, 11 Sep 2024 11:21:22 +0000 (UTC) (envelope-from theraven@freebsd.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1726053682; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=f/2x/6NihxJiRrA1BaJGxpPygChNh4vd66EqRwfJEzQ=; b=maL8AuBoi5mVYQcLONv0iPiXg1fht+yMMOqm+CvjlIZ2TPrHMi9ZGyCYl5/f41yk6lW7Ay 4mjJu9HugSIlQ1+4tOkAibwKGMlrQ+aidcxmwVTsF1FZypo8zwcL2E5WGyEJT9GSQNfEvC ni0nvX7vJ5/yCuP12pxQMzj0+ALztRT4/cOkyoyCwY1/DNddFKiB3hTQaNYh2VD2IBCobf iKEHH1G7R4z+CIklTRRDHpXu45QqIFDseS8HhawIAdMxu2GqQXiMntMKiSKkBNkFjdogjD x+9oJ9iH+tQk1UKkcUiSY655Jbz5QchC2Zi66/Gqzt1txr3BGFNIpyE5FEjePw== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1726053682; a=rsa-sha256; cv=none; b=nNimEszrCpoKTZ6eJJWhwAeIkHo2A24DEDAvdgJiivd1FIqwZNiub/Z33EjNyXsaSfcuzE dWVBYhhJWEX9/1AHuHv1SaKHBABu083XiZ01F0xJeAvjCALDdUXxhft3lGGr31/gTvucwk wDQPhpMr6wDBhw0cmw2w7T9f/Finaagu948QHlNKzRuFyn5TNGTtknKMcXhHG0UkI9gPMr VjgPnE6pvjG++rumuCEv61882Bjp0slSS3X3lWOuKkRSs0dh7laeX/i1FK+Gl9SmUbunDf YbgQ3NhMijzjqnw4CDXWmRs9TPiZU+7bZ5tkgal32K4YyI4ITRIdAqvaGnaPbw== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1726053682; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=f/2x/6NihxJiRrA1BaJGxpPygChNh4vd66EqRwfJEzQ=; b=Jx69LwcELMGUEZggBBlBAssiSMYuDvjr1egy2h9mJd5tLnEClD0COIoplTIy+c5um90GK2 LgvMFdJAlTs7hEWr+K51Cv3bcv1yeYOvGe6sqQJH+1iFjLKSmCzF3DAGDcX2v8C7GbHkfO oJotKy9VmdeIT0K5hUUaKSom344zvg7gtnUcg8P9Sacu1gS9+9dS+K8Fvd9I8px6LO/OXi vGtbjK5016UmhsxxHEoPPJeGUdRF4TrLgmAUNtSUqedLuY7/wwZneASy9wRiAqIL8Nd6Pp DdXKeIWcov6be5miBqLrA7901hcnqcWWPFGMoppQwFB1cdx3Jm3wVHYSFWS2Gw== Received: from smtp.theravensnest.org (smtp.theravensnest.org [45.77.103.195]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: theraven) by smtp.freebsd.org (Postfix) with ESMTPSA id 4X3dS20FYrz18Zn; Wed, 11 Sep 2024 11:21:21 +0000 (UTC) (envelope-from theraven@freebsd.org) Received: from smtpclient.apple (host109-155-136-107.range109-155.btcentralplus.com [109.155.136.107]) by smtp.theravensnest.org (Postfix) with ESMTPSA id F245D65DC; Wed, 11 Sep 2024 12:21:20 +0100 (BST) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable From: David Chisnall List-Id: Discussion related to FreeBSD architecture List-Archive: https://lists.freebsd.org/archives/freebsd-arch List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arch@FreeBSD.org Mime-Version: 1.0 (1.0) Subject: Re: BPF64: proposal of platform-independent hardware-friendly backwards-compatible eBPF alternative Date: Wed, 11 Sep 2024 12:21:09 +0100 Message-Id: References: <20240911120518.1ba191b5@nuclight.lan> Cc: Philip Paeps , Poul-Henning Kamp , freebsd-arch@freebsd.org, freebsd-hackers@freebsd.org, freebsd-net@freebsd.org, tech-net@netbsd.org In-Reply-To: <20240911120518.1ba191b5@nuclight.lan> To: Vadim Goncharov X-Mailer: iPad Mail (21G93) On 11 Sep 2024, at 10:06, Vadim Goncharov wrote: >=20 > But then a possibility to give this to non-root is. And many things are > considered vulnerabilitites even if they are only available to root - > for example, when root can be tricked into running malicious code etc. > (unconscious) actions without direct intention. When the root user intentionality changes some thin from a secure default to= an insecure setting, it is not a security vulnerability in the system that s= hipped the safe defaults.=20 > Equivalency of classic BPF to writable /dev/mem is too loud and > controversial statement. Demonstrate how it can be done on stock > FreeBSD 13 with /dev/bpf available to attacker (e.g. `sudo tcpdump` > allowed). Two things to unpick here. First: Demanding a proof of concept before you accept that something is a vulnerabi= lity is how you build insecure systems. Ask the Matrix team how well that at= titude has worked for them in the last few weeks. You build secure systems b= y defining a threat model and then evaluating primitives against that threat= model, not by throwing together a bunch of primitives and saying =E2=80=98w= ell, *I* can=E2=80=99t assemble them into an exploit and so no one can=E2=80= =99. Second, there are documented attacks on eBPF that give the equivalent of *re= ad* access to /dev/mem. This is why BPF is restricted to root. We have a thr= eat model. The threat model says that we do not need to ensure that BPF cann= ot leak kernel data indirectly because only the user who has the ability to l= eak kernel data directly can use it and this user has a simpler way of achie= ving the same result. If you allow non-root users to run code (native or aga= inst any virtual machine) then you are changing the threat model. You *must*= prevent users from leaking kernel data that they could not leak via existin= g mechanisms. The two most common attacks using eBPF are generally in the following two ca= tegories: - Use eBPF to mount a speculative execution attack on the kernel. Please re= ad up on what these can do. No one should be building a thing that runs code= in the kernel without understanding speculative, cache, and timing side cha= nnels. - Use eBPF to build a set of gadgets that you can then use to go from one m= emory-safety bug in the kernel to arbitrary-code execution. This is the threat landscape in which something in the same space as eBPF mu= st exist. A proposed design should *start* with an explanation of how it mit= igates both of these categories of attack. David From nobody Wed Sep 11 23:57:17 2024 X-Original-To: freebsd-arch@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4X3yDM44CTz5VtfM; Wed, 11 Sep 2024 23:57:23 +0000 (UTC) (envelope-from vadimnuclight@gmail.com) Received: from mail-lj1-x22b.google.com (mail-lj1-x22b.google.com [IPv6:2a00:1450:4864:20::22b]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4X3yDM25B7z4FPd; Wed, 11 Sep 2024 23:57:23 +0000 (UTC) (envelope-from vadimnuclight@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-lj1-x22b.google.com with SMTP id 38308e7fff4ca-2f7528f4658so3540351fa.3; Wed, 11 Sep 2024 16:57:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1726099041; x=1726703841; darn=freebsd.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=BTPaR5wX0V190g/8+LMON57KsrpCZmF7qQI/FvvC2Eo=; b=PzmCM5AwJk1+Q33RogTWyofj6rh8QWyOOeG2ic+sqIhpV+4MUS336FCYh9UMVgM2ij 6/YOIHwh8IvPdJ34RqSDqHWspqPesiUwKw19rhDOEo/N51SvBI226a5v1PjAq9a7LTlP g8OBsBbcMFr8+qgcDf29puSLcQUQ8FQ0Z3pEAnVO+t5W7qGtYIHMc2g3yPuvlpTDjHbu 0Uan+7ktxA+qncYUpCdI5TE6nJEIT/uB6GdKUV1k9wNPilc+XuZ7eKSMN32FbbY3bzPt x8/etQq60KBVO1K8NTrkIb71ZzzZkkH61yjanLojZ9QBz5upmbjSCTk50aXkNAyYBRRX 11Lw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726099041; x=1726703841; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BTPaR5wX0V190g/8+LMON57KsrpCZmF7qQI/FvvC2Eo=; b=Ty+wEIgCCgpYQFMwkygd6eOe+7ImdCuVr+JKgJl/xHBM7G7DgCRlEO5TQtUvj5PjcW DR9E6iTnI8aOtrbBueg2+FNeQ8pSf/vV5meZgchEwPr4VOlJcbwoBD+9+H47ViqNCiLH vO+kwwaPQEq/sUdw7/cZbIUspobfXFd2wvuXNocsBiLwDRcSvhXzDRermzvLPQqoAQK/ JqUUeewsRp2rTUod60VqgLxJBFXbGdhrdHHwHLGxiopGW8fRSmNi8mnjjFlfOTCQ7wZ9 3ASt7IM2uCbqKojS9N6e0T220fLECh21O5mDafeSAa7AmAcoSz0e0NmYToREz+HhxNYj c2vQ== X-Forwarded-Encrypted: i=1; AJvYcCVL8OZtSsiUnabVOX1pjaihQjzwv0PTnukQzOXxLEw6/JB51GNwMKm1ed3Xga1TsSnBsL9TshTjm3zEraM=@freebsd.org, AJvYcCVQ7NSSj1FXVG/MxG4zFSvIWQZMK2IeXke47w0ORiSHbW28ieGhtOcHMmkqbrk1bQ8KoJ1fD5JjgOKKWiA2c0mX@freebsd.org, AJvYcCWNMiTIVfl3EJD1J+o9PFxAQe4hgMzoyW5EIHFYuTGgENrUTLa7slAr2vQ2VQvbjNY77gUQ7VGcgKiqGnk=@freebsd.org X-Gm-Message-State: AOJu0Yyar0aj7zuPTn/Kgrk4Qnrv9VAbfOy5KD/QOwuUJcDx2GteMKCC XE404ntEafb2c5pI9a8Bupxet00EjyH4NelqF9GDnCoF8cGbpDTosPc93YEh X-Google-Smtp-Source: AGHT+IEB8ksEThG5HzDJjvPwwBoIofOrfOch9Ucn/tK7deg+YJt5CHaIIOQyMT9bCCvQTwHNfo1qfA== X-Received: by 2002:a05:651c:2105:b0:2f3:ac52:416b with SMTP id 38308e7fff4ca-2f787f33a2cmr3863201fa.35.1726099040895; Wed, 11 Sep 2024 16:57:20 -0700 (PDT) Received: from nuclight.lan ([37.204.254.214]) by smtp.gmail.com with ESMTPSA id 38308e7fff4ca-2f75bfe6aecsm17094201fa.5.2024.09.11.16.57.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Sep 2024 16:57:20 -0700 (PDT) Date: Thu, 12 Sep 2024 02:57:17 +0300 From: Vadim Goncharov To: David Chisnall Cc: Philip Paeps , freebsd-arch@freebsd.org, freebsd-hackers@freebsd.org, freebsd-net@freebsd.org, tech-net@netbsd.org Subject: Re: BPF64: proposal of platform-independent hardware-friendly backwards-compatible eBPF alternative Message-ID: <20240912025717.455295f1@nuclight.lan> In-Reply-To: References: <20240911120518.1ba191b5@nuclight.lan> X-Mailer: Claws Mail 3.19.1 (GTK+ 2.24.33; amd64-portbld-freebsd12.4) List-Id: Discussion related to FreeBSD architecture List-Archive: https://lists.freebsd.org/archives/freebsd-arch List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arch@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US] X-Rspamd-Queue-Id: 4X3yDM25B7z4FPd On Wed, 11 Sep 2024 12:21:09 +0100 David Chisnall wrote: > On 11 Sep 2024, at 10:06, Vadim Goncharov > wrote: > >=20 > > But then a possibility to give this to non-root is. And many things > > are considered vulnerabilitites even if they are only available to > > root - for example, when root can be tricked into running malicious > > code etc. (unconscious) actions without direct intention. =20 >=20 > When the root user intentionality changes some thin from a secure > default to an insecure setting, it is not a security vulnerability in > the system that shipped the safe defaults.=20 This is just not true. See, for example, FreeBSD-SA-17:06.openssh for vulnerability disabled by default, and workaround proposed to return to default (disabled) state. > > Equivalency of classic BPF to writable /dev/mem is too loud and > > controversial statement. Demonstrate how it can be done on stock > > FreeBSD 13 with /dev/bpf available to attacker (e.g. `sudo tcpdump` > > allowed). =20 >=20 > Two things to unpick here. First: >=20 > Demanding a proof of concept before you accept that something is a > vulnerability is how you build insecure systems. Ask the Matrix team > how well that attitude has worked for them in the last few weeks. You > build secure systems by defining a threat model and then evaluating > primitives against that threat model, not by throwing together a > bunch of primitives and saying =E2=80=98well, *I* can=E2=80=99t assemble = them into an > exploit and so no one can=E2=80=99. >=20 > Second, there are documented attacks on eBPF that give the equivalent > of *read* access to /dev/mem. This is why BPF is restricted to root. Stop. Just stop, and re-read carefully. You (and perhaps Philip) confusing two things: BPF and eBPF (and BPF64 third), all completely different beasts. Last two letters in this thread, I was talking about classic BPF existing in *BSD for decades (on FreeBSD allowed to have permissions on dev/bpf*). So you assert that THIS classic BPF also vulnerable to aforementioned attacks, and thus SA must be issued, just like that FreeBSD-SA-17:06.openssh, with a fix (at least preventing changing default permissions) of hole existing for *decades*. This is too strong assertion to be accepted without proofs, and as I can deduce from your other words and readings about Spectre (see below), this statement is not true (classic /dev/bpf is not vulnerable). > We have a threat model. The threat model says that we do not need to > ensure that BPF cannot leak kernel data indirectly because only the > user who has the ability to leak kernel data directly can use it and > this user has a simpler way of achieving the same result. If you > allow non-root users to run code (native or against any virtual > machine) then you are changing the threat model. You *must* prevent > users from leaking kernel data that they could not leak via existing > mechanisms. >=20 > The two most common attacks using eBPF are generally in the following > two categories: >=20 > - Use eBPF to mount a speculative execution attack on the kernel. > Please read up on what these can do. No one should be building a > thing that runs code in the kernel without understanding speculative, > cache, and timing side channels. >=20 > This is the threat landscape in which something in the same space as > eBPF must exist. A proposed design should *start* with an explanation > of how it mitigates both of these categories of attack. Again, you are talking about eBPF here, not classic BPF. So far Spectre was mentioned as example of those speculative, cache, and timing side channels: https://en.wikipedia.org/wiki/Spectre_(security_vulnerability) refers to mitigations e.g. in Firefox - https://www.mozilla.org/en-US/security/advisories/mfsa2018-01/ with key phrase "The precision of performance.now() has been reduced from 5=CE=BCs to 20=CE= =BCs," So to prevent this class of attacks you need to deprive untrusted code from (precise) timers. And if we then go eBPF sources, we see at https://elixir.bootlin.com/linux/v6.10/source/include/uapi/linux/bpf.h#L1884 * u64 bpf_ktime_get_ns(void) * Description * Return the time elapsed since system boot, in nanoseconds. * Does not include time the system was suspended. * See: **clock_gettime**\ (**CLOCK_MONOTONIC**) This is because eBPF is used in Linux as one-catch-all for tracing and profiling - they do not have DTrace. And we have. And BPF don't need to be DTrace and don't need timers. Thus, we can conclude, your eBPF assertion is simply not applicable to *BSD classic BPF, and consequently, to current state of BPF64 which don't include any timers or time sources available to user code at all. > - Use eBPF to build a set of gadgets that you can then use to go > from one memory-safety bug in the kernel to arbitrary-code execution. This requires further explanations (including to ensure we're not mixing things again). As far as I can see currently, this is also not applicable to BPF64 due to lack of pointers. --=20 WBR, @nuclight From nobody Thu Sep 12 00:26:14 2024 X-Original-To: freebsd-arch@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4X3ysn4Gflz5Vym1; Thu, 12 Sep 2024 00:26:21 +0000 (UTC) (envelope-from vadimnuclight@gmail.com) Received: from mail-lj1-x233.google.com (mail-lj1-x233.google.com [IPv6:2a00:1450:4864:20::233]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4X3ysm1GqYz4M7X; Thu, 12 Sep 2024 00:26:20 +0000 (UTC) (envelope-from vadimnuclight@gmail.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20230601 header.b=jwlbnWxx; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of vadimnuclight@gmail.com designates 2a00:1450:4864:20::233 as permitted sender) smtp.mailfrom=vadimnuclight@gmail.com Received: by mail-lj1-x233.google.com with SMTP id 38308e7fff4ca-2f75b13c2a8so5057741fa.3; Wed, 11 Sep 2024 17:26:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1726100778; x=1726705578; darn=freebsd.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=cYoWHowlaxUIxtahmcQwJ2VDHGzi1OZKXzI/rYfVDfY=; b=jwlbnWxxnS3GL55mBLX7gZVXBFiLGIH6ljRENO0LaB7+MvHFU6HyLoxk2xjm9r1MVG nzOupoNR1juC06bUH7ztouZ8OmeDJvm3ANvXie/aJT2eAkM4oHbKxa+KA35IStImsB80 VkbAHPu2Z1ac6pcJp0QQcE3oQ8tpKuaJlpg2kYTMn4Nx9Q6sQP2YCidDnKVEFZysF08X cw0sBym2ZtBI3luefdW57cQ+CZBBxzhB787cmzTOR8pOTnoBKh0Dcy6AYgF7QT5G4LaD OBhNy9/nGF0UXanf595IigXVJNpZT9MhASvXk+VqVFJ/AB0YypWP+29jbiSOsvxMVfgl 859g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726100778; x=1726705578; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=cYoWHowlaxUIxtahmcQwJ2VDHGzi1OZKXzI/rYfVDfY=; b=QHuYbXeqzgI6Usji1e+EsSYrqy/hSedlFg8knMbInCtcLzYGAtkGWkKZQubNHHgUpd KdI25OwI+gIlDsxlwm/nqpGxGu6prGxEmJ6Q4fCU11HawmiDJzm7SkW9Rx2xFQwoQwZu d0dXCfD1tAaY9lfMJKSSIT0MJzR7OJRdBX7UUuAHDuC1eFBXslqJG9aQDSY0aTj6m+D7 kfAa1P28kD3Q0e6wJ10/iBh6kL71AMXl3z625mUFmsqTQ2g5VtIAe1/oAEnaibFQhElE 2fr/mhd/p1FPEDS1DW46WS0RYdOay8ViQSnjC6RqSsVz2jmYlMgg8TJrZuO8Rac7gQJJ mt5w== X-Forwarded-Encrypted: i=1; AJvYcCXA9etlJIVR002cenHH3bT6WDke6MlfLSnoxs9gerNtSntBDhCXik4xncyF7FaU/vd4pS4NnyALtnZ81LhS/fQ=@freebsd.org, AJvYcCXLT6KCfkdn3b141C114OgYlBzyBQda5gqEGQJUDpAtQdKGm/du4106tMPPzp31W6EpPHBtYIn2G5lQHwU=@freebsd.org X-Gm-Message-State: AOJu0YykR4+1ZRuVWKuiyv71VpVMllbdl9PvhLUu0F5MV+DyVLQfHYmH AHkR7HYAdTzpsuYC6FMdLqaO+xsnv6efYvqhKRlbmB/swlz7JdIU X-Google-Smtp-Source: AGHT+IFG03hcxcakfjwOaJGlKdhk0XrccinW5TqrwyPuUhlOiQiPNISLM95vPsD6s52zTxb4QtHeBw== X-Received: by 2002:a2e:a553:0:b0:2f7:631a:6e21 with SMTP id 38308e7fff4ca-2f787ed5c6amr4932251fa.24.1726100777473; Wed, 11 Sep 2024 17:26:17 -0700 (PDT) Received: from nuclight.lan ([37.204.254.214]) by smtp.gmail.com with ESMTPSA id 38308e7fff4ca-2f75c008f6csm17091391fa.59.2024.09.11.17.26.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Sep 2024 17:26:17 -0700 (PDT) Date: Thu, 12 Sep 2024 03:26:14 +0300 From: Vadim Goncharov To: Rob Wing Cc: "freebsd-arch@freebsd.org" , "freebsd-hackers@freebsd.org" , "freebsd-net@freebsd.org" , "tech-net@netbsd.org" Subject: Re: That's how (why) BSD is dying (Was: BPF64) Message-ID: <20240912032614.36796b03@nuclight.lan> In-Reply-To: References: <20240910040544.125245ad@nuclight.lan> <202409100638.48A6cor2090591@critter.freebsd.dk> <20240910144557.4d95052a@nuclight.lan> <4D84AF55-51C7-4C2B-94F7-D486A29E8821@FreeBSD.org> <20240910164447.30039291@nuclight.lan> <3F3533E4-6059-4B4F-825F-6995745FDE35@FreeBSD.org> <20240911011228.161f94db@nuclight.lan> <20240911024735.37522532@nuclight.lan> X-Mailer: Claws Mail 3.19.1 (GTK+ 2.24.33; amd64-portbld-freebsd12.4) List-Id: Discussion related to FreeBSD architecture List-Archive: https://lists.freebsd.org/archives/freebsd-arch List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arch@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spamd-Bar: --- X-Spamd-Result: default: False [-4.00 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-1.000]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20230601]; MIME_GOOD(-0.10)[text/plain]; ARC_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; FREEMAIL_TO(0.00)[gmail.com]; TO_DN_SOME(0.00)[]; TO_DN_EQ_ADDR_SOME(0.00)[]; MIME_TRACE(0.00)[0:+]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; FREEMAIL_FROM(0.00)[gmail.com]; FROM_HAS_DN(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::233:from]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MLMMJ_DEST(0.00)[freebsd-arch@freebsd.org,freebsd-hackers@freebsd.org,freebsd-net@freebsd.org]; RCVD_VIA_SMTP_AUTH(0.00)[]; TAGGED_RCPT(0.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; RCPT_COUNT_FIVE(0.00)[5] X-Rspamd-Queue-Id: 4X3ysm1GqYz4M7X On Tue, 10 Sep 2024 16:55:15 -0800 Rob Wing wrote: > On Tuesday, September 10, 2024, Vadim Goncharov > wrote: > > > > > > What's the point to waste resources on writing thing that is known > > to be not accepted to base system from the very beginning? > > > what's the point of talking about thing you're never going to > write/finish even if it was accepted? Because - and I've written that from the beginning - wrong architecture will cost too much if your code already exists, than if fixed before that. So I've received worthy feedback describing problems; not that were good solutions yet, though... > Or even bikesheds are in short supply now? > > have you considered calling it eBPF++ or BPF128? Well, first it was called fBPF as next letter from eBPF, and for 128 it lacks 128 bit arithmetics. Then I realized that for some things there will be different implementations in different kernels, e.g. for atomic(9) operations, so better to call generic common thing neutral BPF64 and e.g. fBPF be FreeBSD dialect, nBPF for NetBSD dialect... > at any rate, don't claim the sky is falling on the grounds that I > think your proposal is far fetched Not you. FreeBSD already had precedent when already written and committed code framework for device sensors was removed: https://lists.freebsd.org/pipermail/cvs-src/2007-October/082398.html and that happened by exactly the same person with similar tone (not you). ...well, ok, there were some really technical arguments in that case e.g. about sysctl_add_oid() but that was the most technical of them, all other being same arrogant rant. Anyway, such cases are very demotivating from contributing. > if you think you've got a great idea and have a need for it, then > write it and use it - if other people adopt it, even better > > all the best to your endeavors and don't let the naysayers hold you > back Well, what I really need is a technical help for areas I don't know, as obviously I won't be able to write entire ecosystem alone. For example, I don't know how many registers are available to a function in a kernel on an ARM, MIPS and RISC-V. If I allocate too much, then somebody will have trouble implementing JIT on such machine, as I don't have such hardware. And e.g. https://en.wikipedia.org/wiki/MIPS_architecture says 2 registers are for kernel - OK, and what if we are kernel itself? :-) Do we need Thread Pointer of Global Pointer, or we can stash them to stack? Etc. -- WBR, @nuclight From nobody Thu Sep 12 07:13:12 2024 X-Original-To: freebsd-arch@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4X47vP6Z9kz5V5f2; Thu, 12 Sep 2024 07:13:21 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4X47vP4FMkz4RNc; Thu, 12 Sep 2024 07:13:21 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Authentication-Results: mx1.freebsd.org; none Received: from critter.freebsd.dk (unknown [192.168.55.3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by phk.freebsd.dk (Postfix) with ESMTPS id 3139B89290; Thu, 12 Sep 2024 07:13:14 +0000 (UTC) Received: (from phk@localhost) by critter.freebsd.dk (8.18.1/8.16.1/Submit) id 48C7DCc2011000; Thu, 12 Sep 2024 07:13:12 GMT (envelope-from phk) Message-Id: <202409120713.48C7DCc2011000@critter.freebsd.dk> To: Vadim Goncharov cc: Rob Wing , "freebsd-arch@freebsd.org" , "freebsd-hackers@freebsd.org" , "freebsd-net@freebsd.org" , "tech-net@netbsd.org" Subject: Re: That's how (why) BSD is dying (Was: BPF64) In-reply-to: <20240912032614.36796b03@nuclight.lan> From: "Poul-Henning Kamp" References: <20240910040544.125245ad@nuclight.lan> <202409100638.48A6cor2090591@critter.freebsd.dk> <20240910144557.4d95052a@nuclight.lan> <4D84AF55-51C7-4C2B-94F7-D486A29E8821@FreeBSD.org> <20240910164447.30039291@nuclight.lan> <3F3533E4-6059-4B4F-825F-6995745FDE35@FreeBSD.org> <20240911011228.161f94db@nuclight.lan> <20240911024735.37522532@nuclight.lan> <20240912032614.36796b03@nuclight.lan> List-Id: Discussion related to FreeBSD architecture List-Archive: https://lists.freebsd.org/archives/freebsd-arch List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arch@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <10998.1726125192.1@critter.freebsd.dk> Content-Transfer-Encoding: quoted-printable Date: Thu, 12 Sep 2024 07:13:12 +0000 X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; TAGGED_RCPT(0.00)[]; ASN(0.00)[asn:1835, ipnet:130.225.0.0/16, country:EU] X-Rspamd-Queue-Id: 4X47vP4FMkz4RNc Vadim Goncharov writes: > Because - and I've written that from the beginning - wrong architecture > will cost too much if your code already exists, than if fixed before > that. So I've received worthy feedback describing problems; not that > were good solutions yet, though... And when people do not agree with your proposed architecture "FreeBSD is d= ying!!!1!!" ? > Not you. FreeBSD already had precedent when already written and > committed code framework for device sensors was removed: > https://lists.freebsd.org/pipermail/cvs-src/2007-October/082398.html > and that happened by exactly the same person with similar tone (not you)= . Yes, I will resist all bad architecture. -- = Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe = Never attribute to malice what can adequately be explained by incompetence= . From nobody Thu Sep 12 11:04:35 2024 X-Original-To: freebsd-arch@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4X4F2S2Z0Jz5WBpH; Thu, 12 Sep 2024 11:04:48 +0000 (UTC) (envelope-from theraven@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [96.47.72.83]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4X4F2S22NSz4vpr; Thu, 12 Sep 2024 11:04:48 +0000 (UTC) (envelope-from theraven@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1726139088; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=5XCP+OtTCjv0BrnUPkhgNamjxRyHSjEizsK/4AyPSBw=; b=ULGygp69dhp5iepNCNIxJwvU8JhuN2XttxgIcksnENWn/EkpWlZMBN1/bsv2rLJTwHnLPf 5MImRTxEq5If68X23Td4Ou+Tft+r3y4Xx28CqvaZmYQQ9FMGDRKMjOdDUlKflh1PAobTle cLViEG004vf50NwrD1E7BA5cIEE9z5YDp9Ht/y9b+fjc1RC2Ncwi9rcRR6toD3wkJZarwL 3grzszg0kMJs2HNJycX5ftumlBwT+KeTRlfdOgYaC0lhp/b3/X86iETKGnSuKzFO9oJvxg IBgHq9vkJ3IHOROxGGQQmhe6rKU3Yv9pDuZ/N4r5FTT9O32j6/XMYR6uujvqQA== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1726139088; a=rsa-sha256; cv=none; b=wxPxwDj8dWbkf2i7is2g5+KNMe+VnjkUC1Zuy28Y8p1vZH+eUoM43beUdzU9wAGTavdA3a LaSn0+8q9rYBoHa/6uoFfZJyZW3B0f0Wh7fr1/hXWuAUpM5+YyXlE8Z8NqNS8Xaekp/9LS An2P7ZA2UUnhcZCS0sAHLVmcLuYwcuutD3/Xa1p3ynRBMvYoBntkjqWmOESESk/sKhRjLy LMZYeOeGo//hqCcldzRfTYZBF+O3bll+mpfhVlIRY8s+u/9Jm6l8p6jczAhgXPOA3P4W5k Cz5ykBolBFp6rE/rZuXKZ+4YvkjcuTmdOuUjwcKB/Y34W2hxhs0f4GRN7FKmkA== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1726139088; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=5XCP+OtTCjv0BrnUPkhgNamjxRyHSjEizsK/4AyPSBw=; b=Q5F34EXluziTC0xEWXA/U7vv5rQteg2SuSGd/bAu9eC0RWEQImbmQIvw8YSk3YMA7bV8uG 4/1W3Rqxjj4+7edt26NTqWFJlThUXjjNL+Bi9hBp1gJvQIdOyP23F9x9B0AOxHp3gusPkh iZPdXa+dIvaoOJyHF0PTCvvTbEE2cwsQ0Fw37Lp5ZVi40my7cvEIYrnaLm1IVHv4nfM3gk ULPUhz9GQBFaMxn+fSWGdzbIagta91PkI3NRZoiZF0Vkq/wODgbj3IOrc6uS7nc0hY915P Mh7LF7d/YeKlYvsxlC2woNOIKEiqqUn03nHgNnC6vjvUBytj0GSi7ebwg/uo1g== Received: from smtp.theravensnest.org (smtp.theravensnest.org [45.77.103.195]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: theraven) by smtp.freebsd.org (Postfix) with ESMTPSA id 4X4F2S1RrKzPkW; Thu, 12 Sep 2024 11:04:48 +0000 (UTC) (envelope-from theraven@FreeBSD.org) Received: from smtpclient.apple (host109-155-136-107.range109-155.btcentralplus.com [109.155.136.107]) by smtp.theravensnest.org (Postfix) with ESMTPSA id E3E7A6603; Thu, 12 Sep 2024 12:04:46 +0100 (BST) From: David Chisnall Message-Id: <746547C3-DF15-435D-AECE-9B2D195703B5@FreeBSD.org> Content-Type: multipart/alternative; boundary="Apple-Mail=_10FE2401-9F6E-4915-9D75-86D8B0F37D7C" List-Id: Discussion related to FreeBSD architecture List-Archive: https://lists.freebsd.org/archives/freebsd-arch List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arch@FreeBSD.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3776.700.51\)) Subject: Re: BPF64: proposal of platform-independent hardware-friendly backwards-compatible eBPF alternative Date: Thu, 12 Sep 2024 12:04:35 +0100 In-Reply-To: <20240912025717.455295f1@nuclight.lan> Cc: Philip Paeps , freebsd-arch@freebsd.org, freebsd-hackers@freebsd.org, freebsd-net@freebsd.org, tech-net@netbsd.org To: Vadim Goncharov References: <20240911120518.1ba191b5@nuclight.lan> <20240912025717.455295f1@nuclight.lan> X-Mailer: Apple Mail (2.3776.700.51) --Apple-Mail=_10FE2401-9F6E-4915-9D75-86D8B0F37D7C Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 On 12 Sep 2024, at 00:57, Vadim Goncharov = wrote: >=20 > This is just not true. See, for example, FreeBSD-SA-17:06.openssh for > vulnerability disabled by default, and workaround proposed to return > to default (disabled) state. No, this was changing from one supported expected-secure setting to = another. If you make a device directly usable by non-root users, you = are expected to understand the security implications. > Stop. Just stop, and re-read carefully. I have read carefully. I am not sure at this point whether you are = intentionally failing to engage in good faith or if you simply do not = understand the security landscape that you are operating in and have no = interest in learning. Either way, it is not productive to keep having = this conversation so this will be my last comment in this thread. > You (and perhaps Philip) > confusing two things: BPF and eBPF (and BPF64 third), all completely > different beasts. They are all mechanisms for running semi-trusted / untrusted code in the = kernel. > Last two letters in this thread, I was talking about > classic BPF existing in *BSD for decades (on FreeBSD allowed to have > permissions on dev/bpf*). FreeBSD allows you to change the permissions on anything in /dev/. It = is up to you to understand the security implications of doing this. = Allowing non-root access to /dev/bpf has security implications. I do it = for my user on a couple of single-user FreeBSD dev boxes, because I also = have root on these systems and so anything WireShark can do on my = behalf, I can also do via su. There is no security issue because my = threat model *for this system* is such that I can accept a weaker = posture than the default. There are legitimate reasons for broadening = the permissions on these systems. > So you assert that THIS classic BPF also > vulnerable to aforementioned attacks, and thus SA must be issued, just > like that FreeBSD-SA-17:06.openssh, with a fix (at least preventing > changing default permissions) of hole existing for *decades*. No, for the same reason that there=E2=80=99s no security advisory if you = `chmod +r /dev/mem`. There=E2=80=99s a reason that both /dev/mem and = /dev/bpf are restricted to root by default. Anyone who relaxes these = permissions must understand what they=E2=80=99re doing and have a threat = model that can justify why it=E2=80=99s acceptable. By analogy with FreeBSD-SA-17:06.openssh: this SA applied where password = login was enabled. Enabling password authentication weakens the = security of OpenSSH and so is not done by default, but that was not the = problem that merited the SA. The SA was issued because it had the = additional effect of allowing remote attackers to mount a denial of = service attack. We would not issue an OpenSSH SA saying =E2=80=98enabling= password authentication weakens security because people can log in with = passwords, which are less secure than SSH keys=E2=80=99. The = expectation is that anyone changing this setting knows what they=E2=80=99r= e doing. If it is not a problem for their use case, they can do it. = Precisely the same logic applies to allowing non-root access to /dev/bpf = or /dev/mem. > This is > too strong assertion to be accepted without proofs, and as I can = deduce > from your other words and readings about Spectre (see below), this > statement is not true (classic /dev/bpf is not vulnerable). I have not built a PoC, but I would fully expect that it=E2=80=99s = possible to build an attack that first primes the cache and trains the = branch predictor and then runs a crafted BPF program that has an = out-of-bounds read (which executes only in speculation) to leak kernel = memory (including contents of the direct map, so anything owned by = another process on the same system), and then inspects the contents of = the cache to see the value that was observed in speculation. All of the = necessary primitives are there. If you are designing a system that expects non-root users to be able to = run code in the kernel, the onus is on you to explain why it is safe. = The default assumption must be that it is unsafe. David --Apple-Mail=_10FE2401-9F6E-4915-9D75-86D8B0F37D7C Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 On 12 Sep = 2024, at 00:57, Vadim Goncharov <vadimnuclight@gmail.com> = wrote:

This is just not true. = See, for example, FreeBSD-SA-17:06.openssh for
vulnerability disabled by default, and workaround proposed = to return
to = default (disabled) state.

No, this = was changing from one supported expected-secure setting to another. =  If you make a device directly usable by non-root users, you are = expected to understand the security implications.

Stop. = Just stop, and re-read carefully. =

I have read carefully. =  I am not sure at this point whether you are intentionally failing = to engage in good faith or if you simply do not understand the security = landscape that you are operating in and have no interest in learning. =  Either way, it is not productive to keep having this conversation = so this will be my last comment in this thread.

You = (and perhaps Philip)
confusing two things: BPF and eBPF (and BPF64 third), all = completely
different = beasts.

They are all = mechanisms for running semi-trusted / untrusted code in the = kernel.

Last two letters in = this thread, I was talking about
classic = BPF existing in *BSD for decades (on FreeBSD allowed to have
permissions on dev/bpf*). =

FreeBSD allows you to = change the permissions on anything in /dev/.  It is up to you to = understand the security implications of doing this.  Allowing = non-root access to /dev/bpf has security implications.  I do it for = my user on a couple of single-user FreeBSD dev boxes, because I also = have root on these systems and so anything WireShark can do on my = behalf, I can also do via su.  There is no security issue because = my threat model *for this system* is such that I can accept a weaker = posture than the default.  There are legitimate reasons for = broadening the permissions on these = systems.

So you assert that THIS = classic BPF also
vulnerable to aforementioned attacks, and thus SA must be = issued, just
like = that FreeBSD-SA-17:06.openssh, with a fix (at least preventing
changing default permissions) of hole existing for = *decades*.

No, for the same = reason that there=E2=80=99s no security advisory if you `chmod +r = /dev/mem`.  There=E2=80=99s a reason that both /dev/mem and = /dev/bpf are restricted to root by default.  Anyone who relaxes = these permissions must understand what they=E2=80=99re doing and have a = threat model that can justify why it=E2=80=99s = acceptable.

By analogy with = FreeBSD-SA-17:06.openssh: this SA applied where password login was = enabled.  Enabling password authentication weakens the security of = OpenSSH and so is not done by default, but that was not the problem that = merited the SA.  The SA was issued because it had the additional = effect of allowing remote attackers to mount a denial of service attack. =  We would not issue an OpenSSH SA saying =E2=80=98enabling password = authentication weakens security because people can log in with = passwords, which are less secure than SSH keys=E2=80=99.  The = expectation is that anyone changing this setting knows what they=E2=80=99r= e doing.  If it is not a problem for their use case, they can do = it.  Precisely the same logic applies to allowing non-root access = to /dev/bpf or /dev/mem.

This is
too = strong assertion to be accepted without proofs, and as I can = deduce
from = your other words and readings about Spectre (see below), this
statement is not true (classic /dev/bpf is not = vulnerable).

I = have not built a PoC, but I would fully expect that it=E2=80=99s = possible to build an attack that first primes the cache and trains the = branch predictor and then runs a crafted BPF program that has an = out-of-bounds read (which executes only in speculation) to leak kernel = memory (including contents of the direct map, so anything owned by = another process on the same system), and then inspects the contents of = the cache to see the value that was observed in speculation.  All = of the necessary primitives are there.

If you = are designing a system that expects non-root users to be able to run = code in the kernel, the onus is on you to explain why it is safe. =  The default assumption must be that it is = unsafe.

David


=


= --Apple-Mail=_10FE2401-9F6E-4915-9D75-86D8B0F37D7C--