Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 12 Sep 2024 06:29:38 +0000
From:      "Gavin D. Howard" <gavin@gavinhoward.com>
To:        "freebsd-hackers@FreeBSD.org" <freebsd-hackers@FreeBSD.org>
Subject:   Re: BPF64: proposal of platform-independent hardware-friendly backwards-compatible eBPF alternative
Message-ID:  <f9KxenmdwQZ8oQrcH6ePy30goR2nOPYCULfkQnFaugR0mFpoL_THbqWwt4Jkt4FKpe03xXULUwRCSc8rUdIsnCSeG2X03iSpfNczBy-vks8=@gavinhoward.com>
In-Reply-To: <20240910181711.5d324ac5@nuclight.lan>
References:  <20240910040544.125245ad@nuclight.lan> <wLzD36W8VSXSlBByVmK745ezszNVGM-hfWOobdrCb1vmx9snihwW_gBgeAlvtKWL55fnAZgr9G5ztIO3UjD3Wou3-YPxmLkMp9AFuGHwXsA=@gavinhoward.com> <20240910181711.5d324ac5@nuclight.lan>

next in thread | previous in thread | raw e-mail | index | archive | help
> > If I understand Turing-completeness correctly, the "no backward jumps
> > but allow recursion and quit on stack overflow" is exactly equivalent
> > to having non-infinite loops.
>
> Sure, but then look at practical usefulness: suppose you have stack of
> 32 frames (current limit of eBPF and BPF64). Then you can have only 31
> iterations on a loop, loosing ability to call more functions from loop
> body.

That is true.

However, I wonder if everyone is going at this the wrong way. More
specifically, I wonder if we are targeting the entirely wrong level.
Maybe verifying bytecode or some other low-level code form is just too
hard. What if it were easier just to provide a higher level language
that had enough restrictions to be just barely not Turing-complete.

To test that idea, I did a quick experiment.

I have been working on a language called Yao for the past three years.
One feature I planned for the future was the ability to restrict what
packages and keywords are available at compile time. I had already
put in the plumbing, but I just hadn't implemented it.

But I decided to quickly implement it as an experiment.

First, I needed a stack limit:

https://git.yzena.com/Yzena/Yc/commit/99c822bf7b

Now, you can run this:

```
$ ./release/yc yao --max-stack-depth=3D32 <script>
```

and you get the same max stack depth as eBPF.

Next, restricting keywords. One unique thing about Yao is that *every*
keyword is imported like you would import functions, even standard ones
like `if` and `while`. This means that if a keyword is not allowed, I
can just skip importing it, and it's like it doesn't even exist.

So I built a way to skip importing the `while` keyword and any other
keyword that would make the language Turing-complete. Crucially, though,
it still includes a `foreach` loop, so backwards jumps are still
possible.

A script with a `while` loop is `tools/rig_slam.yao` in the same repo.
If you run:

```
$ make # This bootstraps the repo.
$ ./release/yc yao tools/rig_slam.yao
```

you will get a panic when it tries to parse the `while` loop because I
haven't implemented the parsing:

```
Panic: Unimplemented
    Source:    /home/gavin/Yzena/Code/yc-git/yc/src/yao/keywords.c:2310
    Function:  yao_parse_while()
```

But if you run this:

```
$ ./release/yc yao --max-stack-depth=3D32 --lang-mode=3Diterative \
  tools/rig_slam.yao
```

you get a compile error, and it doesn't even try to parse the `while`:

```
yc: tools/rig_slam.yao[145:2]
    Parse error: Incomplete variable declaration for name: while

yc: tools/rig_slam.yao[145:8]
    Invalid token: Expected semicolon (';')
```

(You also get a different panic after that because Yao has bugs.)

Anyway, essentially, the "iterative" language mode with a max stack
depth makes Yao non-Turing-complete, but with a way to loop.

You can try that with `tools/format.yao`, which has a `foreach` loop:

```
$ ./release/yc yao --max-stack-depth=3D32 --lang-mode=3Diterative \
  tools/format.yao
```

and it still works.

I'm not saying Yao as an eBPF replacement is a good idea; it's not. But
I think this demonstrates that a simple high-level language with a
simple compiler and a simple VM might be a viable method of implementing
an eBPF-like thing without a complex verifier.

Yes, the compiler and VM would need to be in the kernel, but that might
already be the case with the JIT in Linux.

Anyway, my point is that before FreeBSD accepts BPF64, it might be a
good idea to sit down and consider all of the options, brainstorm ideas
together, and make an excellent design.

And it could very well be that BPF64 *is* the best design! I don't know;
I'm not a kernel guy, nor am I an expert compiler guy.

> > * But the analysis pass(es) must still live in the kernel.
> > * There would need to be tutorials or some docs on how to write
> > whatever language so that backwards jumps are avoided.
>
> So BPF64 took simplicity pass: while you have tutorials etc. it's still
> very hard to write (non-toy) code that passes verifier. I think a
> language where you do not need backward jumps but have usable
> constructs (so you just can't write bad code), even BAW, is a better
> way to go than try to train people to fight with unnecessary Cerber.

Yes, I think you are right on this, which IMO just points to a
high-level language. Such a language can have whatever constructs are
safe, but by its very nature, it will be restricted such that dangerous
code is impossible.

It would also be easier to write code that passes the "verifier" if the
verifier is actually the compiler.

Anyway, I hope I don't sound too critical; I don't have the chops to
judge BPF64. Plus, I can tell you put in a lot of work.

Gavin Howard



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?f9KxenmdwQZ8oQrcH6ePy30goR2nOPYCULfkQnFaugR0mFpoL_THbqWwt4Jkt4FKpe03xXULUwRCSc8rUdIsnCSeG2X03iSpfNczBy-vks8=>