From nobody Thu Mar 16 15:41:17 2023 X-Original-To: freebsd-arch@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Pcs1l0WM2z3yG4k for ; Thu, 16 Mar 2023 15:41:31 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-ed1-x52d.google.com (mail-ed1-x52d.google.com [IPv6:2a00:1450:4864:20::52d]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Pcs1k5Nxkz3h9K for ; Thu, 16 Mar 2023 15:41:30 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-ed1-x52d.google.com with SMTP id z21so9385188edb.4 for ; Thu, 16 Mar 2023 08:41:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20210112.gappssmtp.com; s=20210112; t=1678981288; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=ag9s0H4L4jiypNwuPJ+ljtrk7nDPRCHtg2TBm9QNSyM=; b=cWyqPI9MQUxGbrXGbfm37fXXyC7esgSPhtMMYryWIY7Aq1fTp+ZsFgcenhHjQm2vFB jDqHavu8i73+lGY6R8bMJhcpmt7SUOqA7SFwVtUPTmNqAzbI7+iAnoPXkLNjkL9hQGhc iTRlBL7g1NG8HLzKuxij2Sg4kOdyfVOVkzfzco03tbBDOi81omDGUyroET4TcNLTEM9I ynuJk36XQQQC1Cp/1T+MxGBAD2nqb8VUCK/FbCdWPbbpVbarOYLH14xcMCMaLLdcNd5Z lwNy1Gg8Z/ge4S6/zpriHygQ6gefHsoQ7OoiejzcWHG4fNpV6ziVZXDjB5jJIpnRXdbQ 7ucg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678981288; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ag9s0H4L4jiypNwuPJ+ljtrk7nDPRCHtg2TBm9QNSyM=; b=I147LhUxhPw7DAkSp8bQWxDIiYdnaO5833oZl4QR5WTa5AAieAk+P91dJsPwqT/d8B H63gFQ/XUFhB6m+gYztgHzPgppwsz5RRujav0T3L3rV45yMi3OOPhrRkAZiAk93I+I5q ecBCqynxcuvBGVFClPjOoFRmdxhBjUIyqKD088B/v+yCm5zq0WyBLv7M57/RFzUKrKgs 3NGSfvEa0wfOXYU+zVR4nlC9RAUijjr714EtGOnrksyN28qUwNNOwoO2FPMfLiTFY1qx hCWe5215Hbt83JRm9lQRB95/TTKbbyQ4EcZyhiCkQafXNELW5tKotYOSIRIb+5Y9YAcX evLg== X-Gm-Message-State: AO0yUKXjzecinGWTGEtX5MwSlGWTTXkYqDKUUDoSFJSz8N1PNjCEtFJE RxDcjsogXArKBWlo+CrWcSMYCS/2ACTeM2hs+G/xhFBiFo79G9Bw X-Google-Smtp-Source: AK7set+C3Ms41gxGXg2uEFCO1smROqU8VgcoqXbu40Q0s+3emSk5FxM1keOzKjz+yI7FUfbIRM9Bbe0/8HOLoidZkqw= X-Received: by 2002:a17:906:16d6:b0:930:310:abcf with SMTP id t22-20020a17090616d600b009300310abcfmr1992039ejd.2.1678981288462; Thu, 16 Mar 2023 08:41:28 -0700 (PDT) List-Id: Discussion related to FreeBSD architecture List-Archive: https://lists.freebsd.org/archives/freebsd-arch List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arch@freebsd.org MIME-Version: 1.0 References: <20230316100611.4892008c@gonegalt.net> <20230316112222.31b1620e@gonegalt.net> In-Reply-To: <20230316112222.31b1620e@gonegalt.net> From: Warner Losh Date: Thu, 16 Mar 2023 09:41:17 -0600 Message-ID: Subject: Re: Blocks runtime in the kernel To: Justin Hibbits Cc: "freebsd-arch@freebsd.org" Content-Type: multipart/alternative; boundary="000000000000489b0605f7064b0f" X-Rspamd-Queue-Id: 4Pcs1k5Nxkz3h9K X-Spamd-Bar: ---- X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US] X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-ThisMailContainsUnwantedMimeParts: N --000000000000489b0605f7064b0f Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, Mar 16, 2023 at 9:22=E2=80=AFAM Justin Hibbits wrote: > On Thu, 16 Mar 2023 09:04:29 -0600 > Warner Losh wrote: > > > On Thu, Mar 16, 2023, 8:06 AM Justin Hibbits > > wrote: > > > > > Most probably know I've been working on the IfAPI conversion of all > > > network drivers in order to hide the contents of `struct ifnet`. > > > I'm pretty much done with the development, and it's all in review. > > > However, there's one bit that I've thought is very clunky since I > > > added it, the if_foreach() iterator function, which iterates over > > > all interfaces in the current VNET, and calls a callback to operate > > > on each interface. I've noticed that oftentimes I end up with a 2 > > > line callback, which just calls if_foreach_addr_type(), so I end up > > > with just trivial callback functions, which seems like a waste. > > > > > > All that backstory to say, would it be beneficial to anyone else to > > > add a (very basic) blocks runtime to the kernel for doing things > > > like this? The rough change to the IfAPI becomes: > > > > > > int if_foreach_b(int (^)(if_t)); > > > > > > __block int foo =3D 0; > > > > > > if_foreach_b(^(if_t ifp) { > > > if (if_getlinkstate(ifp) =3D=3D LINK_STATE_UP) > > > foo++; > > > }); > > > > > > The same could be done for other *_foreach KPIs as well, if this > > > proves out. I think I could have something working in the next > > > several days. > > > > > > The only technical snag I see with this would be other compilers. > > > I'm not sure if GCC still supports blocks, it did at one point. > > > > > > What do you think? > > > > > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D78352 > > > > Suggests that there were issues upstreaming the apple code. So > > there's that. The gcc12 port I have can't cope with the sample blocks > > code I found on Wikipedia: > > /* blocks-test.c */ > > #include > > #include > > /* Type of block taking nothing returning an int */ > > typedef int (^IntBlock)(); > > > > IntBlock MakeCounter(int start, int increment) { > > __block int i =3D start; > > > > return Block_copy( ^(void) { > > int ret =3D i; > > i +=3D increment; > > return ret; > > }); > > > > } > > > > int main(void) { > > IntBlock mycounter =3D MakeCounter(5, 2); > > printf("First call: %d\n", mycounter()); > > printf("Second call: %d\n", mycounter()); > > printf("Third call: %d\n", mycounter()); > > > > /* because it was copied, it must also be released */ > > Block_release(mycounter); > > > > return 0; > > } > > > > Our current clang is OK: > > % clang -fblocks a.c -o a -lBlocksRuntime > > % > > > > But there's no current users of __block in the kernel. There's no > > kernel-specific Block.h file, > > there's no references to BlockRuntime anywhere in the kernel tree and > > the code in > > contrib/llvm-project/compiler-rt/lib/BlocksRuntime is completely > > userland specific. There > > is no kernel support that I could see, since we don't have a > > libkern/OSAtomic.h. I'm happy > > to be corrected on this though: I've never tried to use blocks in the > > kernel and this is grep > > level confidence. > > > > Clang also doesn't enable blocks unless you pass it -fblock, so you'd > > need to change a fair > > portion of the kernel build system to enable that. > > > > So I'm thinking regardless of whether or not the project should do > > this, you'll have a fair amount > > of choppy waves ahead of you before you could get to the point of > > starting the ifnet work. > > > > Warner > > Hi Warner, > > I did a very very simple test to see what is required link-wise for > blocks in kernel. This was done by changing > https://reviews.freebsd.org/D38962 to use a block instead of a callback > for the "bootpc_init_count_if_cb". I didn't include Block.h or > anything, and simply added "-fblocks" to kern.mk. The result is it > compiles fine, then fails to link (expected) with the following missing > symbols: > As a basic test, that's fine. But I'm not sure we want to globally add -fblocks to the kernel. I don't know if that changes anything else. People will want to know if there's global performance or size impact from doing this and whether or not the compiler inserts other code because blocks are possible. _Block_object_dispose > _Block_object_assign > _NSConcreteStackBlock > > Reading through > contrib/llvm-project/compiler-rt/lib/BlocksRuntime/runtime.c these > missing symbols look straightforward to implement for the basic case. > I'm not thinking of working within the Clang runtime.c, I'm thinking of > reimplementing the needed functions for this constrained use case (no > need for GC, etc). > > This testing was only marginally more than you did, so I'm probably > missing some things as well for more complex use cases. > I was worried about two things when I looked at the code: reference countin= g (which I think is kinda required, even though objc uses it for GC) and memory allocation / handling. The former is well understood, and we can adapt things (though knowing which subset is required here might be tricky, there's a lo= t of flags). The latter, though, would limit the use of these APIs to situations where you can call malloc/free M_WAIT, or you'd need to deal with malloc failures better than runtime.c does. So while the number of routines is small, I think they are the tip of the iceburg and may be more work than you're suggesting. > I'm guessing from GCC's issues that this is a nonstarter anyway? > In the past we've said that it's OK to use clang specific code to get bette= r performance, but that depending on it entirely would require a careful discussion. gcc is produces code that's easier to debug than clang (though gcc12 build is currently still broken), and that's not nothing and has been useful for me in the past. jhb can likely speak to other benefits for gcc12 since he did the last round of updates. So given the difficulties on multiple fronts, I'm not sure it's a great idea. But maybe I'm wrong about how difficult things will be and maybe it would work out in the end. But selecting the network stack to use what will be an unproven, or at least immature technology is ambitious. We've had a mixed bag with that stuff (see epoch and smr for examples). I'm trying hard not to say a flat out "no," because I know that sometimes things that look hard like this pay off. But no gcc support does make it really hard to say yes. I've had my say, and I'll let others say from here. Warner --000000000000489b0605f7064b0f Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Thu, Mar 16, 2023 at 9:22=E2=80=AF= AM Justin Hibbits <jhibbits@free= bsd.org> wrote:
On Thu, 16 Mar 2023 09:04:29 -0600
Warner Losh <imp@bsd= imp.com> wrote:

> On Thu, Mar 16, 2023, 8:06 AM Justin Hibbits <jhibbits@freebsd.org>
> wrote:
>
> > Most probably know I've been working on the IfAPI conversion = of all
> > network drivers in order to hide the contents of `struct ifnet`.<= br> > > I'm pretty much done with the development, and it's all i= n review.
> > However, there's one bit that I've thought is very clunky= since I
> > added it, the if_foreach() iterator function, which iterates over=
> > all interfaces in the current VNET, and calls a callback to opera= te
> > on each interface.=C2=A0 I've noticed that oftentimes I end u= p with a 2
> > line callback, which just calls if_foreach_addr_type(), so I end = up
> > with just trivial callback functions, which seems like a waste. > >
> > All that backstory to say, would it be beneficial to anyone else = to
> > add a (very basic) blocks runtime to the kernel for doing things<= br> > > like this?=C2=A0 The rough change to the IfAPI becomes:
> >
> > int if_foreach_b(int (^)(if_t));
> >
> > __block int foo =3D 0;
> >
> > if_foreach_b(^(if_t ifp) {
> >=C2=A0 =C2=A0if (if_getlinkstate(ifp) =3D=3D LINK_STATE_UP)
> >=C2=A0 =C2=A0 =C2=A0foo++;
> > });
> >
> > The same could be done for other *_foreach KPIs as well, if this<= br> > > proves out.=C2=A0 I think I could have something working in the n= ext
> > several days.
> >
> > The only technical snag I see with this would be other compilers.=
> > I'm not sure if GCC still supports blocks, it did at one poin= t.
> >
> > What do you think?
> >=C2=A0
>
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi= ?id=3D78352
>
> Suggests that there were issues upstreaming the apple code. So
> there's that.=C2=A0 The gcc12 port I have can't cope with the = sample blocks
> code I found on Wikipedia:
> /* blocks-test.c */
> #include <stdio.h>
> #include <Block.h>
> /* Type of block taking nothing returning an int */
> typedef int (^IntBlock)();
>
> IntBlock MakeCounter(int start, int increment) {
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0__block int i =3D start;
>
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return Block_copy( ^(void) {
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0int ret = =3D i;
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0i +=3D in= crement;
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return re= t;
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0});
>
> }
>
> int main(void) {
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0IntBlock mycounter =3D MakeCounter(5,= 2);
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0printf("First call: %d\n", = mycounter());
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0printf("Second call: %d\n",= mycounter());
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0printf("Third call: %d\n", = mycounter());
>
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/* because it was copied, it must als= o be released */
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Block_release(mycounter);
>
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return 0;
> }
>
> Our current clang is OK:
> % clang -fblocks a.c -o a -lBlocksRuntime
> %
>
> But there's no current users of __block in the kernel. There's= no
> kernel-specific Block.h file,
> there's no references to BlockRuntime anywhere in the kernel tree = and
> the code in
> contrib/llvm-project/compiler-rt/lib/BlocksRuntime is completely
> userland specific. There
> is no kernel support that I could see, since we don't have a
> libkern/OSAtomic.h. I'm happy
> to be corrected on this though: I've never tried to use blocks in = the
> kernel and this is grep
> level confidence.
>
> Clang also doesn't enable blocks unless you pass it -fblock, so yo= u'd
> need to change a fair
> portion of the kernel build system to enable that.
>
> So I'm thinking regardless of whether or not the project should do=
> this, you'll have a fair amount
> of choppy waves ahead of you before you could get to the point of
> starting the ifnet work.
>
> Warner

Hi Warner,

I did a very very simple test to see what is required link-wise for
blocks in kernel.=C2=A0 This was done by changing
https://reviews.freebsd.org/D38962 to use a block instead of a= callback
for the "bootpc_init_count_if_cb".=C2=A0 I didn't include Blo= ck.h or
anything, and simply added "-fblocks" to kern.mk.=C2=A0 The result is it=
compiles fine, then fails to link (expected) with the following missing
symbols:

As a basic test, that's fi= ne. But I'm not sure we want to globally add -fblocks
to the = kernel. I don't know if that changes anything else. People will want
to know if there's global performance or size impact from doing= this and
whether or not the compiler inserts other code because = blocks are possible.

_Block_object_dispose
_Block_object_assign
_NSConcreteStackBlock

Reading through
contrib/llvm-project/compiler-rt/lib/BlocksRuntime/runtime.c these
missing symbols look straightforward to implement for the basic case.
I'm not thinking of working within the Clang runtime.c, I'm thinkin= g of
reimplementing the needed functions for this constrained use case (no
need for GC, etc).

This testing was only marginally more than you did, so I'm probably
missing some things as well for more complex use cases.

I was worried about two things when I looked at the code: = reference counting
(which I think is kinda required, even though = objc uses it for GC) and memory
allocation / handling. The former= is well understood, and we can adapt things
(though knowing whic= h subset is required here might be tricky, there's a lot
of f= lags). The latter, though, would limit the use of these APIs to situations<= /div>
where you can call malloc/free M_WAIT, or you'd need to deal = with malloc
failures better than runtime.c does.

So while the number of routines is small, I think they are the tip= of the iceburg
and may be more work than you're suggesting.<= /div>
=C2=A0
I'm guessing from GCC's issues that this is a nonstarter anyway?

In the past we've said that it's = OK to use clang specific code to get better
performance, but that= depending on it entirely would require a careful
discussion. gcc= is produces code that's easier to debug than clang (though
g= cc12 build is currently still broken), and that's not nothing and has b= een
useful for me in the past. jhb can likely speak to other bene= fits for gcc12
since he did the last round of updates.
=
So given the difficulties on multiple fronts, I'm not su= re it's a great idea.
But maybe I'm wrong about how diffi= cult things will be and maybe it would
work out in the end. But s= electing the network stack to use what will be
an unproven, or at= least immature technology is ambitious. We've had a
mixed ba= g with that stuff (see epoch and smr for examples).=C2=A0

I'm trying hard not to say a flat out "no," because I= know that sometimes
things that look hard like this pay off. But= no gcc support does make it
really hard to say yes. I've had= my say, and I'll let others say from
here.

Warner
--000000000000489b0605f7064b0f--