Date: Sun, 31 Dec 2023 09:19:23 -0700 From: Warner Losh <imp@bsdimp.com> To: David Chisnall <theraven@freebsd.org> Cc: Alan Somers <asomers@freebsd.org>, FreeBSD Hackers <freebsd-hackers@freebsd.org> Subject: Re: Is anyone working on VirtFS (FUSE over VirtIO) Message-ID: <CANCZdfr7kKxTgBJ_LSKxAGsMUN9%2B=fiw1Fwy7Oxrc4G2mdSdYQ@mail.gmail.com> In-Reply-To: <A14C40DA-15EE-4777-B47F-2B342CE787EA@freebsd.org> References: <CAOtMX2gmc6L4H8L9107D84xofmd-idDgtVg8nkFkXPaPX1E8wg@mail.gmail.com> <A14C40DA-15EE-4777-B47F-2B342CE787EA@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--00000000000086acec060dd0a1c3 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Top posting: I think you mean VirtioFS, not VirtFS. The latter is the 9p thing that dfr is doing, the former is FUSE over VirtIO. I'll assume you mean that. On Sun, Dec 31, 2023 at 8:25=E2=80=AFAM David Chisnall <theraven@freebsd.or= g> wrote: > On 31 Dec 2023, at 14:36, Alan Somers <asomers@freebsd.org> wrote: > > > > =EF=BB=BFOn Sun, Dec 31, 2023 at 5:46=E2=80=AFAM David Chisnall <therav= en@freebsd.org> > wrote: > >> > >> Hi, > >> > >> For running FreeBSD containers on macOS, I=E2=80=99m using dfr=E2=80= =99s update of the > 9pfs client code. This seems to work fine but Podman is in the process o= f > moving from using QEMU to using Apple=E2=80=99s native hypervisor framewo= rks. > These don=E2=80=99t provide 9pfs servers and instead provide a native Vir= tFS server > (macOS now ships with a native VirtFS client, as does Linux). > >> > >> I believe the component bits for at least a functional implementation > already exist (FUSE and a VirtIO transport), though I=E2=80=99m not sure = about the > parts for sharing buffer cache pages with the host. Is anyone working on > connecting these together? > >> > >> David > > > > Nobody that I know of. And while I understand the FUSE stuff well, > > I'm shakier on VirtIO and the buffer cache. Do you think that this is > > something that a GSoC student could accomplish? > > I=E2=80=99m not familiar enough with either part of the kernel to know. A > competent student with two mentors each familiar with one of the parts > might, but this is increasingly strategically important. The newer cloud > container-hosting platforms are moving to lightweight VMs with VirtFS > because it lets them get the same sharing of container image contents > between hosts but with full kernel isolation. It would be easy to plug > FreeBSD in as an alternative to Linux with this support. > We shouldn't pin our hopes on GSoC for this. If it is important, it needs to be funded. > The VirtFS protocol is less well documented than I=E2=80=99d like, but it= appears > to primarily be a different transport for FUSE messages and so may be qui= te > easy to add if the FUSE code is sufficiently abstracted. > Yea. The FUSE protocol is going to be the challenge here. For this to be useful, the VirtioFS support on the FreeBSD needs to be 100% in the kernel, since you can't have userland in the loop. This isn't so terrible, though, since our VFS interface provides a natural breaking point for converting the requests into FUSE requests. The trouble, I fear, is a mismatch between FreeBSD's VFS abstraction layer and Linux's will cause issues (many years ago, the weakness of FreeBSD VFS caused problems for a company doing caching, though things have no doubt improved from those days). Second, there's a KVM tie-in for the direct mapped pages between the VM and the hypervisor. I'm not sure how that works on the client (FreeBSD) side (though the description also says it's mapped via a PCI bar, so maybe the VM OS doesn't care). Now, having said that it's a challenge shouldn't be taken as discouragement. I think it's going to take advice from a lot of different people to be successful. It sounds like a fun project, but I'm already over-subscribed to fun projects for $WORK. I cast no doubt on its importance. Warner --00000000000086acec060dd0a1c3 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><div dir=3D"ltr">Top posting: I think you mean VirtioFS, n= ot VirtFS. The latter is the 9p thing that dfr is doing, the former is FUSE= over VirtIO.=C2=A0 I'll assume you mean that.</div><br><div class=3D"g= mail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Sun, Dec 31, 2023 at 8= :25=E2=80=AFAM David Chisnall <<a href=3D"mailto:theraven@freebsd.org">t= heraven@freebsd.org</a>> wrote:<br></div><blockquote class=3D"gmail_quot= e" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204)= ;padding-left:1ex">On 31 Dec 2023, at 14:36, Alan Somers <<a href=3D"mai= lto:asomers@freebsd.org" target=3D"_blank">asomers@freebsd.org</a>> wrot= e:<br> > <br> > =EF=BB=BFOn Sun, Dec 31, 2023 at 5:46=E2=80=AFAM David Chisnall <<a= href=3D"mailto:theraven@freebsd.org" target=3D"_blank">theraven@freebsd.or= g</a>> wrote:<br> >> <br> >> Hi,<br> >> <br> >> For running FreeBSD containers on macOS, I=E2=80=99m using dfr=E2= =80=99s update of the 9pfs client code.=C2=A0 This seems to work fine but P= odman is in the process of moving from using QEMU to using Apple=E2=80=99s = native hypervisor frameworks.=C2=A0 These don=E2=80=99t provide 9pfs server= s and instead provide a native VirtFS server (macOS now ships with a native= VirtFS client, as does Linux).<br> >> <br> >> I believe the component bits for at least a functional implementat= ion already exist (FUSE and a VirtIO transport), though I=E2=80=99m not sur= e about the parts for sharing buffer cache pages with the host.=C2=A0 Is an= yone working on connecting these together?<br> >> <br> >> David<br> > <br> > Nobody that I know of.=C2=A0 And while I understand the FUSE stuff wel= l,<br> > I'm shakier on VirtIO and the buffer cache.=C2=A0 Do you think tha= t this is<br> > something that a GSoC student could accomplish?<br> <br> I=E2=80=99m not familiar enough with either part of the kernel to know. A c= ompetent student with two mentors each familiar with one of the parts might= , but this is increasingly strategically important. The newer cloud contain= er-hosting platforms are moving to lightweight VMs with VirtFS because it l= ets them get the same sharing of container image contents between hosts but= with full kernel isolation. It would be easy to plug FreeBSD in as an alte= rnative to Linux with this support.<br></blockquote><div><br></div><div>We = shouldn't pin our hopes on GSoC for this. If it is important, it needs = to be funded.</div><div>=C2=A0</div><blockquote class=3D"gmail_quote" style= =3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding= -left:1ex"> The VirtFS protocol is less well documented than I=E2=80=99d like, but it a= ppears to primarily be a different transport for FUSE messages and so may b= e quite easy to add if the FUSE code is sufficiently abstracted.<br></block= quote><div><br></div><div>Yea. The FUSE protocol is going to be the challen= ge here. For this to be useful, the VirtioFS=C2=A0support on=C2=A0the FreeB= SD=C2=A0 needs to be 100% in the kernel, since you can't have userland = in the loop. This isn't so terrible, though, since our VFS interface pr= ovides a natural breaking point for converting the requests into FUSE reque= sts. The trouble, I fear, is a mismatch between FreeBSD's VFS abstracti= on layer and Linux's will cause issues (many years ago, the weakness of= FreeBSD VFS caused problems for a company doing caching, though things hav= e no doubt improved from those days). Second, there's a KVM tie-in for = the direct mapped pages between the VM and the hypervisor. I'm not sure= how that works on the client (FreeBSD) side (though the description also s= ays it's mapped via a PCI bar, so maybe the VM OS doesn't care).</d= iv><div><br></div><div>Now, having said that it's a challenge shouldn&#= 39;t be taken as discouragement. I think it's going to take advice from= a lot of different people to be successful. It sounds like a fun project, = but I'm already over-subscribed to fun projects for $WORK. I cast no do= ubt on its importance.</div><div><br></div><div>Warner</div></div></div> --00000000000086acec060dd0a1c3--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfr7kKxTgBJ_LSKxAGsMUN9%2B=fiw1Fwy7Oxrc4G2mdSdYQ>