Date: Mon, 15 Jul 2024 08:47:33 +0100 From: David Chisnall <theraven@freebsd.org> To: Emil Tsalapatis <freebsd-lists@etsalapatis.com> Cc: Warner Losh <imp@bsdimp.com>, Alan Somers <asomers@freebsd.org>, FreeBSD Hackers <freebsd-hackers@freebsd.org> Subject: Re: Is anyone working on VirtFS (FUSE over VirtIO) Message-ID: <75944503-8599-43CF-84C5-0C10CA325761@freebsd.org> In-Reply-To: <CABFh=a4t=73NLyJFqBOs1pRuo8B_d8wOH_mavnD-Da9dU-3k8Q@mail.gmail.com> References: <CABFh=a4t=73NLyJFqBOs1pRuo8B_d8wOH_mavnD-Da9dU-3k8Q@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
--Apple-Mail-3C1EEEF7-4CE1-4CB9-AF0C-CE83DB211443 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable <html><head><meta http-equiv=3D"content-type" content=3D"text/html; charset=3D= utf-8"></head><body dir=3D"auto"><div dir=3D"ltr"></div><div dir=3D"ltr"><di= v dir=3D"ltr">Hi,</div><div dir=3D"ltr"><br></div><div dir=3D"ltr">This look= s great! Are there infrastructure problems with supporting the DAX or is it =E2= =80=98just work=E2=80=99? I had hoped that the extensions to the buffer cach= e that allow ARC to own pages that are delegated to the buffer cache would b= e sufficient.</div><div dir=3D"ltr"><br></div><div dir=3D"ltr">If I understa= nd the protocol correctly, the DAX mode is the same as the direct mmap mode i= n FUSE (not sure if FreeBSD!=E2=80=99s kernel fuse bits support this?).</div= ><div dir=3D"ltr"><br></div><div dir=3D"ltr">David</div></div><div dir=3D"lt= r"><br><blockquote type=3D"cite">On 14 Jul 2024, at 15:07, Emil Tsalapatis &= lt;freebsd-lists@etsalapatis.com> wrote:<br><br></blockquote></div><block= quote type=3D"cite"><div dir=3D"ltr">=EF=BB=BF<div dir=3D"ltr"><div>Hi David= , Warner,</div><div><br></div><div> I'm glad you find this= approach interesting! I've been meaning to update the virtio-dbg patch for a= while but unfortunately haven't found the time in the last month since I up= loaded it... I'll update it soon to address the reviews and split off the=20= userspace device emulation code out of the patch to make reviewing=20 easier (thanks Alan for the suggestion). If you have any questions or feedba= ck please let me know.<br></div><div><br></div><div>WRT virtiofs itself, I'v= e been working on it too but I haven't found the time to clean it up and upl= oad it. I have a messy but working implementation <a href=3D"https://github.= com/etsal/freebsd-src/tree/virtiofs-head">here</a>. The changes to FUSE itse= lf are indeed minimal because it is enough to redirect the messages into a v= irtiofs device instead of sending them to a local FUSE device. The virtiofs d= evice and the FUSE device are both simple bidirectional queues. Not sure on h= ow to deal with directly mapping files between host and guest just yet, beca= use the Linux driver uses their DAX interface for that, but it should be pos= sible.<br></div><div><br></div><div>Emil<br></div></div><br><div class=3D"gm= ail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Sun, Jul 14, 2024 at 3:1= 1=E2=80=AFAM David Chisnall <<a href=3D"mailto:theraven@freebsd.org">ther= aven@freebsd.org</a>> wrote:<br></div><blockquote class=3D"gmail_quote" s= tyle=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padd= ing-left:1ex"><div>Wow, that looks incredibly useful. Not needing bhyv= e / qemu (nested, if your main development is a VM) to test virtio drivers w= ould be a huge productivity win. <div><br></div><div>David<br id=3D"m_= 2432313125591762966lineBreakAtBeginningOfMessage"><div><br><blockquote type=3D= "cite"><div>On 13 Jul 2024, at 23:06, Warner Losh <<a href=3D"mailto:imp@= bsdimp.com" target=3D"_blank">imp@bsdimp.com</a>> wrote:</div><br><div><d= iv dir=3D"ltr"><div>Hey David,</div><div><br></div><div>You might want to ch= eck out <a href=3D"https://reviews.freebsd.org/D45370" target=3D"_blan= k">https://reviews.freebsd.org/D45370</a> which has the testing framework as= well as hints at other work that's been done for virtiofs by Emil = ;Tsalapatis. It looks quite interesting. Anything he's done that's at odds w= ith what I've said just shows where my analysis was flawed :) This looks qui= te promising, but I've not had the time to look at it in detail yet.</div><d= iv><br></div><div>Warner</div></div><br><div class=3D"gmail_quote"><div dir=3D= "ltr" class=3D"gmail_attr">On Sat, Jul 13, 2024 at 2:44=E2=80=AFAM David Chi= snall <<a href=3D"mailto:theraven@freebsd.org" target=3D"_blank">theraven= @freebsd.org</a>> wrote:<br></div><blockquote class=3D"gmail_quote" style= =3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-= left:1ex"><div>On 31 Dec 2023, at 16:19, Warner Losh <<a href=3D"mailto:i= mp@bsdimp.com" target=3D"_blank">imp@bsdimp.com</a>> wrote:<br><div><bloc= kquote type=3D"cite"><br><div><div style=3D"font-family:Helvetica;font-size:= 12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spaci= ng:normal;text-align:start;text-indent:0px;text-transform:none;white-space:n= ormal;word-spacing:0px;text-decoration:none">Yea. The FUSE protocol is going= to be the challenge here. For this to be useful, the VirtioFS support o= n the FreeBSD needs to be 100% in the kernel, since you can't hav= e userland in the loop. This isn't so terrible, though, since our VFS interf= ace provides a natural breaking point for converting the requests into FUSE r= equests. The trouble, I fear, is a mismatch between FreeBSD's VFS abstractio= n layer and Linux's will cause issues (many years ago, the weakness of FreeB= SD VFS caused problems for a company doing caching, though things have no do= ubt improved from those days). Second, there's a KVM tie-in for the direct m= apped pages between the VM and the hypervisor. I'm not sure how that works o= n the client (FreeBSD) side (though the description also says it's mapped vi= a a PCI bar, so maybe the VM OS doesn't care).</div></div></blockquote><br><= /div><div>=46rom what I can tell from a little bit of looking at the code, o= ur FUSE implementation has a fairly cleanly abstracted layer (in fuse_ipc.c)= for handling the message queue. For VirtioFS, it would 'just' be nece= ssary to factor out the bits here that do uio into something that talked to a= VirtIO ring. I don=E2=80=99t know what the VFS limitations are, but s= ince the protocol for VirtioFS is the kernel <-> userspace protocol fo= r FUSE, it seems that any functionality that works with FUSE filesystems in u= serspace would work with VirtioFS filesystems.</div><div><br></div><div>The s= hared buffer cache bits are nice, but are optional, so could be done in a la= ter version once the basic functionality worked. </div><div><br></div>= <div>David</div><div><br></div></div></blockquote></div> </div></blockquote></div><br></div></div></blockquote></div> </div></blockquote></body></html>= --Apple-Mail-3C1EEEF7-4CE1-4CB9-AF0C-CE83DB211443--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?75944503-8599-43CF-84C5-0C10CA325761>