Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 15 Jul 2024 08:47:33 +0100
From:      David Chisnall <theraven@freebsd.org>
To:        Emil Tsalapatis <freebsd-lists@etsalapatis.com>
Cc:        Warner Losh <imp@bsdimp.com>, Alan Somers <asomers@freebsd.org>, FreeBSD Hackers <freebsd-hackers@freebsd.org>
Subject:   Re: Is anyone working on VirtFS (FUSE over VirtIO)
Message-ID:  <75944503-8599-43CF-84C5-0C10CA325761@freebsd.org>
In-Reply-To: <CABFh=a4t=73NLyJFqBOs1pRuo8B_d8wOH_mavnD-Da9dU-3k8Q@mail.gmail.com>
References:  <CABFh=a4t=73NLyJFqBOs1pRuo8B_d8wOH_mavnD-Da9dU-3k8Q@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--Apple-Mail-3C1EEEF7-4CE1-4CB9-AF0C-CE83DB211443
Content-Type: text/html;
	charset=utf-8
Content-Transfer-Encoding: quoted-printable

<html><head><meta http-equiv=3D"content-type" content=3D"text/html; charset=3D=
utf-8"></head><body dir=3D"auto"><div dir=3D"ltr"></div><div dir=3D"ltr"><di=
v dir=3D"ltr">Hi,</div><div dir=3D"ltr"><br></div><div dir=3D"ltr">This look=
s great! Are there infrastructure problems with supporting the DAX or is it =E2=
=80=98just work=E2=80=99? I had hoped that the extensions to the buffer cach=
e that allow ARC to own pages that are delegated to the buffer cache would b=
e sufficient.</div><div dir=3D"ltr"><br></div><div dir=3D"ltr">If I understa=
nd the protocol correctly, the DAX mode is the same as the direct mmap mode i=
n FUSE (not sure if FreeBSD!=E2=80=99s kernel fuse bits support this?).</div=
><div dir=3D"ltr"><br></div><div dir=3D"ltr">David</div></div><div dir=3D"lt=
r"><br><blockquote type=3D"cite">On 14 Jul 2024, at 15:07, Emil Tsalapatis &=
lt;freebsd-lists@etsalapatis.com&gt; wrote:<br><br></blockquote></div><block=
quote type=3D"cite"><div dir=3D"ltr">=EF=BB=BF<div dir=3D"ltr"><div>Hi David=
, Warner,</div><div><br></div><div>&nbsp;&nbsp;&nbsp; I'm glad you find this=
 approach interesting! I've been meaning to update the virtio-dbg patch for a=
 while but unfortunately haven't found the time in the last month since I up=
loaded it... I'll update it soon to address the reviews and split off the=20=

userspace device emulation code out of the patch to make reviewing=20
easier (thanks Alan for the suggestion). If you have any questions or feedba=
ck please let me know.<br></div><div><br></div><div>WRT virtiofs itself, I'v=
e been working on it too but I haven't found the time to clean it up and upl=
oad it. I have a messy but working implementation <a href=3D"https://github.=
com/etsal/freebsd-src/tree/virtiofs-head">here</a>. The changes to FUSE itse=
lf are indeed minimal because it is enough to redirect the messages into a v=
irtiofs device instead of sending them to a local FUSE device. The virtiofs d=
evice and the FUSE device are both simple bidirectional queues. Not sure on h=
ow to deal with directly mapping files between host and guest just yet, beca=
use the Linux driver uses their DAX interface for that, but it should be pos=
sible.<br></div><div><br></div><div>Emil<br></div></div><br><div class=3D"gm=
ail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Sun, Jul 14, 2024 at 3:1=
1=E2=80=AFAM David Chisnall &lt;<a href=3D"mailto:theraven@freebsd.org">ther=
aven@freebsd.org</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" s=
tyle=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padd=
ing-left:1ex"><div>Wow, that looks incredibly useful.&nbsp; Not needing bhyv=
e / qemu (nested, if your main development is a VM) to test virtio drivers w=
ould be a huge productivity win. &nbsp;<div><br></div><div>David<br id=3D"m_=
2432313125591762966lineBreakAtBeginningOfMessage"><div><br><blockquote type=3D=
"cite"><div>On 13 Jul 2024, at 23:06, Warner Losh &lt;<a href=3D"mailto:imp@=
bsdimp.com" target=3D"_blank">imp@bsdimp.com</a>&gt; wrote:</div><br><div><d=
iv dir=3D"ltr"><div>Hey David,</div><div><br></div><div>You might want to ch=
eck out&nbsp; <a href=3D"https://reviews.freebsd.org/D45370" target=3D"_blan=
k">https://reviews.freebsd.org/D45370</a>; which has the testing framework as=
 well as hints at other work that's been done for virtiofs&nbsp;by Emil&nbsp=
;Tsalapatis. It looks quite interesting. Anything he's done that's at odds w=
ith what I've said just shows where my analysis was flawed :) This looks qui=
te promising, but I've not had the time to look at it in detail yet.</div><d=
iv><br></div><div>Warner</div></div><br><div class=3D"gmail_quote"><div dir=3D=
"ltr" class=3D"gmail_attr">On Sat, Jul 13, 2024 at 2:44=E2=80=AFAM David Chi=
snall &lt;<a href=3D"mailto:theraven@freebsd.org" target=3D"_blank">theraven=
@freebsd.org</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=
=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-=
left:1ex"><div>On 31 Dec 2023, at 16:19, Warner Losh &lt;<a href=3D"mailto:i=
mp@bsdimp.com" target=3D"_blank">imp@bsdimp.com</a>&gt; wrote:<br><div><bloc=
kquote type=3D"cite"><br><div><div style=3D"font-family:Helvetica;font-size:=
12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spaci=
ng:normal;text-align:start;text-indent:0px;text-transform:none;white-space:n=
ormal;word-spacing:0px;text-decoration:none">Yea. The FUSE protocol is going=
 to be the challenge here. For this to be useful, the VirtioFS&nbsp;support o=
n&nbsp;the FreeBSD&nbsp; needs to be 100% in the kernel, since you can't hav=
e userland in the loop. This isn't so terrible, though, since our VFS interf=
ace provides a natural breaking point for converting the requests into FUSE r=
equests. The trouble, I fear, is a mismatch between FreeBSD's VFS abstractio=
n layer and Linux's will cause issues (many years ago, the weakness of FreeB=
SD VFS caused problems for a company doing caching, though things have no do=
ubt improved from those days). Second, there's a KVM tie-in for the direct m=
apped pages between the VM and the hypervisor. I'm not sure how that works o=
n the client (FreeBSD) side (though the description also says it's mapped vi=
a a PCI bar, so maybe the VM OS doesn't care).</div></div></blockquote><br><=
/div><div>=46rom what I can tell from a little bit of looking at the code, o=
ur FUSE implementation has a fairly cleanly abstracted layer (in fuse_ipc.c)=
 for handling the message queue.&nbsp; For VirtioFS, it would 'just' be nece=
ssary to factor out the bits here that do uio into something that talked to a=
 VirtIO ring.&nbsp; I don=E2=80=99t know what the VFS limitations are, but s=
ince the protocol for VirtioFS is the kernel &lt;-&gt; userspace protocol fo=
r FUSE, it seems that any functionality that works with FUSE filesystems in u=
serspace would work with VirtioFS filesystems.</div><div><br></div><div>The s=
hared buffer cache bits are nice, but are optional, so could be done in a la=
ter version once the basic functionality worked. &nbsp;</div><div><br></div>=
<div>David</div><div><br></div></div></blockquote></div>
</div></blockquote></div><br></div></div></blockquote></div>
</div></blockquote></body></html>=

--Apple-Mail-3C1EEEF7-4CE1-4CB9-AF0C-CE83DB211443--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?75944503-8599-43CF-84C5-0C10CA325761>