Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 17 Jul 2024 09:31:26 +0100
From:      David Chisnall <theraven@freebsd.org>
To:        Emil Tsalapatis <freebsd-lists@etsalapatis.com>
Cc:        Warner Losh <imp@bsdimp.com>, Alan Somers <asomers@freebsd.org>, FreeBSD Hackers <freebsd-hackers@freebsd.org>
Subject:   Re: Is anyone working on VirtFS (FUSE over VirtIO)
Message-ID:  <9F249E56-4053-45A3-96FC-179C01AFB084@freebsd.org>
In-Reply-To: <CABFh=a6Tm=2JJdrk9LDQ%2BM96Wndr8%2Br=C4c17K3RQ0mb4%2BN0KQ@mail.gmail.com>
References:  <CABFh=a6Tm=2JJdrk9LDQ%2BM96Wndr8%2Br=C4c17K3RQ0mb4%2BN0KQ@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--Apple-Mail-768DFAE2-A24A-4D17-AE5A-3111C6A48B63
Content-Type: text/plain;
	charset=utf-8
Content-Transfer-Encoding: quoted-printable



> On 16 Jul 2024, at 21:20, Emil Tsalapatis <freebsd-lists@etsalapatis.com> w=
rote!:
>=20
> After going over the Linux code, I think adding direct mapping doesn't req=
uire any changes outside of FUSE and virtio code. Direct mapping mainly requ=
ires code to manage the virtiofs device's memory region in the driver. This i=
s a shared memory region between guest and host with which the driver backs =
FUSE inodes. The driver then includes an allocator used to map parts of an i=
node into the region.

That=E2=80=99s how I understood the spec too.

> It should be possible to pass host-guest shared pages to ARC, with the cav=
eat that the virtiofs driver should be able to reclaim them at any time. Doe=
s the code currently allow this? Virtiofs needs this because it maps region p=
ages to inodes, and must reuse cold region pages during an allocation if the=
re aren't any available. Basically, the region is a separate pool of device p=
ages that's managed directly by virtiofs.

I am not overly familiar with the buffer cache code, but I believe the code t=
hat was added to support ARC had similar requirements. The first ZFS port ha=
d pages in ARC and then exactly the same data in the buffer cache. The buffe=
r cache was extended with a notion of pages that it didn=E2=80=99t own so th=
at it could just use the pages in ARC directly.

I don=E2=80=99t remember if there=E2=80=99s existing support for ARC to remo=
ve those pages from the buffer cache. They are both kernel pages so it would=
 be possible to just treat removing them from ARC as an accounting operation=
. There is, I believe, support for the pager to remove arbitrary pages and s=
o it might be simple to just add a new kind of pager for these pages (which j=
ust tells the host to flush the pages).

>> If I understand the protocol correctly, the DAX mode is the same as the d=
irect mmap mode in FUSE (not sure if FreeBSD!=E2=80=99s kernel fuse bits sup=
port this?).
>>=20
>=20
>=20
> Yeah, virtiofs DAX seems like it's similar to FUSE direct mmap, but with FU=
SE inodes being backed by the shared region instead. I don't think FreeBSD h=
as direct mmap but I may be wrong there.

It would be a nice feature to have if not!

David


--Apple-Mail-768DFAE2-A24A-4D17-AE5A-3111C6A48B63
Content-Type: text/html;
	charset=utf-8
Content-Transfer-Encoding: quoted-printable

<html><head><meta http-equiv=3D"content-type" content=3D"text/html; charset=3D=
utf-8"></head><body dir=3D"auto"><div dir=3D"ltr"></div><div dir=3D"ltr"><br=
></div><div dir=3D"ltr"><br><blockquote type=3D"cite">On 16 Jul 2024, at 21:=
20, Emil Tsalapatis &lt;freebsd-lists@etsalapatis.com&gt; wrote!:</blockquot=
e></div><blockquote type=3D"cite"><div dir=3D"ltr"><div dir=3D"ltr"><div cla=
ss=3D"gmail_quote"><div><br></div><div><div dir=3D"ltr">After
 going over the Linux code, I think adding direct mapping doesn't require an=
y changes outside of FUSE and virtio code. Direct mapping mainly=20
requires code to manage the virtiofs device's memory region in the driver.=20=

This is a shared memory region between guest and host with which the=20
driver backs FUSE inodes. The driver then includes an allocator used to=20
map parts of an inode into the region.</div></div></div></div></div></blockq=
uote><div><br></div><div>That=E2=80=99s how I understood the spec too.</div>=
<br><blockquote type=3D"cite"><div dir=3D"ltr"><div dir=3D"ltr"><div class=3D=
"gmail_quote"><div><div dir=3D"ltr">It should be possible to pass host-guest=
 shared pages to ARC, with=20
the caveat that the virtiofs driver should be able to reclaim them at=20
any time. Does the code currently allow this? Virtiofs needs this because it=
 maps region pages to inodes, and must reuse cold region pages during an all=
ocation if there aren't any available.=20
Basically, the region is a separate pool of device pages that's managed=20
directly by virtiofs.<br></div></div></div></div></div></blockquote><div><br=
></div><div>I am not overly familiar with the buffer cache code, but I belie=
ve the code that was added to support ARC had similar requirements. The firs=
t ZFS port had pages in ARC and then exactly the same data in the buffer cac=
he. The buffer cache was extended with a notion of pages that it didn=E2=80=99=
t own so that it could just use the pages in ARC directly.</div><div><br></d=
iv><div>I don=E2=80=99t remember if there=E2=80=99s existing support for ARC=
 to remove those pages from the buffer cache. They are both kernel pages so i=
t would be possible to just treat removing them from ARC as an accounting op=
eration. There is, I believe, support for the pager to remove arbitrary page=
s and so it might be simple to just add a new kind of pager for these pages (=
which just tells the host to flush the pages).</div><br><blockquote type=3D"=
cite"><div dir=3D"ltr"><div dir=3D"ltr"><div class=3D"gmail_quote"><blockquo=
te class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px s=
olid rgb(204,204,204);padding-left:1ex"><div dir=3D"auto"><div dir=3D"ltr"><=
div dir=3D"ltr"></div><div dir=3D"ltr">If I understand the protocol correctl=
y, the DAX mode is the same as the direct mmap mode in FUSE (not sure if Fre=
eBSD!=E2=80=99s kernel fuse bits support this?).</div><div dir=3D"ltr"><br><=
/div></div></div></blockquote><div><br><br>Yeah, virtiofs DAX seems like it'=
s similar to FUSE=20
direct mmap, but with FUSE inodes being backed by the shared region instead.=
 I=20
don't think FreeBSD has direct mmap but I may be wrong there.<br></div></div=
></div>
</div></blockquote><div><br></div>It would be a nice feature to have if not!=
<br><div><br></div><div>David</div><div><br></div></body></html>=

--Apple-Mail-768DFAE2-A24A-4D17-AE5A-3111C6A48B63--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9F249E56-4053-45A3-96FC-179C01AFB084>