From nobody Sun Jul 14 14:02:48 2024 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4WMRww576Jz5QhLp for ; Sun, 14 Jul 2024 14:07:28 +0000 (UTC) (envelope-from emil@etsalapatis.com) Received: from mail-pl1-x629.google.com (mail-pl1-x629.google.com [IPv6:2607:f8b0:4864:20::629]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4WMRwv4Hqtz43l2 for ; Sun, 14 Jul 2024 14:07:27 +0000 (UTC) (envelope-from emil@etsalapatis.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-pl1-x629.google.com with SMTP id d9443c01a7336-1fafc9e07f8so25406695ad.0 for ; Sun, 14 Jul 2024 07:07:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=etsalapatis-com.20230601.gappssmtp.com; s=20230601; t=1720966046; x=1721570846; darn=freebsd.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=+74nrH458aU6/tNrxZ3LN01VqxjinDATttBZLFlFgmg=; b=gco+NW9kmu83bnzIV6T87Dh+5UDrXno3bp2d2wgmn1iPOWmDBTEHOb31ua/VBRODMQ GjbHzx9LdM024xCCHAbShriJAZkAY9pCEgI5WUioKiw+xennKCkxr/y2ZTg9JXwKHWdh HFdfdzJMVUqsixfWuef4RUTg6IfrxyQdAivaz1CbACo9KYIlpF1Ktm8CG6TAKcGS8IPF aah8KFTA+569G3j9nxWQN+3v7zRWqpT9vL4YG8YOmJSH8QY+Y+rjD6ZX9hVcPcHo0STP nrj/518Eg/TnGSwqqSobnw0Ct4frwMKsYsc71W+clDTQrD1sEtJvPlw6QVIXsOKraDfV EsHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720966046; x=1721570846; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=+74nrH458aU6/tNrxZ3LN01VqxjinDATttBZLFlFgmg=; b=UBI+myGd5LfAZ7s25Xcfkg2bYUOtOeZFxNezniYW9i3ZBgpAdkVJetRPE6zJsIJZj7 kXzs0kVd3kNzEcWUCdyRxvQO/rvrTWq2JTNC5qdRGmtO/86CWq4cIzlNlL/y17zIh4hT kl418NVFtRzswiftC+kw7xYf/Y+gUNc34FBWDVXVwuj7M4ZldSrrKh9SbyQ8iTUVgPM9 aYpl1bnFF+QddVuFUebYwbbf6XL93sWwYx9cD7xsMxFlj0RGtfeVgJSiyByJ7b9yvQRt 3IjF6u+yhpGdG3fCiwmvlOyJGc+Hd6+VhE1GCMiWUvhIj4VEzq2u9JrR/6XQMC6S8Jf1 64mw== X-Forwarded-Encrypted: i=1; AJvYcCWZoBbY8UNmT3vLvUZlynjAt7384XIxzewzw3FXnKQsyrUlz4bFQ4MK8vWo1t5WfkHOvgaGRoUO2RBHEOYQoZLWw3EBztpuGXZIOI8= X-Gm-Message-State: AOJu0Yyz6+4gKnTZo8oLK+lCmp03dbKG7uOe+RImqStFsJCopwHhsQ2Y 8OJDNE1iRK1d0nsylphRRc2VWU1OuRJjlc2mXUzTlDihk9v7d1SYiCdFZREjZ/1Qjt3vnL1xTen wXNj2rTRZa0om11u/DLElRM4K8LcgMIyoI4HVMw== X-Google-Smtp-Source: AGHT+IEGh5IfPfvd6KhLv2za1b0UymbnpBGV8AgtlRQGZB9uvTd9h9N0aZ0iNbignGj3dvSohIcR7UzLi5uPYM5zbG4= X-Received: by 2002:a17:902:cec5:b0:1fb:7d20:f040 with SMTP id d9443c01a7336-1fbb6f0abbamr178881965ad.64.1720966045993; Sun, 14 Jul 2024 07:07:25 -0700 (PDT) List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@FreeBSD.org MIME-Version: 1.0 References: In-Reply-To: From: Emil Tsalapatis Date: Sun, 14 Jul 2024 10:02:48 -0400 Message-ID: Subject: Re: Is anyone working on VirtFS (FUSE over VirtIO) To: David Chisnall Cc: Warner Losh , Alan Somers , FreeBSD Hackers Content-Type: multipart/alternative; boundary="000000000000d7b5ab061d35a11e" X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US] X-Rspamd-Queue-Id: 4WMRwv4Hqtz43l2 --000000000000d7b5ab061d35a11e Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi David, Warner, I'm glad you find this approach interesting! I've been meaning to update the virtio-dbg patch for a while but unfortunately haven't found the time in the last month since I uploaded it... I'll update it soon to address the reviews and split off the userspace device emulation code out of the patch to make reviewing easier (thanks Alan for the suggestion). If you have any questions or feedback please let me know. WRT virtiofs itself, I've been working on it too but I haven't found the time to clean it up and upload it. I have a messy but working implementation here . The changes to FUSE itself are indeed minimal because it is enough to redirect the messages into a virtiofs device instead of sending them to a local FUSE device. The virtiofs device and the FUSE device are both simple bidirectional queues. Not sure on how to deal with directly mapping files between host and guest just yet, because the Linux driver uses their DAX interface for that, but it should be possible. Emil On Sun, Jul 14, 2024 at 3:11=E2=80=AFAM David Chisnall wrote: > Wow, that looks incredibly useful. Not needing bhyve / qemu (nested, if > your main development is a VM) to test virtio drivers would be a huge > productivity win. > > David > > On 13 Jul 2024, at 23:06, Warner Losh wrote: > > Hey David, > > You might want to check out https://reviews.freebsd.org/D45370 which has > the testing framework as well as hints at other work that's been done for > virtiofs by Emil Tsalapatis. It looks quite interesting. Anything he's do= ne > that's at odds with what I've said just shows where my analysis was flawe= d > :) This looks quite promising, but I've not had the time to look at it in > detail yet. > > Warner > > On Sat, Jul 13, 2024 at 2:44=E2=80=AFAM David Chisnall > wrote: > >> On 31 Dec 2023, at 16:19, Warner Losh wrote: >> >> >> Yea. The FUSE protocol is going to be the challenge here. For this to be >> useful, the VirtioFS support on the FreeBSD needs to be 100% in the >> kernel, since you can't have userland in the loop. This isn't so terribl= e, >> though, since our VFS interface provides a natural breaking point for >> converting the requests into FUSE requests. The trouble, I fear, is a >> mismatch between FreeBSD's VFS abstraction layer and Linux's will cause >> issues (many years ago, the weakness of FreeBSD VFS caused problems for = a >> company doing caching, though things have no doubt improved from those >> days). Second, there's a KVM tie-in for the direct mapped pages between = the >> VM and the hypervisor. I'm not sure how that works on the client (FreeBS= D) >> side (though the description also says it's mapped via a PCI bar, so may= be >> the VM OS doesn't care). >> >> >> From what I can tell from a little bit of looking at the code, our FUSE >> implementation has a fairly cleanly abstracted layer (in fuse_ipc.c) for >> handling the message queue. For VirtioFS, it would 'just' be necessary = to >> factor out the bits here that do uio into something that talked to a Vir= tIO >> ring. I don=E2=80=99t know what the VFS limitations are, but since the = protocol >> for VirtioFS is the kernel <-> userspace protocol for FUSE, it seems tha= t >> any functionality that works with FUSE filesystems in userspace would wo= rk >> with VirtioFS filesystems. >> >> The shared buffer cache bits are nice, but are optional, so could be don= e >> in a later version once the basic functionality worked. >> >> David >> >> > --000000000000d7b5ab061d35a11e Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi David, Warner,

=C2=A0=C2= =A0=C2=A0 I'm glad you find this approach interesting! I've been me= aning to update the virtio-dbg patch for a while but unfortunately haven= 9;t found the time in the last month since I uploaded it... I'll update= it soon to address the reviews and split off the=20 userspace device emulation code out of the patch to make reviewing=20 easier (thanks Alan for the suggestion). If you have any questions or feedb= ack please let me know.

WRT virtiofs itself, I= 've been working on it too but I haven't found the time to clean it= up and upload it. I have a messy but working implementation here. The changes= to FUSE itself are indeed minimal because it is enough to redirect the mes= sages into a virtiofs device instead of sending them to a local FUSE device= . The virtiofs device and the FUSE device are both simple bidirectional que= ues. Not sure on how to deal with directly mapping files between host and g= uest just yet, because the Linux driver uses their DAX interface for that, = but it should be possible.

Emil

On Su= n, Jul 14, 2024 at 3:11=E2=80=AFAM David Chisnall <theraven@freebsd.org> wrote:
Wow, that looks incredibly usef= ul.=C2=A0 Not needing bhyve / qemu (nested, if your main development is a V= M) to test virtio drivers would be a huge productivity win. =C2=A0

=
David

On 13 Jul 2024, at 23:06, Warner= Losh <imp@bsdimp.co= m> wrote:


On 31 Dec 2023, at 16:19, Warner Losh <imp@bsdimp.com> wrote:

Yea. The FUSE protocol is going = to be the challenge here. For this to be useful, the VirtioFS=C2=A0support = on=C2=A0the FreeBSD=C2=A0 needs to be 100% in the kernel, since you can'= ;t have userland in the loop. This isn't so terrible, though, since our= VFS interface provides a natural breaking point for converting the request= s into FUSE requests. The trouble, I fear, is a mismatch between FreeBSD= 9;s VFS abstraction layer and Linux's will cause issues (many years ago= , the weakness of FreeBSD VFS caused problems for a company doing caching, = though things have no doubt improved from those days). Second, there's = a KVM tie-in for the direct mapped pages between the VM and the hypervisor.= I'm not sure how that works on the client (FreeBSD) side (though the d= escription also says it's mapped via a PCI bar, so maybe the VM OS does= n't care).

From what I can tell = from a little bit of looking at the code, our FUSE implementation has a fai= rly cleanly abstracted layer (in fuse_ipc.c) for handling the message queue= .=C2=A0 For VirtioFS, it would 'just' be necessary to factor out th= e bits here that do uio into something that talked to a VirtIO ring.=C2=A0 = I don=E2=80=99t know what the VFS limitations are, but since the protocol f= or VirtioFS is the kernel <-> userspace protocol for FUSE, it seems t= hat any functionality that works with FUSE filesystems in userspace would w= ork with VirtioFS filesystems.

The shared buffer c= ache bits are nice, but are optional, so could be done in a later version o= nce the basic functionality worked. =C2=A0

David


--000000000000d7b5ab061d35a11e--