From nobody Mon Jul 15 07:47:33 2024 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4WMvSL74Ddz5Qq3W for ; Mon, 15 Jul 2024 07:47:46 +0000 (UTC) (envelope-from theraven@freebsd.org) Received: from smtp.freebsd.org (smtp.freebsd.org [96.47.72.83]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4WMvSL6V4kz4kpR; Mon, 15 Jul 2024 07:47:46 +0000 (UTC) (envelope-from theraven@freebsd.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1721029666; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=U8UNFqGQxM7bo0frHVCJntfNcZrUe+r2uP8uE8YNyuw=; b=NdFSeYK4yWNC5DTVgRz47bYEmcnBHuwJH0jlHMsyfRcQNrfG4EMfpf2JezmNjmmNTXvVNo 5kd+eKoB3KS3376itjn56aBjJCw/7w4DNtnYPhZxpwtmnCEBqIHb/kNRvnnfUR7O8Hds4j 4MheJ4ms0kR9dmbHYjUcnR9J5zZCuWrhrYZKQ0GYCW/UW/wnI9CES0NTko2kTYmj8Q6CFS bq18V3Rp2F/BLDR4CRGgSeKiK+d1+AQCKvwjnoBbc4O/EIoo/Mdc9YF/XRzJb7eB+2b+Ln K187nr5Qz3IYdPAJXdR4NroeCEhmJtbfC5AMl9IqCuKvxg13UAgQNqfaYjYe5Q== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1721029666; a=rsa-sha256; cv=none; b=l/syNT7GJT/P7nRfisg4fr2VioiBtcDEnFhWjR42V99cxjnR08nO+BPgl3X5+5jYyCEEz7 YcWtYd9gI2qwr3MlVxcrcROX0z7jthXAY2TkexyIgjwig7QCD/DNhs05eWbg/ELaD3AVnW /uKFzVRs9b5fkQ20PG8oy04fyWig1Mm03tkZnig5avFjg5YmjtAd1Ov83JHuL1+A/63H3k rtndBdtjrj5Qf3MrciUdE5jfQ4q6KbP21MXNsGuQbxtY+hkIXmh0vyDqKyLsNgYga6rWnT zm7RNWoNB8vfHmEDf1WOmoKood8xGYEFRYcMq3nBS/AWMdbPazkMfgsq2vzXKg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1721029666; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=U8UNFqGQxM7bo0frHVCJntfNcZrUe+r2uP8uE8YNyuw=; b=ISVtZFIAU7rB2tjjVF+DEPne95CIXQR2WuKKkpsCOCNd8OLp+eu/W6TanopoXEJ1Urp+zL hdayKHB+8DJj9AI11BJ9JAWd0LeGcDcUP5dhLkEa4/bKE6zP0oZiDUoupNKxi+YKUAeha7 wmykHaNxAaIxLffca4IJSVKDLDQr9vg2+4WSOZhzWXtM5mzu/XrruL1NPaX/ujNjhoo6b8 qAsbMbIMSeJ4GpD5pAuLDv98dTa/YhowxK6V0xvcvtuXuhGRuXYZaSOYHO+tPDpallGh8t 1SyHqvJmmkYjE17zmCh4ZwTnjbX81zjWN7uk4Hg1z0XvhwVGPwI4EtKyDKkNDg== Received: from smtp.theravensnest.org (smtp.theravensnest.org [45.77.103.195]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: theraven) by smtp.freebsd.org (Postfix) with ESMTPSA id 4WMvSL5sfvz1Bk8; Mon, 15 Jul 2024 07:47:46 +0000 (UTC) (envelope-from theraven@freebsd.org) Received: from smtpclient.apple (host86-138-165-11.range86-138.btcentralplus.com [86.138.165.11]) by smtp.theravensnest.org (Postfix) with ESMTPSA id 69A0D4451; Mon, 15 Jul 2024 08:47:45 +0100 (BST) Content-Type: multipart/alternative; boundary=Apple-Mail-3C1EEEF7-4CE1-4CB9-AF0C-CE83DB211443 Content-Transfer-Encoding: 7bit From: David Chisnall List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@FreeBSD.org Mime-Version: 1.0 (1.0) Subject: Re: Is anyone working on VirtFS (FUSE over VirtIO) Date: Mon, 15 Jul 2024 08:47:33 +0100 Message-Id: <75944503-8599-43CF-84C5-0C10CA325761@freebsd.org> References: Cc: Warner Losh , Alan Somers , FreeBSD Hackers In-Reply-To: To: Emil Tsalapatis X-Mailer: iPad Mail (21F90) --Apple-Mail-3C1EEEF7-4CE1-4CB9-AF0C-CE83DB211443 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable
Hi,

This look= s great! Are there infrastructure problems with supporting the DAX or is it =E2= =80=98just work=E2=80=99? I had hoped that the extensions to the buffer cach= e that allow ARC to own pages that are delegated to the buffer cache would b= e sufficient.

If I understa= nd the protocol correctly, the DAX mode is the same as the direct mmap mode i= n FUSE (not sure if FreeBSD!=E2=80=99s kernel fuse bits support this?).

David

On 14 Jul 2024, at 15:07, Emil Tsalapatis &= lt;freebsd-lists@etsalapatis.com> wrote:

=EF=BB=BF
Hi David= , Warner,

    I'm glad you find this= approach interesting! I've been meaning to update the virtio-dbg patch for a= while but unfortunately haven't found the time in the last month since I up= loaded it... I'll update it soon to address the reviews and split off the=20= userspace device emulation code out of the patch to make reviewing=20 easier (thanks Alan for the suggestion). If you have any questions or feedba= ck please let me know.

WRT virtiofs itself, I'v= e been working on it too but I haven't found the time to clean it up and upl= oad it. I have a messy but working implementation here. The changes to FUSE itse= lf are indeed minimal because it is enough to redirect the messages into a v= irtiofs device instead of sending them to a local FUSE device. The virtiofs d= evice and the FUSE device are both simple bidirectional queues. Not sure on h= ow to deal with directly mapping files between host and guest just yet, beca= use the Linux driver uses their DAX interface for that, but it should be pos= sible.

Emil

On Sun, Jul 14, 2024 at 3:1= 1=E2=80=AFAM David Chisnall <ther= aven@freebsd.org> wrote:
Wow, that looks incredibly useful.  Not needing bhyv= e / qemu (nested, if your main development is a VM) to test virtio drivers w= ould be a huge productivity win.  

David

On 13 Jul 2024, at 23:06, Warner Losh <imp@bsdimp.com> wrote:

Hey David,

You might want to ch= eck out  https://reviews.freebsd.org/D45370 which has the testing framework as= well as hints at other work that's been done for virtiofs by Emil = ;Tsalapatis. It looks quite interesting. Anything he's done that's at odds w= ith what I've said just shows where my analysis was flawed :) This looks qui= te promising, but I've not had the time to look at it in detail yet.

Warner

On Sat, Jul 13, 2024 at 2:44=E2=80=AFAM David Chi= snall <theraven= @freebsd.org> wrote:
On 31 Dec 2023, at 16:19, Warner Losh <imp@bsdimp.com> wrote:

Yea. The FUSE protocol is going= to be the challenge here. For this to be useful, the VirtioFS support o= n the FreeBSD  needs to be 100% in the kernel, since you can't hav= e userland in the loop. This isn't so terrible, though, since our VFS interf= ace provides a natural breaking point for converting the requests into FUSE r= equests. The trouble, I fear, is a mismatch between FreeBSD's VFS abstractio= n layer and Linux's will cause issues (many years ago, the weakness of FreeB= SD VFS caused problems for a company doing caching, though things have no do= ubt improved from those days). Second, there's a KVM tie-in for the direct m= apped pages between the VM and the hypervisor. I'm not sure how that works o= n the client (FreeBSD) side (though the description also says it's mapped vi= a a PCI bar, so maybe the VM OS doesn't care).

<= /div>
=46rom what I can tell from a little bit of looking at the code, o= ur FUSE implementation has a fairly cleanly abstracted layer (in fuse_ipc.c)= for handling the message queue.  For VirtioFS, it would 'just' be nece= ssary to factor out the bits here that do uio into something that talked to a= VirtIO ring.  I don=E2=80=99t know what the VFS limitations are, but s= ince the protocol for VirtioFS is the kernel <-> userspace protocol fo= r FUSE, it seems that any functionality that works with FUSE filesystems in u= serspace would work with VirtioFS filesystems.

The s= hared buffer cache bits are nice, but are optional, so could be done in a la= ter version once the basic functionality worked.  

=
David


= --Apple-Mail-3C1EEEF7-4CE1-4CB9-AF0C-CE83DB211443-- From nobody Mon Jul 15 14:13:44 2024 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4WN42C0cmTz5R1xL for ; Mon, 15 Jul 2024 14:14:11 +0000 (UTC) (envelope-from joel.bertrand@systella.fr) Received: from rayleigh.systella.fr (rayleigh.systella.fr [IPv6:2a0a:1c84:1000:a00::2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "*.systella.fr", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4WN42B15z3z4s5T for ; Mon, 15 Jul 2024 14:14:09 +0000 (UTC) (envelope-from joel.bertrand@systella.fr) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=systella.fr header.s=mail header.b=evwiZqUZ; dmarc=pass (policy=reject) header.from=systella.fr; spf=pass (mx1.freebsd.org: domain of joel.bertrand@systella.fr designates 2a0a:1c84:1000:a00::2 as permitted sender) smtp.mailfrom=joel.bertrand@systella.fr DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=systella.fr; s=mail; t=1721052831; bh=eIHzMzkA/cr0DO1adVkRXegNXNPE9z2JLz/NttHuWwo=; h=To:From:Subject:Date:From; b=evwiZqUZ4B4/fTXTdDvTAzx+dZlvrWKmPm978PTAKXBOiURz1vR9ziP3Q+JspU5OG kXaL3f1MyKxbMHGyupfA+USp/wlnDkyC5wrTPDfx3ml5AEorT+W0wvPXhw9sC2tNtf 1WMIFW/FUPTQWxL1gulE/s1XuxL1sRQ3MfqFCYY+IMHdmdjWU+hFw3pFrEDqLUZIAw 33ApPiaatX6YInbqBXyAvmO0EUzpJ0MS+vvHN+Z6SFKWak9sZ1n0WP1pR8ttLcURZc eGbhYbH5AXYAXUHwZRo/lUjVumA8LpcjjBoglVbZzbOIDkHFpbBGu3WgyWEFu2KNc7 S6WHHHTGbONhg== Received: from [192.168.10.103] (hilbert.systella.fr [192.168.10.103]) (authenticated bits=0) by rayleigh.systella.fr (8.18.1/8.18.1/Debian-5) with ESMTPSA id 46FEDnKf019557 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NOT); Mon, 15 Jul 2024 16:13:50 +0200 To: freebsd-hackers@freebsd.org, NetBSD User Maillist From: =?UTF-8?Q?BERTRAND_Jo=c3=abl?= Subject: Strange FreeBSD lock (related to iSCSI initiator connected to NetBSD istgt ?) Autocrypt: addr=joel.bertrand@systella.fr; prefer-encrypt=mutual; keydata= mDMEZRwX3xYJKwYBBAHaRw8BAQdAPnfO4wC5CMbkYaOioVYbTVmx5dnKJwvYZ8EKDFyZGyO0 KkJFUlRSQU5EIEpvw6tsIDxqb2VsLmJlcnRyYW5kQHN5c3RlbGxhLmZyPoiMBBAWCgA+BYJl HBffBAsJBwgJkMVb+z+YwtcIAxUICgQWAAIBAhkBApsDAh4BFiEEI/DFvIjrAtkVxM55xVv7 P5jC1wgAADaKAQCkdx4kqKvHPhJGYGrq2VXpJQhdOqE9Asq/kkw+GS4c6AD9GdILmmQUH3Tn KZCiY2wzujE2j1VjOmCPHfn+7X8gYAG4OARlHBffEgorBgEEAZdVAQUBAQdAyvP3E1DykVzz 7VjVTT3JcAZOnV4tjH3Pnu+YwGJdylEDAQgHiHgEGBYIACoFgmUcF98JkMVb+z+YwtcIApsM FiEEI/DFvIjrAtkVxM55xVv7P5jC1wgAADU2AP4l7GE6+jTfSEoE1p/NRZ3Au5cWxRXSim70 Ka7nW9E4NAD9FuYgs7TCaiKkcu6pnRVaFkEYUC41LzbHjATtY0czBg4= Message-ID: <4560d935-429a-ff65-aa5c-6a086a83f65f@systella.fr> Date: Mon, 15 Jul 2024 16:13:44 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0 SeaMonkey/2.53.18.2 List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Virus-Scanned: clamav-milter 1.3.1 at rayleigh X-Virus-Status: Clean X-Spamd-Bar: --- X-Spamd-Result: default: False [-3.90 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-0.999]; DMARC_POLICY_ALLOW(-0.50)[systella.fr,reject]; R_SPF_ALLOW(-0.20)[+a:rayleigh.systella.fr]; R_DKIM_ALLOW(-0.20)[systella.fr:s=mail]; MIME_GOOD(-0.10)[text/plain]; ONCE_RECEIVED(0.10)[]; ASN(0.00)[asn:44407, ipnet:2a0a:1c80::/29, country:FR]; TO_DN_SOME(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; RCVD_COUNT_ONE(0.00)[1]; RCVD_TLS_ALL(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MLMMJ_DEST(0.00)[freebsd-hackers@freebsd.org]; DKIM_TRACE(0.00)[systella.fr:+]; SUBJECT_HAS_QUESTION(0.00)[] X-Rspamd-Queue-Id: 4WN42B15z3z4s5T Hello, On my network, all workstations are diskless (Linux, FreeBSD, NetBSD, even OpenVMS). Main server runs NetBSD 10.0 and acts as: - boot server - nfs server (/ and /home) - iSCSI (for swap) Pythagore runs FreeBSD 14.0 and is a diskless workstation. Legendre runs NetBSD 10.0 and is network's main server. FreeBSD workstation randomly hangs. On server side, I can see in this workstation's logfile: Jul 15 15:54:16 pythagore kernel: WARNING: 192.168.10.128 (iqn.2020-02.fr.systella.legendre.istgt:pythagore): no ping reply (NOP-In) after 5 seconds; reconnecting Jul 15 15:54:31 pythagore kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3242023, size: 32768 Jul 15 15:54:32 pythagore kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3295147, size: 8192 Jul 15 15:54:32 pythagore kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3294054, size: 24576 Jul 15 15:54:32 pythagore kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 2691989, size: 4096 Jul 15 15:54:33 pythagore kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3185488, size: 4096 Jul 15 15:54:33 pythagore kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3244176, size: 4096 Jul 15 15:54:33 pythagore kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1477832, size: 12288 ... Jul 15 15:55:17 pythagore kernel: WARNING: 192.168.10.128 (iqn.2020-02.fr.systella.legendre.istgt:pythagore): login timed out after 61 seconds; reconnecting ... OK. I understand. iSCSI FreeBSD client was disconnected and, as FreeBSD tries to do some operation on swap, kernel hangs. On server side, I have : Jul 15 15:54:17 legendre istgt[26125]: istgt_iscsi.c: 765:istgt_iscsi_read_pdu: ***ERROR*** readv() failed (-1,errno=54,iqn.1994-09.org.freebsd:pythagore,time=0) Jul 15 15:54:17 legendre istgt[26125]: istgt_iscsi.c:5685:worker: ***ERROR*** conn->state = 1 Jul 15 15:54:17 legendre istgt[26125]: istgt_iscsi.c:5702:worker: ***ERROR*** iscsi_read_pdu() failed Jul 15 15:54:17 legendre istgt[26125]: istgt_iscsi.c:1260:istgt_iscsi_write_pdu_internal: ***ERROR*** writev() failed (errno=32,iqn.1994-09.org.freebsd:pythagore,time=0) Jul 15 15:54:17 legendre istgt[26125]: istgt_iscsi.c:3484:istgt_iscsi_transfer_in_internal: ***ERROR*** iscsi_write_pdu() failed Jul 15 15:54:17 legendre istgt[26125]: istgt_iscsi.c:3853:istgt_iscsi_task_response: ***ERROR*** iscsi_transfer_in() failed Jul 15 15:54:17 legendre istgt[26125]: istgt_iscsi.c:5389:sender: ***ERROR*** iscsi_task_response() CmdSN=552123 failed on iqn.2020-02.fr.systella.legendre.istgt:pythagore,t,0x0001(iqn.1994-09.org.freebsd:pythagore,i,0x80f5c96f41e6) All others workstations (mainly Linux and NetBSD) run without trouble. Only FreeBSD triggers this issue. I suppose this bug is relative to FreeBSD, but I'm not sure. Help to fix it will be welcome. Best regards, JB From nobody Tue Jul 16 20:15:24 2024 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4WNr604RdYz5Q9gs for ; Tue, 16 Jul 2024 20:20:08 +0000 (UTC) (envelope-from emil@etsalapatis.com) Received: from mail-io1-xd2f.google.com (mail-io1-xd2f.google.com [IPv6:2607:f8b0:4864:20::d2f]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4WNr5y451cz49f3 for ; Tue, 16 Jul 2024 20:20:06 +0000 (UTC) (envelope-from emil@etsalapatis.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=etsalapatis-com.20230601.gappssmtp.com header.s=20230601 header.b=ZaGZrN1a; dmarc=none; spf=pass (mx1.freebsd.org: domain of emil@etsalapatis.com designates 2607:f8b0:4864:20::d2f as permitted sender) smtp.mailfrom=emil@etsalapatis.com Received: by mail-io1-xd2f.google.com with SMTP id ca18e2360f4ac-7f70a708f54so10776339f.3 for ; Tue, 16 Jul 2024 13:20:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=etsalapatis-com.20230601.gappssmtp.com; s=20230601; t=1721161205; x=1721766005; darn=freebsd.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=bcd9+RGWey98wJdyRlYbVdoe2hQBPk1Z4srAjhn6ZJQ=; b=ZaGZrN1aCth167d6L6CEc/17WaTJerdIMmF4KXNRsmfQB9GaP0KiOXcDdSCLkgJA/m l6QQ5HuadYHFQMGySIgWLyKOctXe6rfc2uxBj16uujc6/4IWtz9mhsCQGajWTCn+lZ3D p2ArIiGMuJMIcDoaitrFMww6U5iITb9oqXLkedfBLsOUyrTXBG5nLi04Yj1Ez1Fdmvim 8amwwHf3mjGY6KNH/pTslDGNzIAakCNXGPl6u/AU2+Lr7AzuLAEyLodnis5i13E04ip5 HclaD9cRFl92jmKr+cYYL9yl7fr97f93J1yu2PZKJZPSWO4CL7aMSrrmdtlc3ZSMv0C4 WauA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721161205; x=1721766005; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=bcd9+RGWey98wJdyRlYbVdoe2hQBPk1Z4srAjhn6ZJQ=; b=vRAs/PLeE8nHGOh4RXvp6+YZvACvBBN5wys29F5JaB6o72WU9aK4tZXMSckxInI6Kp SOtUe7EQfBcpr0i1AKH7dWJhA1otS0KZ+ygcX+Yj/yjg/+8ujLCUUe3JMQSG/v4xIT0V Gal+4/IUPTz/ACT9AhZ9dIUjFRp7cJIcp1G968T4OJq3ZnOrL0JosOhYnRMVhu6IKmK4 0kbWRcmj/WKMUiD3Ev+SCNKVm1HgYm+DqujxLsgiqsDJ6WhDb5pY7pdbA60CNcALLt72 um5YnPNzIa0udJrvqSNJvKW2VlwwOoEynNev/1TWq0Sjj0OVWG+CIgWKq8H2mON5JbAa ggBQ== X-Forwarded-Encrypted: i=1; AJvYcCWKo+u7PH+p1BxOTltLgIQVL07HeOcA9MfmU0yKwLhmRrX1EAYvHAVo1vU0miUsUtZtKp+RECQwDBDQOz+G7ZHRfmVzR4cjYWq6Lrc= X-Gm-Message-State: AOJu0YxmOe2126gd7RkKAd7BmAuXkKbaTGOk1o7bZf1WwooHhLpjBFcL JMBNEgyZA0evAjf8iN1q1CM9tvPvsnjbrxkLLUzx6TxXgRVTwc0R/7GILTNHysL2L90r8QmV0Tl C/rZXkOxU/nUno3Ldj1eALSnr0SM1wGBQyS3rIQ== X-Google-Smtp-Source: AGHT+IHRzwYALmHJVJ8kB171cVEu2gZv9U/x3mKHX9U1kHzX26U11RzXMnhlXl+zXYtAT5vc5tZllKAZv405PeFxeSU= X-Received: by 2002:a05:6602:601b:b0:803:980e:5b39 with SMTP id ca18e2360f4ac-816c2c0d602mr86107739f.4.1721161204892; Tue, 16 Jul 2024 13:20:04 -0700 (PDT) List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@FreeBSD.org MIME-Version: 1.0 References: <75944503-8599-43CF-84C5-0C10CA325761@freebsd.org> In-Reply-To: <75944503-8599-43CF-84C5-0C10CA325761@freebsd.org> From: Emil Tsalapatis Date: Tue, 16 Jul 2024 16:15:24 -0400 Message-ID: Subject: Re: Is anyone working on VirtFS (FUSE over VirtIO) To: David Chisnall Cc: Warner Losh , Alan Somers , FreeBSD Hackers Content-Type: multipart/alternative; boundary="00000000000038270d061d631223" X-Spamd-Bar: --- X-Spamd-Result: default: False [-3.20 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-0.998]; FORGED_SENDER(0.30)[freebsd-lists@etsalapatis.com,emil@etsalapatis.com]; R_DKIM_ALLOW(-0.20)[etsalapatis-com.20230601.gappssmtp.com:s=20230601]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; RCVD_COUNT_ONE(0.00)[1]; MIME_TRACE(0.00)[0:+,1:+,2:~]; RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::d2f:from]; ARC_NA(0.00)[]; MISSING_XM_UA(0.00)[]; FROM_HAS_DN(0.00)[]; DMARC_NA(0.00)[etsalapatis.com]; TO_MATCH_ENVRCPT_SOME(0.00)[]; FROM_NEQ_ENVFROM(0.00)[freebsd-lists@etsalapatis.com,emil@etsalapatis.com]; RCPT_COUNT_THREE(0.00)[4]; PREVIOUSLY_DELIVERED(0.00)[freebsd-hackers@freebsd.org]; MLMMJ_DEST(0.00)[freebsd-hackers@freebsd.org]; TO_DN_ALL(0.00)[]; RCVD_TLS_LAST(0.00)[]; DKIM_TRACE(0.00)[etsalapatis-com.20230601.gappssmtp.com:+] X-Rspamd-Queue-Id: 4WNr5y451cz49f3 --00000000000038270d061d631223 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi, On Mon, Jul 15, 2024 at 3:47=E2=80=AFAM David Chisnall wrote: > Hi, > > This looks great! Are there infrastructure problems with supporting the > DAX or is it =E2=80=98just work=E2=80=99? I had hoped that the extensions= to the buffer > cache that allow ARC to own pages that are delegated to the buffer cache > would be sufficient. > > After going over the Linux code, I think adding direct mapping doesn't require any changes outside of FUSE and virtio code. Direct mapping mainly requires code to manage the virtiofs device's memory region in the driver. This is a shared memory region between guest and host with which the driver backs FUSE inodes. The driver then includes an allocator used to map parts of an inode into the region. It should be possible to pass host-guest shared pages to ARC, with the caveat that the virtiofs driver should be able to reclaim them at any time. Does the code currently allow this? Virtiofs needs this because it maps region pages to inodes, and must reuse cold region pages during an allocation if there aren't any available. Basically, the region is a separate pool of device pages that's managed directly by virtiofs. If I understand the protocol correctly, the DAX mode is the same as the > direct mmap mode in FUSE (not sure if FreeBSD!=E2=80=99s kernel fuse bits= support > this?). > > Yeah, virtiofs DAX seems like it's similar to FUSE direct mmap, but with FUSE inodes being backed by the shared region instead. I don't think FreeBSD has direct mmap but I may be wrong there. Emil > David > > On 14 Jul 2024, at 15:07, Emil Tsalapatis > wrote: > > =EF=BB=BF > Hi David, Warner, > > I'm glad you find this approach interesting! I've been meaning to > update the virtio-dbg patch for a while but unfortunately haven't found t= he > time in the last month since I uploaded it... I'll update it soon to > address the reviews and split off the userspace device emulation code out > of the patch to make reviewing easier (thanks Alan for the suggestion). I= f > you have any questions or feedback please let me know. > > WRT virtiofs itself, I've been working on it too but I haven't found the > time to clean it up and upload it. I have a messy but working > implementation here > . The changes to > FUSE itself are indeed minimal because it is enough to redirect the > messages into a virtiofs device instead of sending them to a local FUSE > device. The virtiofs device and the FUSE device are both simple > bidirectional queues. Not sure on how to deal with directly mapping files > between host and guest just yet, because the Linux driver uses their DAX > interface for that, but it should be possible. > > Emil > > On Sun, Jul 14, 2024 at 3:11=E2=80=AFAM David Chisnall > wrote: > >> Wow, that looks incredibly useful. Not needing bhyve / qemu (nested, if >> your main development is a VM) to test virtio drivers would be a huge >> productivity win. >> >> David >> >> On 13 Jul 2024, at 23:06, Warner Losh wrote: >> >> Hey David, >> >> You might want to check out https://reviews.freebsd.org/D45370 which >> has the testing framework as well as hints at other work that's been don= e >> for virtiofs by Emil Tsalapatis. It looks quite interesting. Anything he= 's >> done that's at odds with what I've said just shows where my analysis was >> flawed :) This looks quite promising, but I've not had the time to look = at >> it in detail yet. >> >> Warner >> >> On Sat, Jul 13, 2024 at 2:44=E2=80=AFAM David Chisnall >> wrote: >> >>> On 31 Dec 2023, at 16:19, Warner Losh wrote: >>> >>> >>> Yea. The FUSE protocol is going to be the challenge here. For this to b= e >>> useful, the VirtioFS support on the FreeBSD needs to be 100% in the >>> kernel, since you can't have userland in the loop. This isn't so terrib= le, >>> though, since our VFS interface provides a natural breaking point for >>> converting the requests into FUSE requests. The trouble, I fear, is a >>> mismatch between FreeBSD's VFS abstraction layer and Linux's will cause >>> issues (many years ago, the weakness of FreeBSD VFS caused problems for= a >>> company doing caching, though things have no doubt improved from those >>> days). Second, there's a KVM tie-in for the direct mapped pages between= the >>> VM and the hypervisor. I'm not sure how that works on the client (FreeB= SD) >>> side (though the description also says it's mapped via a PCI bar, so ma= ybe >>> the VM OS doesn't care). >>> >>> >>> From what I can tell from a little bit of looking at the code, our FUSE >>> implementation has a fairly cleanly abstracted layer (in fuse_ipc.c) fo= r >>> handling the message queue. For VirtioFS, it would 'just' be necessary= to >>> factor out the bits here that do uio into something that talked to a Vi= rtIO >>> ring. I don=E2=80=99t know what the VFS limitations are, but since the= protocol >>> for VirtioFS is the kernel <-> userspace protocol for FUSE, it seems th= at >>> any functionality that works with FUSE filesystems in userspace would w= ork >>> with VirtioFS filesystems. >>> >>> The shared buffer cache bits are nice, but are optional, so could be >>> done in a later version once the basic functionality worked. >>> >>> David >>> >>> >> --00000000000038270d061d631223 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi,


After going over the Linux code, I think adding direct mapping doesn't requi= re any changes outside of FUSE and virtio code. Direct mapping mainly=20 requires code to manage the virtiofs device's memory region in the driv= er.=20 This is a shared memory region between guest and host with which the=20 driver backs FUSE inodes. The driver then includes an allocator used to=20 map parts of an inode into the region.

It should be possible to pass host-guest shared pages to ARC, = with=20 the caveat that the virtiofs driver should be able to reclaim them at=20 any time. Does the code currently allow this? Virtiofs needs this because i= t maps region pages to inodes, and must reuse cold region pages during an a= llocation if there aren't any available.=20 Basically, the region is a separate pool of device pages that's managed= =20 directly by virtiofs.

<= div dir=3D"ltr">
If I understand the protocol correct= ly, the DAX mode is the same as the direct mmap mode in FUSE (not sure if F= reeBSD!=E2=80=99s kernel fuse bits support this?).

=EF=BB=BF
Hi David, Warner,

=
=C2=A0=C2=A0=C2=A0 I'm glad you find this approach interesting! I&= #39;ve been meaning to update the virtio-dbg patch for a while but unfortun= ately haven't found the time in the last month since I uploaded it... I= 'll update it soon to address the reviews and split off the=20 userspace device emulation code out of the patch to make reviewing=20 easier (thanks Alan for the suggestion). If you have any questions or feedb= ack please let me know.

WRT virtiofs itself, I= 've been working on it too but I haven't found the time to clean it= up and upload it. I have a messy but working implementation her= e. The changes to FUSE itself are indeed minimal because it is enough t= o redirect the messages into a virtiofs device instead of sending them to a= local FUSE device. The virtiofs device and the FUSE device are both simple= bidirectional queues. Not sure on how to deal with directly mapping files = between host and guest just yet, because the Linux driver uses their DAX in= terface for that, but it should be possible.

E= mil

On Sun, Jul 14, 2024 at 3:11=E2=80=AFAM David Chisnall <theraven@freebsd.org= > wrote:
=
Wow, that looks incredibly useful.=C2=A0 Not needing bhyve / qemu (nes= ted, if your main development is a VM) to test virtio drivers would be a hu= ge productivity win. =C2=A0

David

On 13 Jul 2024, at 23:06, Warner Losh <imp@bsdimp.com> wrote:<= /div>
Hey David,

You= might want to check out=C2=A0 https://reviews.freebsd.org/D45370 which has the t= esting framework as well as hints at other work that's been done for vi= rtiofs=C2=A0by Emil=C2=A0Tsalapatis. It looks quite interesting. Anything h= e's done that's at odds with what I've said just shows where my= analysis was flawed :) This looks quite promising, but I've not had th= e time to look at it in detail yet.

Warner

On= Sat, Jul 13, 2024 at 2:44=E2=80=AFAM David Chisnall <theraven@freebsd.org> wrote:=
On 31 Dec = 2023, at 16:19, Warner Losh <imp@bsdimp.com> wrote:
Yea. The FUSE protocol is going to be the challeng= e here. For this to be useful, the VirtioFS=C2=A0support on=C2=A0the FreeBS= D=C2=A0 needs to be 100% in the kernel, since you can't have userland i= n the loop. This isn't so terrible, though, since our VFS interface pro= vides a natural breaking point for converting the requests into FUSE reques= ts. The trouble, I fear, is a mismatch between FreeBSD's VFS abstractio= n layer and Linux's will cause issues (many years ago, the weakness of = FreeBSD VFS caused problems for a company doing caching, though things have= no doubt improved from those days). Second, there's a KVM tie-in for t= he direct mapped pages between the VM and the hypervisor. I'm not sure = how that works on the client (FreeBSD) side (though the description also sa= ys it's mapped via a PCI bar, so maybe the VM OS doesn't care).

From what I can tell from a little bit = of looking at the code, our FUSE implementation has a fairly cleanly abstra= cted layer (in fuse_ipc.c) for handling the message queue.=C2=A0 For Virtio= FS, it would 'just' be necessary to factor out the bits here that d= o uio into something that talked to a VirtIO ring.=C2=A0 I don=E2=80=99t kn= ow what the VFS limitations are, but since the protocol for VirtioFS is the= kernel <-> userspace protocol for FUSE, it seems that any functional= ity that works with FUSE filesystems in userspace would work with VirtioFS = filesystems.

The shared buffer cache bits are nice= , but are optional, so could be done in a later version once the basic func= tionality worked. =C2=A0

David


--00000000000038270d061d631223-- From nobody Wed Jul 17 03:02:49 2024 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4WP12j1vWFz5Qpw4 for ; Wed, 17 Jul 2024 03:02:53 +0000 (UTC) (envelope-from linimon@portsmon.org) Received: from MTA-06-4.privateemail.com (mta-06-4.privateemail.com [198.54.122.146]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4WP12h1ncDz4pS8 for ; Wed, 17 Jul 2024 03:02:52 +0000 (UTC) (envelope-from linimon@portsmon.org) Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of linimon@portsmon.org designates 198.54.122.146 as permitted sender) smtp.mailfrom=linimon@portsmon.org Received: from mta-06.privateemail.com (localhost [127.0.0.1]) by mta-06.privateemail.com (Postfix) with ESMTP id D26D31800051 for ; Tue, 16 Jul 2024 23:02:49 -0400 (EDT) Received: from APP-19 (unknown [10.50.14.243]) by mta-06.privateemail.com (Postfix) with ESMTPA for ; Tue, 16 Jul 2024 23:02:49 -0400 (EDT) Date: Tue, 16 Jul 2024 22:02:49 -0500 (CDT) From: Mark Linimon To: FreeBSD Hackers Message-ID: <1339320103.251016.1721185369763@privateemail.com> In-Reply-To: <1156689809.250425.1721184799123@privateemail.com> References: <1156689809.250425.1721184799123@privateemail.com> Subject: requesting volunteers to help with Bugzilla List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Priority: 3 Importance: Normal X-Mailer: Open-Xchange Mailer v7.10.6-Rev67 X-Originating-Client: open-xchange-appsuite X-Virus-Scanned: ClamAV using ClamSMTP X-Spamd-Bar: --- X-Spamd-Result: default: False [-3.40 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-1.000]; R_SPF_ALLOW(-0.20)[+ip4:198.54.122.128/27]; MIME_GOOD(-0.10)[text/plain]; RWL_MAILSPIKE_GOOD(-0.10)[198.54.122.146:from]; RCVD_VIA_SMTP_AUTH(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; FREEFALL_USER(0.00)[linimon]; ASN(0.00)[asn:22612, ipnet:198.54.122.0/24, country:US]; RCPT_COUNT_ONE(0.00)[1]; FROM_HAS_DN(0.00)[]; R_DKIM_NA(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; HAS_X_PRIO_THREE(0.00)[3]; FROM_EQ_ENVFROM(0.00)[]; ARC_NA(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; TO_DN_ALL(0.00)[]; PREVIOUSLY_DELIVERED(0.00)[freebsd-hackers@freebsd.org]; MLMMJ_DEST(0.00)[freebsd-hackers@freebsd.org]; DMARC_NA(0.00)[portsmon.org]; RCVD_IN_DNSWL_NONE(0.00)[198.54.122.146:from] X-Rspamd-Queue-Id: 4WP12h1ncDz4pS8 (attempting a resend) FreeBSD maintains its own patches to upstream Bugzilla. Some are done in the port, but others are done in a git repo. I have been trying to update my forked copy of the git repo with all the changes that have been made on production over the last N years. Now it's time for me to upstream those changes. I need to work with someone to look over the changes (mostly in Perl, with which I am not fluent) and someone at least familiar with git, to walk me through this process. It is too important for me to mess this up. Of course, help with Bugzilla in general would be greatly appreciated. There are more bugs filed against our instance of Bugzilla itself that I will ever be able to handle. The overall wiki page is: https://wiki.freebsd.org/Bugzilla but I can point interested people into the gory details on the subpages. Yes, this would involve a non-zero time committment. TIA. mcl From nobody Wed Jul 17 08:31:26 2024 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4WP8L412W6z5QrL1 for ; Wed, 17 Jul 2024 08:31:40 +0000 (UTC) (envelope-from theraven@freebsd.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4WP8L3695Gz4NHp; Wed, 17 Jul 2024 08:31:39 +0000 (UTC) (envelope-from theraven@freebsd.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1721205099; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pFZDwUX51MadY6EzqtXM7v7tcF0ywvYOWWrDzT/Sl0o=; b=bmgC4nBYmo3jJ6gBCR5qjDP5yYZ/AJC/gUbTEKv967UZlcXJf1B4NzMWDAQA46xm+Gd+pQ o/fN593OFh9AMkWLQllW6FbX2qE09kR8OLwbBAX5XaFkeNcg5uH1pfL9YtaWEmazd/WqSh TvfTUH0yNkK/8Xty1KFCZtbUv5/j/ko5YLsFkRW19BprSxqpkrZXK9bkigBNLNuNQX1KMy 5J7lmEp28r7HT9UNFzMwqgteD91OI6bJKH4flmewzvtLKDlrvP2mLfpQoprPDH9V+Mw+od 6ZAVhLH/FNRZRXSPiW+PDJ/fIaAXAd0HR7lgTlvNERxnd2y9uDY/ToNu4CPO4A== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1721205099; a=rsa-sha256; cv=none; b=tsCEq36UkpLRO0a2/lAOch60bPlXM1fjifeQlWiGHrHJ908ZMou9caYyv7/hom/DAmDG9S sKYux6p1bhxSHAtPHaD7/z3BmwRe2uQYdnpkvkYoaYAUnGZrIm+V1TFNt3ekx0Q9bUF6ch cldMVuThWtClNjaf5IXPw89FSxNCs5GwskADeLMomXI9lfZ/HJWswpgsNijHK7ChdhN9oT E2S7Sl7JNNOLY5W+goPzXco+1Dv+YfJ6JtKxTlaXrpZbds3gwQcOV2zQfITlwAO/QmnTWD daEgJwTxlXTRmA6IvNyFFm/abKJQsoUL4u+QRT/v1HgXPwFN/tbxLpq6KcGoOQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1721205099; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pFZDwUX51MadY6EzqtXM7v7tcF0ywvYOWWrDzT/Sl0o=; b=Q0OWl9WgHHxK7nZt3IT0nafzYCc9OHc2MXvKh4zOqkVkb7ZccuShc8BsJwVzn/S1W2TWA6 ENZD7B99do1tIdQ8skmcfKAqFY+hUfE9S4rZlU/S0HbPKsFrfjDcByFaKDfaRgl8MyZW0k TGMRuAmDcmRnSe+SbOa0My9nIIcLeUEgj57yHP+kFuSZBn0d9NFwVhYmD4YwaSNw6teBrg z3PpfdYWiH8ap6U0b/0B6GRDTG1txoWUUaEUBivlJuUTwpjg5Ey9L/bN1PqQo7Dg+hXK2s Tu4TS4DB/yzqiOkBqYeUA7mc3StEJUbVCZAY1546+tAf/oMPKZLSUM7xDa5awg== Received: from smtp.theravensnest.org (smtp.theravensnest.org [45.77.103.195]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: theraven) by smtp.freebsd.org (Postfix) with ESMTPSA id 4WP8L35YzHz1CbL; Wed, 17 Jul 2024 08:31:39 +0000 (UTC) (envelope-from theraven@freebsd.org) Received: from smtpclient.apple (host86-138-165-11.range86-138.btcentralplus.com [86.138.165.11]) by smtp.theravensnest.org (Postfix) with ESMTPSA id 361E0450E; Wed, 17 Jul 2024 09:31:39 +0100 (BST) Content-Type: multipart/alternative; boundary=Apple-Mail-768DFAE2-A24A-4D17-AE5A-3111C6A48B63 Content-Transfer-Encoding: 7bit From: David Chisnall List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@FreeBSD.org Mime-Version: 1.0 (1.0) Subject: Re: Is anyone working on VirtFS (FUSE over VirtIO) Date: Wed, 17 Jul 2024 09:31:26 +0100 Message-Id: <9F249E56-4053-45A3-96FC-179C01AFB084@freebsd.org> References: Cc: Warner Losh , Alan Somers , FreeBSD Hackers In-Reply-To: To: Emil Tsalapatis X-Mailer: iPad Mail (21F90) --Apple-Mail-768DFAE2-A24A-4D17-AE5A-3111C6A48B63 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable > On 16 Jul 2024, at 21:20, Emil Tsalapatis w= rote!: >=20 > After going over the Linux code, I think adding direct mapping doesn't req= uire any changes outside of FUSE and virtio code. Direct mapping mainly requ= ires code to manage the virtiofs device's memory region in the driver. This i= s a shared memory region between guest and host with which the driver backs = FUSE inodes. The driver then includes an allocator used to map parts of an i= node into the region. That=E2=80=99s how I understood the spec too. > It should be possible to pass host-guest shared pages to ARC, with the cav= eat that the virtiofs driver should be able to reclaim them at any time. Doe= s the code currently allow this? Virtiofs needs this because it maps region p= ages to inodes, and must reuse cold region pages during an allocation if the= re aren't any available. Basically, the region is a separate pool of device p= ages that's managed directly by virtiofs. I am not overly familiar with the buffer cache code, but I believe the code t= hat was added to support ARC had similar requirements. The first ZFS port ha= d pages in ARC and then exactly the same data in the buffer cache. The buffe= r cache was extended with a notion of pages that it didn=E2=80=99t own so th= at it could just use the pages in ARC directly. I don=E2=80=99t remember if there=E2=80=99s existing support for ARC to remo= ve those pages from the buffer cache. They are both kernel pages so it would= be possible to just treat removing them from ARC as an accounting operation= . There is, I believe, support for the pager to remove arbitrary pages and s= o it might be simple to just add a new kind of pager for these pages (which j= ust tells the host to flush the pages). >> If I understand the protocol correctly, the DAX mode is the same as the d= irect mmap mode in FUSE (not sure if FreeBSD!=E2=80=99s kernel fuse bits sup= port this?). >>=20 >=20 >=20 > Yeah, virtiofs DAX seems like it's similar to FUSE direct mmap, but with FU= SE inodes being backed by the shared region instead. I don't think FreeBSD h= as direct mmap but I may be wrong there. It would be a nice feature to have if not! David --Apple-Mail-768DFAE2-A24A-4D17-AE5A-3111C6A48B63 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable

On 16 Jul 2024, at 21:= 20, Emil Tsalapatis <freebsd-lists@etsalapatis.com> wrote!:

After going over the Linux code, I think adding direct mapping doesn't require an= y changes outside of FUSE and virtio code. Direct mapping mainly=20 requires code to manage the virtiofs device's memory region in the driver.=20= This is a shared memory region between guest and host with which the=20 driver backs FUSE inodes. The driver then includes an allocator used to=20 map parts of an inode into the region.

That=E2=80=99s how I understood the spec too.
=
It should be possible to pass host-guest= shared pages to ARC, with=20 the caveat that the virtiofs driver should be able to reclaim them at=20 any time. Does the code currently allow this? Virtiofs needs this because it= maps region pages to inodes, and must reuse cold region pages during an all= ocation if there aren't any available.=20 Basically, the region is a separate pool of device pages that's managed=20 directly by virtiofs.
I am not overly familiar with the buffer cache code, but I belie= ve the code that was added to support ARC had similar requirements. The firs= t ZFS port had pages in ARC and then exactly the same data in the buffer cac= he. The buffer cache was extended with a notion of pages that it didn=E2=80=99= t own so that it could just use the pages in ARC directly.

I don=E2=80=99t remember if there=E2=80=99s existing support for ARC= to remove those pages from the buffer cache. They are both kernel pages so i= t would be possible to just treat removing them from ARC as an accounting op= eration. There is, I believe, support for the pager to remove arbitrary page= s and so it might be simple to just add a new kind of pager for these pages (= which just tells the host to flush the pages).

<= div dir=3D"ltr">
If I understand the protocol correctl= y, the DAX mode is the same as the direct mmap mode in FUSE (not sure if Fre= eBSD!=E2=80=99s kernel fuse bits support this?).

<= /div>


Yeah, virtiofs DAX seems like it'= s similar to FUSE=20 direct mmap, but with FUSE inodes being backed by the shared region instead.= I=20 don't think FreeBSD has direct mmap but I may be wrong there.

It would be a nice feature to have if not!=

David

= --Apple-Mail-768DFAE2-A24A-4D17-AE5A-3111C6A48B63-- From nobody Fri Jul 19 16:08:54 2024 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4WQZP75JB5z5RtR6 for ; Fri, 19 Jul 2024 16:09:15 +0000 (UTC) (envelope-from ararslan@comcast.net) Received: from resqmta-h2p-567062.sys.comcast.net (resqmta-h2p-567062.sys.comcast.net [IPv6:2001:558:fd02:2446::a]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4WQZP63sj8z52Xc for ; Fri, 19 Jul 2024 16:09:14 +0000 (UTC) (envelope-from ararslan@comcast.net) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=comcast.net header.s=20190202a header.b=0W5y4cOu; dmarc=pass (policy=quarantine) header.from=comcast.net; spf=pass (mx1.freebsd.org: domain of ararslan@comcast.net designates 2001:558:fd02:2446::a as permitted sender) smtp.mailfrom=ararslan@comcast.net Received: from resomta-h2p-540626.sys.comcast.net ([96.102.179.210]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 256/256 bits) (Client did not present a certificate) by resqmta-h2p-567062.sys.comcast.net with ESMTPS id UmqJsm6kjYAjuUqAMsELkZ; Fri, 19 Jul 2024 16:09:06 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=20190202a; t=1721405346; bh=WI6IJnWQbylSIKVaftEkQG5zaXIqfJoseu/Wobk0bJ8=; h=Received:Received:Content-Type:Mime-Version:Subject:From:Date: Message-Id:To:Xfinity-Spam-Result; b=0W5y4cOuT00Rz8i5441B+sqJpb8A4Z2v964Xw4A4D5tnMnMCREXgQftqYEloOCGP6 X3gGutZZrfSKpHyggITQJIV0DnJfAm9IMyZkbeoQe1x7tHJW56rA17ISh2Tx5dZXc1 qS2COvifjfuzmDI+lIhMX6gq2xveVQjnweOspw0dDxMtIpPUcCX1Odse8gTMbCuAvt D9QnhWFi3qyjO+B7amDvu8NAgoJFCArzmUOsGdTqbfbTDOXkq7LjQh19pRKdO/ivY8 Fo4ErBsKzRQiEQHj+fXtB/xBWj13Q5Pdyawpu8iIcbGx4ub4rHypDlrDLUWDin+GhN uDD9VMoGwikew== Received: from smtpclient.apple ([67.160.29.205]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 256/256 bits) (Client did not present a certificate) by resomta-h2p-540626.sys.comcast.net with ESMTPSA id UqAKs2H5ZGEJIUqALsOXnu; Fri, 19 Jul 2024 16:09:06 +0000 Content-Type: text/plain; charset=us-ascii List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@FreeBSD.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.600.62\)) Subject: Re: Diagnosing virtual machine network issues From: Alex Arslan In-Reply-To: <4a5a177a-5356-453c-8a09-f1d63d5d2e16@sentex.net> Date: Fri, 19 Jul 2024 09:08:54 -0700 Cc: freebsd-hackers@freebsd.org Content-Transfer-Encoding: quoted-printable Message-Id: <4AB1C33B-DD93-4484-B63A-9FF8FE612B15@comcast.net> References: <4a5a177a-5356-453c-8a09-f1d63d5d2e16@sentex.net> To: mike tancsa X-Mailer: Apple Mail (2.3774.600.62) X-CMAE-Envelope: MS4xfOVb3c66GXFXJoKbc5vj3GoDRxoMFVffEV2w8LvC0xx/eeC++nhn9d5OCer7gXoUlfn1hJFtW41pZPBbA6xD13Nm7JM+llS+9zICRQywIGFaZhbePuVS CCBuruz3Xzy4HQQsI7Q18Nb+i+nPG0udM8Vh0KT+TFC9jFFgVD+V9ms+5l8T4Kcs190k7Hwps7mijzARiECS87Oj9kD/tbJG3AuX3eXSfcujUQKRYW8qBn7B X-Spamd-Bar: / X-Spamd-Result: default: False [-1.00 / 15.00]; HFILTER_HELO_5(3.00)[resqmta-h2p-567062.sys.comcast.net]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-0.998]; DMARC_POLICY_ALLOW(-0.50)[comcast.net,quarantine]; R_SPF_ALLOW(-0.20)[+ip6:2001:558:fd02:2446::/64]; R_DKIM_ALLOW(-0.20)[comcast.net:s=20190202a]; MIME_GOOD(-0.10)[text/plain]; FROM_HAS_DN(0.00)[]; ARC_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; RCPT_COUNT_TWO(0.00)[2]; FREEMAIL_FROM(0.00)[comcast.net]; TO_DN_SOME(0.00)[]; RCVD_TLS_ALL(0.00)[]; ASN(0.00)[asn:7922, ipnet:2001:558::/29, country:US]; FREEMAIL_ENVFROM(0.00)[comcast.net]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; DKIM_TRACE(0.00)[comcast.net:+]; MLMMJ_DEST(0.00)[freebsd-hackers@freebsd.org]; APPLE_MAILER_COMMON(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; DWL_DNSWL_NONE(0.00)[comcast.net:dkim] X-Rspamd-Queue-Id: 4WQZP63sj8z52Xc > I would start a pcap inside and outside of the VM for all udp port 53 = traffic as a start to see if its a network issue going out of the box. = If it happens frequently and you think it might be the network, perhaps = try with the Intel em driver instead of the virtio network driver ? Thanks so much for your help! The way I implemented your pcap suggestion was to use tcpdump, hopefully that's correct. I ran tcpdump simultaneously on the host and VM then ran the code where libcurl gives a timeout rather than the expected domain resolution failure. The output is below. I'm pretty well outside of my depth here; what is it I'm looking for that would be indicative of a network issue going out of the VM? Linux host: $ sudo /usr/sbin/tcpdump -v -i any 'host 192.168.122.35 and port 53' tcpdump: listening on any, link-type LINUX_SLL (Linux cooked v1), = capture size 262144 bytes 21:06:03.320754 IP (tos 0x0, ttl 64, id 29048, offset 0, flags [none], = proto UDP (17), length 60) 192.168.122.35.24119 > amdci6.domain: 23532+ A? domain.invalid. (32) 21:06:03.320754 IP (tos 0x0, ttl 64, id 29048, offset 0, flags [none], = proto UDP (17), length 60) 192.168.122.35.24119 > amdci6.domain: 23532+ A? domain.invalid. (32) 21:06:03.321633 IP (tos 0x0, ttl 64, id 27798, offset 0, flags [none], = proto UDP (17), length 73) 192.168.122.35.18137 > amdci6.domain: 61699+ PTR? = 35.122.168.192.in-addr.arpa. (45) 21:06:03.321633 IP (tos 0x0, ttl 64, id 27798, offset 0, flags [none], = proto UDP (17), length 73) 192.168.122.35.18137 > amdci6.domain: 61699+ PTR? = 35.122.168.192.in-addr.arpa. (45) 21:06:03.321701 IP (tos 0x0, ttl 64, id 44762, offset 0, flags [DF], = proto UDP (17), length 113) amdci6.domain > 192.168.122.35.18137: 61699* 1/0/0 = 35.122.168.192.in-addr.arpa. PTR freebsd-debugging-amdci6-0. (85) 21:06:03.321707 IP (tos 0x0, ttl 64, id 44762, offset 0, flags [DF], = proto UDP (17), length 113) amdci6.domain > 192.168.122.35.18137: 61699* 1/0/0 = 35.122.168.192.in-addr.arpa. PTR freebsd-debugging-amdci6-0. (85) 21:06:03.322188 IP (tos 0x0, ttl 64, id 27799, offset 0, flags [none], = proto UDP (17), length 72) 192.168.122.35.37631 > amdci6.domain: 23871+ PTR? = 1.122.168.192.in-addr.arpa. (44) 21:06:03.322188 IP (tos 0x0, ttl 64, id 27799, offset 0, flags [none], = proto UDP (17), length 72) 192.168.122.35.37631 > amdci6.domain: 23871+ PTR? = 1.122.168.192.in-addr.arpa. (44) 21:06:08.446737 IP (tos 0x0, ttl 64, id 29049, offset 0, flags [none], = proto UDP (17), length 60) 192.168.122.35.24119 > amdci6.domain: 23532+ A? domain.invalid. (32) 21:06:08.446737 IP (tos 0x0, ttl 64, id 29049, offset 0, flags [none], = proto UDP (17), length 60) 192.168.122.35.24119 > amdci6.domain: 23532+ A? domain.invalid. (32) 21:06:18.567376 IP (tos 0x0, ttl 64, id 29050, offset 0, flags [none], = proto UDP (17), length 60) 192.168.122.35.37009 > amdci6.domain: 36459+ AAAA? domain.invalid. = (32) 21:06:18.567376 IP (tos 0x0, ttl 64, id 29050, offset 0, flags [none], = proto UDP (17), length 60) 192.168.122.35.37009 > amdci6.domain: 36459+ AAAA? domain.invalid. = (32) 21:06:23.671046 IP (tos 0x0, ttl 64, id 29051, offset 0, flags [none], = proto UDP (17), length 60) 192.168.122.35.37009 > amdci6.domain: 36459+ AAAA? domain.invalid. = (32) 21:06:23.671046 IP (tos 0x0, ttl 64, id 29051, offset 0, flags [none], = proto UDP (17), length 60) 192.168.122.35.37009 > amdci6.domain: 36459+ AAAA? domain.invalid. = (32) ^C 14 packets captured 20 packets received by filter 2 packets dropped by kernel FreeBSD VM: $ sudo tcpdump -v port 53 tcpdump: listening on vtnet0, link-type EN10MB (Ethernet), capture size = 262144 bytes 21:06:06.179751 IP (tos 0x0, ttl 64, id 29048, offset 0, flags [none], = proto UDP (17), length 60) freebsd-debugging-amdci6-0.24119 > amdci6.domain: 23532+ A? = domain.invalid. (32) 21:06:06.180634 IP (tos 0x0, ttl 64, id 27798, offset 0, flags [none], = proto UDP (17), length 73) freebsd-debugging-amdci6-0.18137 > amdci6.domain: 61699+ PTR? = 35.122.168.192.in-addr.arpa. (45) 21:06:06.180826 IP (tos 0x0, ttl 64, id 44762, offset 0, flags [DF], = proto UDP (17), length 113) amdci6.domain > freebsd-debugging-amdci6-0.18137: 61699* 1/0/0 = 35.122.168.192.in-addr.arpa. PTR freebsd-debugging-amdci6-0. (85) 21:06:06.181193 IP (tos 0x0, ttl 64, id 27799, offset 0, flags [none], = proto UDP (17), length 72) freebsd-debugging-amdci6-0.37631 > amdci6.domain: 23871+ PTR? = 1.122.168.192.in-addr.arpa. (44) 21:06:06.194107 IP (tos 0x0, ttl 64, id 44764, offset 0, flags [DF], = proto UDP (17), length 118) amdci6.domain > freebsd-debugging-amdci6-0.37631: 23871 2/0/0 = 1.122.168.192.in-addr.arpa. PTR amdci6., 1.122.168.192.in-addr.arpa. PTR = amdci6.local. (90) 21:06:11.305743 IP (tos 0x0, ttl 64, id 29049, offset 0, flags [none], = proto UDP (17), length 60) freebsd-debugging-amdci6-0.24119 > amdci6.domain: 23532+ A? = domain.invalid. (32) 21:06:21.426439 IP (tos 0x0, ttl 64, id 29050, offset 0, flags [none], = proto UDP (17), length 60) freebsd-debugging-amdci6-0.37009 > amdci6.domain: 36459+ AAAA? = domain.invalid. (32) 21:06:26.530138 IP (tos 0x0, ttl 64, id 29051, offset 0, flags [none], = proto UDP (17), length 60) freebsd-debugging-amdci6-0.37009 > amdci6.domain: 36459+ AAAA? = domain.invalid. (32) ^C 8 packets captured 427 packets received by filter 0 packets dropped by kernel From nobody Fri Jul 19 18:51:26 2024 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4WQf0f0nDhz5S6mZ for ; Fri, 19 Jul 2024 18:51:46 +0000 (UTC) (envelope-from kim@westryn.net) Received: from mail.westryn.net (mail.westryn.net [199.48.135.251]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4WQf0c4hJ2z47n2 for ; Fri, 19 Jul 2024 18:51:44 +0000 (UTC) (envelope-from kim@westryn.net) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=westryn.net header.s=westrynnet header.b=m4AcWf0o; dmarc=pass (policy=none) header.from=westryn.net; spf=pass (mx1.freebsd.org: domain of kim@westryn.net designates 199.48.135.251 as permitted sender) smtp.mailfrom=kim@westryn.net Received: from smtpclient.apple (225x169.ouraynet.com [204.16.225.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.westryn.net (Postfix) with ESMTPSA id F0F7A10009F2 for ; Fri, 19 Jul 2024 12:51:37 -0600 (MDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=westryn.net; s=westrynnet; t=1721415098; bh=g6GFWIM+EWsAUykNAvl8t1DsfOYaz4APv/9VlC+ug0g=; h=From:Subject:Date:To; b=m4AcWf0oC20cnjR+kysJ5mY7lYo3/eMu9/3Ad4t5RKlHeqwjaNW1fhXn1IJm6O/JG x2sI3nAnKKEjOxgWRzC2SvwODb28oxKL2V5PRBvCnKfPzd3wFg4XaKJRIMIu0LxD2j rwJgaUwm56FnmCxQgiYg/fB1nT4JIflwwCC33AQpIJ4N8oXrJIrIDvXCK0YX5060E6 c0eX9WMjfu1lN1L6+H44ZwgJcr4wxnDeGxNUTzQK/PngNsVMwf5CO+4e64RfEYRdNm QZDEo8YHuleZJ7PNaCbX1I0sEz97H+IP9FggCH6GgsI0LwVvl0eRKM59ySc/Idy/9H I6skU7Yc9VdYg== From: Kim Shrier Content-Type: multipart/mixed; boundary="Apple-Mail=_4EA1EF40-4F62-4613-8385-DC0CACC9B62D" List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@FreeBSD.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.600.62\)) Subject: MFC D45651 to stable/14 Message-Id: Date: Fri, 19 Jul 2024 12:51:26 -0600 To: FreeBSD Hackers X-Mailer: Apple Mail (2.3774.600.62) X-Spamd-Bar: --- X-Spamd-Result: default: False [-3.97 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.97)[-0.969]; DMARC_POLICY_ALLOW(-0.50)[westryn.net,none]; R_SPF_ALLOW(-0.20)[+ip4:199.48.135.251]; R_DKIM_ALLOW(-0.20)[westryn.net:s=westrynnet]; MIME_GOOD(-0.10)[multipart/mixed,text/plain]; RCVD_TLS_ALL(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DKIM_TRACE(0.00)[westryn.net:+]; HAS_ATTACHMENT(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; TO_DN_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; ASN(0.00)[asn:36236, ipnet:199.48.132.0/22, country:US]; ARC_NA(0.00)[]; PREVIOUSLY_DELIVERED(0.00)[freebsd-hackers@freebsd.org]; FROM_EQ_ENVFROM(0.00)[]; RCVD_COUNT_ONE(0.00)[1]; MLMMJ_DEST(0.00)[freebsd-hackers@freebsd.org]; APPLE_MAILER_COMMON(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~,3:+,4:~,5:+] X-Rspamd-Queue-Id: 4WQf0c4hJ2z47n2 --Apple-Mail=_4EA1EF40-4F62-4613-8385-DC0CACC9B62D Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 I=E2=80=99m unsure of the proper channels for this request. I would like https://reviews.freebsd.org/D45651 to be MFC=E2=80=99ed to = stable/14 and if possible releng/14.1. I have attached patch files to accomplish these updates. --Apple-Mail=_4EA1EF40-4F62-4613-8385-DC0CACC9B62D Content-Disposition: attachment; filename=stable_14_tcc_symver.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="stable_14_tcc_symver.patch" Content-Transfer-Encoding: 7bit diff --git a/include/stdlib.h b/include/stdlib.h index 67a1cb82761d..ef0e196744cd 100644 --- a/include/stdlib.h +++ b/include/stdlib.h @@ -339,23 +339,26 @@ __uint64_t * parameter, and both are different from the ones expected by the historical * FreeBSD qsort_r() interface. * - * Apply a workaround where we explicitly link against the historical - * interface, qsort_r@FBSD_1.0, in case when qsort_r() is called with - * the last parameter with a function pointer that exactly matches the - * historical FreeBSD qsort_r() comparator signature, so applications - * written for the historical interface can continue to work without - * modification. + * Apply a workaround where we explicitly link against the historical interface, + * qsort_r@FBSD_1.0, in case when qsort_r() is called with the last parameter + * with a function pointer that exactly matches the historical FreeBSD qsort_r() + * comparator signature, so applications written for the historical interface + * can continue to work without modification. Toolchains that don't support + * symbol versioning don't define __sym_compat, so only provide this symbol in + * supported environments. */ +#ifdef __sym_compat #if defined(__generic) || defined(__cplusplus) void __qsort_r_compat(void *, size_t, size_t, void *, int (*)(void *, const void *, const void *)); __sym_compat(qsort_r, __qsort_r_compat, FBSD_1.0); #endif +#endif #if defined(__generic) && !defined(__cplusplus) #define qsort_r(base, nel, width, arg4, arg5) \ __generic(arg5, int (*)(void *, const void *, const void *), \ __qsort_r_compat, qsort_r)(base, nel, width, arg4, arg5) -#elif defined(__cplusplus) +#elif defined(__cplusplus) && defined(__sym_compat) __END_DECLS extern "C++" { static inline void qsort_r(void *base, size_t nmemb, size_t size, diff --git a/sys/sys/cdefs.h b/sys/sys/cdefs.h index 19b7d8fe427d..38be15666231 100644 --- a/sys/sys/cdefs.h +++ b/sys/sys/cdefs.h @@ -112,6 +112,8 @@ #define __CC_SUPPORTS___INLINE 1 #define __CC_SUPPORTS___INLINE__ 1 +#define __CC_SUPPORTS_SYMVER 1 + #define __CC_SUPPORTS___FUNC__ 1 #define __CC_SUPPORTS_WARNING 1 @@ -121,6 +123,14 @@ #endif /* __GNUC__ */ +/* + * TinyC pretends to be gcc 9.3. This is generally good enough to support + * everything FreeBSD... except for the .symver assembler directive. + */ +#ifdef __TINYC__ +#undef __CC_SUPPORTS_SYMVER +#endif + /* * Macro to test if we're using a specific version of gcc or later. */ @@ -540,10 +550,12 @@ __asm__(".section .gnu.warning." #sym); \ __asm__(".asciz \"" msg "\""); \ __asm__(".previous") +#ifdef __CC_SUPPORTS_SYMVER #define __sym_compat(sym,impl,verid) \ __asm__(".symver " #impl ", " #sym "@" #verid) #define __sym_default(sym,impl,verid) \ __asm__(".symver " #impl ", " #sym "@@@" #verid) +#endif /* __CC_SUPPORTS_SYMVER */ #else #define __weak_reference(sym,alias) \ __asm__(".weak alias"); \ @@ -552,10 +564,12 @@ __asm__(".section .gnu.warning.sym"); \ __asm__(".asciz \"msg\""); \ __asm__(".previous") +#ifdef __CC_SUPPORTS_SYMVER #define __sym_compat(sym,impl,verid) \ __asm__(".symver impl, sym@verid") #define __sym_default(impl,sym,verid) \ __asm__(".symver impl, sym@@@verid") +#endif /* __CC_SUPPORTS_SYMVER */ #endif /* __STDC__ */ #endif /* __GNUC__ */ --Apple-Mail=_4EA1EF40-4F62-4613-8385-DC0CACC9B62D Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii --Apple-Mail=_4EA1EF40-4F62-4613-8385-DC0CACC9B62D Content-Disposition: attachment; filename=releng_14.1_tcc_symver.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="releng_14.1_tcc_symver.patch" Content-Transfer-Encoding: 7bit diff --git a/include/stdlib.h b/include/stdlib.h index 67a1cb82761d..ef0e196744cd 100644 --- a/include/stdlib.h +++ b/include/stdlib.h @@ -339,23 +339,26 @@ __uint64_t * parameter, and both are different from the ones expected by the historical * FreeBSD qsort_r() interface. * - * Apply a workaround where we explicitly link against the historical - * interface, qsort_r@FBSD_1.0, in case when qsort_r() is called with - * the last parameter with a function pointer that exactly matches the - * historical FreeBSD qsort_r() comparator signature, so applications - * written for the historical interface can continue to work without - * modification. + * Apply a workaround where we explicitly link against the historical interface, + * qsort_r@FBSD_1.0, in case when qsort_r() is called with the last parameter + * with a function pointer that exactly matches the historical FreeBSD qsort_r() + * comparator signature, so applications written for the historical interface + * can continue to work without modification. Toolchains that don't support + * symbol versioning don't define __sym_compat, so only provide this symbol in + * supported environments. */ +#ifdef __sym_compat #if defined(__generic) || defined(__cplusplus) void __qsort_r_compat(void *, size_t, size_t, void *, int (*)(void *, const void *, const void *)); __sym_compat(qsort_r, __qsort_r_compat, FBSD_1.0); #endif +#endif #if defined(__generic) && !defined(__cplusplus) #define qsort_r(base, nel, width, arg4, arg5) \ __generic(arg5, int (*)(void *, const void *, const void *), \ __qsort_r_compat, qsort_r)(base, nel, width, arg4, arg5) -#elif defined(__cplusplus) +#elif defined(__cplusplus) && defined(__sym_compat) __END_DECLS extern "C++" { static inline void qsort_r(void *base, size_t nmemb, size_t size, diff --git a/sys/sys/cdefs.h b/sys/sys/cdefs.h index 4893ae1662b8..2872f1d5d554 100644 --- a/sys/sys/cdefs.h +++ b/sys/sys/cdefs.h @@ -112,6 +112,8 @@ #define __CC_SUPPORTS___INLINE 1 #define __CC_SUPPORTS___INLINE__ 1 +#define __CC_SUPPORTS_SYMVER 1 + #define __CC_SUPPORTS___FUNC__ 1 #define __CC_SUPPORTS_WARNING 1 @@ -121,6 +123,14 @@ #endif /* __GNUC__ */ +/* + * TinyC pretends to be gcc 9.3. This is generally good enough to support + * everything FreeBSD... except for the .symver assembler directive. + */ +#ifdef __TINYC__ +#undef __CC_SUPPORTS_SYMVER +#endif + /* * Macro to test if we're using a specific version of gcc or later. */ @@ -539,10 +549,12 @@ __asm__(".section .gnu.warning." #sym); \ __asm__(".asciz \"" msg "\""); \ __asm__(".previous") +#ifdef __CC_SUPPORTS_SYMVER #define __sym_compat(sym,impl,verid) \ __asm__(".symver " #impl ", " #sym "@" #verid) #define __sym_default(sym,impl,verid) \ __asm__(".symver " #impl ", " #sym "@@@" #verid) +#endif /* __CC_SUPPORTS_SYMVER */ #else #define __weak_reference(sym,alias) \ __asm__(".weak alias"); \ @@ -551,10 +563,12 @@ __asm__(".section .gnu.warning.sym"); \ __asm__(".asciz \"msg\""); \ __asm__(".previous") +#ifdef __CC_SUPPORTS_SYMVER #define __sym_compat(sym,impl,verid) \ __asm__(".symver impl, sym@verid") #define __sym_default(impl,sym,verid) \ __asm__(".symver impl, sym@@@verid") +#endif /* __CC_SUPPORTS_SYMVER */ #endif /* __STDC__ */ #endif /* __GNUC__ */ --Apple-Mail=_4EA1EF40-4F62-4613-8385-DC0CACC9B62D Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii Kim _ C++ is an off-by-one error --Apple-Mail=_4EA1EF40-4F62-4613-8385-DC0CACC9B62D-- From nobody Fri Jul 19 19:38:04 2024 X-Original-To: hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4WQg2555qMz5QDWr for ; Fri, 19 Jul 2024 19:38:05 +0000 (UTC) (envelope-from jrm@freebsd.org) Received: from smtp.freebsd.org (smtp.freebsd.org [96.47.72.83]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4WQg254c2gz4Hmf for ; Fri, 19 Jul 2024 19:38:05 +0000 (UTC) (envelope-from jrm@freebsd.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1721417885; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type; bh=cd/l851VzHRMHRmOnntIT1KEAdYaXc89sbFLTt7VRIg=; b=QgGIDd43hLhM+P/5AgC/xGh0Rbk1PvJYvFBjF/9LAY+dy2CrRaiTXSngJzrC/alyd9Ip0t yLhhYWNG9By8yph5zBnJKn9TqBeMjpRM6I3AbH1jKs3IzH8p6hlcoOzr8uKvSNtX1+yibN fkAJgYo4ciLj5zslv9nHDRl+mT0SCI+G3D0FijiOS+QZMJ1xAKM+EZjZhTZKbuo4u0CGuy l369X0KxMxFYhk0jeaQUaagDLmGZV7k5yPYVS/eGJgxCy2yZifQdooICMCbplIrFEakAOH kTGCrBOefXBpA36D7l03Zl+HIh3UJXtkewzIu5I/wO5qqKmx2VeBI+AYYKgUXg== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1721417885; a=rsa-sha256; cv=none; b=yrJs2TNKV9JXIRX74NB+M5Y+A7MLoHSN5xFv+g1m1bRHj9EB7qPa1TlyUQH70b6jJjDUB5 ZtRb0NeHJQzmshFAJvILuFrLV/6AQ6fPqHQDlCv4jJbmtRHl6Dm3qW7Xtsfs7znJQ8ogKI FfhPnD8P1ZlhHQx4a4imQBCeS9hPJVczDm72Z028fLaryQ8TkdlXxAasSjkh0zHPInEE40 ciIwo0rxcZ8zZMjRQsFNZ8V2gb4Bnaa61WAu7v+20TED7l7kT0qFgB4iEUWUuUDAcSuDa8 JFxxwIBJRjoxHjsfI+OlnoiR26d0JaM2AoDuq9/cnSC9wDbkkh2ss956MQ5+2w== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1721417885; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type; bh=cd/l851VzHRMHRmOnntIT1KEAdYaXc89sbFLTt7VRIg=; b=M8IAQo8Q6ZbCd8k+sXoJ1uUy4QmmJpQMYDHleFer/m7emJU135QIUww5JuD5eoWiR6dFib EysEbOJLT07mgzoxOX76oRUvxc58ozn3WTSJx9dnJdrghERs0Ey1IX87y/B6xarmseUwvR oRbwmqtzL3//ZrR1spZFYxVEq+HXulkzTfJ1oOqImd+XIsM+Aag8WR0j5lZnbguH7S2v2i CRX0rZq8RKEz5OdBs2lF2AP9jqE2geyhO29YZViIOQn1vRCxCS1r4kpf74HkXP02EM51/1 IN7Y2K3AI1/00rU2vV86bzO/g3UlG9Zvs0/0zyLbM9ggswRyeLxVLaLXrsLM2w== Received: from phe.ftfl.ca.ftfl.ca (drmons0544w-207-231-233-42.dhcp-dynamic.fibreop.ns.bellaliant.net [207.231.233.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) (Authenticated sender: jrm/mail) by smtp.freebsd.org (Postfix) with ESMTPSA id 4WQg252rR3z1S0M for ; Fri, 19 Jul 2024 19:38:05 +0000 (UTC) (envelope-from jrm@freebsd.org) From: Joseph Mingrone To: hackers@FreeBSD.org Subject: USB4/TBT3 support Date: Fri, 19 Jul 2024 16:38:04 -0300 Message-ID: <864j8lmaub.fsf@phe.ftfl.ca> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/31.0.50 List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@FreeBSD.org MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha512; protocol="application/pgp-signature" --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hello, Is anyone working on USB4 / Thunderbolt 3 (TBT3) support? Scott Long did a lot of work on this a few years ago, but he had to move on to other things, so he passed things on to Hans Petter Selasky. Fortunately, Hans Petter dropped the code in a public repository. https://github.com/hselasky/usb4 https://github.com/hselasky/usb4/commit/dd85c216a2a6bee5361c7166595ba6ca461= 578b5 Here is an overview of what Scott shared with me. Mostly completed work: =2D Debug/Trace framework =2D NHI controller driver =2D PCIe bridge driver =2D WMI driver =2D Integrated Connection Manager handshake and authentication handling =2D Router and Config Space layer handling (in progress, almost complete) Remaining work: =2D tbtconfig (userland tool) =2D man pages =2D DMAR/IOMMU integration, PCIe tunnelling control =2D Support for resetting and firmware flashing on the NHI via out-of-band = control =2D Host Connection Manager =2D Cross-domain login =2D ThunderboltIP Here are the details that Scott shared. The driver originally targeted the Thunderbolt 3 controllers that were sold under the names =E2=80=9CAlpineRidge" and =E2=80=9CIcelake= =E2=80=9D, in the late 2010's, before the USB standards group publicly released the USB4 spec. The driver set I wrote was complete enough to activate Thunderbolt3 peripherals that otherwise would be disabled by default when plugged in. The driver also attempted to make it easier to identify things like PCIe tunnels in the topology, but that was mostly cosmetic. Unfortunately, the AlpineRidge chips proved to be extremely hard to work with despite their wide availability, and I spent way too much time fighting them and not enough time developing more useful functionality. The WMI driver was written to work around vexing problems with the Alpine Ridge controller that I never figured out. Much of the infrastructure from the TBT3 support extends to modern USB4 controllers, but there are still a lot of missing pieces. The NHI driver doesn't know how to probe a USB4 controller yet, but that should be easy to fix. Even more important, though, is that the code lacks a functional USB4 Connection Manager. Most of the pieces required to traverse the topology, discover routers and adapters, read and write their properties, and build routes between endpoints exists now, but there's no state machine yet that integrates those pieces together into a real Connection Manager. Without that, no attached peripherals will actually run. The TBT3 controllers like AlpineRidge and IceLake have a connection manager in firmware, so that's why those controllers function even with minimal host OS support. This isn't especially hard code to write, but it's missing nonetheless. Once the connection manager is written, it'll need to configure connections with the USB3, PCIe, and DisplayPort devices that operate over tunnels, and it'll need a cross domain handler for connecting to another host. USB3 tunnel support might require significant changes in the USB3 stack in order to work with USB4. Additionally you might need to write a USB-PD driver. Without it, negotiation on USB-C connectors for power delivery advertisements, cable orientation, alt mode configuration, and USB3 vs USB4 lane assignment might not work. If those negotiations are not handled then nothing that you plug into the port will even be seen by the controller. I'd totally stay away from spending time on supporting Falcon Ridge, Alpine Ridge, and Titan Ridge controllers. They're old, they're extremely difficult to work with, and they're not worth the headache. In fact, just ignore all TBT3 controllers, and remove the Internal Connection Manager code. The ICM module isn't code that I'm all that proud of anyways =3D-). Focus on writing an HCM, supporting PCIe and DP tunneling, and integrating IOMMU protections into both the NHI driver and the PCIe tunnel drivers. One thing that would be pretty awesome is ThunderboltIP support. It turns out that these controllers are really cheap 40Gbps devices, and have the potential to perform pretty well at line rate as a replacement for traditional 40Gb ethernet controllers, at least in a point-to-point configuration. In short, a lot of work has been done, but a lot of work remains to be done. The USB4 spec is complicated, and requires intimate knowledge of the USB-PD, USB-C, and USB3 specs. There's also a spec for writing a Host Connection Manager that you=E2=80=99ll need to get familiar = with. If anyone has already started or would like to continue Scott's work, could you please let me know? Joe --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQKkBAEBCgCOFiEEVbCTpybDiFVxIrrVNqQMg7DW754FAmaawJxfFIAAAAAALgAo aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDU1 QjA5M0E3MjZDMzg4NTU3MTIyQkFENTM2QTQwQzgzQjBENkVGOUUQHGpybUBmcmVl YnNkLm9yZwAKCRA2pAyDsNbvnh70EACih+efU1InLLmkeMkx20cGYgKM7IYB+EUm rmRABm9usV0iSPPlvXffr47Pb8z4A/Q5eesf/cGux5RhcThosJB7pU9Kzs2C1LFl eSm0R3IYtcii7S+/8EhheAladjHTNsOqqP0IXNiCbwYqS/qaUFKJ946jPX1Lqs8V gzEeENta6ZJJP8yFghnIXl3XH5BD48SvrR3GC3O4Cerb+yZXNtDEKrwGhHkoRsWe 7JnPfjnFL8JQh4Tcr67ZxW6QNQ1m8InFQFvzBt6zTFKD7IeUCX65JWN1ScL0Jmk9 Bfh7znXqHmsfSMdlAtvnbHKZDFtrJNYeWBVaxzSWlVVD5j/lgmq85Qy9pMfubMkG UxEPtmmgRSc26lZz7EjemYaysaIckydiKr6PSBKCpmPNO013U14DZsPy1m+/AZ5a SXEXecZ6NMDUnMvHwyb06Xcf/Z8xjpuBr1Uf4MTEd1oJHqbt3P3Ody7/EMmI2NNy 2r3XCqDrEEDqsqpbCFCR+/kMjjAQFroIzE1xbhWEdtFLxDl5sTKS6Almf22wVGd+ +KD+sfHPk4KrTbTfsws49vac28tSwwRTcfcwOfqZ3Lxu2fqfGzCwlHdIWLhKdxlx 6F4c89vz/Nb787ZXDvpxb38QqPUkWeOauyMwz52RsDKj4iLYZZbPjsRniS1oBYhE r1UBfHvH3A== =tIcj -----END PGP SIGNATURE----- --=-=-=-- From nobody Fri Jul 19 23:38:01 2024 X-Original-To: hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4WQmM01bkBz5QbjY for ; Fri, 19 Jul 2024 23:38:04 +0000 (UTC) (envelope-from yonas.yanfa@gmail.com) Received: from mail-qk1-x72b.google.com (mail-qk1-x72b.google.com [IPv6:2607:f8b0:4864:20::72b]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4WQmLz6y0Gz4dKZ; Fri, 19 Jul 2024 23:38:03 +0000 (UTC) (envelope-from yonas.yanfa@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-qk1-x72b.google.com with SMTP id af79cd13be357-79ef8e0c294so140498985a.1; Fri, 19 Jul 2024 16:38:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1721432283; x=1722037083; darn=freebsd.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:user-agent:mime-version:date:message-id:from :to:cc:subject:date:message-id:reply-to; bh=mAqxBIg/b5cjeOmoPwmgCFfwxbodKJ1lvAZ/gmoXRXM=; b=TO+kHKpTZSubhr+QvvjCKziygyUFCBaLzNsxxVHAsG0pumjJh5ih3WmDq0Na7teHbE 4mMjLOwQBFxe24cpjIEfQ9Px5g8CftA9SpS16vv7tYsdbnrAO0pMzFD85d9HM+aRlR1i Be1Kt9w1RIUK/a4iTXHd95UaNtoBZVxYzOSXcG9EITsCtur7AtBJxeswydMWKcMXy0kz f90T7lyCZVJOGHtP8ZF4d8OrOzhEOGbobwNbQEn31h/TFpMgpAGNiOJtLjiQZKrQ9arB VW2lNXOOA1B63ZiAtn4dRK2YqFebXxMl+500yj9Uvf7slIUdqw4U0vPhFzgapkoYY6F5 bmBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721432283; x=1722037083; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=mAqxBIg/b5cjeOmoPwmgCFfwxbodKJ1lvAZ/gmoXRXM=; b=b9xzKR+mlTI/Ho8KwgMt0iJ/SNZCyjon6HLdAwU2hkLnok6dN/LX9/ts1UkZdvxqyk sbzswIUNgOW83LU43OCsi6nwZ1bh+gHJC5yRJK5rMk20FyerJ6pmQzsEeHriruYKG5OB +xD1SVkOZIK2zahpNzIlt2wvcoLGaxuL8qRuRRHLgW5VMRKCrEyRqiDwFMkQYDUBqLRG NMhNGCBCaL0WQxDe+b+udz4AfPGO6oSrry0YXDyHcoma+NBRPv7r0jWwENfgp29+22XW nEliGDnkenYtybmv6lBIQpEdnfCk8s6aIRw0HUB/at4lEsQj2ulD2HRgtN50SFSJe3FH S4pQ== X-Forwarded-Encrypted: i=1; AJvYcCVyLnXHOiTTtsmWwKjcOsno0LzDe+yneXTJB0c48Jjp9SVb2HiEu2oG5r6oMpwPY9vvpwU9PJvsngdoHNyTcgayHk6r X-Gm-Message-State: AOJu0YxmwcGiCF30ZkuTKFSrvwgRTmuu7LrVuQpRs9hB+wK2l6rJZcKP zz1dczbszkQG9ujGFm9ifFRMzV0mGstredouK16gU7WjB9Z+K2n66lFJcA== X-Google-Smtp-Source: AGHT+IEux2ikFnxKntsl5UMxaR/fEBvVcikEwpWN/qYxnh7sbh7/BvKZuIN22ivIMfXM03Pv0mNXyg== X-Received: by 2002:a05:620a:4415:b0:79e:f745:5445 with SMTP id af79cd13be357-7a1a1973c42mr237744485a.31.1721432282836; Fri, 19 Jul 2024 16:38:02 -0700 (PDT) Received: from [192.168.1.61] (pool-99-250-134-22.cpe.net.cable.rogers.com. [99.250.134.22]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7a19900d413sm138370385a.70.2024.07.19.16.38.02 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 19 Jul 2024 16:38:02 -0700 (PDT) Message-ID: <028fd8a9-a7cf-4b55-b84c-a3baeb5a9270@gmail.com> Date: Fri, 19 Jul 2024 19:38:01 -0400 List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@FreeBSD.org MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: USB4/TBT3 support To: Joseph Mingrone , hackers@FreeBSD.org References: <864j8lmaub.fsf@phe.ftfl.ca> Content-Language: en-US From: Yonas Yanfa In-Reply-To: <864j8lmaub.fsf@phe.ftfl.ca> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; TAGGED_FROM(0.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US] X-Rspamd-Queue-Id: 4WQmLz6y0Gz4dKZ It would be nice to have Thunderbolt 4 support - and Thunderbolt 5 when it comes out - as well. The speeds are quite high. I imagine many people would love to have 40Gbps - 120Gbps / 80Gbps - 240Gbps on their laptop FreeBSD machine connected to their NAS. Perhaps the FreeBSD Foundation could help find sponsorship for this work. Cheers, Yonas On 2024-07-19 3:38 p.m., Joseph Mingrone wrote: > Hello, > > Is anyone working on USB4 / Thunderbolt 3 (TBT3) support? > > Scott Long did a lot of work on this a few years ago, but he had to move > on to other things, so he passed things on to Hans Petter Selasky. > Fortunately, Hans Petter dropped the code in a public repository. > > https://github.com/hselasky/usb4 > https://github.com/hselasky/usb4/commit/dd85c216a2a6bee5361c7166595ba6ca461578b5 > > Here is an overview of what Scott shared with me. > > Mostly completed work: > > - Debug/Trace framework > - NHI controller driver > - PCIe bridge driver > - WMI driver > - Integrated Connection Manager handshake and authentication handling > - Router and Config Space layer handling (in progress, almost complete) > > Remaining work: > > - tbtconfig (userland tool) > - man pages > - DMAR/IOMMU integration, PCIe tunnelling control > - Support for resetting and firmware flashing on the NHI via out-of-band control > - Host Connection Manager > - Cross-domain login > - ThunderboltIP > > Here are the details that Scott shared. > > The driver originally targeted the Thunderbolt 3 controllers that > were sold under the names “AlpineRidge" and “Icelake”, in the late > 2010's, before the USB standards group publicly released the USB4 > spec. The driver set I wrote was complete enough to activate > Thunderbolt3 peripherals that otherwise would be disabled by > default when plugged in. The driver also attempted to make it > easier to identify things like PCIe tunnels in the topology, but > that was mostly cosmetic. Unfortunately, the AlpineRidge chips > proved to be extremely hard to work with despite their wide > availability, and I spent way too much time fighting them and not > enough time developing more useful functionality. The WMI driver > was written to work around vexing problems with the Alpine Ridge > controller that I never figured out. > > Much of the infrastructure from the TBT3 support extends to modern > USB4 controllers, but there are still a lot of missing pieces. The > NHI driver doesn't know how to probe a USB4 controller yet, but > that should be easy to fix. Even more important, though, is that > the code lacks a functional USB4 Connection Manager. Most of the > pieces required to traverse the topology, discover routers and > adapters, read and write their properties, and build routes between > endpoints exists now, but there's no state machine yet that > integrates those pieces together into a real Connection Manager. > Without that, no attached peripherals will actually run. The TBT3 > controllers like AlpineRidge and IceLake have a connection manager > in firmware, so that's why those controllers function even with > minimal host OS support. This isn't especially hard code to write, > but it's missing nonetheless. > > Once the connection manager is written, it'll need to configure > connections with the USB3, PCIe, and DisplayPort devices that > operate over tunnels, and it'll need a cross domain handler for > connecting to another host. USB3 tunnel support might require > significant changes in the USB3 stack in order to work with USB4. > > Additionally you might need to write a USB-PD driver. Without it, > negotiation on USB-C connectors for power delivery advertisements, > cable orientation, alt mode configuration, and USB3 vs USB4 lane > assignment might not work. If those negotiations are not handled > then nothing that you plug into the port will even be seen by the > controller. > > I'd totally stay away from spending time on supporting Falcon > Ridge, Alpine Ridge, and Titan Ridge controllers. They're old, > they're extremely difficult to work with, and they're not worth the > headache. In fact, just ignore all TBT3 controllers, and remove > the Internal Connection Manager code. The ICM module isn't code > that I'm all that proud of anyways =-). Focus on writing an HCM, > supporting PCIe and DP tunneling, and integrating IOMMU protections > into both the NHI driver and the PCIe tunnel drivers. > > One thing that would be pretty awesome is ThunderboltIP support. > It turns out that these controllers are really cheap 40Gbps > devices, and have the potential to perform pretty well at line rate > as a replacement for traditional 40Gb ethernet controllers, at > least in a point-to-point configuration. > > In short, a lot of work has been done, but a lot of work remains to be > done. The USB4 spec is complicated, and requires intimate knowledge > of the USB-PD, USB-C, and USB3 specs. There's also a spec for > writing a Host Connection Manager that you’ll need to get familiar with. > > If anyone has already started or would like to continue Scott's work, > could you please let me know? > > Joe