Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 23 Jan 2025 14:42:54 -0800
From:      Rick Macklem <rick.macklem@gmail.com>
To:        Gleb Smirnoff <glebius@freebsd.org>
Cc:        current@freebsd.org, rmacklem@freebsd.org
Subject:   Re: HEADS UP: NFS changes coming into CURRENT early February
Message-ID:  <CAM5tNy4xBjKeg87SKXSkzDaJN=1VKcm-8kbZ5hvaJ_Yqpr7hHw@mail.gmail.com>
In-Reply-To: <Z5CP2WBdW_vbqzil@cell.glebi.us>
References:  <Z5CP2WBdW_vbqzil@cell.glebi.us>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Jan 21, 2025 at 10:27=E2=80=AFPM Gleb Smirnoff <glebius@freebsd.org=
> wrote:
>
> CAUTION: This email originated from outside of the University of Guelph. =
Do not click links or open attachments unless you recognize the sender and =
know the content is safe. If in doubt, forward suspicious emails to IThelp@=
uoguelph.ca.
>
>
>   Hi,
>
> TLDR version:
> users of NFS with Kerberos (e.g. running gssd(8)) as well as users of NFS=
 with
> TLS (e.g. running rpc.tlsclntd(8) or rpc.tlsservd(8)) as well as users of
> network lock manager (e.g. having 'options NFSLOCKD' and running rpcbind(=
8))
> are affected.  You would need to recompile & reinstall both the world and=
 the
> kernel together.  Of course this is what you'd normally do when you track
> FreeBSD CURRENT, but better be warned.  I will post hashes of the specifi=
c
> revisions that break API/ABI when they are pushed.
>
> Longer version:
> last year I tried to check-in a new implementation of unix(4) SOCK_STREAM=
 and
> SOCK_SEQPACKET in d80a97def9a1, but was forced to back it out due to seve=
ral
> kernel side abusers of a unix(4) socket.  The most difficult ones are the=
 NFS
> related RPC services, that act as RPC clients talking to an RPC servers i=
n
> userland.  Since it is impossible to fully emulate a userland process
> connection to a unix(4) socket they need to work with the socket internal
> structures bypassing all the normal KPIs and conventions.  Of course they
> didn't tolerate the new implementation that totally eliminated intermedia=
te
> buffer on the sending side.
>
> While the original motivation for the upcoming changes is the fact that I=
 want
> to go forward with the new unix/stream and unix/seqpacket, I also tried t=
o make
> kernel to userland RPC better.  You judge if I succeeded or not :) Here a=
re
> some highlights:
>
> - Code footprint both in kernel clients and in userland daemons is reduce=
d.
>   Example: gssd:    1 file changed, 5 insertions(+), 64 deletions(-)
>            kgssapi: 1 file changed, 26 insertions(+), 78 deletions(-)
>                     4 files changed, 1 insertion(+), 11 deletions(-)
> - You can easily see all RPC calls from kernel to userland with genl(1):
>   # genl monitor rpcnl
> - The new transport is multithreaded in kernel by default, so kernel clie=
nts
>   can send a bunch of RPCs without any serialization and if the userland
>   figures out how to parallelize their execution, such parallelization wo=
uld
>   happen.  Note: new rpc.tlsservd(8) will use threads.
> - One ad-hoc single program syscall is removed - gssd_syscall.  Note:
>   rpctls syscall remains, but I have some ideas on how to improve that, t=
oo.
>   Not at this step though.
> - All sleeps of kernel RPC calls are now in single place, and they all ha=
ve
>   timeouts.  I believe NFS services are now much more resilient to hangs.
>   A deadlock when NFS kernel thread is blocked on unix socket buffer, and
>   the socket can't go away because its application is blocked in some oth=
er
>   syscall is no longer possible.
>
> The code is posted on phabricator, reviews D48547 through D48552.
> Reviewers are very welcome!
>
> I share my branch on Github. It is usually rebased on today's CURRENT:
>
> https://github.com/glebius/FreeBSD/commits/gss-netlink/
>
> Early testers are very welcome!
Unfortunately it looks like I won't be able to test this until after it is
committed to main.  Since there are library changes, etc, it appears
that it will need a "make buildworld". On the laptop I currently have
running it, this will take about a week, if it finishes. (I usually do
"make buildworld"s on the universe machines, but since all I currently
have is flakey wifi, I don't think that is practical either.)

Once there is a snapshot of main that has it, I can download
and test that.

I will try and take a look at the stuff in phabricator, but given the
size of it and my lack of knowledge w.r.t. netlink, I doubt I'll have
much to say about it.

Hopefully someone else can do some review/testing? rick

>
> --
> Gleb Smirnoff
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAM5tNy4xBjKeg87SKXSkzDaJN=1VKcm-8kbZ5hvaJ_Yqpr7hHw>