FreeBSD Mail Archives

Date:      Sun, 26 Jan 2025 14:33:23 -0800
From:      Rick Macklem <rick.macklem@gmail.com>
To:        Gleb Smirnoff <glebius@freebsd.org>
Cc:        current@freebsd.org, rmacklem@freebsd.org
Subject:   Re: HEADS UP: NFS changes coming into CURRENT early February
Message-ID:  <CAM5tNy5EBUj6k0%2BXw0TwetVR0xx0pp5i2G3WC9L4wFyEnnmDWw@mail.gmail.com>
In-Reply-To: <CAM5tNy7APTu3HVqYoMzb1YCOC7QiFzaRq9NGtYFAJi_uOu094Q@mail.gmail.com>
References:  <Z5CP2WBdW_vbqzil@cell.glebi.us> <CAM5tNy7APTu3HVqYoMzb1YCOC7QiFzaRq9NGtYFAJi_uOu094Q@mail.gmail.com>

On Sun, Jan 26, 2025 at 1:44=E2=80=AFPM Rick Macklem <rick.macklem@gmail.co=
m> wrote:
>
> On Tue, Jan 21, 2025 at 10:27=E2=80=AFPM Gleb Smirnoff <glebius@freebsd.o=
rg> wrote:
> >
> > CAUTION: This email originated from outside of the University of Guelph=
. Do not click links or open attachments unless you recognize the sender an=
d know the content is safe. If in doubt, forward suspicious emails to IThel=
p@uoguelph.ca.
> >
> >
> >   Hi,
> >
> > TLDR version:
> > users of NFS with Kerberos (e.g. running gssd(8)) as well as users of N=
FS with
> > TLS (e.g. running rpc.tlsclntd(8) or rpc.tlsservd(8)) as well as users =
of
> > network lock manager (e.g. having 'options NFSLOCKD' and running rpcbin=
d(8))
> > are affected.  You would need to recompile & reinstall both the world a=
nd the
> > kernel together.  Of course this is what you'd normally do when you tra=
ck
> > FreeBSD CURRENT, but better be warned.  I will post hashes of the speci=
fic
> > revisions that break API/ABI when they are pushed.
> >
> > Longer version:
> > last year I tried to check-in a new implementation of unix(4) SOCK_STRE=
AM and
> > SOCK_SEQPACKET in d80a97def9a1, but was forced to back it out due to se=
veral
> > kernel side abusers of a unix(4) socket.  The most difficult ones are t=
he NFS
> > related RPC services, that act as RPC clients talking to an RPC servers=
 in
> > userland.  Since it is impossible to fully emulate a userland process
> > connection to a unix(4) socket they need to work with the socket intern=
al
> > structures bypassing all the normal KPIs and conventions.  Of course th=
ey
> > didn't tolerate the new implementation that totally eliminated intermed=
iate
> > buffer on the sending side.
> >
> > While the original motivation for the upcoming changes is the fact that=
 I want
> > to go forward with the new unix/stream and unix/seqpacket, I also tried=
 to make
> > kernel to userland RPC better.  You judge if I succeeded or not :) Here=
 are
> > some highlights:
> >
> > - Code footprint both in kernel clients and in userland daemons is redu=
ced.
> >   Example: gssd:    1 file changed, 5 insertions(+), 64 deletions(-)
> >            kgssapi: 1 file changed, 26 insertions(+), 78 deletions(-)
> >                     4 files changed, 1 insertion(+), 11 deletions(-)
> > - You can easily see all RPC calls from kernel to userland with genl(1)=
:
> >   # genl monitor rpcnl
> > - The new transport is multithreaded in kernel by default, so kernel cl=
ients
> >   can send a bunch of RPCs without any serialization and if the userlan=
d
> >   figures out how to parallelize their execution, such parallelization =
would
> >   happen.  Note: new rpc.tlsservd(8) will use threads.
> > - One ad-hoc single program syscall is removed - gssd_syscall.  Note:
> >   rpctls syscall remains, but I have some ideas on how to improve that,=
 too.
> >   Not at this step though.
> > - All sleeps of kernel RPC calls are now in single place, and they all =
have
> >   timeouts.  I believe NFS services are now much more resilient to hang=
s.
> >   A deadlock when NFS kernel thread is blocked on unix socket buffer, a=
nd
> >   the socket can't go away because its application is blocked in some o=
ther
> >   syscall is no longer possible.
> >
> > The code is posted on phabricator, reviews D48547 through D48552.
> > Reviewers are very welcome!
> >
> > I share my branch on Github. It is usually rebased on today's CURRENT:
> >
> > https://github.com/glebius/FreeBSD/commits/gss-netlink/
> >
> > Early testers are very welcome!
> Ok, I can now do minimal testing and crashed it...
>
> I did a mount with option "tls" and then partitioned it from the NFS serv=
er
> by doing "ifconfig bridge0 down". Waited until the TCP connection closed
> and then did "ifconfig bridge0 up".
>
> The crash is a NULL pointer at rpctls_impl.c:255 (in rpctls_connect(),
> called from nfscl_renewthread().
> The problem is that you made rpctls_connect_handle a vnet'd variable.
> The client side (aka an NFS mount) does not happen inside a jail and
> cannot use any vnet'd variables.
> Why? Well, any number of threads enter the NFS client via VOP_xxx()
> calls etc. Any one of them might end up doing a TCP reconnect when the
> underlying TCP connection is broken and then heals.
>
> I don't know why you made rpctls_connect_handle  a vnet'd variable,
> but it cannot be that way.
> (I once looked at making NFS mounts work inside a vnet prison and
> gave up when I realized any old thread ends up in the code and it
> would have taken many, many CURVNET_SET() calls to make it work.)
>
> In summary, no global variable on the client side can be vnet'd and no
> global variable on the server side that is vnet'd can be shared with the
> client side code.
Ok,I now see you've fixed this crash.
I'd still like to limit commits to main to the ones that are required to us=
e
netlink for the upcalls at this time.

rick

>
> I realize you are enthusiastic about this, but I'd suggest you back off t=
o
> the minimal changes required to make this stuff work with netlink instead
> of unix domain sockets and stick with that, at least for the initial
> commit cycle.
>
> One thing to note is that few (if any) people who run main test this stuf=
f.
> It may be 1-2years before it sees third party testing and I can only do m=
inimal
> testing until at least April.
>
> Anyhow, thanks for all the good work you are doing with this, rick
>
> >
> > --
> > Gleb Smirnoff
> >

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAM5tNy5EBUj6k0%2BXw0TwetVR0xx0pp5i2G3WC9L4wFyEnnmDWw>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation