Date: Thu, 23 Jan 2025 14:42:54 -0800 From: Rick Macklem <rick.macklem@gmail.com> To: Gleb Smirnoff <glebius@freebsd.org> Cc: current@freebsd.org, rmacklem@freebsd.org Subject: Re: HEADS UP: NFS changes coming into CURRENT early February Message-ID: <CAM5tNy4xBjKeg87SKXSkzDaJN=1VKcm-8kbZ5hvaJ_Yqpr7hHw@mail.gmail.com> In-Reply-To: <Z5CP2WBdW_vbqzil@cell.glebi.us> References: <Z5CP2WBdW_vbqzil@cell.glebi.us>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Jan 21, 2025 at 10:27=E2=80=AFPM Gleb Smirnoff <glebius@freebsd.org= > wrote: > > CAUTION: This email originated from outside of the University of Guelph. = Do not click links or open attachments unless you recognize the sender and = know the content is safe. If in doubt, forward suspicious emails to IThelp@= uoguelph.ca. > > > Hi, > > TLDR version: > users of NFS with Kerberos (e.g. running gssd(8)) as well as users of NFS= with > TLS (e.g. running rpc.tlsclntd(8) or rpc.tlsservd(8)) as well as users of > network lock manager (e.g. having 'options NFSLOCKD' and running rpcbind(= 8)) > are affected. You would need to recompile & reinstall both the world and= the > kernel together. Of course this is what you'd normally do when you track > FreeBSD CURRENT, but better be warned. I will post hashes of the specifi= c > revisions that break API/ABI when they are pushed. > > Longer version: > last year I tried to check-in a new implementation of unix(4) SOCK_STREAM= and > SOCK_SEQPACKET in d80a97def9a1, but was forced to back it out due to seve= ral > kernel side abusers of a unix(4) socket. The most difficult ones are the= NFS > related RPC services, that act as RPC clients talking to an RPC servers i= n > userland. Since it is impossible to fully emulate a userland process > connection to a unix(4) socket they need to work with the socket internal > structures bypassing all the normal KPIs and conventions. Of course they > didn't tolerate the new implementation that totally eliminated intermedia= te > buffer on the sending side. > > While the original motivation for the upcoming changes is the fact that I= want > to go forward with the new unix/stream and unix/seqpacket, I also tried t= o make > kernel to userland RPC better. You judge if I succeeded or not :) Here a= re > some highlights: > > - Code footprint both in kernel clients and in userland daemons is reduce= d. > Example: gssd: 1 file changed, 5 insertions(+), 64 deletions(-) > kgssapi: 1 file changed, 26 insertions(+), 78 deletions(-) > 4 files changed, 1 insertion(+), 11 deletions(-) > - You can easily see all RPC calls from kernel to userland with genl(1): > # genl monitor rpcnl > - The new transport is multithreaded in kernel by default, so kernel clie= nts > can send a bunch of RPCs without any serialization and if the userland > figures out how to parallelize their execution, such parallelization wo= uld > happen. Note: new rpc.tlsservd(8) will use threads. > - One ad-hoc single program syscall is removed - gssd_syscall. Note: > rpctls syscall remains, but I have some ideas on how to improve that, t= oo. > Not at this step though. > - All sleeps of kernel RPC calls are now in single place, and they all ha= ve > timeouts. I believe NFS services are now much more resilient to hangs. > A deadlock when NFS kernel thread is blocked on unix socket buffer, and > the socket can't go away because its application is blocked in some oth= er > syscall is no longer possible. > > The code is posted on phabricator, reviews D48547 through D48552. > Reviewers are very welcome! > > I share my branch on Github. It is usually rebased on today's CURRENT: > > https://github.com/glebius/FreeBSD/commits/gss-netlink/ > > Early testers are very welcome! Unfortunately it looks like I won't be able to test this until after it is committed to main. Since there are library changes, etc, it appears that it will need a "make buildworld". On the laptop I currently have running it, this will take about a week, if it finishes. (I usually do "make buildworld"s on the universe machines, but since all I currently have is flakey wifi, I don't think that is practical either.) Once there is a snapshot of main that has it, I can download and test that. I will try and take a look at the stuff in phabricator, but given the size of it and my lack of knowledge w.r.t. netlink, I doubt I'll have much to say about it. Hopefully someone else can do some review/testing? rick > > -- > Gleb Smirnoff >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAM5tNy4xBjKeg87SKXSkzDaJN=1VKcm-8kbZ5hvaJ_Yqpr7hHw>