Date: Tue, 28 Oct 2025 20:42:57 +0000 From: bugzilla-noreply@freebsd.org To: net@FreeBSD.org Subject: [Bug 289734] panic tcp_usr_close while running mount command after configure NFS over TLS Message-ID: <bug-289734-7501-2RJ3zI7YbK@https.bugs.freebsd.org/bugzilla/> In-Reply-To: <bug-289734-7501@https.bugs.freebsd.org/bugzilla/> References: <bug-289734-7501@https.bugs.freebsd.org/bugzilla/>
next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=289734 --- Comment #18 from Rick Macklem <rmacklem@FreeBSD.org> --- (In reply to Gleb Smirnoff from comment #17) I haven't found time to look at the code, but here is what the old (FreeBSD-14) code does: (A) - When the krpc receives a "needs a TLS handshake" request (a Null RPC with "STARTTLS" stuffed in it), the krpc does an upcall to the userland daemon (rpc.tlsservd). (B) - The userland daemon (rpc.tlsservd) does a syscall that says "I need a file descriptor for the socket". The krpc cobbles a file descriptor for the daemon for the socket. *** At this point the krpc marks the socket (closed by daemon and not soclose() here in the kernel) and returns the file descriptor to the daemon. (C) - The daemon sets the SSL library to use the socket file descriptor, notes that it is responsible for doing a close(s) on the socket and calls SSL_accept() to do the actual handshake. (D) - After SSL_accept() returns, it replies to the upcall done at (A) with the results of the TLS handshake. Note that (B) at "***" is the exact point at which responsibility for closing the socket is given to the daemon (rpc.tlsservd). My understanding is that the glebius@ patch got rid of (B) and my hunch is there is now a time window between (A) and (D) where both the daemon (rpc.tlsservd) and the krpc might do a [so]close() on the socket. The easy way for me to fix this (since I am not familiar with glebius@'s code) is to go back to the FreeBSD-14 code and make the minimal changes needed for it to use netlink for the upcall instead of an AF_LOCAL socket (which was what I understand was the original goal of the glebius@ patch). --> In other words, return it to using the syscall at (B) and using separate daemon processes (with a TCP connection pinned to one of them) instead of pthreads. If glebius@ is ok with doing this, I can do so fairly quickly and come with a patch for testing. -- You are receiving this mail because: You are the assignee for the bug.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-289734-7501-2RJ3zI7YbK>
