Date: Sun, 24 Nov 2019 00:01:04 -0600 From: Kyle Evans <kevans@freebsd.org> To: FreeBSD Hackers <freebsd-hackers@freebsd.org> Subject: ptrace(2) debugging Message-ID: <CACNAnaHtsAaULLp0icE_=vY4eq2CuJ6Oq4Zx868axaYXArSOeQ@mail.gmail.com>
next in thread | raw e-mail | index | archive | help
Hi, I'm working on implementing `reptyr -T` on FreeBSD because I'm pretty bad about starting long-running jobs outside of tmux and often desire to reparent these jobs into tmux. I've gotten to a point where it's getting stuck in waitpid(2) when attempting to work over the session leader to ignore SIGHUP. The chain of operations looks roughly like this: PT_ATTACH -> waitpid -> kill(SIGCONT) -> PT_TO_SCE -> waitpid -> PT_TO_SCE -> waitpid Each of the waitpids are paired with a PT_LWPINFO. The first waitpid observes SIGSTOP. The second waitpid observes SIGCONT. I would expect the third to observe PL_FLAG_SCE on ptrace_lwpinfo->pl_flags, but instead it actually hangs as the target process is now sleep-inhibited and stuck in "pause" wchan. I've uploaded a truss excerpt at [0] in case it's helpful -- pid=10204 is the process I'm reparenting, initially just attached/detached to make sure reptyr *can* do this. pid=10187 is the sshd that it's running under, and pid=10188 is the shell running under that. Anyone have good advice on debugging this? It seems like it might be some kind of kernel bug, as it's already done this same dance once before when grabbing sshd and my attempts to distill it down to a simple test case failed. The FreeBSD part of reptyr needed some love, though, so that can't be discounted either. Thanks, Kyle Evans [0] https://people.freebsd.org/~kevans/truss.log
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CACNAnaHtsAaULLp0icE_=vY4eq2CuJ6Oq4Zx868axaYXArSOeQ>