Date: Thu, 29 Oct 2009 09:50:43 +0300 From: Eygene Ryabinkin <rea-fbsd@codelabs.ru> To: "Dorr H. Clark" <dclark@engr.scu.edu> Cc: freebsd-hackers@freebsd.org, freebsd-bugs@freebsd.org, freebsd-stable@freebsd.org Subject: Re: ptrace problem 6.x/7.x - can someone explain this? Message-ID: <jlHTT4ItuuL5pOoOT8LOA8wKXP0@m7CpruDFVVXtHTpJF1FPSUwA%2BUQ> In-Reply-To: <Pine.GSO.4.21.0910271711580.17024-100000@nova46.dc.engr.scu.edu> References: <Pine.GSO.4.21.0810072312220.4889-100000@nova41.dc.engr.scu.edu> <Pine.GSO.4.21.0910271711580.17024-100000@nova46.dc.engr.scu.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
--Dxnq1zWXvFF0Q93v Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Dorr, good day. Tue, Oct 27, 2009 at 05:32:34PM -0700, Dorr H. Clark wrote: > We believe ptrace has a problem in 6.3; we have not tried other > releases. The same code, however, exists in 7.1. And in HEAD too. > The bug was first encountered in gdb... > > (gdb) det > Detaching from program: /usr/local/bin/emacs, process 66217 > (gdb) att 66224 > Attaching to program: /usr/local/bin/emacs, process 66224 > Error accessing memory address 0x281ba5a4: Device busy. > (gdb) det > Detaching from program: /usr/local/bin/emacs, process 66224 > ptrace: Device busy. > (gdb) quit <--- target process 66224 dies here > > To isolate this problem, a wrote a simple minded test program was > written that just attached and detached. This test program found > even the very first detach fails with EBUSY (see test source below): > > $ ./test1 -p 66217 -c 1 -d 10 > pid 66217 count 1 delay 10 > Start of pass 0 > Calling PT_ATTACH pid 66217 addr 0x0 sig 0 > Calling PT_DETACH pid 66217 addr 0xffffffff sig 0 > Call 0 to PT_DETACH returned -1, errno 16 > > Once again, the target process died when the ptracing test program > exitted, as would be expected if a detach had failed. > > The failure return was coming from the following test in kern_ptrace() > in sys_process.c > > /* not currently stopped */ > if ((p->p_flag & (P_STOPPED_SIG | P_STOPPED_TRACE)) == 0 || > p->p_suspcount != p->p_numthreads || > (p->p_flag & P_WAITED) == 0) { > error = EBUSY; > goto fail; > } Yes, the ptraced process should have been waited for, even after the PT_ATTACH call. This is somewhat documented in ptrace(2), ----- ----- but I agree that the wording is a bit sloppy. I'll try to produce slightly modified explanation in the manual page and will post the patch here and as the PR. I had modified your example to visually display the results of each wait() call that is made after ptrace() invocation. Here we go: ----- $ ./test -p 45901 pid 45901 count 2 delay 5 Start of pass 0 Calling PT_ATTACH pid 45901 addr 0x0 sig 0 Attached wait() yield 0x117f: stopped by signal 17; <-- after PT_ATTACH wait() yield 0x57f: stopped by signal 5; <-- after PT_STEP Calling PT_DETACH pid 45901 addr 0xffffffffffffffff sig 0 Detached. ----- As you see, the process is stopped just after the PT_ATTACH with the signal 17, SIGSTOP. PT_STEP follows with the delivery of the SIGTRAP. Both of these signals should be processed by the parent's wait(). And PT_DETACH works, apart from one thing: on my 8.0 PT_DETACH leads to the segfault of the traced program. I hadn't yet tried it on the other versions, so may be there is some bug in the code of test.c, or some bug in the ptrace() implementation -- can't say for sure. If anyone knows why the program segfaults -- please, speak up. The modified source of the test.s is attached. > This is applied to all operations except PT_TRACE_ME, PT_ATTACH, and > some instances of PT_CLEAR_STEP. > > P_WAITED is generally not true. In particular, it's not set > automatically when a process is PT_ATTACHed. It is cleared by > PT_DETACH and again when ptrace sends a signal (PT_CONTINUE, > PT_DETACH.) _But_ it's set in only two places, and they aren't in > ptrace code. > > 2 sys/kern/kern_exit.c kern_wait 773 p->p_flag |= P_WAITED; > 3 compat/svr4/svr4_misc.c svr4_sys_waitsys 1351 q->p_flag |= P_WAITED; > > The relevant one is the first one, primarily. Here's the code: > > mtx_lock_spin(&sched_lock); > if ((p->p_flag & P_STOPPED_SIG) && > (p->p_suspcount == p->p_numthreads) && > (p->p_flag & P_WAITED) == 0 && > (p->p_flag & P_TRACED || options & WUNTRACED)) { > mtx_unlock_spin(&sched_lock); > p->p_flag |= P_WAITED; > sx_xunlock(&proctree_lock); > td->td_retval[0] = p->p_pid; > if (status) > *status = W_STOPCODE(p->p_xstat); > PROC_UNLOCK(p); > return (0); > } > mtx_unlock_spin(&sched_lock); > > So it's only set on processes which are already traced. But it's not > set until someone calls wait4() on them - or the equivalent sysV > compatability routine. > > Gdb doesn't always wait4() for processes immediately opon tracing > them, and the ptrace man page does not imply this is needed. Hmm, there is at least one thread on the simular matter, http://sourceware.org/ml/gdb/2008-12/msg00041.html and people are saying that wait() still should be present. > Moreover, it's not clear why it should matter. The process > needs to be stopped in order for it to make sense to do most > of the things ptrace does. But - why should it need to be waited for? To see if it was really stopped, I presume. > And what kind of sense does this make to someone writing a debugging > tool, where the natural logic seems to be: > - attach to process - wait for the process' attachment by doing wait(). > - look at some stuff > - stick in some kind of breakpoint or similar and start it going again > (or 'step' it) > - wait for it to stop > - look at and modify stuff > - detach, or set it moving again > > By way of experiment, the test for P_WAITED was removed. Gdb no longer had > problems, and no new issues with gdb were encountered (although this > was just interactive, no "gdb coverage test" was attempted). By the way, I can't reproduce gdb faults with the 8.0 sources. Will try 7.x, but I think that I have no 6.x handy. -- Eygene _ ___ _.--. # \`.|\..----...-'` `-._.-'_.-'` # Remember that it is hard / ' ` , __.--' # to read the on-line manual )/' _/ \ `-_, / # while single-stepping the kernel. `-'" `"\_ ,_.-;_.-\_ ', fsc/as # _.-'_./ {_.' ; / # -- FreeBSD Developers handbook {_.-``-' {_/ # --Dxnq1zWXvFF0Q93v--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?jlHTT4ItuuL5pOoOT8LOA8wKXP0>