Date: Wed, 13 Jul 2016 06:30:36 +0300 From: Konstantin Belousov <kostikbel@gmail.com> To: Mark Johnston <markj@FreeBSD.org> Cc: freebsd-current@FreeBSD.org Subject: Re: ptrace attach in multi-threaded processes Message-ID: <20160713033036.GR38613@kib.kiev.ua> In-Reply-To: <20160712182414.GC71220@wkstn-mjohnston.west.isilon.com> References: <20160712011938.GA51319@wkstn-mjohnston.west.isilon.com> <20160712055753.GI38613@kib.kiev.ua> <20160712170502.GA71220@wkstn-mjohnston.west.isilon.com> <20160712175150.GP38613@kib.kiev.ua> <20160712182414.GC71220@wkstn-mjohnston.west.isilon.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Jul 12, 2016 at 11:24:14AM -0700, Mark Johnston wrote: > On Tue, Jul 12, 2016 at 08:51:50PM +0300, Konstantin Belousov wrote: > > On Tue, Jul 12, 2016 at 10:05:02AM -0700, Mark Johnston wrote: > > > On Tue, Jul 12, 2016 at 08:57:53AM +0300, Konstantin Belousov wrote: > > > I suppose it is not strictly incorrect. I find it surprising that a > > > PT_ATTACH followed by a PT_DETACH may leave the process in a different > > > state than it was in before the attach. This means that it is not > > > possible to gcore a process without potentially leaving it stopped, for > > > instance. This result may occur in a single-threaded process > > > as well, since a signal may already be queued when the PT_ATTACH handler > > > sends SIGSTOP. > > I still miss somethine. Isn't this an expected outcome from sending a > > signal with STOP action ? > > It is. But I also expect a PT_DETACH operation to resume a stopped > process, assuming that a second SIGSTOP was not posted while the > process was suspended. But as far as the situation was discussed, it seems that real SIGSTOP raced with PT_ATTACH. And the offered interpretation that SIGSTOP was delivered 'a bit later' than PT_ATTACH would fit into the description. > > > > > > Indeed, I somehow missed that. I had assumed that the leaked TDB_XSIG > > > represented a bug in ptracestop(). > > It could, I did not made any statements that deny the bug: > > To be clear, the root of my issue comes from the following: the SIGSTOP > from PT_ATTACH may be handled concurrently with a second signal > delivered to a second thread in the same process. Then, the resulting > behaviour depends on the order in which the recipient threads suspend in > ptracestop(). If the thread that received SIGSTOP suspends last, its > td_xsig will be overwritten with the userland-provided value in the > PT_DETACH handler. If it suspends first, its td_xsig will be preserved, > and upon PT_DETACH the process will be suspended again in issignal(). > > I'm not sure if this is considered a bug. ptracestop() is handling all > signals (including the SIGSTOP generated by the PT_ATTACH handler) in a > consistent way, but this results in inconsistent behaviour from the > perspective of a ptrace(2) consumer. Still I do not understand what is inconsistent. Let look at it from the other side (before, we discussed the implementation in kernel). Is this happens in gcore(1) ? If yes, gcore interaction with ptrace(2) looks like this: ptrace(PT_ATTACH, g_pid); waitpid(g_pid, &g_status, 0); ... if (sig == SIGSTOP) sig = 0; ptrace(PT_DETACH, g_pid, 1, sig); It sounds as if it is desirable for you to modify gcore(1) to consume all signals, or at least, all STOP signals before PT_DETACH. I do not understand why do you want it, but that would probably give you the behaviour you want: ptrace(PT_ATTACH, g_pid); waitpid(g_pid, &g_status, 0); ... /* still consume implicit SIGSTOP from attach */ if (sig == SIGSTOP) sig = 0; do { error = waitpid(g_pid, &g_status, WNOHANG | WSTOPPED); } while (error == 0); ptrace(PT_DETACH, g_pid, 1, sig);
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160713033036.GR38613>