From owner-freebsd-current@freebsd.org Tue Jul 12 17:01:37 2016 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B58A0B83B72 for ; Tue, 12 Jul 2016 17:01:37 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-pa0-x235.google.com (mail-pa0-x235.google.com [IPv6:2607:f8b0:400e:c03::235]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 6FACC15C1 for ; Tue, 12 Jul 2016 17:01:37 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-pa0-x235.google.com with SMTP id ks6so8381979pab.0 for ; Tue, 12 Jul 2016 10:01:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=X4iM2NoHGULXnjuxn8zoJndOBGImBXdDqYV9EXbngNM=; b=AAi9EVBE7rxFp72eAArjPZAVV4wqtzO72ZYhDOzButbwy+T776fIaSe/t4pi3enHGS aguPBN2PqWvrA3DfJv8zX+2t0FM2g6AE1Him35UHWZOnikcZ8b6PbSvwNB7Ns8y1TqOJ Jf/wH5N4h4WQToDTmgwAN8F23Xm+bY4JmpTcOCrf37ZTPtcAHDyDP/+h/5TUg1sXQ1d1 t1DQQuv6bfZzpjNlPzgsGuT3+uJlP4qSH8fOfbesCkLiWXURChI8S7w12fIYEcM/Mmx4 4P8Rueoa09xMWrPLiG+vCdsFeEBWYf7b679ib23OxC19WqeHg2MdDVsRzQcw5TSnN0Kk ljvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to:user-agent; bh=X4iM2NoHGULXnjuxn8zoJndOBGImBXdDqYV9EXbngNM=; b=f/2YNMBZaXLduyVYj3xjzEhJPhJ9l2yRNpcRsukLVadSR8Pyd5AlHJLhOUnG073jLw x50PwIxddMnoSFJroyXc4xqFB3NZVS7k1cfer9grUM5V0ZiqCATBJ1yDZEsPf4TYlip9 9dioXtOUulILtKiGCKFohom4H+LQeWs/PgcKb/A6ltuBVnaLnJwuXa5vW5vJ+vZcLByX PBK6I3mqYPzv0FPetcOpaNB8odecmX0R7RAiSrA/VeTpUmJMsHkENjB5Gq0pC0uERvo6 MetxMQzkpaN1JUPsRArcILlaRuZBjjua2Eh9BnDk7jFZ93oUzuCiFVI4BMaMNgoRAgkK FuzA== X-Gm-Message-State: ALyK8tLSPlqP/xKNq1NQxI0V+55TnLWXA0cTYD14ZMMHW84hp1B+hODOd+Q7G6r6Fy/66Q== X-Received: by 10.66.255.7 with SMTP id am7mr5757616pad.75.1468342896478; Tue, 12 Jul 2016 10:01:36 -0700 (PDT) Received: from wkstn-mjohnston.west.isilon.com (c-76-104-201-218.hsd1.wa.comcast.net. [76.104.201.218]) by smtp.gmail.com with ESMTPSA id 3sm5669408pfe.81.2016.07.12.10.01.35 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 12 Jul 2016 10:01:35 -0700 (PDT) Sender: Mark Johnston Date: Tue, 12 Jul 2016 10:05:02 -0700 From: Mark Johnston To: Konstantin Belousov Cc: freebsd-current@FreeBSD.org Subject: Re: ptrace attach in multi-threaded processes Message-ID: <20160712170502.GA71220@wkstn-mjohnston.west.isilon.com> References: <20160712011938.GA51319@wkstn-mjohnston.west.isilon.com> <20160712055753.GI38613@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160712055753.GI38613@kib.kiev.ua> User-Agent: Mutt/1.6.1 (2016-04-27) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Jul 2016 17:01:37 -0000 On Tue, Jul 12, 2016 at 08:57:53AM +0300, Konstantin Belousov wrote: > On Mon, Jul 11, 2016 at 06:19:38PM -0700, Mark Johnston wrote: > > Hi, > > > > It seems to be possible for ptrace(PT_ATTACH) to race with the delivery > > of a signal to the same process. ptrace(PT_ATTACH) sets P_TRACED and > > sends SIGSTOP to a thread in the target process. Consider the case where > > a signal is delivered to a second thread, and both threads are executing > > ast() concurrently. The two threads will both call issignal() and from > > there call ptracestop() because P_TRACED is set, though they will be > > serialized by the proc lock. If the thread receiving SIGSTOP wins the > > race, it will suspend first and set p->p_xthread. The second thread will > > also suspend in ptracestop(), overwriting the p_xthread field set by the > > first thread. Later, ptrace(PT_DETACH) will unsuspend the threads, but > > it will set td->td_xsig only in the second thread. This means that the > > first thread will return SIGSTOP from ptracestop() and subsequently > > suspend the process, which seems rather incorrect. > Why ? In particular, why delivering STOP after attach, in the described > situation, is perceived as incorrect ? Parallel STOPs, one from attach, > and other from kill(2), must result in two stops. I suppose it is not strictly incorrect. I find it surprising that a PT_ATTACH followed by a PT_DETACH may leave the process in a different state than it was in before the attach. This means that it is not possible to gcore a process without potentially leaving it stopped, for instance. This result may occur in a single-threaded process as well, since a signal may already be queued when the PT_ATTACH handler sends SIGSTOP. To me it just seems a bit strange that ptrace's mechanism for stopping the target - sending SIGSTOP - interacts this way with ptrace's handling of signals - ptracestop()). Specifically, PT_ATTACH does not rely on the SA_STOP property of SIGSTOP to stop the process, but rather on the special signal handling in ptracestop(). > > The bit about overwriting p_xsig/p_xthread indeed initially sound worrysome, > but probably not too much. The only consequence of reassigning p_xthread > is the selection of the 'lead' thread in sys_process.c, it seems. > > > > > The above is just a theory to explain an unexpectedly-stopped > > multi-threaded process that I've observed. Is there some mechanism I'm > > missing that prevents multiple threads from suspending in ptracestop() > > at the same time? If not, then I think that's the root of the problem, > > since p_xthread is pretty clearly not meant to be overwritten this way. > Again, why ? > > Note the comment > * Just make wait() to work, the last stopped thread > * will win. > which seems to point to the situation. Indeed, I somehow missed that. I had assumed that the leaked TDB_XSIG represented a bug in ptracestop(). > > > Moreover, in my scenario I see a thread with TDB_XSIG set even after > > ptrace(PT_DETACH) was called (P_TRACED is cleared). > This is interesting, we indeed do not clear the flag consistently. > But again, the only consequence seems to be a possible invalid reporting > of events.