From owner-freebsd-stable@FreeBSD.ORG Sat Jan 16 13:51:42 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D7153106566C for ; Sat, 16 Jan 2010 13:51:42 +0000 (UTC) (envelope-from tijl@coosemans.org) Received: from mailrelay001.isp.belgacom.be (mailrelay001.isp.belgacom.be [195.238.6.51]) by mx1.freebsd.org (Postfix) with ESMTP id 742E38FC08 for ; Sat, 16 Jan 2010 13:51:41 +0000 (UTC) X-Belgacom-Dynamic: yes X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Aq8EAG9RUUtQyBtM/2dsb2JhbACBRNcvhDIE Received: from 76.27-200-80.adsl-dyn.isp.belgacom.be (HELO kalimero.tijl.coosemans.org) ([80.200.27.76]) by relay.skynet.be with ESMTP; 16 Jan 2010 14:51:40 +0100 Received: from kalimero.tijl.coosemans.org (kalimero.tijl.coosemans.org [127.0.0.1]) by kalimero.tijl.coosemans.org (8.14.3/8.14.3) with ESMTP id o0GDpdii001966; Sat, 16 Jan 2010 14:51:39 +0100 (CET) (envelope-from tijl@coosemans.org) From: Tijl Coosemans To: freebsd-stable@freebsd.org Date: Sat, 16 Jan 2010 14:51:38 +0100 User-Agent: KMail/1.9.10 References: <4B4D0293.3040704@rogers.com> <201001140941.46748.tijl@coosemans.org> <4B4FC56A.4020007@freebsd.org> In-Reply-To: <4B4FC56A.4020007@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <201001161451.39399.tijl@coosemans.org> Cc: Kostik Belousov , Gardner Bell , David Xu Subject: Re: process in STOP state X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 16 Jan 2010 13:51:42 -0000 On Friday 15 January 2010 02:31:22 David Xu wrote: > Tijl Coosemans wrote: >>> Besides weird formatting of procstat -k output, I do not see >>> anything wrong in the state of the process. It got SIGSTOP, I am >>> sure. Attaching gdb helps because debugger gets signal reports >>> instead of target process getting the signal actions on signal >>> delivery. >>> >>> The only question is why the process gets SIGSTOP at all. >> >> Wine uses ptrace(2) sometimes. The SIGSTOP could have come from >> that. I recently submitted >> http://www.freebsd.org/cgi/query-pr.cgi?pr=142757 describing a >> problem with ptrace and signals, so you might want to give the >> kernel patch a try. > > The problem in your patch is that ksi pointer can not be hold across > thread sleeping, because once the process is resumed, there is no > guarantee that the thread will run first, once the signal came from > process's signal queue, other threads can remove the signal, and here > your sigqueue_take(ksi) is dangerous code. If other threads can run before the current thread then there's a second problem next to the one in the PR (current thread deletes signal that shouldn't be deleted). Then those other threads can see that the SIGSTOP bit (or another signal) is still set and stop the process a second time. This might be what happens in the OP's case. So, the signal has to be cleared before suspending the process, but then other threads can still deliver other signals which might change delivery order and I don't see any way around that besides introducing a per process signal lock that is also kept while the process is stopped. Comments?