From owner-freebsd-current@FreeBSD.ORG Tue Sep 24 19:19:54 2013 Return-Path: Delivered-To: freebsd-current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 17764983 for ; Tue, 24 Sep 2013 19:19:54 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from mx1.stack.nl (relay02.stack.nl [IPv6:2001:610:1108:5010::104]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id C4FF72321 for ; Tue, 24 Sep 2013 19:19:53 +0000 (UTC) Received: from turtle.stack.nl (turtle.stack.nl [IPv6:2001:610:1108:5010::132]) by mx1.stack.nl (Postfix) with ESMTP id 85CD9359306; Tue, 24 Sep 2013 21:19:49 +0200 (CEST) Received: by turtle.stack.nl (Postfix, from userid 1677) id 6F422CB4E; Tue, 24 Sep 2013 21:19:49 +0200 (CEST) Date: Tue, 24 Sep 2013 21:19:49 +0200 From: Jilles Tjoelker To: Konstantin Belousov Subject: Re: restarting SYSCALL system call on amd64 loses arguments Message-ID: <20130924191949.GA12607@stack.nl> References: <20130923222613.548860a3@kalimero.tijl.coosemans.org> <20130923213730.GX41229@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130923213730.GX41229@kib.kiev.ua> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Tijl Coosemans , Russ Cox , freebsd-current@FreeBSD.org X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Sep 2013 19:19:54 -0000 On Tue, Sep 24, 2013 at 12:37:30AM +0300, Konstantin Belousov wrote: > On Mon, Sep 23, 2013 at 10:26:13PM +0200, Tijl Coosemans wrote: > > Has anyone taken a look at this PR yet? > > http://www.freebsd.org/cgi/query-pr.cgi?pr=182161 > This looks like a valid bug, but probably not a valid testcase. > Let me elaborate. When a signal is delivered, return from the signal > handler is performed by the sigreturn(2), which reloads the whole > register file when crossing kernel->user boundary due to sys_sigreturn(9) > setting PCB_FULL_IRET flag. As result, the whole trap frame at the > time of the syscall entry is restored, and ERESTART return is not > exercised. > I was not able to reproduce the issue with the supplied test program > on HEAD. I suspect that the program actually exposed the bug in the > signal delivery in the threaded processes, which I introduced for 9.1 > and fixed in r251047 & r251365. The ERESTART return happens if there is no signal or no longer a signal. The latter is how the bug in the PR occurs: a SIGCHLD delivery via handler in one thread races with a SIGCHLD acceptance in wait4() in another thread. Note wait4() returning a value in the other thread in the fourth line of the kdump output in the PR. For some reason, I can reproduce this easily on my local quad-core r255729 stable/9 system but not on ref9-amd64.freebsd.org or ref10-amd64.freebsd.org. I can also reproduce the bug on my local system by racing signal delivery via handler with acceptance in sigtimedwait(). -- Jilles Tjoelker