From owner-freebsd-threads@FreeBSD.ORG Wed May 19 23:10:10 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C2FA816A4CE for ; Wed, 19 May 2004 23:10:10 -0700 (PDT) Received: from smtp01.syd.iprimus.net.au (smtp01.syd.iprimus.net.au [210.50.30.52]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3B31943D54 for ; Wed, 19 May 2004 23:10:10 -0700 (PDT) (envelope-from tim@robbins.dropbear.id.au) Received: from robbins.dropbear.id.au (210.50.217.190) by smtp01.syd.iprimus.net.au (7.0.024) id 409956B4004A461E; Thu, 20 May 2004 16:09:52 +1000 Received: by robbins.dropbear.id.au (Postfix, from userid 1000) id 4C8CB41CD; Thu, 20 May 2004 16:11:42 +1000 (EST) Date: Thu, 20 May 2004 16:11:42 +1000 From: Tim Robbins To: Daniel Eischen Message-ID: <20040520061142.GA3493@cat.robbins.dropbear.id.au> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.1i cc: threads@freebsd.org cc: Julian Elischer Subject: Re: execve() and KSE X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 May 2004 06:10:10 -0000 On Thu, May 20, 2004 at 01:16:15AM -0400, Daniel Eischen wrote: > On Wed, 19 May 2004, Julian Elischer wrote: > > > What is supposed to happen is that all the execve should stall awaiting > > all the other kernel threads to abort/suicide and then it should proceed > > with the execve as per normal. > > it is possible this doesn't work right.. I haven't tried ti for a LONG > > time.. > > The program is bogus also. First, you can't pass NULL to > pthread_cond_wait() -- check the return values. Second, > you can't join to a thread that has done an exec() -- > the whole process has exec'd. I think you need to do > this the old fashioned way (fork, exec, wait for child, > etc). The call to pthread_cond_wait() with a NULL mutex argument was a mistake but the join was intentional. However, I'm not interested in the program; I'm more interested in the way the kernel handles the execve() call (and the general robustness of KSE heading up to 5.3-STABLE.) The following patch makes the program do what I would expect: exit, instead of getting stuck in the "running" state. It clears the P_SINGLE_EXIT and TDF_SA flags after clearing P_SA in kern_execve(). Without this, the flags are still set in the single-threaded process that comes out the other side of the execve() syscall, and it ends up getting stuck in sched_switch <- choosethread <- thread_exit <- thread_user_enter <- trap <- calltrap. (FWIW: there seems to be another nearby bug: the mtx_unlock(&Giant) call in the kern_execve() ERESTART case may be erroneous, since I can't see where Giant is acquired.) ==== //depot/user/tjr/freebsd-tjr/src/sys/kern/kern_exec.c#19 - /home/tim/p4/src/sys/kern/kern_exec.c ==== @@ -264,7 +264,8 @@ * If we get here all other threads are dead, * so unset the associated flags and lose KSE mode. */ - p->p_flag &= ~P_SA; + p->p_flag &= ~(P_SA|P_SINGLE_EXIT); + p->p_singlethread->td_flags &= ~TDF_SA; td->td_mailbox = NULL; thread_single_end(); } Tim