From owner-freebsd-hackers@FreeBSD.ORG Thu Jan 22 07:15:24 2009 Return-Path: Delivered-To: hackers@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 81314106564A; Thu, 22 Jan 2009 07:15:24 +0000 (UTC) (envelope-from das@FreeBSD.ORG) Received: from zim.MIT.EDU (ZIM.MIT.EDU [18.95.3.101]) by mx1.freebsd.org (Postfix) with ESMTP id 37BAD8FC08; Thu, 22 Jan 2009 07:15:23 +0000 (UTC) (envelope-from das@FreeBSD.ORG) Received: from zim.MIT.EDU (localhost [127.0.0.1]) by zim.MIT.EDU (8.14.3/8.14.2) with ESMTP id n0M7H3Hu061853; Thu, 22 Jan 2009 02:17:03 -0500 (EST) (envelope-from das@FreeBSD.ORG) Received: (from das@localhost) by zim.MIT.EDU (8.14.3/8.14.2/Submit) id n0M7H3c1061852; Thu, 22 Jan 2009 02:17:03 -0500 (EST) (envelope-from das@FreeBSD.ORG) Date: Thu, 22 Jan 2009 02:17:03 -0500 From: David Schultz To: Daniel Eischen Message-ID: <20090122071703.GA61697@zim.MIT.EDU> Mail-Followup-To: Daniel Eischen , hackers@FreeBSD.ORG, Jason Evans , Julian Elischer References: <4966F81C.3070406@elischer.org> <20090109163426.GC2825@green.homeunix.org> <49678BBC.8050306@elischer.org> <20090116211959.GA12007@green.homeunix.org> <49710BD6.7040705@FreeBSD.org> <20090120004135.GB12007@green.homeunix.org> <20090121230033.GC12007@green.homeunix.org> <20090122045637.GA61058@zim.MIT.EDU> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Cc: hackers@FreeBSD.ORG, Jason Evans , Julian Elischer Subject: Re: threaded, forked, rethreaded processes will deadlock X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Jan 2009 07:15:24 -0000 On Thu, Jan 22, 2009, Daniel Eischen wrote: > On Wed, 21 Jan 2009, David Schultz wrote: > > >I think there *is* a real bug here, but there's two distinct ways > >to fix it. When a threaded process forks, malloc acquires all its > >locks so that its state is consistent after a fork. However, the > >post-fork hook that's supposed to release these locks fails to do > >so in the child because the child process isn't threaded, and > >malloc_mutex_unlock() is optimized to be a no-op in > >single-threaded processes. If the child *stays* single-threaded, > >malloc() works by accident even with all the locks held because > >malloc_mutex_lock() is also a no-op in single-threaded processes. > >But if the child goes multi-threaded, then things break. > > > >Solution 1 is to actually unlock the locks in the child process, > >which is what Brian is proposing. > > > >Solution 2 is to take the position that all of this pre- and > >post-fork bloat in the fork() path is gratuitous and should be > >removed. The rationale here is that if you fork with multiple > >running threads, there's scads of ways in which the child's heap > >could be inconsistent; fork hooks would be needed not just in > >malloc(), but in stdio, third party libraries, etc. Why should > >malloc() be special? It's the programmer's job to quiesce all the > >threads before calling fork(), and if the programmer doesn't do > >this, then POSIX only guarantees that async-signal-safe functions > >will work. > > > >Note that Solution 2 also fixes Brian's problem if he quiesces all > >of his worker threads before forking (as he should!) With the > >pre-fork hook removed, all the locks will start out free in the > >child. So that's what I vote for... > > The problem is that our own libraries (libthr included) > need to malloc() for themselves, even after a fork() in > the child. After a fork(), the malloc locks should be > reinitialized in the child if it was threaded, so that > our implementation actually works for all the async > signal calls, fork(), exec(), etc. I forget the exact > failure modes for very common cases, but if you remove > the re-initialization of the malloc locks, I'm sure > you will have problems. If you can't implement functions that are required to be async-signal-safe like fork() and exec() without malloc(), then for now I guess we should go for something along the lines of what Brian is proposing. If the app programmer has taken special pains to ensure that all other threads are stopped when a fork happens, the fork() call shouldn't return in the child with all the malloc locks bogusly held.