From owner-freebsd-hackers@FreeBSD.ORG Thu Jan 22 05:06:12 2009 Return-Path: Delivered-To: hackers@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 74CB71065780; Thu, 22 Jan 2009 05:06:12 +0000 (UTC) (envelope-from das@FreeBSD.ORG) Received: from zim.MIT.EDU (ZIM.MIT.EDU [18.95.3.101]) by mx1.freebsd.org (Postfix) with ESMTP id 119798FC12; Thu, 22 Jan 2009 05:06:11 +0000 (UTC) (envelope-from das@FreeBSD.ORG) Received: from zim.MIT.EDU (localhost [127.0.0.1]) by zim.MIT.EDU (8.14.3/8.14.2) with ESMTP id n0M4ubZE061171; Wed, 21 Jan 2009 23:56:37 -0500 (EST) (envelope-from das@FreeBSD.ORG) Received: (from das@localhost) by zim.MIT.EDU (8.14.3/8.14.2/Submit) id n0M4ubsE061170; Wed, 21 Jan 2009 23:56:37 -0500 (EST) (envelope-from das@FreeBSD.ORG) Date: Wed, 21 Jan 2009 23:56:37 -0500 From: David Schultz To: Daniel Eischen Message-ID: <20090122045637.GA61058@zim.MIT.EDU> Mail-Followup-To: Daniel Eischen , Brian Fundakowski Feldman , hackers@FreeBSD.ORG, Jason Evans , Julian Elischer References: <20090109053117.GB2825@green.homeunix.org> <4966F81C.3070406@elischer.org> <20090109163426.GC2825@green.homeunix.org> <49678BBC.8050306@elischer.org> <20090116211959.GA12007@green.homeunix.org> <49710BD6.7040705@FreeBSD.org> <20090120004135.GB12007@green.homeunix.org> <20090121230033.GC12007@green.homeunix.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Cc: Brian Fundakowski Feldman , hackers@FreeBSD.ORG, Jason Evans , Julian Elischer Subject: Re: threaded, forked, rethreaded processes will deadlock X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Jan 2009 05:06:12 -0000 I think there *is* a real bug here, but there's two distinct ways to fix it. When a threaded process forks, malloc acquires all its locks so that its state is consistent after a fork. However, the post-fork hook that's supposed to release these locks fails to do so in the child because the child process isn't threaded, and malloc_mutex_unlock() is optimized to be a no-op in single-threaded processes. If the child *stays* single-threaded, malloc() works by accident even with all the locks held because malloc_mutex_lock() is also a no-op in single-threaded processes. But if the child goes multi-threaded, then things break. Solution 1 is to actually unlock the locks in the child process, which is what Brian is proposing. Solution 2 is to take the position that all of this pre- and post-fork bloat in the fork() path is gratuitous and should be removed. The rationale here is that if you fork with multiple running threads, there's scads of ways in which the child's heap could be inconsistent; fork hooks would be needed not just in malloc(), but in stdio, third party libraries, etc. Why should malloc() be special? It's the programmer's job to quiesce all the threads before calling fork(), and if the programmer doesn't do this, then POSIX only guarantees that async-signal-safe functions will work. Note that Solution 2 also fixes Brian's problem if he quiesces all of his worker threads before forking (as he should!) With the pre-fork hook removed, all the locks will start out free in the child. So that's what I vote for...