From owner-freebsd-hackers@FreeBSD.ORG Wed Mar 18 17:23:59 2009 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0AC5710656C8; Wed, 18 Mar 2009 17:23:59 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.terabit.net.ua (mail.terabit.net.ua [195.137.202.147]) by mx1.freebsd.org (Postfix) with ESMTP id 8BEFA8FC08; Wed, 18 Mar 2009 17:23:58 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from skuns.zoral.com.ua ([91.193.166.194] helo=mail.zoral.com.ua) by mail.terabit.net.ua with esmtps (TLSv1:AES256-SHA:256) (Exim 4.63 (FreeBSD)) (envelope-from ) id 1Ljyh1-000CL3-RU; Wed, 18 Mar 2009 18:32:35 +0200 Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id n2IGWNrM012767 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 18 Mar 2009 18:32:23 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.3/8.14.3) with ESMTP id n2IGWNKj009409; Wed, 18 Mar 2009 18:32:23 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.3/8.14.3/Submit) id n2IGWMt0009408; Wed, 18 Mar 2009 18:32:22 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 18 Mar 2009 18:32:22 +0200 From: Kostik Belousov To: Daniel Eischen Message-ID: <20090318163222.GE7716@deviant.kiev.zoral.com.ua> References: <4966F81C.3070406@elischer.org> <20090109163426.GC2825@green.homeunix.org> <49678BBC.8050306@elischer.org> <20090116211959.GA12007@green.homeunix.org> <49710BD6.7040705@FreeBSD.org> <20090120004135.GB12007@green.homeunix.org> <20090121230033.GC12007@green.homeunix.org> <20090122045637.GA61058@zim.MIT.EDU> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="N1GIdlSm9i+YlY4t" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua X-Virus-Scanned: mail.terabit.net.ua 1Ljyh1-000CL3-RU 04b5085783d7537d392265e86bde8f92 X-Terabit: YES Cc: David Schultz , hackers@freebsd.org, davidxu@freebsd.org, Jason Evans , Julian Elischer Subject: Re: threaded, forked, rethreaded processes will deadlock X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 Mar 2009 17:24:00 -0000 --N1GIdlSm9i+YlY4t Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Jan 22, 2009 at 12:42:56AM -0500, Daniel Eischen wrote: > On Wed, 21 Jan 2009, David Schultz wrote: >=20 > >I think there *is* a real bug here, but there's two distinct ways > >to fix it. When a threaded process forks, malloc acquires all its > >locks so that its state is consistent after a fork. However, the > >post-fork hook that's supposed to release these locks fails to do > >so in the child because the child process isn't threaded, and > >malloc_mutex_unlock() is optimized to be a no-op in > >single-threaded processes. If the child *stays* single-threaded, > >malloc() works by accident even with all the locks held because > >malloc_mutex_lock() is also a no-op in single-threaded processes. > >But if the child goes multi-threaded, then things break. > > > >Solution 1 is to actually unlock the locks in the child process, > >which is what Brian is proposing. > > > >Solution 2 is to take the position that all of this pre- and > >post-fork bloat in the fork() path is gratuitous and should be > >removed. The rationale here is that if you fork with multiple > >running threads, there's scads of ways in which the child's heap > >could be inconsistent; fork hooks would be needed not just in > >malloc(), but in stdio, third party libraries, etc. Why should > >malloc() be special? It's the programmer's job to quiesce all the > >threads before calling fork(), and if the programmer doesn't do > >this, then POSIX only guarantees that async-signal-safe functions > >will work. > > > >Note that Solution 2 also fixes Brian's problem if he quiesces all > >of his worker threads before forking (as he should!) With the > >pre-fork hook removed, all the locks will start out free in the > >child. So that's what I vote for... >=20 > The problem is that our own libraries (libthr included) > need to malloc() for themselves, even after a fork() in > the child. After a fork(), the malloc locks should be > reinitialized in the child if it was threaded, so that > our implementation actually works for all the async > signal calls, fork(), exec(), etc. I forget the exact > failure modes for very common cases, but if you remove > the re-initialization of the malloc locks, I'm sure > you will have problems. >=20 > Perhaps much of this malloc() stuff goes away when we > move to pthread locks that are not pointers to allocated > objects, but instead are actual objects/structures. > This needs to be done in order for mutexes/CVs/etc > to be PTHREAD_PROCESS_SHARED (placed in shared memory > and used by multiple processes). In other words, > pthread_mutex_t goes from this: >=20 > typedef struct pthread_mutex *pthread_mutex_t; >=20 > to something like this: >=20 > struct __pthread_mutex { > uint32_t lock; > ... > } > typedef struct __pthread_mutex pthread_mutex_t; >=20 > Same thing for CVs, and we probably should convert any other > locks used internally by libc/libpthread (spinlocks). >=20 > So after a fork(), there is no need to reallocate anything, > it can just be reinitialized if necessary. >=20 I looked at the issue once more recently, and I propose the following much less intrusive patch. It is somewhat hackish, but I think that it would be good to have this working. Most other Unixes do have working thread library after the fork. Any objections ? diff --git a/lib/libthr/thread/thr_fork.c b/lib/libthr/thread/thr_fork.c index bc410d1..ae6b9ad 100644 --- a/lib/libthr/thread/thr_fork.c +++ b/lib/libthr/thread/thr_fork.c @@ -173,14 +173,19 @@ _fork(void) /* Ready to continue, unblock signals. */=20 _thr_signal_unblock(curthread); =20 - if (unlock_malloc) + if (unlock_malloc) { + __isthreaded =3D 1; _malloc_postfork(); + __isthreaded =3D 0; + } =20 /* Run down atfork child handlers. */ TAILQ_FOREACH(af, &_thr_atfork_list, qe) { if (af->child !=3D NULL) af->child(); } + + THR_UMUTEX_UNLOCK(curthread, &_thr_atfork_lock); } else { /* Parent process */ errsave =3D errno; --N1GIdlSm9i+YlY4t Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (FreeBSD) iEYEARECAAYFAknBIhUACgkQC3+MBN1Mb4gSTwCeIIAdoAw9tSKhJ1ttiGe8LNwo 5zoAoOx5my0Upyo9shFZ1P/irQ60mREW =y+gs -----END PGP SIGNATURE----- --N1GIdlSm9i+YlY4t--