From owner-freebsd-hackers@FreeBSD.ORG Tue Aug 14 00:41:00 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3FFDF1065672; Tue, 14 Aug 2012 00:41:00 +0000 (UTC) (envelope-from listlog2011@gmail.com) Received: from mail-yw0-f54.google.com (mail-yw0-f54.google.com [209.85.213.54]) by mx1.freebsd.org (Postfix) with ESMTP id CD7698FC16; Tue, 14 Aug 2012 00:40:59 +0000 (UTC) Received: by yhfs35 with SMTP id s35so4697606yhf.13 for ; Mon, 13 Aug 2012 17:40:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:reply-to:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=Tfg4x9S1VhZZG/BaTbYZ8mh8Gnr67m/yUu/l7GepsZo=; b=y1XlGEFWtWaJVldM7bQCqZEZCgEEivIwcKBi19Q0yBI8LbGUYoJw2EfZau+Vp2KsWk O7bwUtJfJ0sMLdkmve/faPZh0cCN7I+hB8z3dSlC/48dbMYnJay1pO9l249gl9ok9qjw 5axZ6DR75ATu/tYpbJ0s3wAuo2R94CzMbkoIISR8FflnEwFKBWYHGjaHUhCdrXnCoILF fi+jvxMJ7jXwlH8yjPGGNX3fgOHDAhp2Wo84Akc5qfrafYphpqYVMuwFp7PnBXBdaoAA i4TRMTtAk0DkLvZwOZTeBGnTrezJONTCubiV5hR1Z0FRzMUMMOPNn/1cSX/fQ5zFa/Gw j7kA== Received: by 10.68.223.164 with SMTP id qv4mr35260310pbc.20.1344904858743; Mon, 13 Aug 2012 17:40:58 -0700 (PDT) Received: from xp5k.my.domain ([115.192.131.203]) by mx.google.com with ESMTPS id vh7sm308913pbc.22.2012.08.13.17.40.55 (version=SSLv3 cipher=OTHER); Mon, 13 Aug 2012 17:40:58 -0700 (PDT) Message-ID: <50299E95.80401@gmail.com> Date: Tue, 14 Aug 2012 08:40:53 +0800 From: David Xu User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:13.0) Gecko/20120808 Thunderbird/13.0.1 MIME-Version: 1.0 To: Konstantin Belousov References: <20120730102408.GA19983@stack.nl> <20120730105303.GU2676@deviant.kiev.zoral.com.ua> <20120805215432.GA28704@stack.nl> <20120806082535.GI2676@deviant.kiev.zoral.com.ua> <20120809105648.GA79814@stack.nl> <20120809110850.GA2425@deviant.kiev.zoral.com.ua> <20120810101302.GF2425@deviant.kiev.zoral.com.ua> <5026F4B1.5030000@gmail.com> <20120813115021.GE2352@deviant.kiev.zoral.com.ua> In-Reply-To: <20120813115021.GE2352@deviant.kiev.zoral.com.ua> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-hackers@freebsd.org, davidxu@freebsd.org, Jilles Tjoelker Subject: Re: system() using vfork() or posix_spawn() and libthr X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: davidxu@freebsd.org List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Aug 2012 00:41:00 -0000 On 2012/08/13 19:50, Konstantin Belousov wrote: > On Sun, Aug 12, 2012 at 08:11:29AM +0800, David Xu wrote: >> On 2012/08/10 18:13, Konstantin Belousov wrote: >>> On Thu, Aug 09, 2012 at 02:08:50PM +0300, Konstantin Belousov wrote: >>>> Third alternative, which seems to be even better, is to restore >>>> single-threading of the parent for vfork(). >> single-threading is slow for large threaded process, don't know if it >> is necessary for vfork(), POSIX says nothing about threaded process. > I agree that with both of your statements. But, being fast but allowing > silent process corruption is not good behaviour. Either we need to > actually support vfork() for threaded processes, or disable it with some > error code. > > I prefer to support it. I believe that vfork() should be wrapped by > libthr in the same way as fork() is wrapped. Not sure should we call > atfork handlers, for now I decided not to call, since the handlers > assume separate address spaces for parent/child. But we could only call > parent handler in child, however weird this sounds. > > The real complication with wrapping is the fact that we cannot return > from wrapper in child without destroying parent state. So I tried to > prototype the code to handle the wrapping in the same frame, thus > neccessity of using asm. > > Below is WIP, only for amd64 ATM. > > diff --git a/lib/libthr/arch/amd64/Makefile.inc b/lib/libthr/arch/amd64/Makefile.inc > index e6d99ec..476d26a 100644 > --- a/lib/libthr/arch/amd64/Makefile.inc > +++ b/lib/libthr/arch/amd64/Makefile.inc > @@ -1,3 +1,4 @@ > #$FreeBSD$ > > -SRCS+= pthread_md.c _umtx_op_err.S > +CFLAGS+=-I${.CURDIR}/../libc/${MACHINE_CPUARCH} > +SRCS+= pthread_md.c _umtx_op_err.S vfork.S > diff --git a/lib/libthr/arch/amd64/amd64/vfork.S b/lib/libthr/arch/amd64/amd64/vfork.S > new file mode 100644 > index 0000000..07d813d > --- /dev/null > +++ b/lib/libthr/arch/amd64/amd64/vfork.S > @@ -0,0 +1,74 @@ > +/*- > + * Copyright (c) 2012 Konstantin Belousov > + * Copyright (c) 1990 The Regents of the University of California. > + * All rights reserved. > + * > + * This code is derived from software contributed to Berkeley by > + * William Jolitz. > + * > + * Redistribution and use in source and binary forms, with or without > + * modification, are permitted provided that the following conditions > + * are met: > + * 1. Redistributions of source code must retain the above copyright > + * notice, this list of conditions and the following disclaimer. > + * 2. Redistributions in binary form must reproduce the above copyright > + * notice, this list of conditions and the following disclaimer in the > + * documentation and/or other materials provided with the distribution. > + * 4. Neither the name of the University nor the names of its contributors > + * may be used to endorse or promote products derived from this software > + * without specific prior written permission. > + * > + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND > + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE > + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE > + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE > + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL > + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS > + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) > + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT > + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY > + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF > + * SUCH DAMAGE. > + */ > + > +#if defined(SYSLIBC_SCCS) && !defined(lint) > + .asciz "@(#)Ovfork.s 5.1 (Berkeley) 4/23/90" > +#endif /* SYSLIBC_SCCS and not lint */ > +#include > +__FBSDID("$FreeBSD$"); > + > +#include "SYS.h" > + > + .weak _vfork > + .set _vfork,__sys_vfork > + .weak vfork > + .set vfork,__sys_vfork > +ENTRY(__sys_vfork) > + call _thr_vfork_pre > + popq %rsi /* fetch return address (%rsi preserved) */ > + mov $SYS_vfork,%rax > + KERNCALL > + jb 2f > + cmpl $0,%eax > + jne 1f > + pushq %rsi > + pushq %rsi /* twice for stack alignment */ > + call _thr_vfork_post > + popq %rsi > + popq %rsi > + xorl %eax,%eax > +1: > + jmp *%rsi > +2: > + pushq %rsi > + pushq %rax > + call _thr_vfork_post > + call PIC_PLT(CNAME(__error)) > + popq %rdx > + movq %rdx,(%rax) > + movq $-1,%rax > + movq $-1,%rdx > + retq > +END(__sys_vfork) > + > + .section .note.GNU-stack,"",%progbits > diff --git a/lib/libthr/pthread.map b/lib/libthr/pthread.map > index 355edea..40d14b4 100644 > --- a/lib/libthr/pthread.map > +++ b/lib/libthr/pthread.map > @@ -157,6 +157,7 @@ FBSD_1.0 { > system; > tcdrain; > usleep; > + vfork; > wait; > wait3; > wait4; > diff --git a/lib/libthr/thread/Makefile.inc b/lib/libthr/thread/Makefile.inc > index ddde6e9..92e82ac 100644 > --- a/lib/libthr/thread/Makefile.inc > +++ b/lib/libthr/thread/Makefile.inc > @@ -55,4 +55,5 @@ SRCS+= \ > thr_switch_np.c \ > thr_symbols.c \ > thr_umtx.c \ > + thr_vfork.c \ > thr_yield.c > diff --git a/lib/libthr/thread/thr_private.h b/lib/libthr/thread/thr_private.h > index ba272fe..30f3de2 100644 > --- a/lib/libthr/thread/thr_private.h > +++ b/lib/libthr/thread/thr_private.h > @@ -777,6 +777,8 @@ int _thr_setscheduler(lwpid_t, int, const struct sched_param *) __hidden; > void _thr_signal_prefork(void) __hidden; > void _thr_signal_postfork(void) __hidden; > void _thr_signal_postfork_child(void) __hidden; > +void _thr_vfork_pre(void) __hidden; > +void _thr_vfork_post(void) __hidden; > void _thr_try_gc(struct pthread *, struct pthread *) __hidden; > int _rtp_to_schedparam(const struct rtprio *rtp, int *policy, > struct sched_param *param) __hidden; > @@ -833,6 +835,7 @@ pid_t __sys_getpid(void); > ssize_t __sys_read(int, void *, size_t); > ssize_t __sys_write(int, const void *, size_t); > void __sys_exit(int); > +pid_t __sys_vfork(void); > #endif > > static inline int > diff --git a/lib/libthr/thread/thr_vfork.c b/lib/libthr/thread/thr_vfork.c > new file mode 100644 > index 0000000..bed7db5 > --- /dev/null > +++ b/lib/libthr/thread/thr_vfork.c > @@ -0,0 +1,112 @@ > +/* > + * Copyright (c) 2012 Konstantin Belousov > + * Copyright (c) 2005 David Xu > + * Copyright (c) 2003 Daniel Eischen > + * All rights reserved. > + * > + * Redistribution and use in source and binary forms, with or without > + * modification, are permitted provided that the following conditions > + * are met: > + * 1. Redistributions of source code must retain the above copyright > + * notice, this list of conditions and the following disclaimer. > + * 2. Neither the name of the author nor the names of any co-contributors > + * may be used to endorse or promote products derived from this software > + * without specific prior written permission. > + * > + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND > + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE > + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE > + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE > + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL > + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS > + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) > + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT > + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY > + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF > + * SUCH DAMAGE. > + * > + * $FreeBSD$ > + */ > + > +/* > + * Copyright (c) 1995-1998 John Birrell > + * All rights reserved. > + * > + * Redistribution and use in source and binary forms, with or without > + * modification, are permitted provided that the following conditions > + * are met: > + * 1. Redistributions of source code must retain the above copyright > + * notice, this list of conditions and the following disclaimer. > + * 2. Redistributions in binary form must reproduce the above copyright > + * notice, this list of conditions and the following disclaimer in the > + * documentation and/or other materials provided with the distribution. > + * 3. Neither the name of the author nor the names of any co-contributors > + * may be used to endorse or promote products derived from this software > + * without specific prior written permission. > + * > + * THIS SOFTWARE IS PROVIDED BY JOHN BIRRELL AND CONTRIBUTORS ``AS IS'' AND > + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE > + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE > + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE > + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL > + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS > + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) > + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT > + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY > + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF > + * SUCH DAMAGE. > + */ > + > +#include "namespace.h" > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include "un-namespace.h" > + > +#include "libc_private.h" > +#include "rtld_lock.h" > +#include "thr_private.h" > + > +static int rtld_locks[MAX_RTLD_LOCKS]; > +static int cancelsave, was_threaded; > + > +void > +_thr_vfork_pre(void) > +{ > + struct pthread *curthread; > + > + curthread = _get_curthread(); > + _thr_rwl_rdlock(&_thr_atfork_lock); > + cancelsave = curthread->no_cancel; > + curthread->no_cancel = 1; > + _thr_signal_block(curthread); > + _thr_signal_prefork(); > + if (_thr_isthreaded() != 0) { > + was_threaded = 1; > + _malloc_prefork(); > + _rtld_atfork_pre(rtld_locks); > + } else > + was_threaded = 0; > +} > + > +void > +_thr_vfork_post(void) > +{ > + struct pthread *curthread; > + > + _thr_signal_postfork(); > + if (was_threaded) { > + _rtld_atfork_post(rtld_locks); > + _malloc_postfork(); > + } > + curthread = _get_curthread(); > + _thr_signal_unblock(curthread); > + curthread->no_cancel = cancelsave; > + _thr_rwlock_unlock(&_thr_atfork_lock); > + if (curthread->cancel_async) > + _thr_testcancel(curthread); > +} > diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c > index 6cb95cd..8735c25 100644 > --- a/sys/kern/kern_fork.c > +++ b/sys/kern/kern_fork.c > @@ -150,11 +150,7 @@ sys_vfork(struct thread *td, struct vfork_args *uap) > int error, flags; > struct proc *p2; > > -#ifdef XEN > - flags = RFFDG | RFPROC; /* validate that this is still an issue */ > -#else > flags = RFFDG | RFPROC | RFPPWAIT | RFMEM; > -#endif > error = fork1(td, flags, 0, &p2, NULL, 0); > if (error == 0) { > td->td_retval[0] = p2->p_pid; > @@ -756,7 +752,7 @@ fork1(struct thread *td, int flags, int pages, struct proc **procp, > struct thread *td2; > struct vmspace *vm2; > vm_ooffset_t mem_charged; > - int error; > + int error, single_threaded; > static int curfail; > static struct timeval lastfail; > #ifdef PROCDESC > @@ -815,6 +811,19 @@ fork1(struct thread *td, int flags, int pages, struct proc **procp, > } > #endif > > + if (((p1->p_flag & (P_HADTHREADS | P_SYSTEM)) == P_HADTHREADS) && > + (flags & RFPPWAIT) != 0) { > + PROC_LOCK(p1); > + if (thread_single(SINGLE_BOUNDARY)) { > + PROC_UNLOCK(p1); > + error = ERESTART; > + goto fail2; > + } > + PROC_UNLOCK(p1); > + single_threaded = 1; > + } else > + single_threaded = 0; > + > mem_charged = 0; > vm2 = NULL; > if (pages == 0) > @@ -945,6 +954,12 @@ fail1: > if (vm2 != NULL) > vmspace_free(vm2); > uma_zfree(proc_zone, newproc); > + if (single_threaded) { > + PROC_LOCK(p1); > + thread_single_end(); > + PROC_UNLOCK(p1); > + } > +fail2: > #ifdef PROCDESC > if (((flags & RFPROCDESC) != 0) && (fp_procdesc != NULL)) { > fdclose(td->td_proc->p_fd, fp_procdesc, *procdescp, td); Single threading might be too excessive, the whole point of vfork is it is faster, if it is slow, I will use fork() instead. And another question is do you think child won't want to talk with another thread in parent process ? It is unlikely you can use locking in vfork wrapper, this becauses a vfork child can call vfork again, if child process dies after it acquired thread libraries' locks, it will cause lock leaks. Also, if a child process needs to enter RTLD to resolve a PLT, the rtld_bind rwlock will be acquired, if the child process dies after it acquired read lock, it also causes lock leak. I think Jilles patch is good enough to make posix_spawn and system() work, it seems Solaris also only allows SIG_IGN or SIG_DFL to be use in vfork child, same as Jilles' patch.