Date: Fri, 17 Aug 2012 15:43:12 +0300 From: Konstantin Belousov <kostikbel@gmail.com> To: Jilles Tjoelker <jilles@stack.nl> Cc: freebsd-hackers@freebsd.org, davidxu@freebsd.org Subject: Re: system() using vfork() or posix_spawn() and libthr Message-ID: <20120817124312.GD33100@deviant.kiev.zoral.com.ua> In-Reply-To: <20120816223933.GA19925@stack.nl> References: <20120814081830.GA5883@deviant.kiev.zoral.com.ua> <502A1788.9090702@freebsd.org> <20120814094111.GB5883@deviant.kiev.zoral.com.ua> <502A6B7A.6070504@gmail.com> <20120814210911.GA90640@stack.nl> <502AE1D4.4060308@gmail.com> <20120815174942.GN5883@deviant.kiev.zoral.com.ua> <502C3D8B.4060008@gmail.com> <20120816114426.GR5883@deviant.kiev.zoral.com.ua> <20120816223933.GA19925@stack.nl>
next in thread | previous in thread | raw e-mail | index | archive | help
--yudcn1FV7Hsu/q59 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Aug 17, 2012 at 12:39:33AM +0200, Jilles Tjoelker wrote: > On Thu, Aug 16, 2012 at 02:44:26PM +0300, Konstantin Belousov wrote: > > My point is that the fact that fork() is called from dynamic context > > that was identified as the signal handler does not mean anything. > > It can be mis-identified for many reasons, which both me and you > > tried to partially enumerate above. >=20 > > The really important thing is the atomicity of the current context, > > e.g. synchronous SIGSEGV caused by a language runtime GC is very > > much safe place to call atfork handlers, since runtimes usually cause > > signal generations only at the safe place. >=20 > > I do not think that such approach as you described can be completed, > > the _Fork() seems the only robust way. >=20 > Agreed, that way (detecting signal handler) lies madness. >=20 > If need be, _Fork() is easily implemented and used. >=20 > > BTW, returning to Jilles proposal, can we call vfork() only for > > single-threaded parent ? I think it gives good boost for single-threaded > > case, and also eliminates the concerns of non-safety. >=20 > This would probably fix the safety issues but at a price. There is a > correlation between processes so large that they benefit greatly from > vfork and threaded processes. Ok, so I will continue with my patch. >=20 > On the other hand, I think direct calls to vfork() in applications are > risky and it may not be possible to support them safely in all > circumstances. However, if libc is calling vfork() such as via popen(), > system() or posix_spawn(), it should be possible even in a > multi-threaded process. In that case, the rtld and libthr problems can > be avoided by using aliases with hidden visibility for all functions the > vforked child needs to call (or any other method that prevents > interposition and hard-codes a displacement into libc.so). I do not see how using any aliases could help there. Basically, if mt process is not single-threaded for vfork, you can have both some parent thread and child enter rtld. This is complete mess. >=20 > There may still be a problem in programs that install signal handlers > because the set of async-signal-safe functions is larger than what can > be done in a vforked child. Userland can avoid this by masking affected > signals before calling vfork() and resetting them to SIG_DFL before > unmasking them. This will add many syscalls if the code does not know > which signals are affected (such as libc). Alternatively, the kernel > could map caught signals to the default action for processes with > P_PPWAIT (just like it does not stop such processes because of signals > or TTY job control). If rtld does not work, then any library function call from a signal handler is problematic. My goal is to get a working rtld and possibly malloc. BTW, not quite related, it seems that the placement of openat, setcontext and swapcontext in the pthread.map is wrong. openat is at FBSD_1.1 in libc, and *context are at FBSD_1.0 version, while libthr exports them at 1.2. App or library gets linked to arbitrary version depending on whether libphread was specified at link time, and interposition from libthr does not work. Below is the latest version of my patch for vfork, which survives (modified) tools/test/pthread_vfork_test. Patch is only for x86 right now. diff --git a/lib/libthr/arch/amd64/Makefile.inc b/lib/libthr/arch/amd64/Mak= efile.inc index e6d99ec..476d26a 100644 --- a/lib/libthr/arch/amd64/Makefile.inc +++ b/lib/libthr/arch/amd64/Makefile.inc @@ -1,3 +1,4 @@ #$FreeBSD$ =20 -SRCS+=3D pthread_md.c _umtx_op_err.S +CFLAGS+=3D-I${.CURDIR}/../libc/${MACHINE_CPUARCH} +SRCS+=3D pthread_md.c _umtx_op_err.S vfork.S diff --git a/lib/libthr/arch/amd64/amd64/vfork.S b/lib/libthr/arch/amd64/am= d64/vfork.S new file mode 100644 index 0000000..1f0714f --- /dev/null +++ b/lib/libthr/arch/amd64/amd64/vfork.S @@ -0,0 +1,77 @@ +/*- + * Copyright (c) 2012 Konstantin Belousov <kib@freebsd.org> + * Copyright (c) 1990 The Regents of the University of California. + * All rights reserved. + * + * This code is derived from software contributed to Berkeley by + * William Jolitz. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 4. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURP= OSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENT= IAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STR= ICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY W= AY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <machine/asm.h> +__FBSDID("$FreeBSD$"); + +#include "SYS.h" + + .weak _vfork + .set _vfork,__sys_vfork + .weak vfork + .set vfork,__sys_vfork +ENTRY(__sys_vfork) + call _thr_vfork_pre + popq %rsi /* fetch return address (%rsi preserved) */ + mov $SYS_vfork,%rax + KERNCALL + jb 2f + cmpl $0,%eax + jne 1f + pushq %rsi + pushq %rsi /* twice for stack alignment */ + call CNAME(_thr_vfork_post_child) + popq %rsi + popq %rsi + xorl %eax,%eax + jmp *%rsi +1: + pushq %rsi + pushq %rax + call CNAME(_thr_vfork_post_child) + popq %rax + popq %rsi + jmp *%rsi +2: + pushq %rsi + pushq %rax + call CNAME(_thr_vfork_post_child) + call PIC_PLT(CNAME(__error)) + popq %rdx + movq %rdx,(%rax) + movq $-1,%rax + movq %rax,%rdx + retq +END(__sys_vfork) + + .section .note.GNU-stack,"",%progbits diff --git a/lib/libthr/arch/i386/Makefile.inc b/lib/libthr/arch/i386/Makef= ile.inc index 01290d5..13d55e2 100644 --- a/lib/libthr/arch/i386/Makefile.inc +++ b/lib/libthr/arch/i386/Makefile.inc @@ -1,3 +1,4 @@ # $FreeBSD$ =20 -SRCS+=3D pthread_md.c _umtx_op_err.S +CFLAGS+=3D-I${.CURDIR}/../libc/${MACHINE_CPUARCH} +SRCS+=3D pthread_md.c _umtx_op_err.S vfork.S diff --git a/lib/libthr/arch/i386/i386/vfork.S b/lib/libthr/arch/i386/i386/= vfork.S new file mode 100644 index 0000000..ca8896b --- /dev/null +++ b/lib/libthr/arch/i386/i386/vfork.S @@ -0,0 +1,82 @@ +/*- + * Copyright (c) 2012 Konstantin Belousov <kib@freebsd.org> + * Copyright (c) 1990 The Regents of the University of California. + * All rights reserved. + * + * This code is derived from software contributed to Berkeley by + * William Jolitz. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 4. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURP= OSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENT= IAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STR= ICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY W= AY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <machine/asm.h> +__FBSDID("$FreeBSD$"); + +#include "SYS.h" + + .weak _vfork + .set _vfork,__sys_vfork + .weak vfork + .set vfork,__sys_vfork +ENTRY(__sys_vfork) + PIC_PROLOGUE + call PIC_PLT(CNAME(_thr_vfork_pre)) + PIC_EPILOGUE + popl %ecx /* my rta into ecx */ + mov $SYS_vfork,%eax + KERNCALL + jb error_ret + pushl %ecx + cmpl $0,%eax + jne parent_ret + PIC_PROLOGUE + call PIC_PLT(CNAME(_thr_vfork_post_child)) + PIC_EPILOGUE + popl %ecx + xorl %eax,%eax + jmp *%ecx +parent_ret: + pushl %eax + PIC_PROLOGUE + call PIC_PLT(CNAME(_thr_vfork_post_parent)) + PIC_EPILOGUE + popl %eax + popl %ecx + jmp *%ecx +error_ret: + pushl %ecx + pushl %eax + PIC_PROLOGUE + call PIC_PLT(CNAME(_thr_vfork_post_child)) + call PIC_PLT(CNAME(__error)) + PIC_EPILOGUE + popl %edx + movl %edx,(%eax) + movl $-1,%eax + movl %eax,%edx + retl +END(__sys_vfork) + + .section .note.GNU-stack,"",%progbits diff --git a/lib/libthr/pthread.map b/lib/libthr/pthread.map index 355edea..40d14b4 100644 --- a/lib/libthr/pthread.map +++ b/lib/libthr/pthread.map @@ -157,6 +157,7 @@ FBSD_1.0 { system; tcdrain; usleep; + vfork; wait; wait3; wait4; diff --git a/lib/libthr/thread/Makefile.inc b/lib/libthr/thread/Makefile.inc index ddde6e9..92e82ac 100644 --- a/lib/libthr/thread/Makefile.inc +++ b/lib/libthr/thread/Makefile.inc @@ -55,4 +55,5 @@ SRCS+=3D \ thr_switch_np.c \ thr_symbols.c \ thr_umtx.c \ + thr_vfork.c \ thr_yield.c diff --git a/lib/libthr/thread/thr_private.h b/lib/libthr/thread/thr_privat= e.h index ba272fe..951c3b8 100644 --- a/lib/libthr/thread/thr_private.h +++ b/lib/libthr/thread/thr_private.h @@ -757,6 +757,7 @@ void _thr_cancel_leave(struct pthread *, int) __hidden; void _thr_testcancel(struct pthread *) __hidden; void _thr_signal_block(struct pthread *) __hidden; void _thr_signal_unblock(struct pthread *) __hidden; +void _thr_signal_unblock_noref(struct pthread *) __hidden; void _thr_signal_init(void) __hidden; void _thr_signal_deinit(void) __hidden; int _thr_send_sig(struct pthread *, int sig) __hidden; @@ -777,6 +778,9 @@ int _thr_setscheduler(lwpid_t, int, const struct sched_= param *) __hidden; void _thr_signal_prefork(void) __hidden; void _thr_signal_postfork(void) __hidden; void _thr_signal_postfork_child(void) __hidden; +void _thr_vfork_pre(void) __hidden; +void _thr_vfork_post_child(void) __hidden; +void _thr_vfork_post_parent(void) __hidden; void _thr_try_gc(struct pthread *, struct pthread *) __hidden; int _rtp_to_schedparam(const struct rtprio *rtp, int *policy, struct sched_param *param) __hidden; @@ -833,6 +837,7 @@ pid_t __sys_getpid(void); ssize_t __sys_read(int, void *, size_t); ssize_t __sys_write(int, const void *, size_t); void __sys_exit(int); +pid_t __sys_vfork(void); #endif =20 static inline int diff --git a/lib/libthr/thread/thr_sig.c b/lib/libthr/thread/thr_sig.c index 3dee8b7..3e5b25e 100644 --- a/lib/libthr/thread/thr_sig.c +++ b/lib/libthr/thread/thr_sig.c @@ -107,6 +107,13 @@ _thr_signal_unblock(struct pthread *curthread) __sys_sigprocmask(SIG_SETMASK, &curthread->sigmask, NULL); } =20 +void +_thr_signal_unblock_noref(struct pthread *curthread) +{ + if (curthread->sigblock =3D=3D 0) + __sys_sigprocmask(SIG_SETMASK, &curthread->sigmask, NULL); +} + int _thr_send_sig(struct pthread *thread, int sig) { diff --git a/lib/libthr/thread/thr_vfork.c b/lib/libthr/thread/thr_vfork.c new file mode 100644 index 0000000..80c9d1e --- /dev/null +++ b/lib/libthr/thread/thr_vfork.c @@ -0,0 +1,123 @@ +/* + * Copyright (c) 2012 Konstantin Belousov <kib@freebsd.org> + * Copyright (c) 2005 David Xu <davidxu@freebsd.org> + * Copyright (c) 2003 Daniel Eischen <deischen@freebsd.org> + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Neither the name of the author nor the names of any co-contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURP= OSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENT= IAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STR= ICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY W= AY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * $FreeBSD$ + */ + +/* + * Copyright (c) 1995-1998 John Birrell <jb@cimlogic.com.au> + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. Neither the name of the author nor the names of any co-contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY JOHN BIRRELL AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURP= OSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENT= IAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STR= ICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY W= AY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include "namespace.h" +#include <errno.h> +#include <link.h> +#include <string.h> +#include <stdlib.h> +#include <unistd.h> +#include <pthread.h> +#include <spinlock.h> +#include "un-namespace.h" + +#include "libc_private.h" +#include "rtld_lock.h" +#include "thr_private.h" + +static int rtld_locks[MAX_RTLD_LOCKS]; +static int cancelsave, was_threaded; + +void +_thr_vfork_pre(void) +{ + struct pthread *curthread; + + curthread =3D _get_curthread(); + _thr_rwl_wrlock(&_thr_atfork_lock); + cancelsave =3D curthread->no_cancel; + curthread->no_cancel =3D 1; + _thr_signal_block(curthread); + _thr_signal_prefork(); + if (_thr_isthreaded() !=3D 0) { + was_threaded =3D 1; + _malloc_prefork(); + _rtld_atfork_pre(rtld_locks); + } else + was_threaded =3D 0; +} + +void +_thr_vfork_post_child(void) +{ + struct pthread *curthread; + + _thr_signal_postfork(); + if (was_threaded) { + _rtld_atfork_post(rtld_locks); + _malloc_postfork(); + } + curthread =3D _get_curthread(); + _thr_signal_unblock(curthread); + curthread->no_cancel =3D cancelsave; + _thr_rwlock_unlock(&_thr_atfork_lock); + if (curthread->cancel_async) + _thr_testcancel(curthread); +} + +void +_thr_vfork_post_parent(void) +{ + struct pthread *curthread; + + curthread =3D _get_curthread(); + _thr_signal_unblock_noref(curthread); + if (curthread->cancel_async) + _thr_testcancel(curthread); +} diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c index 46cdca1..134ba80 100644 --- a/sys/kern/kern_fork.c +++ b/sys/kern/kern_fork.c @@ -150,11 +150,7 @@ sys_vfork(struct thread *td, struct vfork_args *uap) int error, flags; struct proc *p2; =20 -#ifdef XEN - flags =3D RFFDG | RFPROC; /* validate that this is still an issue */ -#else flags =3D RFFDG | RFPROC | RFPPWAIT | RFMEM; -#endif =09 error =3D fork1(td, flags, 0, &p2, NULL, 0); if (error =3D=3D 0) { td->td_retval[0] =3D p2->p_pid; @@ -756,7 +752,7 @@ fork1(struct thread *td, int flags, int pages, struct p= roc **procp, struct thread *td2; struct vmspace *vm2; vm_ooffset_t mem_charged; - int error; + int error, single_threaded; static int curfail; static struct timeval lastfail; #ifdef PROCDESC @@ -815,6 +811,19 @@ fork1(struct thread *td, int flags, int pages, struct = proc **procp, } #endif =20 + if (((p1->p_flag & (P_HADTHREADS | P_SYSTEM)) =3D=3D P_HADTHREADS) && + (flags & RFPPWAIT) !=3D 0) { + PROC_LOCK(p1); + if (thread_single(SINGLE_BOUNDARY)) { + PROC_UNLOCK(p1); + error =3D ERESTART; + goto fail2; + } + PROC_UNLOCK(p1); + single_threaded =3D 1; + } else + single_threaded =3D 0; + mem_charged =3D 0; vm2 =3D NULL; if (pages =3D=3D 0) @@ -945,6 +954,12 @@ fail1: if (vm2 !=3D NULL) vmspace_free(vm2); uma_zfree(proc_zone, newproc); + if (single_threaded) { + PROC_LOCK(p1); + thread_single_end(); + PROC_UNLOCK(p1); + } +fail2: #ifdef PROCDESC if (((flags & RFPROCDESC) !=3D 0) && (fp_procdesc !=3D NULL)) { fdclose(td->td_proc->p_fd, fp_procdesc, *procdescp, td); --yudcn1FV7Hsu/q59 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (FreeBSD) iEYEARECAAYFAlAuPGAACgkQC3+MBN1Mb4hX7ACfT7TOl3/pYf8O6pqZCeGCNDBv 7V8AoMVz/suqGIqvU9J1hmiAJqcllF+Q =7/Wn -----END PGP SIGNATURE----- --yudcn1FV7Hsu/q59--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120817124312.GD33100>