Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 17 Aug 2012 18:08:47 +0200
From:      Jilles Tjoelker <jilles@stack.nl>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        freebsd-hackers@freebsd.org, davidxu@freebsd.org
Subject:   Re: system() using vfork() or posix_spawn() and libthr
Message-ID:  <20120817160847.GA34417@stack.nl>
In-Reply-To: <20120817124312.GD33100@deviant.kiev.zoral.com.ua>
References:  <502A1788.9090702@freebsd.org> <20120814094111.GB5883@deviant.kiev.zoral.com.ua> <502A6B7A.6070504@gmail.com> <20120814210911.GA90640@stack.nl> <502AE1D4.4060308@gmail.com> <20120815174942.GN5883@deviant.kiev.zoral.com.ua> <502C3D8B.4060008@gmail.com> <20120816114426.GR5883@deviant.kiev.zoral.com.ua> <20120816223933.GA19925@stack.nl> <20120817124312.GD33100@deviant.kiev.zoral.com.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Aug 17, 2012 at 03:43:12PM +0300, Konstantin Belousov wrote:
> On Fri, Aug 17, 2012 at 12:39:33AM +0200, Jilles Tjoelker wrote:
> > On Thu, Aug 16, 2012 at 02:44:26PM +0300, Konstantin Belousov wrote:
> > > BTW, returning to Jilles proposal, can we call vfork() only for
> > > single-threaded parent ? I think it gives good boost for single-threaded
> > > case, and also eliminates the concerns of non-safety.

> > This would probably fix the safety issues but at a price. There is a
> > correlation between processes so large that they benefit greatly from
> > vfork and threaded processes.
> Ok, so I will continue with my patch.

> > On the other hand, I think direct calls to vfork() in applications are
> > risky and it may not be possible to support them safely in all
> > circumstances. However, if libc is calling vfork() such as via popen(),
> > system() or posix_spawn(), it should be possible even in a
> > multi-threaded process. In that case, the rtld and libthr problems can
> > be avoided by using aliases with hidden visibility for all functions the
> > vforked child needs to call (or any other method that prevents
> > interposition and hard-codes a displacement into libc.so).

> I do not see how using any aliases could help there. Basically, if mt
> process is not single-threaded for vfork, you can have both some parent
> thread and child enter rtld. This is complete mess.

If libc calls a function with hidden visibility, this proceeds directly
(not via the PLT) and rtld is not involved.

Several ways to do this are in section 2.2.7 Avoid Using Exported
Symbols of Ulrich Drepper's dsohowto. One of them is something like

extern __typeof(next) next_int
	__attribute((alias("next"), visibility("hidden")));

in the same source as the definition of the function

int next(void) { ...; }

As Drepper notes, the visibility attribute is not strictly required if a
version script keeps the symbol local but it might lead to better code.
At least on i386, though, the optimal direct near call instruction is
generated even without it. For example, _nsdispatch() calls
libc_dlopen() (kept local by the version script) without going through
the PLT (from the output of objdump -dS on the libc.so.7 in /usr/obj).

In the assembler syscall stubs using code from lib/libc/arch/SYS.h this
can be done by adding another .set (we currently have foo, _foo and
__sys_foo for a syscall foo; some syscalls have only _foo and
__sys_foo) such as __syshidden_foo. The ones that are actually used
then need prototypes (similar to the __sys_* ones in lib/libthr/?).

For some reason, the symbol .cerror (HIDENAME(cerror)) is also exported.
Either this should be kept local or a local uninterposable alias should
be added and used (as with the syscalls).

The function __error() (to get errno's address for the current thread)
is and must be called via the PLT (because libthr is separate).
Therefore, we should ensure it is called at least once before vfork so
calls in the child do not involve rtld. The implementations for the
various architectures use the TLS register (or memory location for ARM),
so they seem safe.

This should suffice to fix posix_spawn() but the _execvpe() used by
posix_spawnp also uses various string functions. If not all of these
have already been called earlier, this will not work. Making calls to
them not go through the PLT seems fairly hard, even though it would make
sense in general, so perhaps I should instead reimplement it such that
the parent does almost all of the work.

An alternative is to write the core of posix_spawn() in assembler using
system calls directly but I would rather avoid that :(

> > There may still be a problem in programs that install signal handlers
> > because the set of async-signal-safe functions is larger than what can
> > be done in a vforked child. Userland can avoid this by masking affected
> > signals before calling vfork() and resetting them to SIG_DFL before
> > unmasking them. This will add many syscalls if the code does not know
> > which signals are affected (such as libc). Alternatively, the kernel
> > could map caught signals to the default action for processes with
> > P_PPWAIT (just like it does not stop such processes because of signals
> > or TTY job control).

> If rtld does not work, then any library function call from a signal handler
> is problematic. My goal is to get a working rtld and possibly malloc.

> BTW, not quite related, it seems that the placement of openat,
> setcontext and swapcontext in the pthread.map is wrong. openat is
> at FBSD_1.1 in libc, and *context are at FBSD_1.0 version, while
> libthr exports them at 1.2. App or library gets linked to arbitrary
> version depending on whether libphread was specified at link time, and
> interposition from libthr does not work.

Oops :( Can this still be fixed (like by exporting identical functions
in multiple versions)?

> Below is the latest version of my patch for vfork, which survives (modified)
> tools/test/pthread_vfork_test. Patch is only for x86 right now.

Why does this patch call thread_single(SINGLE_BOUNDARY)? I think this
just causes breakage by causing short writes, [ERESTART] and the like
where not permitted by POSIX while not fixing userland's problems. From
userland's point of view, thread_single(SINGLE_BOUNDARY) just freezes
threads at whatever point they are executing, including while holding an
internal lock. Also, the thread_single_end() is before waiting for the
child to exec or exit (because that is now just before returning to
userland).

If single-threading is needed, it should be done in userland such as via
pthread_suspend_all_np(). This function already ensures no other thread
is in a libthr critical section. If rtld blocks SIGTHR during its
critical sections, it likely also waits for the threads to leave rtld.
As with kernel-level single-threading, this is slow and causes incorrect
short writes.

If single-threading is only necessary because umtx locks are formally
per-process instead of per-vmspace, then I suggest avoiding the trouble
and depending on the fact that they are per-vmspace.

> [snip]
> diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c
> index 46cdca1..134ba80 100644
> --- a/sys/kern/kern_fork.c
> +++ b/sys/kern/kern_fork.c
> @@ -150,11 +150,7 @@ sys_vfork(struct thread *td, struct vfork_args *uap)
>  	int error, flags;
>  	struct proc *p2;
>  
> -#ifdef XEN
> -	flags = RFFDG | RFPROC; /* validate that this is still an issue */
> -#else
>  	flags = RFFDG | RFPROC | RFPPWAIT | RFMEM;
> -#endif		
>  	error = fork1(td, flags, 0, &p2, NULL, 0);
>  	if (error == 0) {
>  		td->td_retval[0] = p2->p_pid;
> @@ -756,7 +752,7 @@ fork1(struct thread *td, int flags, int pages, struct proc **procp,
>  	struct thread *td2;
>  	struct vmspace *vm2;
>  	vm_ooffset_t mem_charged;
> -	int error;
> +	int error, single_threaded;
>  	static int curfail;
>  	static struct timeval lastfail;
>  #ifdef PROCDESC
> @@ -815,6 +811,19 @@ fork1(struct thread *td, int flags, int pages, struct proc **procp,
>  	}
>  #endif
>  
> +	if (((p1->p_flag & (P_HADTHREADS | P_SYSTEM)) == P_HADTHREADS) &&
> +	    (flags & RFPPWAIT) != 0) {
> +		PROC_LOCK(p1);
> +		if (thread_single(SINGLE_BOUNDARY)) {
> +			PROC_UNLOCK(p1);
> +			error = ERESTART;
> +			goto fail2;
> +		}
> +		PROC_UNLOCK(p1);
> +		single_threaded = 1;
> +	} else
> +		single_threaded = 0;
> +
>  	mem_charged = 0;
>  	vm2 = NULL;
>  	if (pages == 0)
> @@ -945,6 +954,12 @@ fail1:
>  	if (vm2 != NULL)
>  		vmspace_free(vm2);
>  	uma_zfree(proc_zone, newproc);
> +	if (single_threaded) {
> +		PROC_LOCK(p1);
> +		thread_single_end();
> +		PROC_UNLOCK(p1);
> +	}
> +fail2:
>  #ifdef PROCDESC
>  	if (((flags & RFPROCDESC) != 0) && (fp_procdesc != NULL)) {
>  		fdclose(td->td_proc->p_fd, fp_procdesc, *procdescp, td);

-- 
Jilles Tjoelker



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120817160847.GA34417>