Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 11 Aug 2012 15:10:48 +0200
From:      Jilles Tjoelker <jilles@stack.nl>
To:        David Xu <davidxu@freebsd.org>
Cc:        Konstantin Belousov <kostikbel@gmail.com>, freebsd-hackers@freebsd.org
Subject:   Re: system() using vfork() or posix_spawn() and libthr
Message-ID:  <20120811131048.GA29572@stack.nl>
In-Reply-To: <50246EE4.9090409@freebsd.org>
References:  <20120730102408.GA19983@stack.nl> <20120730105303.GU2676@deviant.kiev.zoral.com.ua> <20120805215432.GA28704@stack.nl> <20120806082535.GI2676@deviant.kiev.zoral.com.ua> <20120809105648.GA79814@stack.nl> <50246EE4.9090409@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Aug 10, 2012 at 10:16:04AM +0800, David Xu wrote:
> On 2012/08/09 18:56, Jilles Tjoelker wrote:
> > On Mon, Aug 06, 2012 at 11:25:35AM +0300, Konstantin Belousov wrote:
> >> On Sun, Aug 05, 2012 at 11:54:32PM +0200, Jilles Tjoelker wrote:
> >>> On Mon, Jul 30, 2012 at 01:53:03PM +0300, Konstantin Belousov wrote:
> >>>> On Mon, Jul 30, 2012 at 12:24:08PM +0200, Jilles Tjoelker wrote:
> >>>>> People sometimes use system() from large address spaces where it would
> >>>>> improve performance greatly to use vfork() instead of fork().

> >>>>> A simple approach is to change fork() to vfork(), although I have not
> >>>>> tried this. It seems safe enough to use sigaction and sigprocmask system
> >>>>> calls in the vforked process.

> >>>>> Alternatively, we can have posix_spawn() do the vfork() with signal
> >>>>> changes. This avoids possible whining from compilers and static
> >>>>> analyzers about using vfork() in system.c. However, I do not like the
> >>>>> tricky code for signals and that it adds lines of code.

> >>>>> This is lightly tested.

> >>>> It is interesting to note that for some time our vfork(2) no longer
> >>>> stops the whole forked process (parent), only the forking thread is
> >>>> waiting for the child exit or exec. I am not sure is this point
> >>>> important for system(3), but determined code can notice the difference
> >>>> from the fork->vfork switch.

> >>> Neither fork nor vfork call thread_single(SINGLE_BOUNDARY), so this is
> >>> not a difference.
> >> It is the difference, because vforked child shares parent address space.

> >>> Thread singling may be noticeable from a failing execve() (but only in
> >>> the process doing execve()) and in the rare case of rfork() without
> >>> RFPROC.

> >> No, other running threads in parent affect vforked child till exec or exit.
> >> In fact, I would classify this as bug, but not a serious one.

> > There are some ugly ways this parallel execution is depended on. If the
> > vforked child calls sigaction() while another thread is also in
> > sigaction() for that signal, the vforked child needs to wait for the
> > other thread to release the lock.

> > This uses a per-process lock to synchronize threads in different
> > processes, which may not work properly.

> > If the vforked child is killed (such as by SIGKILL) while holding the
> > lock, the parent is not killed but its _thr_sigact is damaged.

> > These problems could be avoided in libthr by skipping the lock in
> > _sigaction() if a signal action is being set to SIG_DFL or SIG_IGN and
> > the old action is not queried. In those cases, _thr_sigact is not
> > touched so no lock is required. This change also helps applications,
> > provided they call sigaction() and not signal().

> > Alternatively, posix_spawn() and system() could use the sigaction system
> > call directly, bypassing libthr (if present). However, this will not
> > help applications that call vfork() and sigaction() themselves (such as
> > a shell that wants to implement ...&  using vfork()).

> > posix_spawn() likely still needs some adjustment so that having it reset
> > all signals (sigfillset()) to the default action will not cause it to
> > [EINVAL] because libthr does not allow changing SIGTHR's disposition.

> > Index: lib/libthr/thread/thr_sig.c
> > ===================================================================
> > --- lib/libthr/thread/thr_sig.c	(revision 238970)
> > +++ lib/libthr/thread/thr_sig.c	(working copy)
> > @@ -519,8 +519,16 @@
> >   		return (-1);
> >   	}
> >
> > -	if (act)
> > +	if (act) {
> >   		newact = *act;
> > +		/*
> > +		 * Short-circuit cases where we do not touch _thr_sigact.
> > +		 * This allows performing these safely in a vforked child.
> > +		 */
> > +		if ((newact.sa_handler == SIG_DFL ||
> > +		    newact.sa_handler == SIG_IGN)&&  oact == NULL)
> > +			return (__sys_sigaction(sig,&newact, NULL));
> > +	}
> >
> >   	__sys_sigprocmask(SIG_SETMASK,&_thr_maskset,&oldset);
> >   	_thr_rwl_wrlock(&_thr_sigact[sig-1].lock);
> >

> Your patch is better than nothing, I don't object.
> The problem is visble to you, but there is also invisible user - rtld.
> If a symbol is never used in parent process, but now it is used
> in a vforked child, the rtld will be involved, if the child
> is killed, the rtld's data structure may be in inconsistent state,
> such as locking or link list etcs...
> I think this problem might be a non-fixable problem.

Hmm. Rtld cannot be fixed like libthr because its data structures are
inherently in userland.

Perhaps signal handling should be different for a vforked child, like
the default action of a signal sent to a thread affects the entire
process and not just the thread. This cannot be implemented in the
calling code because resolving execve() itself also needs rtld (ugly
hacks like performing an execve() call that is guaranteed to fail
aside).

The rtld problem can be avoided specifically by linking with '-z now'.
This might be acceptable for sh and csh; most applications can use
posix_spawn() which would have to become a system call.

-- 
Jilles Tjoelker



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120811131048.GA29572>