Date: Wed, 10 Feb 2010 22:36:20 +0100 From: Jilles Tjoelker <jilles@stack.nl> To: Naveen Gujje <gujjenaveen@gmail.com> Cc: freebsd-hackers@freebsd.org Subject: Re: System() returning ECHILD error on FreeBSD 7.2 Message-ID: <20100210213620.GA94346@stack.nl> In-Reply-To: <39c945731002092314u4a8fd100q69c0735a11e9063a@mail.gmail.com> References: <39c945731002092314u4a8fd100q69c0735a11e9063a@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Feb 10, 2010 at 12:44:57PM +0530, Naveen Gujje wrote:
> [SIGCHLD handler that calls waitpid()]
> And, in some other part of the code, we call system() to add an ethernet
> interface. This system() call is returning -1 with errno set to ECHILD,
> though the passed command is executed successfully.  I have noticed that,
> the problem is observed only after we register SigChildHandler. If I have a
> simple statement like system("ls") before and after the call to
> signal(SIGCHLD, SigChildHandler), the call before setting signal handler
> succeeds without errors and the call after setting signal handler returns -1
> with errno set to ECHILD.
> Here, I believe that within the system() call, the child exited before the
> parent got a chance to call _wait4 and thus resulted in ECHILD error. But,
> for the child to exit without notifying the parent, SIGCHLD has to be set to
> SIG_IGN in the parent and this is not the case, because we are already
> setting it to SigChildHandler. If I set SIGCHLD to SIG_DFL before calling
> system() then i don't see this problem.
> I would like to know how setting SIGCHLD to SIG_DFL or SigChildHanlder is
> making the difference.
I think your process is multi-threaded. In a single-threaded process,
system()'s signal masking will ensure it will reap the zombie, leaving
the signal handler with nothing (in fact, as of FreeBSD 7.0 it will not
be called at all unless there are other child processes).
In a multi-threaded process, each thread has its own signal mask and
system() can only affect its own thread's signal mask. If another thread
has SIGCHLD unblocked, the signal handler will race with system() trying
to call waitpid() first.
Possible fixes are using siginfo_t information to only waitpid() child
processes you know about, setting up the signal masks so the bad
situation does not occur (note that the signal mask is inherited across
pthread_create()) and calling fork/execve and managing the child process
exit yourself.
Note that POSIX does not require system() to be thread-safe.
-- 
Jilles Tjoelker
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100210213620.GA94346>
