Date: Wed, 10 Feb 2010 09:52:06 -0800 From: Garrett Cooper <yanefbsd@gmail.com> To: Naveen Gujje <gujjenaveen@gmail.com> Cc: freebsd-hackers@freebsd.org Subject: Re: System() returning ECHILD error on FreeBSD 7.2 Message-ID: <7d6fde3d1002100952g1518bc36r371020260e81a8c3@mail.gmail.com> In-Reply-To: <39c945731002100925i2e466768peac89cdef15463f2@mail.gmail.com> References: <39c945731002100925i2e466768peac89cdef15463f2@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Feb 10, 2010 at 9:25 AM, Naveen Gujje <gujjenaveen@gmail.com> wrote= : > Naveen Gujje <gujjenaveen at gmail.com > <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>> wrote: > =A0>> signal(SIGCHLD, SigChildHandler); > =A0>> > =A0>> void > =A0>> SigChildHandler(int sig) > > =A0>> { > =A0>> =A0 pid_t pid; > =A0>> > =A0>> =A0 /* get status of all dead procs */ > =A0>> =A0 do { > =A0>> =A0 =A0 int procstat; > =A0>> =A0 =A0 pid =3D waitpid(-1, &procstat, WNOHANG); > =A0>> =A0 =A0 if (pid < 0) { > > =A0>> =A0 =A0 =A0 if (errno =3D=3D EINTR) > =A0>> =A0 =A0 =A0 =A0 continue; =A0 =A0 =A0 =A0 =A0 =A0 =A0 /* ignore it = */ > =A0>> =A0 =A0 =A0 else { > =A0>> =A0 =A0 =A0 =A0 if (errno !=3D ECHILD) > =A0>> =A0 =A0 =A0 =A0 =A0 perror("getting waitpid"); > > =A0>> =A0 =A0 =A0 =A0 pid =3D 0; =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/* break = out */ > =A0>> =A0 =A0 =A0 } > =A0>> =A0 =A0 } > =A0>> =A0 =A0 else if (pid !=3D 0) > =A0>> =A0 =A0 =A0 syslog(LOG_INFO, "child process %d completed", (int) pi= d); > > =A0>> =A0 } while (pid); > =A0>> > =A0>> =A0 signal(SIGCHLD, SigChildHandler); > =A0>> } > >>There are several problems with your signal handler. > >>First, the perror() and syslog() functions are not re-entrant, > >>so they should not be used inside signal handlers. =A0This can >>lead to undefined behaviour. =A0Please refer to the sigaction(2) >>manual page for a list of functions that are considered safe >>to be used inside signal handlers. > >>Second, you are using functions that may change the value of >>the global errno variable. =A0Therefore you must save its value >>at the beginning of the signal handler, and restore it at the >>end. > >>Third (not a problem in this particular case, AFAICT, but >>still good to know): =A0Unlike SysV systems, BSD systems do >>_not_ automatically reset the signal action when the handler >>is called. =A0Therefore you do not have to call signal() again > >>in the handler (but it shouldn't hurt either). =A0Because of >>the semantic difference of the signal() function on different >>systems, it is preferable to use sigaction(2) instead in >>portable code. > > Okay, I followed your suggestion and changed my SigChildHandler to > > void > SigChildHandler(int sig) > { > =A0pid_t pid; > =A0int status; > =A0int saved_errno =3D errno; > > =A0while (((pid =3D waitpid( (pid_t) -1, &status, WNOHANG)) > 0) || > > =A0 =A0 =A0 =A0 ((-1 =3D=3D pid) && (EINTR =3D=3D errno))) > =A0 =A0; > > =A0errno =3D saved_errno; > } > > and used sigaction(2) to register this handler. Still, system(3) returns > -1 with errno set to ECHILD. > > =A0>> And, in some other part of the code, we call system() to add an eth= ernet > > =A0>> interface. This system() call is returning -1 with errno set to ECH= ILD, > =A0>> though the passed command is executed successfully. =A0I have notic= ed that, > =A0>> the problem is observed only after we register SigChildHandler. If = I have a > > =A0>> simple statement like system("ls") before and after the call to > =A0>> signal(SIGCHLD, SigChildHandler), the call before setting signal ha= ndler > =A0>> succeeds without errors and the call after setting signal handler r= eturns -1 > > =A0>> with errno set to ECHILD. > =A0>> > =A0>> Here, I believe that within the system() call, the child exited bef= ore the > =A0>> parent got a chance to call _wait4 and thus resulted in ECHILD erro= r. > >>I don't think that can happen. > > =A0>> But, for the child to exit without notifying the parent, SIGCHLD ha= s to be > =A0>> set to SIG_IGN in the parent and this is not the case, because we > are already > > =A0>> setting it to SigChildHandler. If I set SIGCHLD to SIG_DFL before c= alling > =A0>> system() then i don't see this problem. > =A0>> > =A0>> I would like to know how setting SIGCHLD to SIG_DFL or SigChildHanl= der is > > =A0>> making the difference. > >>The system() function temporarily blocks SIGCHLD (i.e. it >>adds the signal to the process' signal mask). =A0However, >>blocking is different from ignoring: =A0The signal is held > >>as long as it is blocked, and as soon as it is removed >>from the mask, it is delivered, i.e. your signal handler >>is called right before the system() function returns. > > Yes, I agree with you. Here, I believe, the point in blocking SIGCHLD > is to give preference to wait4() of system() over any other waitXXX() in > parent process. But I still cant get the reason for wait4() to return -1. > >>And since you don't save the errno value, your signal >>handler overwrites the value returned from the system() >>function. =A0So you get ECHILD. > > I had a debug print just after wait4() in system() and before we unblock > SIGCHLD. And it's clear that wait4() is returning -1 with errno as ECHILD= . Isn't this section of the system(3) libcall essentially doing what you want, s.t. you'll never be able to get the process status when you call waitpid(2)? do { pid =3D _wait4(savedpid, &pstat, 0, (struct rusage *)0); } while (pid =3D=3D -1 && errno =3D=3D EINTR); break; You typically get status via wait*(2) when using exec*(2) or via the return codes from system(3), not system(3) with wait*(2)... Thanks, -Garrett
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?7d6fde3d1002100952g1518bc36r371020260e81a8c3>
