From owner-freebsd-hackers@FreeBSD.ORG Wed Feb 10 17:52:07 2010 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5F1FE106566B for ; Wed, 10 Feb 2010 17:52:07 +0000 (UTC) (envelope-from yanefbsd@gmail.com) Received: from mail-px0-f203.google.com (mail-px0-f203.google.com [209.85.216.203]) by mx1.freebsd.org (Postfix) with ESMTP id 3441B8FC1D for ; Wed, 10 Feb 2010 17:52:06 +0000 (UTC) Received: by pxi41 with SMTP id 41so130151pxi.27 for ; Wed, 10 Feb 2010 09:52:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=oxoLDdQC2iGMnanURbuBwoCka4d/5XSfg5w2f4ey34A=; b=c7DsQkFOQ4zjMuAde211mhCdMMh2nlW+Oh5/2s862yIq83nWhMv+sApPgvJ0yqDm5Z 5hWpm8QlPmbm/4arfYXh+WSuKlsHHnbIXMc6FpnvOtQi51KLTmqX4Dp0WsptyoDUdOwd Q5igmfj3pq1GnF8OkVzkRdizzF7s/N8nFyfwk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=R/Gbp2kegERv9k+QOxvOgmdf71Egc8+CtmnMpQ6D07YG2/8gaiCTHJrjzlTwQa2sff yX7ENMo5/DACTPGFfUhB5JlfS+v9Ab7GnW4DqjB2b+rnorAHasMO9UvU82nt3PtMXsnz mccDaF65vZarfDvaQH7vXGcYHTRytpsxepwHE= MIME-Version: 1.0 Received: by 10.142.62.35 with SMTP id k35mr343922wfa.197.1265824326618; Wed, 10 Feb 2010 09:52:06 -0800 (PST) In-Reply-To: <39c945731002100925i2e466768peac89cdef15463f2@mail.gmail.com> References: <39c945731002100925i2e466768peac89cdef15463f2@mail.gmail.com> Date: Wed, 10 Feb 2010 09:52:06 -0800 Message-ID: <7d6fde3d1002100952g1518bc36r371020260e81a8c3@mail.gmail.com> From: Garrett Cooper To: Naveen Gujje Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-hackers@freebsd.org Subject: Re: System() returning ECHILD error on FreeBSD 7.2 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Feb 2010 17:52:07 -0000 On Wed, Feb 10, 2010 at 9:25 AM, Naveen Gujje wrote= : > Naveen Gujje > wrote: > =A0>> signal(SIGCHLD, SigChildHandler); > =A0>> > =A0>> void > =A0>> SigChildHandler(int sig) > > =A0>> { > =A0>> =A0 pid_t pid; > =A0>> > =A0>> =A0 /* get status of all dead procs */ > =A0>> =A0 do { > =A0>> =A0 =A0 int procstat; > =A0>> =A0 =A0 pid =3D waitpid(-1, &procstat, WNOHANG); > =A0>> =A0 =A0 if (pid < 0) { > > =A0>> =A0 =A0 =A0 if (errno =3D=3D EINTR) > =A0>> =A0 =A0 =A0 =A0 continue; =A0 =A0 =A0 =A0 =A0 =A0 =A0 /* ignore it = */ > =A0>> =A0 =A0 =A0 else { > =A0>> =A0 =A0 =A0 =A0 if (errno !=3D ECHILD) > =A0>> =A0 =A0 =A0 =A0 =A0 perror("getting waitpid"); > > =A0>> =A0 =A0 =A0 =A0 pid =3D 0; =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/* break = out */ > =A0>> =A0 =A0 =A0 } > =A0>> =A0 =A0 } > =A0>> =A0 =A0 else if (pid !=3D 0) > =A0>> =A0 =A0 =A0 syslog(LOG_INFO, "child process %d completed", (int) pi= d); > > =A0>> =A0 } while (pid); > =A0>> > =A0>> =A0 signal(SIGCHLD, SigChildHandler); > =A0>> } > >>There are several problems with your signal handler. > >>First, the perror() and syslog() functions are not re-entrant, > >>so they should not be used inside signal handlers. =A0This can >>lead to undefined behaviour. =A0Please refer to the sigaction(2) >>manual page for a list of functions that are considered safe >>to be used inside signal handlers. > >>Second, you are using functions that may change the value of >>the global errno variable. =A0Therefore you must save its value >>at the beginning of the signal handler, and restore it at the >>end. > >>Third (not a problem in this particular case, AFAICT, but >>still good to know): =A0Unlike SysV systems, BSD systems do >>_not_ automatically reset the signal action when the handler >>is called. =A0Therefore you do not have to call signal() again > >>in the handler (but it shouldn't hurt either). =A0Because of >>the semantic difference of the signal() function on different >>systems, it is preferable to use sigaction(2) instead in >>portable code. > > Okay, I followed your suggestion and changed my SigChildHandler to > > void > SigChildHandler(int sig) > { > =A0pid_t pid; > =A0int status; > =A0int saved_errno =3D errno; > > =A0while (((pid =3D waitpid( (pid_t) -1, &status, WNOHANG)) > 0) || > > =A0 =A0 =A0 =A0 ((-1 =3D=3D pid) && (EINTR =3D=3D errno))) > =A0 =A0; > > =A0errno =3D saved_errno; > } > > and used sigaction(2) to register this handler. Still, system(3) returns > -1 with errno set to ECHILD. > > =A0>> And, in some other part of the code, we call system() to add an eth= ernet > > =A0>> interface. This system() call is returning -1 with errno set to ECH= ILD, > =A0>> though the passed command is executed successfully. =A0I have notic= ed that, > =A0>> the problem is observed only after we register SigChildHandler. If = I have a > > =A0>> simple statement like system("ls") before and after the call to > =A0>> signal(SIGCHLD, SigChildHandler), the call before setting signal ha= ndler > =A0>> succeeds without errors and the call after setting signal handler r= eturns -1 > > =A0>> with errno set to ECHILD. > =A0>> > =A0>> Here, I believe that within the system() call, the child exited bef= ore the > =A0>> parent got a chance to call _wait4 and thus resulted in ECHILD erro= r. > >>I don't think that can happen. > > =A0>> But, for the child to exit without notifying the parent, SIGCHLD ha= s to be > =A0>> set to SIG_IGN in the parent and this is not the case, because we > are already > > =A0>> setting it to SigChildHandler. If I set SIGCHLD to SIG_DFL before c= alling > =A0>> system() then i don't see this problem. > =A0>> > =A0>> I would like to know how setting SIGCHLD to SIG_DFL or SigChildHanl= der is > > =A0>> making the difference. > >>The system() function temporarily blocks SIGCHLD (i.e. it >>adds the signal to the process' signal mask). =A0However, >>blocking is different from ignoring: =A0The signal is held > >>as long as it is blocked, and as soon as it is removed >>from the mask, it is delivered, i.e. your signal handler >>is called right before the system() function returns. > > Yes, I agree with you. Here, I believe, the point in blocking SIGCHLD > is to give preference to wait4() of system() over any other waitXXX() in > parent process. But I still cant get the reason for wait4() to return -1. > >>And since you don't save the errno value, your signal >>handler overwrites the value returned from the system() >>function. =A0So you get ECHILD. > > I had a debug print just after wait4() in system() and before we unblock > SIGCHLD. And it's clear that wait4() is returning -1 with errno as ECHILD= . Isn't this section of the system(3) libcall essentially doing what you want, s.t. you'll never be able to get the process status when you call waitpid(2)? do { pid =3D _wait4(savedpid, &pstat, 0, (struct rusage *)0); } while (pid =3D=3D -1 && errno =3D=3D EINTR); break; You typically get status via wait*(2) when using exec*(2) or via the return codes from system(3), not system(3) with wait*(2)... Thanks, -Garrett