Skip site navigation (1)Skip section navigation (2)
Date:      08 Dec 1997 19:36 EST
From:      "Andrew Atrens" <atrens@nortel.ca>
To:        hackers@FreeBSD.ORG
Subject:   possible linux emulation bug (was: dealing with zombies)
Message-ID:  <199712090607.WAA02766@hub.freebsd.org>

next in thread | raw e-mail | index | archive | help

Hi again folks!

Rummaging around my mailbox, I found a hackers thread from Sept 01 to 
Sept 07  on this topic. Specifically on Sep01, Greg Lehey writes:


> The semantics of SIGC[H]LD differ greatly between System V and BSD.
> Here's a quote from "Porting UNIX Software", page 213:
> 
>  System V treats the death of a child differently from other
>  implementations: The System V signal SIGCLD differs from the BSD and
>  POSIX.1 signal SIGCHLD and from all other signals by remaining active
>  until you call wait.  This can cause infinite recursion in the signal
>  handler if you reinstate the signal via signal or sigset before
>  calling wait.  If you use the POSIX.1 sigaction call, you don't have
>  to worry about this problem.
> 
>  When a child dies, it becomes a zombie.  As all voodoo fans know, a
>  zombie is one of the Living Dead, neither alive nor dead.  In UNIX
>  terminology, when a child process dies it becomes a zombie: the text
>  and data segments are freed, and the files are closed, but the
>  process table entry and some other information remain until it is
>  exorcized by the parent process, which is done by calling wait.  By
>  default, System V ignores SIGCLD and SIGCHLD, but the system creates
>  zombies, so you can find out about child status by calling wait.  If,
>  however, you change the default to explicitly ignore the signal, the
>  system ignores SIGCHLD and SIGCLD, but it also no longer creates
>  zombie processes.  If you set the disposition of SIGCHLD and SIGCLD
>  to ignore, but you call wait anyway, it waits until all child
>  processes have terminated, and then returns -1 (error), with errno
>  set to ECHILD.  You can achieve the same effect with sigaction by
>  specifying the SA_NOCLDWAIT flag in sa_flags.  There is no way to
>  achieve this behaviour in other versions of UNIX: if you find your
>  ported program is collecting zombies (which you will see with the ps
>  program), it might be that the program uses this feature to avoid
>  having to call wait.  If you experience this problem, you can solve
>  it by adding a signal handler for SIGCLD that just calls wait and
>  returns.
> 
>  The signal number for SIGCLD is the same as for SIGCHLD.  The
>  semantics depend on how you enable it: if you enable it with signal,
>  you get SIGCLD semantics (and unreliable signals), and if you enable
>  it with sigaction you get SIGCHLD and reliable signals.  Don't rely
>  on this, however.  Some versions of System V have special coding to
>  ensure that a separate SIGCLD signal is delivered for each child that
>  dies.
> 
> Greg


---


Given that Linux is in many ways SysV'ish, would this imply that my
Wordperfect zombies are the result of an implementation inconsistency
in the Linux emulation code ?



In message "dealing with zombies", I wrote:

> Hi All,
> 
> I'm eval'ing Wordperfect-7.0-for-Linux on `FreeBSD 3.0-971012-SNAP'  and am
> seeing lots of zombies:
> 
>  1446  ??  Z      0:00.00  (xwpthes)
>  1460  ??  Z      0:00.00  (xwpgmk5)
>  1461  ??  Z      0:00.00  (xwpspell)
>  1462  ??  Z      0:00.00  (xwpspell)
>  1467  ??  Z      0:00.00  (wpp7)
>  1471  ??  Z      0:00.00  (xwpspell)
>  1472  ??  Z      0:00.00  (xwpspell)
>  1473  ??  Z      0:00.00  (xwpspell)
>  1513  ??  Z      0:00.00  (xwpthes)
>  1514  ??  Z      0:00.00  (xwpthes)
>  1531  ??  Z      0:00.00  (xwpthes)
>  1532  ??  Z      0:00.00  (xwpthes)
>  1533  ??  Z      0:00.00  (xwpthes)
>  1543  ??  Z      0:00.00  (xwpthes)
>  1551  ??  Z      0:00.00  (xwpthes)
>  
> As I understand, the root cause is that (xwp) is failing to reap dead children. 
> However, the children *do* get reaped when I exit xwp... it seems that
> as long as xwp is running, no reaping is done, and the zombies accumulate...:(
> 
> What I'm wondering is:
> 
> i.   Is this a fault/feature of the app (xwp), the linux emulation code, or
>      the kernel (in a larger sense), and
> ii.  are there any *good* workarounds. ( I seem to recall some discussion about
>      a tunable kernel parm for auto-reap or some such thing? )
> 
> 
> Cheers,
> 
> Andrew.
> 
>                                                                            



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199712090607.WAA02766>