Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 30 Sep 2004 20:29:32 -0500
From:      Dan Nelson <dnelson@allantgroup.com>
To:        Jason Barnes <jbarnes@c3po.barnesos.net>
Cc:        questions@freebsd.org
Subject:   Re: process will not die.
Message-ID:  <20041001012932.GH22530@dan.emsphone.com>
In-Reply-To: <20040930160527.A58465@c3po.barnesos.net>
References:  <20040930160527.A58465@c3po.barnesos.net>

next in thread | previous in thread | raw e-mail | index | archive | help
In the last episode (Sep 30), Jason Barnes said:
> 	While running an mpirun job on my dual-processor SMP system
> (FreeBSD 4-STABLE from August 28), my program (initiated with the
> command line 'mpirun -np 2 ../sphagr') periodically dies, leaving a
> process that I can't kill -9.  Here's the top:
> 
> 	here's ps -auxw | grep sph:
> 
> jbarnes   549  0.0  8.7 410076 90744  p2  R     3:39PM   3:01.97 sphagr -p4pg /usr/home/
> jbarnes   550  0.0  0.0     0    0  p2  Z     3:39PM   0:00.00  (sphagr)
> 
> 	The 550 process I kill -9ed, but its still there, and now when I
> try to kill it it says 'no such process'.

Processes in the Z state have already exited, but their parent process
has not retrieved their status with one of the wait*() functions.  The
entry in the process table will stay until that happens.  You can run
"ps axlp 550" and look at the PPID column to determine the parent's
pid.  The parent code needs to either wait() for the child status, or
if it doesn't need to know when the child exits, ignore SIGCHLD or set
the SA_NOCLDWAIT flag with sigaction().

-- 
	Dan Nelson
	dnelson@allantgroup.com



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20041001012932.GH22530>