From owner-freebsd-stable@FreeBSD.ORG  Thu Feb 16 13:58:54 2006
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
X-Original-To: freebsd-stable@freebsd.org
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id DA01216A420
	for <freebsd-stable@freebsd.org>; Thu, 16 Feb 2006 13:58:54 +0000 (GMT)
	(envelope-from gavin.atkinson@ury.york.ac.uk)
Received: from mail-gw1.york.ac.uk (mail-gw1.york.ac.uk [144.32.128.246])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 6C5B343D8B
	for <freebsd-stable@freebsd.org>; Thu, 16 Feb 2006 13:58:53 +0000 (GMT)
	(envelope-from gavin.atkinson@ury.york.ac.uk)
Received: from buffy.york.ac.uk (buffy-128.york.ac.uk [144.32.128.160])
	by mail-gw1.york.ac.uk (8.12.10/8.12.10) with ESMTP id k1GDwE67019546; 
	Thu, 16 Feb 2006 13:58:29 GMT
Received: from buffy.york.ac.uk (localhost [127.0.0.1])
	by buffy.york.ac.uk (8.13.4/8.13.4) with ESMTP id k1GDw8M6076961;
	Thu, 16 Feb 2006 13:58:08 GMT
	(envelope-from gavin.atkinson@ury.york.ac.uk)
Received: (from ga9@localhost)
	by buffy.york.ac.uk (8.13.4/8.13.4/Submit) id k1GDw8IO076960;
	Thu, 16 Feb 2006 13:58:08 GMT
	(envelope-from gavin.atkinson@ury.york.ac.uk)
X-Authentication-Warning: buffy.york.ac.uk: ga9 set sender to
	gavin.atkinson@ury.york.ac.uk using -f
From: Gavin Atkinson <gavin.atkinson@ury.york.ac.uk>
To: Dan Nelson <dnelson@allantgroup.com>
In-Reply-To: <20060215223432.GH70956@dan.emsphone.com>
References: <1140027060.83368.11.camel@r4.agava-guns.domain>
	<20060215194204.GC70956@dan.emsphone.com>
	<20060215215608.GA55676@xor.obsecurity.org>
	<20060215223432.GH70956@dan.emsphone.com>
Content-Type: text/plain
Content-Transfer-Encoding: 7bit
Date: Thu, 16 Feb 2006 13:58:08 +0000
Message-Id: <1140098288.76342.44.camel@buffy.york.ac.uk>
Mime-Version: 1.0
X-Mailer: Evolution 2.4.2.1 FreeBSD GNOME Team Port 
X-York-MailScanner: Found to be clean
X-York-MailScanner-From: gavin.atkinson@ury.york.ac.uk
Cc: freebsd-stable@freebsd.org
Subject: Re: Strange process
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Feb 2006 13:58:55 -0000

On Wed, 2006-02-15 at 16:34 -0600, Dan Nelson wrote:
> In the last episode (Feb 15), Kris Kennaway said:
> > On Wed, Feb 15, 2006 at 01:42:04PM -0600, Dan Nelson wrote:
> > > In the last episode (Feb 15), Ivan Kolosovskiy said:
> > > > top:
> > > > PID USERNAME  THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU COMMAND
> > > > 38410 findfile    1  96    0     0K     0K START  0   0:00  0.00% grotty
> > > > 
> > > > ps:
> > > > host$ ps -waux | grep grotty
> > > > findfile 38410  0,0  0,0     0     0  p6  REJ  19:57     0:00,25 [grotty]
> > > 
> > > E in the STAT column means the process is trying to exit, but
> > > can't. What does "ps lp 38410" print?  The MWCHAN column should say
> > > where in the kernel the process is stuck.
> > 
> > I often see this too.  For example:
> > 
> >   PID USERNAME    THR PRI NICE   SIZE    RES STATE    TIME   WCPU COMMAND
> >  5357 kkenn         1  96    0     0K     0K START    0:00  0.35% xpdf
> > 
> > > ps -waux  | grep xpdf
> > kkenn    5357  0.3  0.0     0     0  ??  RE   Sun08PM   0:00.20 [xpdf]
> > 
> > > ps lp 5357
> >   UID   PID  PPID CPU PRI NI   VSZ   RSS MWCHAN STAT  TT       TIME COMMAND
> 
> That syntax should have worked...  Try a plain "px axl | grep xpdf"
> instead.
> 
> I think top's START state corresponds to the ~200-line window of code
> in kern_fork.c:fork1() between p_state=PRS_NEW and p_state=PRS_NORMAL,
> but I'm not positive.

In my case (again on 6.0-REL), I have four such processes in top:

  636 root        1 100    4     0K     0K START  0   0:00  5.08% bandwidthd
  612 root        1 100    4     0K     0K START  0   0:00  4.14% bandwidthd
  604 root        1 100    4     0K     0K START  0   0:00  3.39% bandwidthd
  602 root        1 119    4     0K     0K START  1   0:00  0.00% bandwidthd

and in ps -auxl | grep bandwidth :

root       636  5.1  0.0     0     0  d1- RNE  26Jan06   0:00.39 [bandwidthd]         0   595   5 100  4 -
root       612  4.1  0.0     0     0  d1- RNE  26Jan06   0:00.35 [bandwidthd]         0   594   4 100  4 -
root       604  3.4  0.0     0     0  d1- RNE  26Jan06   0:00.29 [bandwidthd]         0   596   5 100  4 -
root       602  0.0  0.0     0     0  d1- RNE  26Jan06   0:00.09 [bandwidthd]         0   597 316 119  4 -

Note that in the top uutput, these processes have a non-zero WCPU
percentage (which does not change) - I don't know if tis means the
process did get to run briefly, or if they are frozen in time before
that part of the process structure has been cleared out.  This
percentage does not count against the system idle percentagte in top:

CPU states:  0.0% user,  0.0% nice,  0.2% system,  0.2% interrupt, 99.6% idle

Hope that helps somebody figure out what is happening.  Sadly I've not
seen these on a machine with ddb in the kernel yet so I can't get a
backtrace, if anyone else seeing these has ddb then that would probably
be interesting to see.

Gavin