From owner-freebsd-questions@FreeBSD.ORG  Sat Oct 27 12:46:43 2007
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: freebsd-questions@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C8EAD16A41B
	for <freebsd-questions@freebsd.org>;
	Sat, 27 Oct 2007 12:46:43 +0000 (UTC)
	(envelope-from kris@FreeBSD.org)
Received: from weak.local (pointyhat.freebsd.org [IPv6:2001:4f8:fff6::2b])
	by mx1.freebsd.org (Postfix) with ESMTP id 5A6FD13C4AC;
	Sat, 27 Oct 2007 12:46:42 +0000 (UTC)
	(envelope-from kris@FreeBSD.org)
Message-ID: <47233334.8040005@FreeBSD.org>
Date: Sat, 27 Oct 2007 14:46:44 +0200
From: Kris Kennaway <kris@FreeBSD.org>
User-Agent: Thunderbird 2.0.0.6 (Macintosh/20070728)
MIME-Version: 1.0
To: Gunther Mayer <gunther.mayer@googlemail.com>
References: <47232945.10506@gmail.com>
In-Reply-To: <47232945.10506@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-questions@freebsd.org
Subject: Re: CPU usage 100% but no process hogging CPU
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 27 Oct 2007 12:46:43 -0000

Gunther Mayer wrote:
> Hi there,
> 
> I'm having some capacity issues on the FreeBSD 6.2/Core 2 Duo/2GB RAM 
> server that I manage. For quite a few days now it constantly shows load 
> averages of around 1 and a CPU usage of around 100%. Yet summing up the 
> CPU usage of the individual processes running I hardly ever get to more 
> than 5%, regardless of how long I watch top.
> 
> A snapshot of my top output looks like this:
> 
> last pid: 96102;  load averages:  1.28,  1.15,  
> 1.06                                                                                
> up 22+08:33:16  13:55:03
> 122 processes: 2 running, 119 sleeping, 1 zombie
> CPU states: 67.3% user,  0.0% nice, 32.7% system,  0.0% interrupt,  0.0% 
> idle
> Mem: 474M Active, 974M Inact, 186M Wired, 68M Cache, 213M Buf, 93M Free
> Swap: 4064M Total, 4064M Free
> 
>  PID USERNAME  THR PRI NICE   SIZE    RES STATE    TIME   WCPU COMMAND
>  635 root        1 122    0 27304K  2644K select 656:38  1.27% syslog-ng
>  844 www        20  20    0   411M   300M kserel 360:13  0.00% java
>  837 user1       3  20    0 29048K  5672K kserel  34:30  0.00% radiusd
>  788 pgsql       1  96    0 13516K  3824K select  10:03  0.00% postgres
>  785 pgsql       1 115    0   120M  7436K select   9:02  0.00% postgres
>  787 pgsql       1   8    0   120M 41112K nanslp   5:15  0.00% postgres
> 
> syslog-ng is quite busy as I use it to capture logs of more than 50 
> remote sites. I have lots of slow queries in my postgres logs that I 
> think are related to this bottleneck, though unoptimised queries and an 
> ever growing amount of data are more likely to take the blame for that. 
> High disk I/O in this regard could explain the high system utilisation, 
> however.
> 
> I found out that I've been bitten by the freebsd-update bug 
> (http://security.freebsd.org/advisories/FreeBSD-EN-07:05.freebsd-update.asc) 
> which replaced my SMP kernel with a GENERIC one and I'm taking 
> corrective action early tomorrow morning, but surely even with just a 
> single CPU the load average should never be as high?
> 
> Where are those phantom CPU hogging processes?

A couple of points:

1) top -S will show what the kernel is doing, which may be relevant.

2) Because it only samples once a second (by default), top is bad for 
monitoring of any short-lived processes that may be using CPU for brief 
periods and then exiting.  Don't know if you have any on this workload 
though.

3) In 6.x threaded applications do not generate CPU usage data in top. , 
i.e. java is probably using more than 0% of your CPU :)  I think this is 
fixed in 7.0 and maybe also with libthr.  Chances are you want to use 
libthr even in 6.x for performance reasons (libkse has attrocious 
performance).  Use libmap.conf to switch the libraries.

Kris