From owner-freebsd-questions@FreeBSD.ORG  Thu Jul 31 01:35:09 2003
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: freebsd-questions@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id DD12C37B401
	for <freebsd-questions@freebsd.org>;
	Thu, 31 Jul 2003 01:35:09 -0700 (PDT)
Received: from rambo.401.cx (rambo.401.cx [80.65.205.166])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 2D24743F3F
	for <freebsd-questions@freebsd.org>;
	Thu, 31 Jul 2003 01:33:48 -0700 (PDT)	(envelope-from listsub@401.cx)
Received: from 401.cx (132.dairy.twenty4help.se [80.65.195.132])
	by rambo.401.cx (8.12.9/8.12.9) with ESMTP id h6V8Xe7P000497;
	Thu, 31 Jul 2003 10:33:40 +0200 (CEST)
	(envelope-from listsub@401.cx)
Message-ID: <3F28D45E.4030109@401.cx>
Date: Thu, 31 Jul 2003 10:33:34 +0200
From: "Roger 'Rocky' Vetterberg" <listsub@401.cx>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
	rv:1.5a) Gecko/20030708 Thunderbird/0.1a
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: Jamie <jamie@gnulife.org>
References: <20030728165345.A71147-100000@floyd.gnulife.org>
In-Reply-To: <20030728165345.A71147-100000@floyd.gnulife.org>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
cc: freebsd-questions@freebsd.org
Subject: Re: Server spinning out of control...
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>,
	<mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>,
	<mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 31 Jul 2003 08:35:10 -0000

Jamie wrote:
> 
>     That is a good idea, thanks. We did check that though. Went through
> each user's accounts checking their .forwards and procmaillrc files.
> 
>     We are running spamassassin 2.55, and in the global procmailrc file we
> call spamc which connects to a spamd running on another machine.
> 
>     Are you aware of any other system utilities that might be used to
> trace CPU consumption and trap problems? We've taken a lot of stabs in the
> dark with what it could be, and we'd like to try some solid diagnostic
> utils to shed more light.
> 
>    - Jamie

Try running systat -vm, that should give you a good overview over what 
happens when the load skyrockets.

I had a similar problem once, not as extreme as the one you describe 
but the symptoms where the same. A few times a day one of our servers 
reported load averages at about 5.0-5.5. By the time I got there (30 
second run to the serverroom) the server was always back to almost 
idle, avg around 0.2-0.5. The only thing that was different in this 
compared to most of the other servers was the nic. Since the onboard 
nic died we had to replace it with a low profile PCI nic. I cant 
remember the exact make and model, but it was probably something cheap 
from the nearest computer store.
Using systat I noticed that during the bursts of high loads the number 
of interrupts on the nic went skyhigh. We replaced the nic with a more 
wellknown brand, and the server flatlined its load average. Its still 
doing exactly the same tasks but rarely goes above 0.1.

--
R