From owner-freebsd-hackers@FreeBSD.ORG  Sun Feb  1 07:41:52 2004
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 471E316A4CE
	for <freebsd-hackers@freebsd.org>;
	Sun,  1 Feb 2004 07:41:52 -0800 (PST)
Received: from mail.icomag.de (ns.icomag.de [195.227.115.162])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 663E343D41
	for <freebsd-hackers@freebsd.org>;
	Sun,  1 Feb 2004 07:41:49 -0800 (PST)	(envelope-from bgd@icomag.de)
Received: from localhost (localhost [127.0.0.1])
	by mail.icomag.de (Postfix) with ESMTP id 1883022E36
	for <freebsd-hackers@freebsd.org>;
	Sun,  1 Feb 2004 16:41:47 +0100 (CET)
Received: by mail.icomag.de (Postfix, from userid 1019)
	id BDE8D22E38; Sun,  1 Feb 2004 16:41:43 +0100 (CET)
Date: Sun, 1 Feb 2004 16:41:43 +0100
From: Bogdan TARU <bgd@icomag.de>
To: freebsd-hackers@freebsd.org
Message-ID: <20040201154143.GA7837@icomag.de>
Mail-Followup-To: freebsd-hackers@freebsd.org
References: <20040123125040.GA42187@icomag.de>
	<40111803.25970.2F6461BE@localhost>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <40111803.25970.2F6461BE@localhost>
User-Agent: Mutt/1.4.1i
X-Virus-Scanned: by AMaViS
Subject: Re: 4.9 kernel panics on a poweredge 2650
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 01 Feb 2004 15:41:52 -0000


	Hi Hackers,

 Ok, now some more infos about my problem:

We have 3 identical webservers (as hw configuration), and the same
kernel and applications running on all three. They get mostly the same
traffic (dns round-robined). They all run 4.9-RELEASE. I have
experienced repetable crashes on all three, so there is no problem
with the hardware (or the possibility of such a thing is too small). 

 I have come to think that the problem is with the kernel memory
space, which is too low. I have compiled the kernel from Generic, by
performing the following modifications:

- maxusers set to 128
- activated SMP (the cpus are HTT-compatible)
- kva_pages set 256 (each box has 2GB of ram and 2Gb of swap)
- PMAP_SHPGPERPROC=401 (for apache)
- ACCEPT_FILTER_DATA and ACCEPT_FILTER_HTTP
- removed unnecessary drivers from the kernel

 /etc/sysctl.conf looks like:


net.inet.tcp.msl=100
net.inet.tcp.blackhole=1
# Hyperthreading
machdep.cpu_idle_hlt=1

kern.ipc.somaxconn=4096
kern.maxfiles=65535
vfs.vmiodirenable=1
kern.ipc.shm_use_phys=1
net.inet.tcp.sendspace=16384


 The boxes run w/o a problem for about 2-3 days, after which they
panic with 'page not present' in different processes (sshd, httpd,
etc). I guess the real reason for this is the low value for kvm_free:


(web1)[~] sysctl -a | grep vm.kvm
vm.kvm_size: 1069543424
vm.kvm_free: 4190208

 But I don't know what causes that. The boxes are not that busy (they
don't even crash during peak-traffic times), and vmstat -m shows me as
a total:

Memory Totals:  In Use       Free    Requests
                 5311K      7090K    15602606

 which also looks sort of normal. So, any idea where I should start
looking in order to see what 'eats' so much kvm space?

 Thank you,
 bogdan 


On Fri, Jan 23, 2004 at 12:48:03PM -0800, Andrew Kinney wrote:
> On 23 Jan 2004 at 13:50, Bogdan TARU wrote:
> 
> > 
> > 
> >  Hi hackers,
> > 
> >  I am experiencing kernel panics on a poweredge 2650 each day around
> >  3am (usually the machine comes up at 3:04am). The kernel panics are
> >  reproductable by running: /etc/periodic/security/100.chksetuid (in
> >  fact by runnning find on /usr with -perms). The problem lies
> >  somewhere in /usr/ports. Deleting the /usr/ports tree doesn't solve
> >  it, trying a cvs up of /usr/ports results in a crash again.
> > 
> 
> Our experience is that repetitive crashes when dealing with large 
> numbers of files (like the ports tree) generally points to hitting 
> some OS resource limit.  Some things to check that may or may not 
> apply to this particular problem:
> 
> sysctl vm.zone
> 
> Make sure you're not hitting any of those limits.
> 
> sysctl vm.kvm_size
> sysctl vm.kvm_free
> 
> If kvm_free is running low just prior to the crash, you might want to 
> increase your KVA_PAGES (see lint) and rebuild your kernel.
> 
> Of course, this is all hit and miss guess work until you have a crash 
> dump, so getting a crash dump and a traceback from a kernel identical 
> to your running kernel with debugging symbols would be a logical 
> first step if you want to avoid any guessing.  If your tracebacks 
> show failures in random locations, you're probably looking at bad 
> RAM.  If you always fail in the same spot with each crash, then it is 
> just a matter of determining why and correcting it.
> 
> I believe the freebsd  developer's handbook has instructions on how 
> to setup a system to do an automatic crash dump for any panic.  It is 
> relatively straightforward.
> 
> Sincerely,
> Andrew Kinney
> President and
> Chief Technology Officer
> Advantagecom Networks, Inc.
> http://www.advantagecom.net
>