From owner-freebsd-hackers@FreeBSD.ORG Tue Feb 10 20:51:20 2004 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7F35716A4CE for ; Tue, 10 Feb 2004 20:51:20 -0800 (PST) Received: from www.ambrisko.com (adsl-64-174-51-42.dsl.snfc21.pacbell.net [64.174.51.42]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2987D43D2F for ; Tue, 10 Feb 2004 20:51:20 -0800 (PST) (envelope-from ambrisko@ambrisko.com) Received: from ambrisko.com (localhost [127.0.0.1]) by www.ambrisko.com (8.12.9p2/8.12.9) with ESMTP id i1B4pJu0006869; Tue, 10 Feb 2004 20:51:19 -0800 (PST) (envelope-from ambrisko@ambrisko.com) Received: (from ambrisko@localhost) by ambrisko.com (8.12.9p2/8.12.9/Submit) id i1B4pDjD006864; Tue, 10 Feb 2004 20:51:13 -0800 (PST) (envelope-from ambrisko) From: Doug Ambrisko Message-Id: <200402110451.i1B4pDjD006864@ambrisko.com> In-Reply-To: <20040201154143.GA7837@icomag.de> To: Bogdan TARU Date: Tue, 10 Feb 2004 20:51:13 -0800 (PST) X-Mailer: ELM [version 2.4ME+ PL94b (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII cc: freebsd-hackers@freebsd.org Subject: Re: 4.9 kernel panics on a poweredge 2650 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 11 Feb 2004 04:51:20 -0000 Bogdan TARU writes: | | Hi Hackers, | | Ok, now some more infos about my problem: | | We have 3 identical webservers (as hw configuration), and the same | kernel and applications running on all three. They get mostly the same | traffic (dns round-robined). They all run 4.9-RELEASE. I have | experienced repetable crashes on all three, so there is no problem | with the hardware (or the possibility of such a thing is too small). | | I have come to think that the problem is with the kernel memory | space, which is too low. I have compiled the kernel from Generic, by | performing the following modifications: | | - maxusers set to 128 | - activated SMP (the cpus are HTT-compatible) | - kva_pages set 256 (each box has 2GB of ram and 2Gb of swap) | - PMAP_SHPGPERPROC=401 (for apache) | - ACCEPT_FILTER_DATA and ACCEPT_FILTER_HTTP | - removed unnecessary drivers from the kernel | | /etc/sysctl.conf looks like: | | | net.inet.tcp.msl=100 | net.inet.tcp.blackhole=1 | # Hyperthreading | machdep.cpu_idle_hlt=1 | | kern.ipc.somaxconn=4096 | kern.maxfiles=65535 | vfs.vmiodirenable=1 | kern.ipc.shm_use_phys=1 | net.inet.tcp.sendspace=16384 | | | The boxes run w/o a problem for about 2-3 days, after which they | panic with 'page not present' in different processes (sshd, httpd, | etc). I guess the real reason for this is the low value for kvm_free: | | | (web1)[~] sysctl -a | grep vm.kvm | vm.kvm_size: 1069543424 | vm.kvm_free: 4190208 This isn't good you have about 4M of kernel memory left resulting in your panic. A quick fix to try is to bump up kva_pages to 384. Just recompile the kernel with that and install. There are some undocumented/ poorly sysctl that can free up some memory. I should put something together but I'm working on some other issues right now. For some hints look at vmstat -z and look at how much memory you use. Note that the limit can be read as "allocated and gone from the system to be used only by this zone". Trim down some things that are huge but not used much. Now the tuneable to do that via loader.conf can be a challenge to derive. Doug A.