Date: Mon, 02 Feb 2004 12:04:20 -0800 From: "Andrew Kinney" <andykinney@advantagecom.net> To: Bogdan TARU <bgd@icomag.de>, freebsd-hackers@freebsd.org Subject: Re: 4.9 kernel panics on a poweredge 2650 Message-ID: <401E3CC4.10516.62BCDDCB@localhost> In-Reply-To: <20040201154143.GA7837@icomag.de> References: <40111803.25970.2F6461BE@localhost>
next in thread | previous in thread | raw e-mail | index | archive | help
On 1 Feb 2004 at 16:41, Bogdan TARU wrote: > > - maxusers set to 128 This is probably too small. Let it autoscale (don't set it) unless you have a specific reason not to let it scale on its own. > - activated SMP (the cpus are HTT-compatible) > - kva_pages set 256 (each box has 2GB of ram and 2Gb of swap) 256 is the default in 4.9 and is often too small for machines used as web servers using numerous heavy/large Apache processes. Try setting this to 512 instead. > - PMAP_SHPGPERPROC=401 (for apache) This seems a tad small, but might be okay if you have only a few Apache processes or they are all small processes. Pay close attention to sysctl vm.zone. When/if "free" plus "used" PV ENTRY exceeds 90% of "limit", then you really need to increase PMAP_SHPGPERPROC more because this means you are exceeding FreeBSD's capability to forcibly recycle PV Entries which is done at 90%. > The boxes run w/o a problem for about 2-3 days, after which they > panic with 'page not present' in different processes (sshd, httpd, > etc). I guess the real reason for this is the low value for kvm_free: > > > (web1)[~] sysctl -a | grep vm.kvm > vm.kvm_size: 1069543424 > vm.kvm_free: 4190208 > > But I don't know what causes that. The boxes are not that busy (they > don't even crash during peak-traffic times), and vmstat -m shows me as > a total: > > Memory Totals: In Use Free Requests > 5311K 7090K 15602606 > > which also looks sort of normal. So, any idea where I should start > looking in order to see what 'eats' so much kvm space? > There's more to it than just looking at the amount "used" and "free". You have to add the "free" and "used" to reach the total allocated. Just because something is "free" doesn't mean it has been de- allocated. It only means that the page has been placed on the "free" list of the particular allocation bucket. Once a page is allocated to a particular bucket size, it cannot be reallocated to a different bucket size. It can only be added or removed from the "free" list for that particular bucket size. So, this breaks the illusion that something listed as "free" memory is in fact free. It is only "free" to be reallocated to that same bucket size. If a particular bucket size needs more pages allocated and it requests more pages when there aren't any more to be had (for whatever reason), you're going to get a trap 12 panic. In the case of Apache, this usually happens when many largish Apache processes are spawned and KVM used by PV_ENTRY begins to increase. Then, one of two things will happen: 1. You hit the limit of PV_ENTRY and get a trap 12 panic. 2. You run out of KVM as the amount of KVM consumed by PV_ENTRY increases and get a trap 12 panic. 'vmstat -m' doesn't show the amount of KVM consumed by PV_ENTRY (28 bytes * (usedcount + freecount)), among other things, so it is not a complete view of KVM usage. KVM usage by PV_ENTRY can be significant. On one of our systems, PV_ENTRY accounts for a little over 254MBytes of KVM usage. We had similar panics to yours about a year ago before we increased KVA_PAGES to accommodate the higher PV_ENTRY usage allowed by increasing PMAP_SHPGPERPROC. When you throw in the other KVM usage of the elements represented in sysctl vm.zone (like VNODE, for instance), it is quite easy to exceed KVM limits and get a trap 12 panic on a system with large disks, large memory, heavy OS resource utilization, or any combination of those. At any rate, my previous recommendation on this same thread still stands: get a crash dump and do a traceback using an identically configured kernel with debug symbols. See the FreeBSD developer's handbook. Such a procedure will allow you to see exactly what failed and avoid all this guesswork, though it is likely that increasing KVA_PAGES to a value higher than 256 will be what you need. Sincerely, Andrew Kinney President and Chief Technology Officer Advantagecom Networks, Inc. http://www.advantagecom.net
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?401E3CC4.10516.62BCDDCB>