Date: Thu, 11 Sep 2008 03:56:31 -0700 From: Jeremy Chadwick <koitsu@FreeBSD.org> To: Michael Grant <mgrant@grant.org> Cc: Kris Kennaway <kris@freebsd.org>, FreeBSD Stable List <freebsd-stable@freebsd.org> Subject: Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load Message-ID: <20080911105631.GB25493@icarus.home.lan> In-Reply-To: <62b856460809110308sa44f057mc08189a97efa9d0c@mail.gmail.com> References: <BF6724CD748744908D602889CCF119F1@emea.hubersuhner.net> <487E0D1B.2060902@FreeBSD.org> <20080716203900.5jt4qce17gg0og0o@mail.basicnets.co.uk> <A403B8D27BE048E79A94B09C0C520854@emea.hubersuhner.net> <B4E29257-B805-4597-9024-E042F34243D1@mac.com> <62b856460807241309k3cea60dbh24eea677cd6751f7@mail.gmail.com> <4888E207.4020606@FreeBSD.org> <62b856460809110138o5fb10171h9832ac8b964fa3f6@mail.gmail.com> <20080911092047.GA24499@icarus.home.lan> <62b856460809110308sa44f057mc08189a97efa9d0c@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Sep 11, 2008 at 12:08:47PM +0200, Michael Grant wrote: > On Thu, Sep 11, 2008 at 11:20 AM, Jeremy Chadwick <koitsu@freebsd.org> wrote: > > On Thu, Sep 11, 2008 at 10:38:36AM +0200, Michael Grant wrote: > >> My box crashed again: > >> > >> panic: kmem_malloc(4096): kmem_map too small: 1073741824 total allocated > >> cpuid = 0 > >> Uptime: 33d11h12m58s > >> Dumping 3327 MB (2 chunks) > >> chunk 0: 1MB (151 pages) ... ok > >> chunk 1: 3327MB (851568 pages) <---hung here > >> > >> Still no valid dump. > >> > >> There is 4gig of physical memory in the machine. > >> > >> In /boot/loader.conf, I currently have the following: > >> > >> vm.kmem_size=1G > >> vm.kmem_size_max=1G > >> vm.kmem_size_scale=2 > >> > >> and in my kernel conf file I have: > >> > >> options KVA_PAGES=512 > >> > >> It stayed up for 33 days this time. Is there anything else I can do? > > > > First and foremost: are you using ZFS on this machine? If so, there are > > many tunables you can apply to try and limit this; I'm willing to bet > > it's ARC which is doing it. See below. > > > > In general, it appears that you need to increase the maximum range of > > kmem. The kernel attempted to utilise more than 1GB, and your limit is > > 1G. My machines running RELENG_7 on amd64, with only 2GB of RAM > > installed, use the following tunables in loader.conf: > > > > vm.kmem_size="1536M" > > vm.kmem_size_max="1536M" > > > > If ZFS is in use, I recommend these as well: > > > > vfs.zfs.arc_min="16M" > > vfs.zfs.arc_max="64M" > > vfs.zfs.prefetch_disable="1" > > > > Do not increase kmem_size any larger than 1.5GB; the amount of RAM you > > have in the machine, with regards to RELENG_7, will not help. This is a > > known limitation which has been fixed in HEAD/CURRENT (where the limit > > has been increased to 512GB). See the "Kernel" section below; you'll > > see the applicable item. > > > > http://wiki.freebsd.org/JeremyChadwick/Commonly_reported_issues > > > > Your only solution may be to run HEAD/CURRENT. > > I am not running ZFS. My file systems are ufs. > > This feels like some sort of memory leak in the kernel. Giving it > more and more memory just seems to delay the crash. Are you saying > the crash is fixed in HEAD/CURRENT? It's an intentional crash, not "the program tried to access NULL, which crashed the machine" crash. The kernel wants more memory to accomplish a certain thing, and it's not available. kris@ can explain this in better terms than I can. First and foremost, it would be good to find out what all you are running on this machine (process-wise). A process could be tickling something in the kernel which requires a large amount of memory to be required. I can imagine something like MySQL would require this. Ideally what needs to happen is to debug the kernel or get a full map of kmem to find out what's using what. I believe vmstat -m or vmstat -z output might help. Obviously since the machine panics, you won't be able to run those commands after the fact. I would recommend you set up a cronjob that runs every 1-2 minutes and logs the output of both of those commands to a file. When the panic happens, restart the system and look at the logfile to see if you can figure out if anything suddenly starts taking up a large amount of memory, or if it's a gradual thing (indicating a memory leak). If you can figure out what might be tickling the problem, you can ultimately figure out if increasing kmem is the right thing to do, or if there's a greater problem here. > I'm running 6.3 by the way. > > I have put your changes into my loader.conf, we'll see how long it > goes this time. I'm not qute in position to update everything to 7.x > at the moment. Our production webservers run RELENG_6 and RELENG_7, and we don't encounter this kind of problem. I'm not saying what you're experiencing is indicative of hardware issues or something like that -- I'm simply saying I have loaded systems which don't ever hit that condition. So figuring out what's causing it in your case would be good. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080911105631.GB25493>