Date: Thu, 21 May 2009 00:15:23 -0700 (PDT) From: Nate Eldredge <neldredge@math.ucsd.edu> To: Yuri <yuri@rawbw.com> Cc: freebsd-hackers@freebsd.org Subject: Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls? Message-ID: <Pine.GSO.4.64.0905202344420.1483@zeno.ucsd.edu> In-Reply-To: <4A14F58F.8000801@rawbw.com> References: <4A14F58F.8000801@rawbw.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 20 May 2009, Yuri wrote: > Seems like failing system calls (mmap and sbrk) that allocate memory is more > graceful and would allow the program to at least issue the reasonable error > message. > And more intelligent programs would be able to reduce used memory instead of > just dying. It's a feature, called "memory overcommit". It has a variety of pros and cons, and is somewhat controversial. One advantage is that programs often allocate memory (in various ways) that they will never use, which under a conservative policy would result in that memory being wasted, or programs failing unnecessarily. With overcommit, you sometimes allocate more memory than you have, on the assumption that some of it will not actually be needed. Although memory allocated by mmap and sbrk usually does get used in fairly short order, there are other ways of allocating memory that are easy to overlook, and which may "allocate" memory that you don't actually intend to use. Probably the best example is fork(). For instance, consider the following program. #define SIZE 1000000000 /* 1 GB */ int main(void) { char *buf = malloc(SIZE); /* 1 GB */ memset(buf, 'x', SIZE); /* touch the buffer */ pid_t pid = fork(); if (pid == 0) { execlp("true", "true", (char *)NULL); perror("true"); _exit(1); } else if (pid > 0) { for (;;); /* do work */ } else { perror("fork"); exit(1); } return 0; } Suppose we run this program on a machine with just over 1 GB of memory. The fork() should give the child a private "copy" of the 1 GB buffer, by setting it to copy-on-write. In principle, after the fork(), the child might want to rewrite the buffer, which would require an additional 1GB to be available for the child's copy. So under a conservative allocation policy, the kernel would have to reserve that extra 1 GB at the time of the fork(). Since it can't do that on our hypothetical 1+ GB machine, the fork() must fail, and the program won't work. However, in fact that memory is not going to be used, because the child is going to exec() right away, which will free the child's "copy". Indeed, this happens most of the time with fork() (but of course the kernel can't know when it will or won't.) With overcommit, we pretend to give the child a writable private copy of the buffer, in hopes that it won't actually use more of it than we can fulfill with physical memory. If it doesn't use it, all is well; if it does use it, then disaster occurs and we have to start killing things. So the advantage is you can run programs like the one above on machines that technically don't have enough memory to do so. The disadvantage, of course, is that if someone calls the bluff, then we kill random processes. However, this is not all that much worse than failing allocations: although programs can in theory handle failed allocations and respond accordingly, in practice they don't do so and just quit anyway. So in real life, both cases result in disaster when memory "runs out"; with overcommit, the disaster is a little less predictable but happens much less often. If you google for "memory overcommit" you will see lots of opinions and debate about this feature on various operating systems. There may be a way to enable the conservative behavior; I know Linux has an option to do this, but am not sure about FreeBSD. This might be useful if you are paranoid, or run programs that you know will gracefully handle running out of memory. IMHO for general use it is better to have overcommit, but I know there are those who disagree. -- Nate Eldredge neldredge@math.ucsd.edu
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.GSO.4.64.0905202344420.1483>