Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 12 Apr 2005 12:36:47 -0700 (PDT)
From:      Don Lewis <truckman@FreeBSD.org>
To:        dnelson@allantgroup.com
Cc:        vivek@khera.org
Subject:   Re: kernel killing processes when out of swap
Message-ID:  <200504121936.j3CJalHc036643@gw.catspoiler.org>
In-Reply-To: <20050412164536.GB4842@dan.emsphone.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 12 Apr, Dan Nelson wrote:
> In the last episode (Apr 12), Nick Barnes said:
>> This is the well-known problem with my fantasy world in which the OS
>> doesn't overcommit any resources.  All those programs are broken, but
>> it's too costly to fix them.  If overcommit had been resisted more
>> effectively in the first place, those programs would have been
>> written properly.
> 
> Another issue is things like shared libraries; without overcommit you
> need to reserve the file size * the number of processes mapping it,
> since you can't guarantee they won't touch every COW page handed to
> them.  I think you can design a shlib scheme where you can map the libs
> RO; not sure if you would take a performance hit or if there are other
> considerations.

The data and bss sizes in most shared libraries are small, so I don't
think that is much of an issue.  The text pages are more of a problem
because of the need to do relocation fixups.  It would be nice to mark
the text pages read only after relocation and/or prelink the binaries
and shared libraries like recent versions of Linux do.  Text page
modifications to set debugger breakpoints would also have to be handled.

A bigger problem is the default stack size of 64 MB per process.  That
quickly adds up to a lot of reserved swap space.  One way of handling
that might be an ELF header field that could limit the stack size to a
smaller value for most binaries.  I don't happen to remember the default
SunOS 4.x stack size, but I suspect that SunOS 4.x overcommited stack
space on the assumption that most processes wouldn't use anything close
to the limit.

> There's a similar problem when large processes want to
> fork+exec something; for a fraction of a second you need to reserve 2x
> the process's space until the exec frees it.  vfork solves that
> problem, at the expense of blocking the parent until the child's
> process is loaded.

The fork() case was a common failure mode that I ran into back when
I was using SunOS 4.  It was usually a fairly benign problem because the
fork() was triggered by an interactive command, and when it failed I
could usually recover from the problem by exiting some other process or
by freeing up some swap space by removing files from the swap-backed
/tmp directory.

In an earlier life, I had the displeasure of trying to run large
processes (~3x RAM) on a small-memory machine without either COW or
vfork().  Actually, fork() was required in at least some of the cases
because the process wanted to make a snapshot of itself to do processing
on its in-memory data in the background.  The machine would page like
crazy and swap other processes in and out for about an hour each time
the large process forked.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200504121936.j3CJalHc036643>