From owner-freebsd-current  Thu Apr 15 16:32:26 1999
Delivered-To: freebsd-current@freebsd.org
Received: from gatekeeper.tsc.tdk.com (gatekeeper.tsc.tdk.com [207.113.159.21])
	by hub.freebsd.org (Postfix) with ESMTP id D598C14E2D
	for <current@FreeBSD.ORG>; Thu, 15 Apr 1999 16:32:23 -0700 (PDT)
	(envelope-from gdonl@tsc.tdk.com)
Received: from sunrise.gv.tsc.tdk.com (root@sunrise.gv.tsc.tdk.com [192.168.241.191])
	by gatekeeper.tsc.tdk.com (8.8.8/8.8.8) with ESMTP id QAA14291;
	Thu, 15 Apr 1999 16:29:50 -0700 (PDT)
	(envelope-from gdonl@tsc.tdk.com)
Received: from salsa.gv.tsc.tdk.com (salsa.gv.tsc.tdk.com [192.168.241.194])
	by sunrise.gv.tsc.tdk.com (8.8.5/8.8.5) with ESMTP id QAA03652;
	Thu, 15 Apr 1999 16:29:49 -0700 (PDT)
Received: (from gdonl@localhost)
	by salsa.gv.tsc.tdk.com (8.8.5/8.8.5) id QAA11601;
	Thu, 15 Apr 1999 16:29:48 -0700 (PDT)
From: Don Lewis <Don.Lewis@tsc.tdk.com>
Message-Id: <199904152329.QAA11601@salsa.gv.tsc.tdk.com>
Date: Thu, 15 Apr 1999 16:29:48 -0700
In-Reply-To: Peter Jeremy <peter.jeremy@auss2.alcatel.com.au>
       "Re: swap-related problems" (Apr 15, 12:14pm)
X-Mailer: Mail User's Shell (7.2.6 alpha(3) 7/19/95)
To: Peter Jeremy <peter.jeremy@auss2.alcatel.com.au>,
	mi@aldan.algebra.com
Subject: Re: swap-related problems
Cc: current@FreeBSD.ORG
Sender: owner-freebsd-current@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

On Apr 15, 12:14pm, Peter Jeremy wrote:
} Subject: Re: swap-related problems
} Mikhail Teterin <mi@misha.cisco.com> wrote:
} > Worse then that,
} >it may be possible to use it at malloc time, but unless your program
} >runs and touches every page, the memory may not be available later.
} 
} If you run and touch every page, you are guaranteed to have the
} memory available, but you also increase the chances of you being the
} largest process when the system runs out of swap - in which case you
} get killed.

This could also has a pretty severe runtime performance penalty.  An
implementation that just reserves space would not.

} >If we are up to discussing the possible implementations, I'd suggest
} >that the system uses something other then SIGKILL to notify the
} >program it's time to pay for the over-commit speed and convenience.
} >I think, SIGBUS is appropriate, but I'm not sure.
} 
} I'm not sure this will gain a great deal.  Currently, if the kernel
} runs out of swap, it kills the largest runnable process.  For your
} proposal to work, it would need to kill the process that is requesting
} the space.  This raises a number of issues:
} 1) The problem is detected in vm_pageout_scan().  There's no obvious
}    (to me anyway) way to locate the process that is triggering the
}    space request.
} 2) The current approach kills the process hogging the greatest amount
}    of memory.  This minimises the likelihood that you'll run out of
}    swap again, quickly.
} 3) The process that triggered the request could potentially be in a
}    non-runnable state.  In this case, the signal would be lost (or
}    indefinitely delayed).
} 4) Since you're proposing a trap-able signal, the process may chose
}    to ignore it and attempt to continue on.
} 5) The process would require additional stack space to process the
}    signal (the signal handler frame and space for system calls to
}    free memory as a minimum).
} 
} The last three issues could result in system deadlock.

Yes.  On the other hand, with an implemntation where malloc() returns
NULL, a carefully application could log a message and wait for more swap
to become available, or checkpoint itself.

} Having said all that, I agree that it would be useful if FreeBSD had a
} knob to disable overcommit - either on a system-wide, or per-process
} basis.  I don't feel sufficiently strongly about it to actually do
} something about it.  (From a quick look at the current code in
} vm_pageout_scan(), it would be fairly easy to add a per-process flag
} to prevent the process being a candidate for killing.  ptrace(2) or
} setrlimit(2) seem the most obvious ways to control the flag.  This
} would seem to allieviate the most common problem - one or two large,
} critical processes (eg the Xserver) getting killed, but probably has
} some nasty downside that I've overlooked).

Like if the Xserver has a memory leak, it will keep growing until most
of the other processes on the machine are killed of, maybe even important
ones, like the process that manipulates the control rods in the nuclear
reactor ;-)

Actually, that brings up a good point.  It is generally considered bad
practice for safety critical programs to use dynamic memory allocation
since it is so hard to guarantee that there won't be a memory allocation
failure.  With memory overcommit, it is possible that a process that
doesn't dynamically allocate memory to be killed because a fault could
happen when accessing BSS.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message