Date: Thu, 15 Apr 1999 16:29:48 -0700 From: Don Lewis <Don.Lewis@tsc.tdk.com> To: Peter Jeremy <peter.jeremy@auss2.alcatel.com.au>, mi@aldan.algebra.com Cc: current@FreeBSD.ORG Subject: Re: swap-related problems Message-ID: <199904152329.QAA11601@salsa.gv.tsc.tdk.com> In-Reply-To: Peter Jeremy <peter.jeremy@auss2.alcatel.com.au> "Re: swap-related problems" (Apr 15, 12:14pm)
next in thread | previous in thread | raw e-mail | index | archive | help
On Apr 15, 12:14pm, Peter Jeremy wrote: } Subject: Re: swap-related problems } Mikhail Teterin <mi@misha.cisco.com> wrote: } > Worse then that, } >it may be possible to use it at malloc time, but unless your program } >runs and touches every page, the memory may not be available later. } } If you run and touch every page, you are guaranteed to have the } memory available, but you also increase the chances of you being the } largest process when the system runs out of swap - in which case you } get killed. This could also has a pretty severe runtime performance penalty. An implementation that just reserves space would not. } >If we are up to discussing the possible implementations, I'd suggest } >that the system uses something other then SIGKILL to notify the } >program it's time to pay for the over-commit speed and convenience. } >I think, SIGBUS is appropriate, but I'm not sure. } } I'm not sure this will gain a great deal. Currently, if the kernel } runs out of swap, it kills the largest runnable process. For your } proposal to work, it would need to kill the process that is requesting } the space. This raises a number of issues: } 1) The problem is detected in vm_pageout_scan(). There's no obvious } (to me anyway) way to locate the process that is triggering the } space request. } 2) The current approach kills the process hogging the greatest amount } of memory. This minimises the likelihood that you'll run out of } swap again, quickly. } 3) The process that triggered the request could potentially be in a } non-runnable state. In this case, the signal would be lost (or } indefinitely delayed). } 4) Since you're proposing a trap-able signal, the process may chose } to ignore it and attempt to continue on. } 5) The process would require additional stack space to process the } signal (the signal handler frame and space for system calls to } free memory as a minimum). } } The last three issues could result in system deadlock. Yes. On the other hand, with an implemntation where malloc() returns NULL, a carefully application could log a message and wait for more swap to become available, or checkpoint itself. } Having said all that, I agree that it would be useful if FreeBSD had a } knob to disable overcommit - either on a system-wide, or per-process } basis. I don't feel sufficiently strongly about it to actually do } something about it. (From a quick look at the current code in } vm_pageout_scan(), it would be fairly easy to add a per-process flag } to prevent the process being a candidate for killing. ptrace(2) or } setrlimit(2) seem the most obvious ways to control the flag. This } would seem to allieviate the most common problem - one or two large, } critical processes (eg the Xserver) getting killed, but probably has } some nasty downside that I've overlooked). Like if the Xserver has a memory leak, it will keep growing until most of the other processes on the machine are killed of, maybe even important ones, like the process that manipulates the control rods in the nuclear reactor ;-) Actually, that brings up a good point. It is generally considered bad practice for safety critical programs to use dynamic memory allocation since it is so hard to guarantee that there won't be a memory allocation failure. With memory overcommit, it is possible that a process that doesn't dynamically allocate memory to be killed because a fault could happen when accessing BSS. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199904152329.QAA11601>