Date: Thu, 15 Apr 1999 12:14:23 +1000 From: Peter Jeremy <peter.jeremy@auss2.alcatel.com.au> To: mi@aldan.algebra.com Cc: current@FreeBSD.ORG Subject: Re: swap-related problems Message-ID: <99Apr15.120110est.40366@border.alcanet.com.au>
next in thread | raw e-mail | index | archive | help
Mikhail Teterin <mi@misha.cisco.com> wrote: > Worse then that, >it may be possible to use it at malloc time, but unless your program >runs and touches every page, the memory may not be available later. If you run and touch every page, you are guaranteed to have the memory available, but you also increase the chances of you being the largest process when the system runs out of swap - in which case you get killed. >If we are up to discussing the possible implementations, I'd suggest >that the system uses something other then SIGKILL to notify the >program it's time to pay for the over-commit speed and convenience. >I think, SIGBUS is appropriate, but I'm not sure. I'm not sure this will gain a great deal. Currently, if the kernel runs out of swap, it kills the largest runnable process. For your proposal to work, it would need to kill the process that is requesting the space. This raises a number of issues: 1) The problem is detected in vm_pageout_scan(). There's no obvious (to me anyway) way to locate the process that is triggering the space request. 2) The current approach kills the process hogging the greatest amount of memory. This minimises the likelihood that you'll run out of swap again, quickly. 3) The process that triggered the request could potentially be in a non-runnable state. In this case, the signal would be lost (or indefinitely delayed). 4) Since you're proposing a trap-able signal, the process may chose to ignore it and attempt to continue on. 5) The process would require additional stack space to process the signal (the signal handler frame and space for system calls to free memory as a minimum). The last three issues could result in system deadlock. > That our >malloc does not conform to standards (for whatever reasons), The problem is the system, rather than malloc. I'm not sure that the ANSI C standard forbids the current system behaviour. The behaviour is quite common, so I'm sure it's not forbidden by POSIX or various other Unix specs. >that something should be done about it. This is a commonly used phrase around here. Basically, you have 3 choices: 1) Fix the problem yourself (ie submit patches that fix the problem) 2) Convince someone else that the problem is critical enough for them to fix it for you. 3) Pay someone to fix it for you (this is an extension of the previous item). > That "something" must start >with documenting the flaw... Feel free to write a PR. Including the changes you want to the documentation (in text format if you don't want to write a patch to the man page) increases the chances of the changes getting done. Having said all that, I agree that it would be useful if FreeBSD had a knob to disable overcommit - either on a system-wide, or per-process basis. I don't feel sufficiently strongly about it to actually do something about it. (From a quick look at the current code in vm_pageout_scan(), it would be fairly easy to add a per-process flag to prevent the process being a candidate for killing. ptrace(2) or setrlimit(2) seem the most obvious ways to control the flag. This would seem to allieviate the most common problem - one or two large, critical processes (eg the Xserver) getting killed, but probably has some nasty downside that I've overlooked). Peter To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?99Apr15.120110est.40366>