Date: Mon, 13 Mar 2006 11:48:06 -0800 (PST) From: Jon Dama <jd@ugcs.caltech.edu> To: Peter Jeremy <peterjeremy@optushome.com.au> Cc: Dmitry Pryanishnikov <dmitry@atlantis.dp.ua>, Kostik Belousov <kostikbel@gmail.com>, freebsd-stable@freebsd.org, Michael Proto <mike@jellydonut.org> Subject: Re: RELENG_4 on flash disk and swap Message-ID: <Pine.LNX.4.53.0603131119220.30166@hurl.ugcs.caltech.edu> In-Reply-To: <20060310193248.GC688@turion.vk2pj.dyndns.org> References: <20060302181625.I3905@atlantis.atlantis.dp.ua> <76FAD2DB-CD18-42D4-95C8-F016CFB17B00@segpub.com.au> <20060303110936.R86586@atlantis.atlantis.dp.ua> <20060303185157.GB692@turion.vk2pj.dyndns.org> <20060304001224.G356@atlantis.atlantis.dp.ua> <20060304065138.GD692@turion.vk2pj.dyndns.org> <20060310121758.S80837@atlantis.atlantis.dp.ua> <20060310123942.GI37572@deviant.kiev.zoral.com.ua> <20060310153737.X40396@atlantis.atlantis.dp.ua> <20060310193248.GC688@turion.vk2pj.dyndns.org>
next in thread | previous in thread | raw e-mail | index | archive | help
If you feel this situation is undesirable, the first thing to do is to put together the patches necessary to allow the kernel to actually track how much ram+swap might be needed to cover the address-space allocations that have been granted. This isn't trivial: just start thinking about shared allocations, forking, copy-on-writem, etc. In order to make this change "costless" I suspect you'll have to hide it behind a kernel config option. Maybe you'll bill it as mere instrumentation. Then worry about convincing people that overcommit shouldn't be the only option. But once you have your kernel config option to enable proper accounting it should be a short-hop to making a sysctl that can disable overcommit and enforce limits based on the previously mentioned "accounting". Most importantly though you won't need to convince anyone that the default ought to be changed. SIGDANGER has essentially been rejected universally by everyone but its creators (IBM), and as it is unusual, don't expect anyone to write a program that uses it. Ditto for any solution that involves madvise or expecting programs to prefault their pages. Other suggestion: build a time machine to go back to 1990 and get early (pages guaranteed) and late (overcommitted) allocation written into POSIX. Somewhat accepted is to ensure allocations must be backed but to also support a M_NORESERVE flag in mmap to permit overcomitted allocations. Anyways, no matter what you must first give the kernel the necessary accounting code. For the record: I believe in overcommit, but I recognize that it violates the semantics people were (foolishly) taught in school. Also, when the system is page-starved it kills the largest consumer of pages that has the same UID as the process that pushed the system over the limit---not merely the largest consumer of pages. So you see, running critical services that carefully pre-allocate and fault their memory is possible within the overcommit framework. Jon
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.LNX.4.53.0603131119220.30166>