From owner-freebsd-stable@FreeBSD.ORG Mon Mar 13 19:48:07 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AA9EB16A401 for ; Mon, 13 Mar 2006 19:48:07 +0000 (UTC) (envelope-from jd@ugcs.caltech.edu) Received: from hurl.ugcs.caltech.edu (hurl.ugcs.caltech.edu [131.215.176.101]) by mx1.FreeBSD.org (Postfix) with ESMTP id 54D5243D4C for ; Mon, 13 Mar 2006 19:48:07 +0000 (GMT) (envelope-from jd@ugcs.caltech.edu) Received: by hurl.ugcs.caltech.edu (Postfix, from userid 3640) id C80BD1C3B97; Mon, 13 Mar 2006 11:48:06 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by hurl.ugcs.caltech.edu (Postfix) with ESMTP id BD53D1C3B8E; Mon, 13 Mar 2006 11:48:06 -0800 (PST) Date: Mon, 13 Mar 2006 11:48:06 -0800 (PST) From: Jon Dama To: Peter Jeremy In-Reply-To: <20060310193248.GC688@turion.vk2pj.dyndns.org> Message-ID: References: <20060302181625.I3905@atlantis.atlantis.dp.ua> <76FAD2DB-CD18-42D4-95C8-F016CFB17B00@segpub.com.au> <20060303110936.R86586@atlantis.atlantis.dp.ua> <20060303185157.GB692@turion.vk2pj.dyndns.org> <20060304001224.G356@atlantis.atlantis.dp.ua> <20060304065138.GD692@turion.vk2pj.dyndns.org> <20060310121758.S80837@atlantis.atlantis.dp.ua> <20060310123942.GI37572@deviant.kiev.zoral.com.ua> <20060310153737.X40396@atlantis.atlantis.dp.ua> <20060310193248.GC688@turion.vk2pj.dyndns.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Cc: Dmitry Pryanishnikov , Kostik Belousov , freebsd-stable@freebsd.org, Michael Proto Subject: Re: RELENG_4 on flash disk and swap X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Mar 2006 19:48:07 -0000 If you feel this situation is undesirable, the first thing to do is to put together the patches necessary to allow the kernel to actually track how much ram+swap might be needed to cover the address-space allocations that have been granted. This isn't trivial: just start thinking about shared allocations, forking, copy-on-writem, etc. In order to make this change "costless" I suspect you'll have to hide it behind a kernel config option. Maybe you'll bill it as mere instrumentation. Then worry about convincing people that overcommit shouldn't be the only option. But once you have your kernel config option to enable proper accounting it should be a short-hop to making a sysctl that can disable overcommit and enforce limits based on the previously mentioned "accounting". Most importantly though you won't need to convince anyone that the default ought to be changed. SIGDANGER has essentially been rejected universally by everyone but its creators (IBM), and as it is unusual, don't expect anyone to write a program that uses it. Ditto for any solution that involves madvise or expecting programs to prefault their pages. Other suggestion: build a time machine to go back to 1990 and get early (pages guaranteed) and late (overcommitted) allocation written into POSIX. Somewhat accepted is to ensure allocations must be backed but to also support a M_NORESERVE flag in mmap to permit overcomitted allocations. Anyways, no matter what you must first give the kernel the necessary accounting code. For the record: I believe in overcommit, but I recognize that it violates the semantics people were (foolishly) taught in school. Also, when the system is page-starved it kills the largest consumer of pages that has the same UID as the process that pushed the system over the limit---not merely the largest consumer of pages. So you see, running critical services that carefully pre-allocate and fault their memory is possible within the overcommit framework. Jon