From owner-freebsd-arch@FreeBSD.ORG Mon Oct 15 16:01:26 2012 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id DD464E5D; Mon, 15 Oct 2012 16:01:26 +0000 (UTC) (envelope-from marcel@xcllnt.net) Received: from mail.xcllnt.net (mail.xcllnt.net [70.36.220.4]) by mx1.freebsd.org (Postfix) with ESMTP id 9565F8FC14; Mon, 15 Oct 2012 16:01:26 +0000 (UTC) Received: from mantonsen-sslvpn-nc.jnpr.net (natint3.juniper.net [66.129.224.36]) (authenticated bits=0) by mail.xcllnt.net (8.14.5/8.14.5) with ESMTP id q9FG1Ijj025081 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Mon, 15 Oct 2012 09:01:19 -0700 (PDT) (envelope-from marcel@xcllnt.net) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: Behavior of madvise(MADV_FREE) From: Marcel Moolenaar In-Reply-To: Date: Mon, 15 Oct 2012 09:01:19 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: References: <9FEBC10C-C453-41BE-8829-34E830585E90@xcllnt.net> <4835.1350062021@critter.freebsd.dk> To: Jason Evans X-Mailer: Apple Mail (2.1499) Cc: Poul-Henning Kamp , "freebsd-arch@freebsd.org Arch" , Tim LaBerge , Alan Cox X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2012 16:01:27 -0000 On Oct 12, 2012, at 3:05 PM, Jason Evans wrote: > On Oct 12, 2012, at 1:54 PM, Marcel Moolenaar wrote: >> BTW: MADV_DONTNEED in Linux seems to behave like MADV_FREE >> in FreeBSD -- at least according to the manpage. Which makes >> me wonder how standard madvise(2) is anyway. >=20 > MADV_DONTNEED on Linux immediately dissociates the physical page from = the VM mapping, such that subsequent access results in a zero-filled = page being soft-faulted into place. >=20 > MADV_FREE is *way* nicer than MADV_DONTNEED in the context of malloc. = jemalloc has a really discouraging amount of complexity that is directly = a result of working around the performance overhead of MADV_DONTNEED. I've been letting this thread sink in -- responding to last. Vendors, like Juniper want reliable VM statistics to prevent over-provisioning. While the stats don't need to be exact at all times (i.e. instantaneous), having the stats catch up to a new steady state is very desirable. In other words: it's not that helpful to have lots of memory on the inactive queue indefinitely. Also, moving the complexity of exactly which hint to give the kernel under different scenarios isn't that appealing at all. It just doesn't scale. If some VM changes warrant a new hint to madvise(), you may end up changing multiple daemons. It seems better to have just 1 hint (i.e. MADV_FREE) and have the kernel change its behaviour depending on the situation. When there's plenty of memory, you may even ignore the hint. Under severe memory pressure you may want to free up the page right away so that you can give it to some thread that's waiting for a page. At the edge of needing to swap, complex algorithms may be worthwhile -- or maybe not. I don't know. =20 This leads to: 1. Keep MADV_FREE as it behaves in FreeBSD right now or make it even more sloppy. 2. Have an idle thread that moves inactive pages to the cache or free queue if they've been inactive for X minutes, for some tunable X. Have it back off when the pageout daemon kicks in. 3. Have MADV_FREE behave like Linux's MADV_DONTNEED when the machine is under significant/severe/some) memory pressure. Thoughts? --=20 Marcel Moolenaar marcel@xcllnt.net