Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 15 Oct 2012 09:01:19 -0700
From:      Marcel Moolenaar <marcel@xcllnt.net>
To:        Jason Evans <jasone@FreeBSD.org>
Cc:        Poul-Henning Kamp <phk@phk.freebsd.dk>, "freebsd-arch@freebsd.org Arch" <freebsd-arch@FreeBSD.org>, Tim LaBerge <tlaberge@juniper.net>, Alan Cox <alc@rice.edu>
Subject:   Re: Behavior of madvise(MADV_FREE)
Message-ID:  <F67D539D-8BE3-4817-8466-C76DE43AE252@xcllnt.net>
In-Reply-To: <F71ACE9D-297E-4565-BB8D-D95D46D90708@freebsd.org>
References:  <9FEBC10C-C453-41BE-8829-34E830585E90@xcllnt.net> <4835.1350062021@critter.freebsd.dk> <E6A52D27-0D6A-4175-9ECA-ADE25BFF35C2@xcllnt.net> <F71ACE9D-297E-4565-BB8D-D95D46D90708@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On Oct 12, 2012, at 3:05 PM, Jason Evans <jasone@FreeBSD.org> wrote:

> On Oct 12, 2012, at 1:54 PM, Marcel Moolenaar wrote:
>> BTW: MADV_DONTNEED in Linux seems to behave like MADV_FREE
>> in FreeBSD -- at least according to the manpage. Which makes
>> me wonder how standard madvise(2) is anyway.
>=20
> MADV_DONTNEED on Linux immediately dissociates the physical page from =
the VM mapping, such that subsequent access results in a zero-filled =
page being soft-faulted into place.
>=20
> MADV_FREE is *way* nicer than MADV_DONTNEED in the context of malloc.  =
jemalloc has a really discouraging amount of complexity that is directly =
a result of working around the performance overhead of MADV_DONTNEED.

I've been letting this thread sink in -- responding to last.

Vendors, like Juniper want reliable VM statistics to prevent
over-provisioning. While the stats don't need to be exact at
all times (i.e. instantaneous), having the stats catch up to
a new steady state is very desirable. In other words: it's
not that helpful to have lots of memory on the inactive queue
indefinitely.

Also, moving the complexity of exactly which hint to give the
kernel under different scenarios isn't that appealing at all.
It just doesn't scale. If some VM changes warrant a new hint
to madvise(), you may end up changing multiple daemons. It
seems better to have just 1 hint (i.e. MADV_FREE) and have the
kernel change its behaviour depending on the situation. When
there's plenty of memory, you may even ignore the hint. Under
severe memory pressure you may want to free up the page right
away so that you can give it to some thread that's waiting
for a page. At the edge of needing to swap, complex algorithms
may be worthwhile -- or maybe not. I don't know.
=20
This leads to:
1.  Keep MADV_FREE as it behaves in FreeBSD right now or make
    it even more sloppy.
2.  Have an idle thread that moves inactive pages to the cache
    or free queue if they've been inactive for X minutes, for
    some tunable X. Have it back off when the pageout daemon
    kicks in.
3.  Have MADV_FREE behave like Linux's MADV_DONTNEED when the
    machine is under significant/severe/some) memory pressure.

Thoughts?

--=20
Marcel Moolenaar
marcel@xcllnt.net





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?F67D539D-8BE3-4817-8466-C76DE43AE252>