Date: Sat, 20 Oct 2012 13:44:03 -0500 From: Alan Cox <alc@rice.edu> To: Marcel Moolenaar <marcel@xcllnt.net> Cc: Poul-Henning Kamp <phk@phk.freebsd.dk>, Tim LaBerge <tlaberge@juniper.net>, Jason Evans <jasone@FreeBSD.org>, Alan Cox <alc@rice.edu>, "freebsd-arch@freebsd.org Arch" <freebsd-arch@FreeBSD.org> Subject: Re: Behavior of madvise(MADV_FREE) Message-ID: <5082F0F3.1070102@rice.edu> In-Reply-To: <F67D539D-8BE3-4817-8466-C76DE43AE252@xcllnt.net> References: <9FEBC10C-C453-41BE-8829-34E830585E90@xcllnt.net> <4835.1350062021@critter.freebsd.dk> <E6A52D27-0D6A-4175-9ECA-ADE25BFF35C2@xcllnt.net> <F71ACE9D-297E-4565-BB8D-D95D46D90708@freebsd.org> <F67D539D-8BE3-4817-8466-C76DE43AE252@xcllnt.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On 10/15/2012 11:01, Marcel Moolenaar wrote: > On Oct 12, 2012, at 3:05 PM, Jason Evans<jasone@FreeBSD.org> wrote: > >> On Oct 12, 2012, at 1:54 PM, Marcel Moolenaar wrote: >>> BTW: MADV_DONTNEED in Linux seems to behave like MADV_FREE >>> in FreeBSD -- at least according to the manpage. Which makes >>> me wonder how standard madvise(2) is anyway. >> MADV_DONTNEED on Linux immediately dissociates the physical page from the VM mapping, such that subsequent access results in a zero-filled page being soft-faulted into place. >> >> MADV_FREE is *way* nicer than MADV_DONTNEED in the context of malloc. jemalloc has a really discouraging amount of complexity that is directly a result of working around the performance overhead of MADV_DONTNEED. > I've been letting this thread sink in -- responding to last. > > Vendors, like Juniper want reliable VM statistics to prevent > over-provisioning. While the stats don't need to be exact at > all times (i.e. instantaneous), having the stats catch up to > a new steady state is very desirable. In other words: it's > not that helpful to have lots of memory on the inactive queue > indefinitely. I'm sympathetic. Once upon a time, I was often called upon to explain to network administrators why their idle web cache didn't have oodles of "free" memory and how this wasn't a problem. > Also, moving the complexity of exactly which hint to give the > kernel under different scenarios isn't that appealing at all. > It just doesn't scale. I think that you're being a bit too pessimistic here. If your use case really corresponds to "this memory is free and will not be reused (or reallocated for a very long time)", then that is qualitatively very different from the way malloc(3) uses MADV_FREE. malloc(3)'s use of MADV_FREE is highly speculative. It doesn't really know what the application is going to do in the future. I don't think that having two distinct hints that distinguish between "speculative" and "non-speculative" uses would be problematic. The distinction is real and also easy to explain. The only danger is that application writers really don't understand their application and use the wrong hint. > ... If some VM changes warrant a new hint > to madvise(), you may end up changing multiple daemons. It > seems better to have just 1 hint (i.e. MADV_FREE) and have the > kernel change its behaviour depending on the situation. When > there's plenty of memory, you may even ignore the hint. Under > severe memory pressure you may want to free up the page right > away so that you can give it to some thread that's waiting > for a page. How is this really different from the existing behavior? If a thread is waiting for a page, then the page daemon is running. In particular, it is moving pages from the head of the inactive queue, where they were placed by MADV_FREE, to the cache/free queue and waking up the waiting thread when the aggregate cache/free target is met. > At the edge of needing to swap, complex algorithms > may be worthwhile -- or maybe not. I don't know. > > This leads to: > 1. Keep MADV_FREE as it behaves in FreeBSD right now or make > it even more sloppy. I'm not sure that I understand what you mean by "sloppy" here. Can you elaborate? > 2. Have an idle thread that moves inactive pages to the cache > or free queue if they've been inactive for X minutes, for > some tunable X. Have it back off when the pageout daemon > kicks in. The existing page daemon already wakes up periodically and looks around for something to do. In particular, have a look at vm_pageout_page_stats(). That function tries to do something analogous to what you propose. In part, it tries to prevent munmap(2)ed file-backed pages from getting stuck in the active queue. > 3. Have MADV_FREE behave like Linux's MADV_DONTNEED when the > machine is under significant/severe/some) memory pressure. > > Thoughts? >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5082F0F3.1070102>