Date: Fri, 12 Oct 2012 14:00:20 -0700 From: Marcel Moolenaar <marcel@xcllnt.net> To: Alan Cox <alc@rice.edu> Cc: Tim LaBerge <tlaberge@juniper.net>, "freebsd-arch@freebsd.org Arch" <freebsd-arch@freebsd.org> Subject: Re: Behavior of madvise(MADV_FREE) Message-ID: <186E5ECB-120E-4E49-96B4-485E2676C05F@xcllnt.net> In-Reply-To: <50785A62.5050603@rice.edu> References: <9FEBC10C-C453-41BE-8829-34E830585E90@xcllnt.net> <50785A62.5050603@rice.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
On Oct 12, 2012, at 10:58 AM, Alan Cox <alc@rice.edu> wrote: >> Now on to the questions: >> 1. madvise(MADV_FREE) marks the pages as clean and moves >> them to the inactive queue. Why isn't the reference >> state cleared on either the page or the TLB? >=20 > It is, at least 31 out of 32 times that vm_page_dontneed() is called. = =46rom vm_page_dontneed(), which is called by madvise(MADV_FREE): >=20 > /* > * Clear any references to the page. Otherwise, the page = daemon will > * immediately reactivate the page. > * > * Perform the pmap_clear_reference() first. Otherwise, a = concurrent > * pmap operation, such as pmap_remove(), could clear a = reference in > * the pmap and set PGA_REFERENCED on the page before the > * pmap_clear_reference() had completed. Consequently, the = page would > * appear referenced based upon an old reference that occurred = before > * this function ran. > */ > pmap_clear_reference(m); > vm_page_aflag_clear(m, PGA_REFERENCED); Ah... I missed this. I didn't look in vm_page_dontneed() for this. I thought current FreeBSD behaved the same as 6.1-ish. >> 2. Why aren't the pages moved to the cache queue in the >> first place? >=20 > Because this would make madvise(MADV_FREE) considerably more = expensive, for example, the pages would have to be unmapped. Your = situation may be different, but more often than not, people call = madvise(MADV_FREE) when memory is plentiful, and there is no need to do = anything. In other words, the page daemon isn't going to need to run = anytime soon. For example, when madvise(MADV_FREE) is used in = implementations of malloc() and free(), the vast majority of calls to = madvise(MADV_FREE) are pointless. They are pointless in that soon after = the madvise(MADV_FREE) call by the free() implementation, either (1) the = application turns around and allocates more memory causing the = MADV_FREE'd memory to be used once again or (2) the process terminates = before the page daemon runs. Consequently, the implementation of = madvise(MADV_FREE) does the minimal necessary work so that if memory = does become scarce and the page daemon has to run, that the MADV_FREE'd = pages are first in line for reclamation. Understood. Thanks. >> Ad 2: >> MADV_DONTNEED is there to signal that the pages contain >> valid data, but that the page is not needed right now. >> Using this, pages get moved to the inactive queue. That >> makes sense. But MADV_FREE signals that there's no valid >> data anymore and that the page may be demand zeroed on >> next reference. The page is not inactive. It's free. If >> the paged was zeroed before calling MADV_FREE, the page >> really caches contents that that can be recreated later >> (the demand zero). >=20 > There is also another way of looking at it. By leaving the pages = allocated and mapped, you are saving time, i.e., CPU cycles, for the all = to common case that the MADV_FREE'd pages are used again in the near = future. >=20 > It wouldn't be illogical to have to two variants of MADV_FREE. One = for use by folks like yourself who can say definitively that the pages = won't be accessed again and should really be freed, and the current = implementation for more speculative uses like in the malloc() and free() = implementation. Better yet, the second case would be replaced by a = notification from the kernel to the process when memory is actually = becoming scarce so that we won't waste cycles on any pointless madvise() = calls by the process. This aligns with phk@s suggestion of MADV_RECYCLE. We may want to play with this. --=20 Marcel Moolenaar marcel@xcllnt.net
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?186E5ECB-120E-4E49-96B4-485E2676C05F>