From owner-freebsd-arch@FreeBSD.ORG Fri Oct 12 21:00:29 2012 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id DCC3EF65 for ; Fri, 12 Oct 2012 21:00:28 +0000 (UTC) (envelope-from marcel@xcllnt.net) Received: from mail.xcllnt.net (mail.xcllnt.net [70.36.220.4]) by mx1.freebsd.org (Postfix) with ESMTP id B0D6E8FC0C for ; Fri, 12 Oct 2012 21:00:28 +0000 (UTC) Received: from marcelm-sslvpn-nc.jnpr.net (natint3.juniper.net [66.129.224.36]) (authenticated bits=0) by mail.xcllnt.net (8.14.5/8.14.5) with ESMTP id q9CL0PZM068231 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Fri, 12 Oct 2012 14:00:27 -0700 (PDT) (envelope-from marcel@xcllnt.net) Content-Type: text/plain; charset=iso-8859-1 Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: Behavior of madvise(MADV_FREE) From: Marcel Moolenaar In-Reply-To: <50785A62.5050603@rice.edu> Date: Fri, 12 Oct 2012 14:00:20 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: <186E5ECB-120E-4E49-96B4-485E2676C05F@xcllnt.net> References: <9FEBC10C-C453-41BE-8829-34E830585E90@xcllnt.net> <50785A62.5050603@rice.edu> To: Alan Cox X-Mailer: Apple Mail (2.1499) Cc: Tim LaBerge , "freebsd-arch@freebsd.org Arch" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Oct 2012 21:00:29 -0000 On Oct 12, 2012, at 10:58 AM, Alan Cox wrote: >> Now on to the questions: >> 1. madvise(MADV_FREE) marks the pages as clean and moves >> them to the inactive queue. Why isn't the reference >> state cleared on either the page or the TLB? >=20 > It is, at least 31 out of 32 times that vm_page_dontneed() is called. = =46rom vm_page_dontneed(), which is called by madvise(MADV_FREE): >=20 > /* > * Clear any references to the page. Otherwise, the page = daemon will > * immediately reactivate the page. > * > * Perform the pmap_clear_reference() first. Otherwise, a = concurrent > * pmap operation, such as pmap_remove(), could clear a = reference in > * the pmap and set PGA_REFERENCED on the page before the > * pmap_clear_reference() had completed. Consequently, the = page would > * appear referenced based upon an old reference that occurred = before > * this function ran. > */ > pmap_clear_reference(m); > vm_page_aflag_clear(m, PGA_REFERENCED); Ah... I missed this. I didn't look in vm_page_dontneed() for this. I thought current FreeBSD behaved the same as 6.1-ish. >> 2. Why aren't the pages moved to the cache queue in the >> first place? >=20 > Because this would make madvise(MADV_FREE) considerably more = expensive, for example, the pages would have to be unmapped. Your = situation may be different, but more often than not, people call = madvise(MADV_FREE) when memory is plentiful, and there is no need to do = anything. In other words, the page daemon isn't going to need to run = anytime soon. For example, when madvise(MADV_FREE) is used in = implementations of malloc() and free(), the vast majority of calls to = madvise(MADV_FREE) are pointless. They are pointless in that soon after = the madvise(MADV_FREE) call by the free() implementation, either (1) the = application turns around and allocates more memory causing the = MADV_FREE'd memory to be used once again or (2) the process terminates = before the page daemon runs. Consequently, the implementation of = madvise(MADV_FREE) does the minimal necessary work so that if memory = does become scarce and the page daemon has to run, that the MADV_FREE'd = pages are first in line for reclamation. Understood. Thanks. >> Ad 2: >> MADV_DONTNEED is there to signal that the pages contain >> valid data, but that the page is not needed right now. >> Using this, pages get moved to the inactive queue. That >> makes sense. But MADV_FREE signals that there's no valid >> data anymore and that the page may be demand zeroed on >> next reference. The page is not inactive. It's free. If >> the paged was zeroed before calling MADV_FREE, the page >> really caches contents that that can be recreated later >> (the demand zero). >=20 > There is also another way of looking at it. By leaving the pages = allocated and mapped, you are saving time, i.e., CPU cycles, for the all = to common case that the MADV_FREE'd pages are used again in the near = future. >=20 > It wouldn't be illogical to have to two variants of MADV_FREE. One = for use by folks like yourself who can say definitively that the pages = won't be accessed again and should really be freed, and the current = implementation for more speculative uses like in the malloc() and free() = implementation. Better yet, the second case would be replaced by a = notification from the kernel to the process when memory is actually = becoming scarce so that we won't waste cycles on any pointless madvise() = calls by the process. This aligns with phk@s suggestion of MADV_RECYCLE. We may want to play with this. --=20 Marcel Moolenaar marcel@xcllnt.net