From owner-freebsd-hackers@FreeBSD.ORG Mon Oct 31 12:59:32 2011 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8EE54106566B; Mon, 31 Oct 2011 12:59:32 +0000 (UTC) (envelope-from onwahe@gmail.com) Received: from mail-pz0-f44.google.com (mail-pz0-f44.google.com [209.85.210.44]) by mx1.freebsd.org (Postfix) with ESMTP id 5D9938FC0A; Mon, 31 Oct 2011 12:59:32 +0000 (UTC) Received: by pzk4 with SMTP id 4so37712291pzk.3 for ; Mon, 31 Oct 2011 05:59:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=DEL+iGGniTInrwO4WrxDwRNRQPmeytlTyqvkhjNqJiE=; b=PF/WwqYkM8Q+NBRI+QEsWMhExkjPp+jvrDD3e+n0u+nPM/wflyR8jvObCNNld7ipLz 2mzi8PdD+D+/Fz/kGnaaXio2NTvtIG+rfN1qpdw0qWR472xZ0xWy+mMMsIkTPbfn6YL7 yM4DPxkwO+VUTQnZK861E8dsETMId8qQl13ck= MIME-Version: 1.0 Received: by 10.68.28.4 with SMTP id x4mr23564371pbg.56.1320065971333; Mon, 31 Oct 2011 05:59:31 -0700 (PDT) Received: by 10.142.57.6 with HTTP; Mon, 31 Oct 2011 05:59:31 -0700 (PDT) In-Reply-To: <4EAA3FBC.3090907@rice.edu> References: <20111006160159.GQ1511@deviant.kiev.zoral.com.ua> <4E8FF4B8.7010300@rice.edu> <4EA747B5.9040304@rice.edu> <4EAA3FBC.3090907@rice.edu> Date: Mon, 31 Oct 2011 13:59:31 +0100 Message-ID: From: Svatopluk Kraus To: Alan Cox Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: alc@freebsd.org, Wojciech Puchar , Kostik Belousov , hackers@freebsd.org, Grzegorz Kulewski Subject: Re: mmap performance and memory use X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 Oct 2011 12:59:32 -0000 On Fri, Oct 28, 2011 at 7:38 AM, Alan Cox wrote: > On 10/26/2011 06:23, Svatopluk Kraus wrote: >> >> Hi, >> >> well, I'm working on new port (arm11 mpcore) and pmap_enter_object() >> is what I'm debugging rigth now. And I did not find any way in >> userland how to force kernel to call pmap_enter_object() which makes >> SUPERPAGE mapping without promotion. I tried to call mmap() with >> MAP_PREFAULT_READ without success. I tried to call madvise() with >> MADV_WILLNEED without success too. >> > > mmap() should call pmap_enter_object() if MAP_PREFAULT_READ was specified= . > =A0I'm surprised to hear that it's not happening for you. Yes, it's not happening for me really. mmap() with MAP_PREFAULT_READ case: ---------------------------------------------------------------- vm_mmap() in sys/vm/vm_mmap.c (r225617) line 1501 - if MAP_ANON then docow =3D 0 line 1525 - vm_map_find() is called with zeroed docow It's propagated down the calling stack, so even vm_map_pmap_enter() is not called in vm_map_insert(). Most likely, this is correct. (Anonymous object -> no physical memory allocation in advance -> no SUPERPAGE mapping without promotion.) madvise() with MADV_WILLNEED case: ---------------------------------------------------------- vm_map_pmap_enter() in sys/vm/vm_map.c (r223825) line 1814 - vm_page_find_least() is called During madvise(), vm_map_pmap_enter() is called. However, in the call, vm_page_find_least() returns NULL. It returns NULL, if no page is allocated in object with pindex greater or equal to the parameter pindex. The following loop after the call says that if no page is allocated for SUPERPAGE (i.e. for given region), pmap_enter_object() is not called and this is correct. >> Moreover, the SUPERPAGE mapping is made readonly firstly. So, even if >> I have SUPERPAGE mapping without promotion, the mapping is demoted >> after first write, and promoted again after all underlying pages are >> accessed by write. There is 4K page table saving no longer. >> > > Yes, that is all true. =A0It is possible to change things so that the pag= e > table pages are reclaimed after a time, and not kept around indefinitely. > =A0However, this not high on my personal priority list. =A0Before that, i= t is > more likely that I will add an option to avoid the demotion on write, if = we > don't have to copy the entire superpage to do so. Well, I just wanted to remark that there is no 4K page table saving now. However, there is still big TLB entries saving with SUPERPAGE promotions. I'm not pushing you to do anything. I understand that physical pages allocation in advance is not good idea and it goes against great copy on write feature. However, something like MAP_PREFAULT_WRITE on MAP_ANON, which allocates all physical pages in advance and does SUPERPAGE mapping without promotion sounds like a good-but-really-specific feature, which can be utilized sometimes. Nevertheless, IMHO, it's not worth to do such specific feature. Svata