Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 05 Apr 2012 11:41:16 -0500
From:      Alan Cox <alc@rice.edu>
To:        Andrey Zonov <andrey@zonov.org>
Cc:        Konstantin Belousov <kostikbel@gmail.com>, freebsd-hackers@freebsd.org, alc@freebsd.org
Subject:   Re: problems with mmap() and disk caching
Message-ID:  <4F7DCB2C.7070709@rice.edu>
In-Reply-To: <4F7C1620.6040703@zonov.org>
References:  <4F7B495D.3010402@zonov.org> <20120404071746.GJ2358@deviant.kiev.zoral.com.ua> <4F7C1620.6040703@zonov.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 04/04/2012 04:36, Andrey Zonov wrote:
> On 04.04.2012 11:17, Konstantin Belousov wrote:
>>
>> Calling madvise(MADV_RANDOM) fixes the issue, because the code to
>> deactivate/cache the pages is turned off. On the other hand, it also
>> turns of read-ahead for faulting, and the first loop becomes eternally
>> long.
>
> Now it takes 5 times longer.  Anyway, thanks for explanation.
>
>>
>> Doing MADV_WILLNEED does not fix the problem indeed, since willneed
>> reactivates the pages of the object at the time of call. To use
>> MADV_WILLNEED, you would need to call it between faults/memcpy.
>>
>
> I played with it, but no luck so far.
>
>>>
>>> I've also never seen super pages, how to make them work?
>> They just work, at least for me. Look at the output of procstat -v
>> after enough loops finished to not cause disk activity.
>>
>
> The problem was in my test program.  I fixed it, now I see super pages 
> but I'm still not satisfied.  There are several tests below:
>
> 1. With madvise(MADV_RANDOM) I see almost all super pages:
> $ ./mmap /mnt/random-1024 5
> mmap:  1 pass took:  26.438535 (none:      0; res: 262144; super: 511; 
> other:      0)
> mmap:  2 pass took:   0.187311 (none:      0; res: 262144; super: 511; 
> other:      0)
> mmap:  3 pass took:   0.184953 (none:      0; res: 262144; super: 511; 
> other:      0)
> mmap:  4 pass took:   0.186007 (none:      0; res: 262144; super: 511; 
> other:      0)
> mmap:  5 pass took:   0.185790 (none:      0; res: 262144; super: 511; 
> other:      0)
>
> Should it be 512?
>

Check the starting virtual address.  It is probably not aligned on a 
superpage boundary.  Hence, a few pages at the start and end of your 
mapped region are not in a superpage.

> 2. Without madvise(MADV_RANDOM):
> $ ./mmap /mnt/random-1024 50
> mmap:  1 pass took:   7.629745 (none: 262112; res:     32; super: 0; 
> other:      0)
> mmap:  2 pass took:   7.301720 (none: 261202; res:    942; super: 0; 
> other:      0)
> mmap:  3 pass took:   7.261416 (none: 260226; res:   1918; super: 1; 
> other:      0)
> [skip]
> mmap: 49 pass took:   0.155368 (none:      0; res: 262144; super: 323; 
> other:      0)
> mmap: 50 pass took:   0.155438 (none:      0; res: 262144; super: 323; 
> other:      0)
>
> Only 323 pages.
>
> 3. If I just re-run test I don't see super pages with any size of 
> "block".
>
> $ ./mmap /mnt/random-1024 5 $((1<<30))
> mmap:  1 pass took:   1.013939 (none:      0; res: 262144; super: 0; 
> other:      0)
> mmap:  2 pass took:   0.267082 (none:      0; res: 262144; super: 0; 
> other:      0)
> mmap:  3 pass took:   0.270711 (none:      0; res: 262144; super: 0; 
> other:      0)
> mmap:  4 pass took:   0.268940 (none:      0; res: 262144; super: 0; 
> other:      0)
> mmap:  5 pass took:   0.269634 (none:      0; res: 262144; super: 0; 
> other:      0)
>
> 4. If I activate madvise(MADV_WILLNEDD) in the copy loop and re-run 
> test then I see super pages only if I use "block" greater than 2Mb.
>
> $ ./mmap /mnt/random-1024 1 $((1<<21))
> mmap:  1 pass took:   0.299722 (none:      0; res: 262144; super: 0; 
> other:      0)
> $ ./mmap /mnt/random-1024 1 $((1<<22))
> mmap:  1 pass took:   0.271828 (none:      0; res: 262144; super: 170; 
> other:      0)
> $ ./mmap /mnt/random-1024 1 $((1<<23))
> mmap:  1 pass took:   0.333188 (none:      0; res: 262144; super: 258; 
> other:      0)
> $ ./mmap /mnt/random-1024 1 $((1<<24))
> mmap:  1 pass took:   0.339250 (none:      0; res: 262144; super: 303; 
> other:      0)
> $ ./mmap /mnt/random-1024 1 $((1<<25))
> mmap:  1 pass took:   0.418812 (none:      0; res: 262144; super: 324; 
> other:      0)
> $ ./mmap /mnt/random-1024 1 $((1<<26))
> mmap:  1 pass took:   0.360892 (none:      0; res: 262144; super: 335; 
> other:      0)
> $ ./mmap /mnt/random-1024 1 $((1<<27))
> mmap:  1 pass took:   0.401122 (none:      0; res: 262144; super: 342; 
> other:      0)
> $ ./mmap /mnt/random-1024 1 $((1<<28))
> mmap:  1 pass took:   0.478764 (none:      0; res: 262144; super: 345; 
> other:      0)
> $ ./mmap /mnt/random-1024 1 $((1<<29))
> mmap:  1 pass took:   0.607266 (none:      0; res: 262144; super: 346; 
> other:      0)
> $ ./mmap /mnt/random-1024 1 $((1<<30))
> mmap:  1 pass took:   0.901269 (none:      0; res: 262144; super: 347; 
> other:      0)
>
> 5. If I activate madvise(MADV_WILLNEED) immediately after mmap() then 
> I see some number of super pages (the number from test #2).
>
> $ ./mmap /mnt/random-1024 5
> mmap:  1 pass took:   0.178666 (none:      0; res: 262144; super: 323; 
> other:      0)
> mmap:  2 pass took:   0.158889 (none:      0; res: 262144; super: 323; 
> other:      0)
> mmap:  3 pass took:   0.157229 (none:      0; res: 262144; super: 323; 
> other:      0)
> mmap:  4 pass took:   0.156895 (none:      0; res: 262144; super: 323; 
> other:      0)
> mmap:  5 pass took:   0.162938 (none:      0; res: 262144; super: 323; 
> other:      0)
>
> 6. If I read file manually before test then I don't see super pages 
> with any size of "block" and madvise(MADV_WILLNEED) doesn't help.
>
> $ ./mmap /mnt/random-1024 5 $((1<<30))
> mmap:  1 pass took:   0.996767 (none:      0; res: 262144; super: 0; 
> other:      0)
> mmap:  2 pass took:   0.311129 (none:      0; res: 262144; super: 0; 
> other:      0)
> mmap:  3 pass took:   0.317430 (none:      0; res: 262144; super: 0; 
> other:      0)
> mmap:  4 pass took:   0.314437 (none:      0; res: 262144; super: 0; 
> other:      0)
> mmap:  5 pass took:   0.310757 (none:      0; res: 262144; super: 0; 
> other:      0)
>
>

When you read manually, i.e., perform a dd in advance of running your 
test program, the VM subsystem doesn't know that you intend to later 
mmap() the data.  Moreover, it doesn't know what the alignment of that 
mapping will be.  So, when it allocates physical memory for the file 
during the running of dd, it only allocates ordinary pages.

I suspect that the rest of your results are explained by the overzealous 
behavior of the sequential access / cache-behind heuristic that Kostik 
described.

Alan




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4F7DCB2C.7070709>