Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 5 Apr 2012 20:31:38 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Alan Cox <alc@rice.edu>
Cc:        alc@freebsd.org, freebsd-hackers@freebsd.org, Andrey Zonov <andrey@zonov.org>
Subject:   Re: problems with mmap() and disk caching
Message-ID:  <20120405173138.GX2358@deviant.kiev.zoral.com.ua>
In-Reply-To: <4F7DC037.9060803@rice.edu>
References:  <4F7B495D.3010402@zonov.org> <20120404071746.GJ2358@deviant.kiev.zoral.com.ua> <4F7DC037.9060803@rice.edu>

next in thread | previous in thread | raw e-mail | index | archive | help

--zr9wCmgsEgDWsI0K
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, Apr 05, 2012 at 10:54:31AM -0500, Alan Cox wrote:
> On 04/04/2012 02:17, Konstantin Belousov wrote:
> >On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote:
> >>Hi,
> >>
> >>I open the file, then call mmap() on the whole file and get pointer,
> >>then I work with this pointer.  I expect that page should be only once
> >>touched to get it into the memory (disk cache?), but this doesn't work!
> >>
> >>I wrote the test (attached) and ran it for the 1G file generated from
> >>/dev/random, the result is the following:
> >>
> >>Prepare file:
> >># swapoff -a
> >># newfs /dev/ada0b
> >># mount /dev/ada0b /mnt
> >># dd if=3D/dev/random of=3D/mnt/random-1024 bs=3D1m count=3D1024
> >>
> >>Purge cache:
> >># umount /mnt
> >># mount /dev/ada0b /mnt
> >>
> >>Run test:
> >>$ ./mmap /mnt/random-1024 30
> >>mmap:  1 pass took:   7.431046 (none: 262112; res:     32; super:
> >>0; other:      0)
> >>mmap:  2 pass took:   7.356670 (none: 261648; res:    496; super:
> >>0; other:      0)
> >>mmap:  3 pass took:   7.307094 (none: 260521; res:   1623; super:
> >>0; other:      0)
> >>mmap:  4 pass took:   7.350239 (none: 258904; res:   3240; super:
> >>0; other:      0)
> >>mmap:  5 pass took:   7.392480 (none: 257286; res:   4858; super:
> >>0; other:      0)
> >>mmap:  6 pass took:   7.292069 (none: 255584; res:   6560; super:
> >>0; other:      0)
> >>mmap:  7 pass took:   7.048980 (none: 251142; res:  11002; super:
> >>0; other:      0)
> >>mmap:  8 pass took:   6.899387 (none: 247584; res:  14560; super:
> >>0; other:      0)
> >>mmap:  9 pass took:   7.190579 (none: 242992; res:  19152; super:
> >>0; other:      0)
> >>mmap: 10 pass took:   6.915482 (none: 239308; res:  22836; super:
> >>0; other:      0)
> >>mmap: 11 pass took:   6.565909 (none: 232835; res:  29309; super:
> >>0; other:      0)
> >>mmap: 12 pass took:   6.423945 (none: 226160; res:  35984; super:
> >>0; other:      0)
> >>mmap: 13 pass took:   6.315385 (none: 208555; res:  53589; super:
> >>0; other:      0)
> >>mmap: 14 pass took:   6.760780 (none: 192805; res:  69339; super:
> >>0; other:      0)
> >>mmap: 15 pass took:   5.721513 (none: 174497; res:  87647; super:
> >>0; other:      0)
> >>mmap: 16 pass took:   5.004424 (none: 155938; res: 106206; super:
> >>0; other:      0)
> >>mmap: 17 pass took:   4.224926 (none: 135639; res: 126505; super:
> >>0; other:      0)
> >>mmap: 18 pass took:   3.749608 (none: 117952; res: 144192; super:
> >>0; other:      0)
> >>mmap: 19 pass took:   3.398084 (none:  99066; res: 163078; super:
> >>0; other:      0)
> >>mmap: 20 pass took:   3.029557 (none:  74994; res: 187150; super:
> >>0; other:      0)
> >>mmap: 21 pass took:   2.379430 (none:  55231; res: 206913; super:
> >>0; other:      0)
> >>mmap: 22 pass took:   2.046521 (none:  40786; res: 221358; super:
> >>0; other:      0)
> >>mmap: 23 pass took:   1.152797 (none:  30311; res: 231833; super:
> >>0; other:      0)
> >>mmap: 24 pass took:   0.972617 (none:  16196; res: 245948; super:
> >>0; other:      0)
> >>mmap: 25 pass took:   0.577515 (none:   8286; res: 253858; super:
> >>0; other:      0)
> >>mmap: 26 pass took:   0.380738 (none:   3712; res: 258432; super:
> >>0; other:      0)
> >>mmap: 27 pass took:   0.253583 (none:   1193; res: 260951; super:
> >>0; other:      0)
> >>mmap: 28 pass took:   0.157508 (none:      0; res: 262144; super:
> >>0; other:      0)
> >>mmap: 29 pass took:   0.156169 (none:      0; res: 262144; super:
> >>0; other:      0)
> >>mmap: 30 pass took:   0.156550 (none:      0; res: 262144; super:
> >>0; other:      0)
> >>
> >>If I ran this:
> >>$ cat /mnt/random-1024>  /dev/null
> >>before test, when result is the following:
> >>
> >>$ ./mmap /mnt/random-1024 5
> >>mmap:  1 pass took:   0.337657 (none:      0; res: 262144; super:
> >>0; other:      0)
> >>mmap:  2 pass took:   0.186137 (none:      0; res: 262144; super:
> >>0; other:      0)
> >>mmap:  3 pass took:   0.186132 (none:      0; res: 262144; super:
> >>0; other:      0)
> >>mmap:  4 pass took:   0.186535 (none:      0; res: 262144; super:
> >>0; other:      0)
> >>mmap:  5 pass took:   0.190353 (none:      0; res: 262144; super:
> >>0; other:      0)
> >>
> >>This is what I expect.  But why this doesn't work without reading file
> >>manually?
> >Issue seems to be in some change of the behaviour of the reserv or
> >phys allocator. I Cc:ed Alan.
>=20
> I'm pretty sure that the behavior here hasn't significantly changed in=20
> about twelve years.  Otherwise, I agree with your analysis.
>=20
> On more than one occasion, I've been tempted to change:
>=20
>                                         pmap_remove_all(mt);
>                                         if (mt->dirty !=3D 0)
>                                                 vm_page_deactivate(mt);
>                                         else
>                                                 vm_page_cache(mt);
>=20
> to:
>=20
>                                         vm_page_dontneed(mt);
>=20
> because I suspect that the current code does more harm than good.  In=20
> theory, it saves activations of the page daemon.  However, more often=20
> than not, I suspect that we are spending more on page reactivations than=
=20
> we are saving on page daemon activations.  The sequential access=20
> detection heuristic is just too easily triggered.  For example, I've=20
> seen it triggered by demand paging of the gcc text segment.  Also, I=20
> think that pmap_remove_all() and especially vm_page_cache() are too=20
> severe for a detection heuristic that is so easily triggered.

Yes, I agree that such change shall be an improvement, and I expect
that Andrey will test it.

On the other hand, I do think that allocator should prefer unnamed
pages to pages which still have valid content. On my 12G desktop,
I never saw more then 100MB of cached pages, and similar numbers
are observed on the 32-48GB servers. I suppose that this is related.

--zr9wCmgsEgDWsI0K
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (FreeBSD)

iEYEARECAAYFAk991voACgkQC3+MBN1Mb4hlcgCfR9YVkv2Oj7ybQhmro4m7Ewgs
FxEAn1urOu+uu1tcLh4u7H56v/oNAsJJ
=HI7f
-----END PGP SIGNATURE-----

--zr9wCmgsEgDWsI0K--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120405173138.GX2358>