Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 5 Jul 2021 19:32:00 +0300
From:      Vitaliy Gusev <gusev.vitaliy@gmail.com>
To:        kostikbel@gmail.com
Cc:        freebsd-hackers@freebsd.org, gljennjohn@gmail.com, Mark Johnston <markj@freebsd.org>
Subject:   Re: madvise(MADV_FREE) doesn't work in some cases?
Message-ID:  <2390FA9B-319E-45D4-BEA7-10878E43AD4B@gmail.com>
In-Reply-To: <D542E8C1-4E97-48E8-8748-BBA19B2216EC@gmail.com>
References:  <D5749BDF-36B5-4AE9-A75F-2A702DF71F8C@gmail.com> <20210703065420.6dbafb5f@ernst.home> <D542E8C1-4E97-48E8-8748-BBA19B2216EC@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--Apple-Mail=_283679B0-317F-482E-9F4E-E3C4F0D00D24
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=utf-8

Hi,
> > Does it mean madvise() doesn't work well in FreeBSD or test does =
something wrong?
>=20
> Your program does not exactly what you described above.  There is a =
generic
> race to consume memory, and some specific details about madvise(2) on =
FreeBSD.
>=20
> >=46rom the code, you do:
> - mmap anonymous private region
> - fork
> - both child and parent start touching the mmaped region.
>=20
> Two processes race to consume 1/2 of RAM on your system.  If one of
> them happen to execute faster then another, you do get to the case =
where
> one of them does madvise().  But it could be that processes execute in
> lockstep, and try to eat all the memory before going to madvise().
> Did you excluded this case?
I believe I did all things right. You can see sleeps that serialise =
execution. To check again I modified test and added time printing and =
use MADV_DONTNEED:

Here is source  http://cpp.sh/2rd4f <http://cpp.sh/2rd4f>;

I=E2=80=99ve run:=20

$ ./mmapfork 2300
mmap 0x801000000 pid 40628
end 0x890c00000 len 0x8fc00000
pid 40628
pid 40629
40629: [1625500831] touch
40629: [1625500832] sleep before madvise
40629: [1625500833] madvise
40629: [1625500834] Press enter to exit
40628: [1625500845] touch
40628: [1625500846] sleep before madvise
40628: [1625500851] madvise
40628: [1625500852] Press enter to exit

And you can see that child started running in 11 seconds after parent =
had already called madvise() for all scope of touched memory.

And finally in dmesg:

pid 40629 (mmapfork), jid 0, uid 1001, was killed: out of swap space

So the same result as I wrote in the first email.

> Now, about the specific of madvise(MADV_FREE) on FreeBSD.  Due to the =
way
> CoW is implemented with the shadow chain of objects, we cannot drop =
the
> top of the shadow chain, otherwise instead of returning zeroed pages =
next
> time, we would return content back in the time.  It was relatively =
recent
> discovery, see bf5661f4a1af6931ec4b6, PR 240061.
>=20
Thanks, I will look at it.
> To explain it in simplified form, when there is potential old content
> under the CoW copy for the mapping, we cannot drop CoW-ed pages. This
> is the motivation why madvise(MADV_FREE) does nothing for your =
program.
> When you run two instances without fork, there is no previous content
> and no Cow, so madvise() can safely remove the pages from the object,
> and on the next access they are zero-filled.

Do I understand right, that it should work with MADV_DONTNEED? But =
=E2=80=9Cdontneed" variant doesn=E2=80=99t work.=20
>=20
> You can read more details in the referenced commit, as well as some =
musings
> about way to make it somewhat better.
>=20
> I must say, that trying to allocated 1/2 + 1/2 of RAM this way, on a =
system
> without swap, is the way to ask for troubles anyway.
I=E2=80=99ve just notify that other operation systems work well with =
that, whereas FreeBSD has troubles. Probably something in madvise() is =
not finished ?

----
Vitaliy Gusev





--Apple-Mail=_283679B0-317F-482E-9F4E-E3C4F0D00D24--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2390FA9B-319E-45D4-BEA7-10878E43AD4B>