Date: Mon, 5 Jul 2021 19:32:00 +0300 From: Vitaliy Gusev <gusev.vitaliy@gmail.com> To: kostikbel@gmail.com Cc: freebsd-hackers@freebsd.org, gljennjohn@gmail.com, Mark Johnston <markj@freebsd.org> Subject: Re: madvise(MADV_FREE) doesn't work in some cases? Message-ID: <2390FA9B-319E-45D4-BEA7-10878E43AD4B@gmail.com> In-Reply-To: <D542E8C1-4E97-48E8-8748-BBA19B2216EC@gmail.com> References: <D5749BDF-36B5-4AE9-A75F-2A702DF71F8C@gmail.com> <20210703065420.6dbafb5f@ernst.home> <D542E8C1-4E97-48E8-8748-BBA19B2216EC@gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
--Apple-Mail=_283679B0-317F-482E-9F4E-E3C4F0D00D24 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Hi, > > Does it mean madvise() doesn't work well in FreeBSD or test does = something wrong? >=20 > Your program does not exactly what you described above. There is a = generic > race to consume memory, and some specific details about madvise(2) on = FreeBSD. >=20 > >=46rom the code, you do: > - mmap anonymous private region > - fork > - both child and parent start touching the mmaped region. >=20 > Two processes race to consume 1/2 of RAM on your system. If one of > them happen to execute faster then another, you do get to the case = where > one of them does madvise(). But it could be that processes execute in > lockstep, and try to eat all the memory before going to madvise(). > Did you excluded this case? I believe I did all things right. You can see sleeps that serialise = execution. To check again I modified test and added time printing and = use MADV_DONTNEED: Here is source http://cpp.sh/2rd4f <http://cpp.sh/2rd4f> I=E2=80=99ve run:=20 $ ./mmapfork 2300 mmap 0x801000000 pid 40628 end 0x890c00000 len 0x8fc00000 pid 40628 pid 40629 40629: [1625500831] touch 40629: [1625500832] sleep before madvise 40629: [1625500833] madvise 40629: [1625500834] Press enter to exit 40628: [1625500845] touch 40628: [1625500846] sleep before madvise 40628: [1625500851] madvise 40628: [1625500852] Press enter to exit And you can see that child started running in 11 seconds after parent = had already called madvise() for all scope of touched memory. And finally in dmesg: pid 40629 (mmapfork), jid 0, uid 1001, was killed: out of swap space So the same result as I wrote in the first email. > Now, about the specific of madvise(MADV_FREE) on FreeBSD. Due to the = way > CoW is implemented with the shadow chain of objects, we cannot drop = the > top of the shadow chain, otherwise instead of returning zeroed pages = next > time, we would return content back in the time. It was relatively = recent > discovery, see bf5661f4a1af6931ec4b6, PR 240061. >=20 Thanks, I will look at it. > To explain it in simplified form, when there is potential old content > under the CoW copy for the mapping, we cannot drop CoW-ed pages. This > is the motivation why madvise(MADV_FREE) does nothing for your = program. > When you run two instances without fork, there is no previous content > and no Cow, so madvise() can safely remove the pages from the object, > and on the next access they are zero-filled. Do I understand right, that it should work with MADV_DONTNEED? But = =E2=80=9Cdontneed" variant doesn=E2=80=99t work.=20 >=20 > You can read more details in the referenced commit, as well as some = musings > about way to make it somewhat better. >=20 > I must say, that trying to allocated 1/2 + 1/2 of RAM this way, on a = system > without swap, is the way to ask for troubles anyway. I=E2=80=99ve just notify that other operation systems work well with = that, whereas FreeBSD has troubles. Probably something in madvise() is = not finished ? ---- Vitaliy Gusev --Apple-Mail=_283679B0-317F-482E-9F4E-E3C4F0D00D24--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2390FA9B-319E-45D4-BEA7-10878E43AD4B>