Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 21 Nov 2025 11:03:44 +0200
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Michal Meloun <mmel@freebsd.org>
Cc:        FreeBSD Current <freebsd-current@freebsd.org>
Subject:   Re: mmap( MAP_ANON) is broken on current. (was Still seeing Failed assertion: "p[i] == 0" on armv7 buildworld)
Message-ID:  <aSAq8Ds6nCA24YEI@kib.kiev.ua>
In-Reply-To: <aSAklF9D8haCAaNU@kib.kiev.ua>
References:  <8657a2f4-cb32-49a5-bbf6-cd5a4394c7be@FreeBSD.org> <aSAklF9D8haCAaNU@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help

On Fri, Nov 21, 2025 at 10:36:42AM +0200, Konstantin Belousov wrote:
> On Fri, Nov 21, 2025 at 08:12:55AM +0100, Michal Meloun wrote:
> > I have confirmed that jmalloc assertions are caused by mmap() failure. It
> > can return non-zeroed page(s) for mmap(MAP_ANON), which is clearly a bug.
> > 
> > I have confirmed this on native ARMv7, and according to Mark, it is also
> > reproducible on ARM32 and i386 jails. I think I saw it also on a
> > memory-constrained (4 GB) aarch64, but I cannot reproduce it yet.
> > 
> > Have somebody idea how to identify vm faults associated with anon mmap to
> > trigger detection of this failure in kernel? Or any other hint?
> 
> I think It would be much more visible if freshly allocated anonymous pages
> are corrupted.  A similar mechanism to get zeroed pages is used to get
> fresh page table pages, and corruption there must cause a lot of kernel
> page faults with 'invalid PTE bit' hw reports.
> But of course everything is possible.
> 
> VM has an optimization where we track known-to-be-zeroed free page
> separately, by marking them with PG_ZERO flag. If allocation needs a
> zeroed page and the flag is set, we skip calling pmap_zero_page() on it.
> 
> Also, in vm_page_free_prep() when we are told that the page is zeroed,
> with DIAGNOSTIC enabled, on amd64 and arm64, we do check for that.
> 
> So lets add slow check for vm_fault code that supposedly zeroed page is
> indeed zeroed.  Can you try to catch the issue with the patch applied,
> and DIAGNOSTIC enabled?  Patch is arch-agnostic and I believe should
> work on armv7, although obviously causing slowdown.

I also made the vm_page_free_prep() check MI.
Please use https://reviews.freebsd.org/D53850 instead of the previous
patch.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?aSAq8Ds6nCA24YEI>