From nobody Sat Nov 22 14:05:48 2025 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4dDDQK36d2z6HB2C for ; Sat, 22 Nov 2025 14:06:01 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4dDDQJ304Tz3wG4; Sat, 22 Nov 2025 14:06:00 +0000 (UTC) (envelope-from kostikbel@gmail.com) Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=gmail.com (policy=none); spf=softfail (mx1.freebsd.org: 2001:470:d5e7:1::1 is neither permitted nor denied by domain of kostikbel@gmail.com) smtp.mailfrom=kostikbel@gmail.com Received: from tom.home (kib@localhost [127.0.0.1] (may be forged)) by kib.kiev.ua (8.18.1/8.18.1) with ESMTP id 5AME5mLZ004171; Sat, 22 Nov 2025 16:05:51 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua 5AME5mLZ004171 Received: (from kostik@localhost) by tom.home (8.18.1/8.18.1/Submit) id 5AME5mGK004170; Sat, 22 Nov 2025 16:05:48 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 22 Nov 2025 16:05:48 +0200 From: Konstantin Belousov To: Michal Meloun Cc: FreeBSD Current Subject: Re: mmap( MAP_ANON) is broken on current. (was Still seeing Failed assertion: "p[i] == 0" on armv7 buildworld) Message-ID: References: <8657a2f4-cb32-49a5-bbf6-cd5a4394c7be@FreeBSD.org> <07201c46-6fb4-4514-aa88-490830edb010@freebsd.org> List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spamd-Bar: + X-Spamd-Result: default: False [1.19 / 15.00]; NEURAL_SPAM_MEDIUM(0.94)[0.937]; NEURAL_SPAM_LONG(0.65)[0.648]; NEURAL_HAM_SHORT(-0.39)[-0.392]; MIME_GOOD(-0.10)[text/plain]; DMARC_POLICY_SOFTFAIL(0.10)[gmail.com : No valid SPF, No valid DKIM,none]; ARC_NA(0.00)[]; ASN(0.00)[asn:6939, ipnet:2001:470::/32, country:US]; MIME_TRACE(0.00)[0:+]; MISSING_XM_UA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; FREEMAIL_FROM(0.00)[gmail.com]; RCPT_COUNT_TWO(0.00)[2]; MLMMJ_DEST(0.00)[freebsd-current@freebsd.org]; RCVD_TLS_LAST(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; R_DKIM_NA(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; R_SPF_SOFTFAIL(0.00)[~all]; TO_DN_ALL(0.00)[]; HAS_XAW(0.00)[] X-Rspamd-Queue-Id: 4dDDQJ304Tz3wG4 On Sat, Nov 22, 2025 at 03:48:08PM +0200, Konstantin Belousov wrote: > On Sat, Nov 22, 2025 at 01:23:00PM +0100, Michal Meloun wrote: > > > > > > On 21.11.2025 21:02, Konstantin Belousov wrote: > > > On Fri, Nov 21, 2025 at 09:54:23PM +0200, Konstantin Belousov wrote: > > > > On Fri, Nov 21, 2025 at 08:08:47PM +0100, Michal Meloun wrote: > > > > > First, many thanks for your efforts, but this check doesn't trigger when the > > > > > problem occurs > > > > > > > > > Hm, ok. This is a data point, in fact. > > > > > > > > > > > > > > To be more precise, testing case > > > > > on fresh kernel(d8bfcacd12aba73188c44a157c707908e275825d) > > > > > with PMAP_DEBUG defined in pmap-v6.c and with > > > > > trivial zero check for first page at this place -> > > > > > https://cgit.freebsd.org/src/tree/contrib/jemalloc/src/pages.c#n281 > > > > > > > > > > causes this failure: > > > > > > > > > > __je_pages_map: addr: 0x0, ret: 0x3087b000, size: 4096, alignment: 4096, > > > > > prot: 0x00000003, flags: 0x0C001002 > > > > > __je_pages_map: i: 0, p[i]: 0xFFFFFFFF, p: 0x3087b000 > > > > > __je_pages_map: i: 23, p[i]: 0x308E5F94, p: 0x3087b000 > > > > > > > > Could you, please, when the failure is detected, spawn 'procstat -v ' > > > > and dump the memory map of the process? To be clear, I want to see all > > > > of this: > > > > - the address of the mapping returned by mmap > > > > - its size > > > > - the location of the first non-zero byte > > > > - memory map > > > > > > Also, regardless of the output above, please try this as a wild guess: > > > > > > diff --git a/sys/vm/vm_object.c b/sys/vm/vm_object.c > > > index 5b4517d2bf0c..5c6ed51706bf 100644 > > > --- a/sys/vm/vm_object.c > > > +++ b/sys/vm/vm_object.c > > > @@ -2222,7 +2222,7 @@ vm_object_coalesce(vm_object_t prev_object, vm_ooffset_t prev_offset, > > > * Remove any pages that may still be in the object from a previous > > > * deallocation. > > > */ > > > - if (next_pindex < prev_object->size) { > > > + if (true || next_pindex < prev_object->size) { > > > vm_object_page_remove(prev_object, next_pindex, next_pindex + > > > next_size, 0); > > > #if 0 > > > > > I finally found the right way to obtain both parts of report in synchronized > > and straightforward way. > > The outputs from procstat -v are in the Outputs from procstat -v are in > > attachments, I hope that the mailing list won't eat them. > > These are without this patch. > > > > About this patch - this does not solve the problem, but it measurable > > reduces its likelihood. > > So in both cases you reported below (skipped) the problem indeed appeared > in the case where we extend existing mapping, potentially causing the > object to reuse the dandling pages after the object' end. This must > explain why the debugging patch did not catched anything. > > It is somewhat strange that the vm_object_coalesce() patch has the > non-deterministic effect, but lets see. Below is the big hammer, > disabling the extension for the anon mappings at all. Again, I want to > know if it helps. This is a debugging aid, not a fix. It should cause > large(r) fragmentation of the process map, and possibly much larger > kernel memory use, but I hope it is usable for test. > > diff --git a/sys/vm/vm_map.c b/sys/vm/vm_map.c > index 6b09552c5fee..6a1db58d2d13 100644 > --- a/sys/vm/vm_map.c > +++ b/sys/vm/vm_map.c > @@ -1732,7 +1732,7 @@ vm_map_insert1(vm_map_t map, vm_object_t object, vm_ooffset_t offset, > vm_object_clear_flag(object, OBJ_ONEMAPPING); > VM_OBJECT_WUNLOCK(object); > } > - } else if ((prev_entry->eflags & ~MAP_ENTRY_USER_WIRED) == > + } else if (false && (prev_entry->eflags & ~MAP_ENTRY_USER_WIRED) == > protoeflags && > (cow & (MAP_STACK_AREA | MAP_VN_EXEC)) == 0 && > prev_entry->end == start && (prev_entry->cred == cred || > Independent from the patch above, please test the following enhancement to the vm_object_coalesce() patch. There, I have some hope that this might be a proper fix. commit 012ea1ce604a59d790661c25656c1c641d33d6ec Author: Konstantin Belousov Date: Sat Nov 22 16:02:50 2025 +0200 vm_object_coalesce(): fix logic to detect coalesce possibility, simplify diff --git a/sys/vm/vm_object.c b/sys/vm/vm_object.c index 5b4517d2bf0c..d59327cf22ca 100644 --- a/sys/vm/vm_object.c +++ b/sys/vm/vm_object.c @@ -2188,9 +2188,14 @@ vm_object_coalesce(vm_object_t prev_object, vm_ooffset_t prev_offset, prev_size >>= PAGE_SHIFT; next_size >>= PAGE_SHIFT; next_pindex = OFF_TO_IDX(prev_offset) + prev_size; + KASSERT(next_pindex + next_size > prev_object->size, + ("vm_object_coalesce: " + "obj %p next_pindex %#jx next_size %#jx obj_size %#jx", + prev_object, (uintmax_t)next_pindex, (uintmax_t)next_size, + (uintmax_t)prev_object->size)); - if (prev_object->ref_count > 1 && - prev_object->size != next_pindex && + if (prev_object->ref_count > 1 || + prev_object->size != next_pindex || (prev_object->flags & OBJ_ONEMAPPING) == 0) { VM_OBJECT_WUNLOCK(prev_object); return (FALSE); @@ -2222,26 +2227,13 @@ vm_object_coalesce(vm_object_t prev_object, vm_ooffset_t prev_offset, * Remove any pages that may still be in the object from a previous * deallocation. */ - if (next_pindex < prev_object->size) { - vm_object_page_remove(prev_object, next_pindex, next_pindex + - next_size, 0); -#if 0 - if (prev_object->cred != NULL) { - KASSERT(prev_object->charge >= - ptoa(prev_object->size - next_pindex), - ("object %p overcharged 1 %jx %jx", prev_object, - (uintmax_t)next_pindex, (uintmax_t)next_size)); - prev_object->charge -= ptoa(prev_object->size - - next_pindex); - } -#endif - } + vm_object_page_remove(prev_object, next_pindex, next_pindex + + next_size, 0); /* * Extend the object if necessary. */ - if (next_pindex + next_size > prev_object->size) - prev_object->size = next_pindex + next_size; + prev_object->size = next_pindex + next_size; VM_OBJECT_WUNLOCK(prev_object); return (TRUE);