Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 11 Jun 2020 20:29:27 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        Justin Hibbits <chmeeedalf@gmail.com>
Cc:        "vangyzen@freebsd.org" <vangyzen@FreeBSD.org>, svn-src-head@freebsd.org, FreeBSD Current <freebsd-current@freebsd.org>, FreeBSD Hackers <freebsd-hackers@freebsd.org>, FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>, Brandon Bergren <bdragon@FreeBSD.org>
Subject:   Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311
Message-ID:  <1EDCA498-0B67-4374-B7CA-1ECDA8EE32AD@yahoo.com>
In-Reply-To: <20200611212532.59f677be@ralga.knownspace>
References:  <C24EE1A1-FAED-42C2-8204-CA7B1D20A369@yahoo.com> <695E6836-F860-4557-B7DE-CC1EDB347F18@yahoo.com> <DCABCD83-27B0-4F2D-9410-69102294A98E@yahoo.com> <121B9B09-141B-4DC3-918B-1E7CFB99E779@yahoo.com> <8AAB0462-3FA8-490C-8D8D-7C15B1C9E2DE@yahoo.com> <18E62746-80DB-4195-977D-4FF32D0129EE@yahoo.com> <F5953A6B-56CE-4D1C-8C18-58D44B639881@yahoo.com> <D0C483E5-3F6A-4816-A6BA-3D2C82C24F8E@yahoo.com> <C440956F-139E-4EF7-A68E-FE35D9934BD3@yahoo.com> <9562EEE4-62EF-4164-91C0-948CC0432984@yahoo.com> <9B68839B-AEC8-43EE-B3B6-B696A4A57DAE@yahoo.com> <359C9C7D-4106-42B5-AAB5-08EF995B8100@yahoo.com> <20200513105632.06db9e21@titan.knownspace> <B1225914-43BC-44EF-A73E-D06B890229C6@yahoo.com> <20200611155545.55526f7c@ralga.knownspace> <5542B85D-1C3A-41D8-98CE-3C02E990C3EB@yahoo.com> <20200611164216.47f82775@ralga.knownspace> <DEA9A860-5DEE-49EE-97F1-DBDB39D5C0A3@yahoo.com> <DCB0BC72-1666-49F3-A838-B2A0D653A0C2@yahoo.com> <20200611212532.59f677be@ralga.knownspace>

next in thread | previous in thread | raw e-mail | index | archive | help


On 2020-Jun-11, at 19:25, Justin Hibbits <chmeeedalf at gmail.com> =
wrote:

> On Thu, 11 Jun 2020 17:30:24 -0700
> Mark Millard <marklmi@yahoo.com> wrote:
>=20
>> On 2020-Jun-11, at 16:49, Mark Millard <marklmi at yahoo.com> wrote:
>>=20
>>> On 2020-Jun-11, at 14:42, Justin Hibbits <chmeeedalf at gmail.com>
>>> wrote:
>>>=20
>>> On Thu, 11 Jun 2020 14:36:37 -0700
>>> Mark Millard <marklmi@yahoo.com> wrote:
>>>=20
>>>> On 2020-Jun-11, at 13:55, Justin Hibbits <chmeeedalf at gmail.com>
>>>> wrote:
>>>>=20
>>>>> On Wed, 10 Jun 2020 18:56:57 -0700
>>>>> Mark Millard <marklmi@yahoo.com> wrote: =20
>>> . . . =20
>>>>=20
>>>>=20
>>>>> That said, the attached patch effectively copies
>>>>> what's done in OEA6464 into OEA pmap.  Can you test it?   =20
>>>>=20
>>>> I'll try it once I get a chance, probably later
>>>> today.
>>>> . . . =20
>>>=20
>>> No luck at the change being a fix, I'm afraid.
>>>=20
>>> I verified that the build ended up with
>>>=20
>>> 00926cb0 <moea_protect+0x2ec> bl      008e8dc8 <PHYS_TO_VM_PAGE>
>>> 00926cb4 <moea_protect+0x2f0> mr      r27,r3
>>> 00926cb8 <moea_protect+0x2f4> addi    r3,r3,36
>>> 00926cbc <moea_protect+0x2f8> hwsync
>>> 00926cc0 <moea_protect+0x2fc> lwarx   r25,0,r3
>>> 00926cc4 <moea_protect+0x300> li      r4,0
>>> 00926cc8 <moea_protect+0x304> stwcx.  r4,0,r3
>>> 00926ccc <moea_protect+0x308> bne-    00926cc0 <moea_protect+0x2fc>
>>> 00926cd0 <moea_protect+0x30c> andi.   r3,r25,128
>>> 00926cd4 <moea_protect+0x310> beq     00926ce0 <moea_protect+0x31c>
>>> 00926cd8 <moea_protect+0x314> mr      r3,r27
>>> 00926cdc <moea_protect+0x318> bl      008e9874 <vm_page_dirty_KBI>
>>>=20
>>> in the installed kernel. So I doubt a
>>> mis-build would be involved. It is a
>>> head -r360311 based context still. World is
>>> without MALLOC_PRODUCTION so that jemalloc
>>> code executes its asserts, catching more
>>> and earlier than otherwise.
>>>=20
>>> First test . . .
>>>=20
>>> The only thing that the witness kernel reported was:
>>>=20
>>> Jun 11 15:58:16 FBSDG4S2 kernel: lock order reversal:
>>> Jun 11 15:58:16 FBSDG4S2 kernel:  1st 0x216fb00 Mountpoints (UMA
>>> zone) @ /usr/src/sys/vm/uma_core.c:4387 Jun 11 15:58:16 FBSDG4S2
>>> kernel:  2nd 0x1192d2c kernelpmap (kernelpmap) @
>>> /usr/src/sys/powerpc/aim/mmu_oea.c:1524 Jun 11 15:58:16 FBSDG4S2
>>> kernel: stack backtrace: Jun 11 15:58:16 FBSDG4S2 kernel: #0
>>> 0x5ec164 at witness_debugger+0x94 Jun 11 15:58:16 FBSDG4S2 kernel:
>>> #1 0x5ebe3c at witness_checkorder+0xb50 Jun 11 15:58:16 FBSDG4S2
>>> kernel: #2 0x536d5c at __mtx_lock_flags+0xcc Jun 11 15:58:16
>>> FBSDG4S2 kernel: #3 0x92636c at moea_kextract+0x5c Jun 11 15:58:16
>>> FBSDG4S2 kernel: #4 0x965d30 at pmap_kextract+0x98 Jun 11 15:58:16
>>> FBSDG4S2 kernel: #5 0x8bfdbc at zone_release+0xf0 Jun 11 15:58:16
>>> FBSDG4S2 kernel: #6 0x8c7854 at bucket_drain+0x2f0 Jun 11 15:58:16
>>> FBSDG4S2 kernel: #7 0x8c728c at bucket_free+0x54 Jun 11 15:58:16
>>> FBSDG4S2 kernel: #8 0x8c74fc at bucket_cache_reclaim+0x1bc Jun 11
>>> 15:58:16 FBSDG4S2 kernel: #9 0x8c7004 at zone_reclaim+0x128 Jun 11
>>> 15:58:16 FBSDG4S2 kernel: #10 0x8c3a40 at uma_reclaim+0x170 Jun 11
>>> 15:58:16 FBSDG4S2 kernel: #11 0x8c3f70 at uma_reclaim_worker+0x68
>>> Jun 11 15:58:16 FBSDG4S2 kernel: #12 0x50fbac at fork_exit+0xb0 Jun
>>> 11 15:58:16 FBSDG4S2 kernel: #13 0x9684ac at fork_trampoline+0xc
>>>=20
>>> The processes that were hit were listed as:
>>>=20
>>> Jun 11 15:59:11 FBSDG4S2 kernel: pid 971 (cron), jid 0, uid 0:
>>> exited on signal 11 (core dumped) Jun 11 16:02:59 FBSDG4S2 kernel:
>>> pid 1111 (stress), jid 0, uid 0: exited on signal 6 (core dumped)
>>> Jun 11 16:03:27 FBSDG4S2 kernel: pid 871 (mountd), jid 0, uid 0:
>>> exited on signal 6 (core dumped) Jun 11 16:03:40 FBSDG4S2 kernel:
>>> pid 1065 (su), jid 0, uid 0: exited on signal 6 Jun 11 16:04:13
>>> FBSDG4S2 kernel: pid 1088 (su), jid 0, uid 0: exited on signal 6
>>> Jun 11 16:04:28 FBSDG4S2 kernel: pid 968 (sshd), jid 0, uid 0:
>>> exited on signal 6
>>>=20
>>> Jun 11 16:05:42 FBSDG4S2 kernel: pid 1028 (login), jid 0, uid 0:
>>> exited on signal 6
>>>=20
>>> Jun 11 16:05:46 FBSDG4S2 kernel: pid 873 (nfsd), jid 0, uid 0:
>>> exited on signal 6 (core dumped)
>>>=20
>>>=20
>>> Rebooting and rerunning and showing the stress output and such
>>> (I did not capture copies during the first test, but the first
>>> test had similar messages at the same sort of points):
>>>=20
>>> Second test . . .
>>>=20
>>> # stress -m 2 --vm-bytes 1700M
>>> stress: info: [1166] dispatching hogs: 0 cpu, 0 io, 2 vm, 0 hdd
>>> <jemalloc>:
>>> =
/usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258:=

>>> Failed assertion: "slab =3D=3D extent_slab_get(extent)" <jemalloc>:
>>> =
/usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258:=

>>> Failed assertion: "slab =3D=3D extent_slab_get(extent)" ^C
>>>=20
>>> # exit
>>> <jemalloc>:
>>> /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:200:
>>> Failed assertion: "ret =3D=3D sz_index2size_compute(index)" Abort =
trap
>>>=20
>>> The other stuff was similar to to first test, not repeated here. =20
>>=20
>> The updated code looks odd to me for how "m" is
>> handled (part of a egrep to ensure I show all the
>> usage of m):
>>=20
>> moea_protect(mmu_t mmu, pmap_t pm, vm_offset_t sva, vm_offset_t eva,
>>        vm_page_t       m;
>>                        if (pm !=3D kernel_pmap && m !=3D NULL &&
>>                            (m->a.flags & PGA_EXECUTABLE) =3D=3D 0 &&
>>                                if ((m->oflags & VPO_UNMANAGED) =3D=3D =
0)
>>                                        vm_page_aflag_set(m,
>> PGA_EXECUTABLE); m =3D PHYS_TO_VM_PAGE(old_pte.pte_lo & PTE_RPGN);
>>                                refchg =3D
>> atomic_readandclear_32(&m->md.mdpg_attrs); vm_page_dirty(m);
>>                                        vm_page_aflag_set(m,
>> PGA_REFERENCED);
>>=20
>> Or more completely, with notes mixed in:
>>=20
>> void=20
>> moea_protect(mmu_t mmu, pmap_t pm, vm_offset_t sva, vm_offset_t eva,
>>    vm_prot_t prot)
>> {
>>        . . .
>>        vm_page_t       m;
>>        . . .
>>        for (pvo =3D RB_NFIND(pvo_tree, &pm->pmap_pvo, &key);
>>            pvo !=3D NULL && PVO_VADDR(pvo) < eva; pvo =3D tpvo) {
>>                . . .
>>                if (pt !=3D NULL) {
>>                        . . .
>>                        if (pm !=3D kernel_pmap && m !=3D NULL &&
>>=20
>> NOTE: m seems to be uninitialized but tested for being NULL
>> above.
>>=20
>>                            (m->a.flags & PGA_EXECUTABLE) =3D=3D 0 &&
>>=20
>> Note: This looks to potentially be using a random, non-NULL
>> value for m during evaluation of m->a.flags .
>>=20
>>                        . . .
>>=20
>>                        if ((pvo->pvo_vaddr & PVO_MANAGED) &&
>>                            (pvo->pvo_pte.prot & VM_PROT_WRITE)) {
>>                                m =3D PHYS_TO_VM_PAGE(old_pte.pte_lo &
>> PTE_RPGN);
>>=20
>> Note: m finally is potentially initialized(/set).
>>=20
>>                                refchg =3D
>> atomic_readandclear_32(&m->md.mdpg_attrs); if (refchg & PTE_CHG)
>>                                        vm_page_dirty(m);
>>                                if (refchg & PTE_REF)
>>                                        vm_page_aflag_set(m,
>> PGA_REFERENCED); . . .
>>=20
>> Note: So, if m is set above, then the next loop
>> iteration(s) would use this then-old value
>> instead of an initialized value.
>>=20
>> It looks to me like at least one assignment
>> to m is missing.
>>=20
>> moea64_pvo_protect has pg that seems analogous to
>> m and has:
>>=20
>>        pg =3D PHYS_TO_VM_PAGE(pvo->pvo_pte.pa & LPTE_RPGN);
>> . . .
>>        if (pm !=3D kernel_pmap && pg !=3D NULL &&
>>            (pg->a.flags & PGA_EXECUTABLE) =3D=3D 0 &&
>>            (pvo->pvo_pte.pa & (LPTE_I | LPTE_G | LPTE_NOEXEC)) =3D=3D =
0)
>> { if ((pg->oflags & VPO_UNMANAGED) =3D=3D 0)
>>                        vm_page_aflag_set(pg, PGA_EXECUTABLE);
>>=20
>> . . .
>>        if (pg !=3D NULL && (pvo->pvo_vaddr & PVO_MANAGED) &&
>>            (oldprot & VM_PROT_WRITE)) {
>>                refchg |=3D =
atomic_readandclear_32(&pg->md.mdpg_attrs);
>>                if (refchg & LPTE_CHG)
>>                        vm_page_dirty(pg);
>>                if (refchg & LPTE_REF)
>>                        vm_page_aflag_set(pg, PGA_REFERENCED);
>>=20
>>=20
>> This might suggest some about what is missing.
>=20
> Can you try moving the assignment to 'm' to right below the
> moea_pte_change() call?

Panics during boot. svnlite diff shown later.

That change got me a panic just after the lines about ada0
and cd0 details. (Unknown what internal stage.) Hand
translated from a picture of the screen:

panic: vm_page_free_prep: mapping flags set in page 0xd032a078
. . .
panic
vm_page_free_prep
vm_page_free_toq
vm_page_free
vm_object_collapse
vm_object_deallocate
vm_map_process_deferred
vm_map_remove
exec_new_vmspace
exec_elf32_imgact
kern_execve
sys_execve
trap
powerpc_interrupt
user SC trap by 0x100d7af8 . . .




# svnlite diff /usr/src/sys/powerpc/aim/mmu_oea.c
Index: /usr/src/sys/powerpc/aim/mmu_oea.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- /usr/src/sys/powerpc/aim/mmu_oea.c	(revision 360322)
+++ /usr/src/sys/powerpc/aim/mmu_oea.c	(working copy)
@@ -1773,6 +1773,9 @@
 {
 	struct	pvo_entry *pvo, *tpvo, key;
 	struct	pte *pt;
+	struct	pte old_pte;
+	vm_page_t	m;
+	int32_t	refchg;
=20
 	KASSERT(pm =3D=3D &curproc->p_vmspace->vm_pmap || pm =3D=3D =
kernel_pmap,
 	    ("moea_protect: non current pmap"));
@@ -1800,12 +1803,31 @@
 		pvo->pvo_pte.pte.pte_lo &=3D ~PTE_PP;
 		pvo->pvo_pte.pte.pte_lo |=3D PTE_BR;
=20
+		old_pte =3D *pt;
+
 		/*
 		 * If the PVO is in the page table, update that pte as =
well.
 		 */
 		if (pt !=3D NULL) {
 			moea_pte_change(pt, &pvo->pvo_pte.pte, =
pvo->pvo_vaddr);
+			m =3D PHYS_TO_VM_PAGE(old_pte.pte_lo & =
PTE_RPGN);
+			if (pm !=3D kernel_pmap && m !=3D NULL &&
+			    (m->a.flags & PGA_EXECUTABLE) =3D=3D 0 &&
+			    (pvo->pvo_pte.pa & (PTE_I | PTE_G)) =3D=3D =
0) {
+				if ((m->oflags & VPO_UNMANAGED) =3D=3D =
0)
+					vm_page_aflag_set(m, =
PGA_EXECUTABLE);
+				moea_syncicache(pvo->pvo_pte.pa & =
PTE_RPGN,
+				    PAGE_SIZE);
+			}
 			mtx_unlock(&moea_table_mutex);
+			if ((pvo->pvo_vaddr & PVO_MANAGED) &&
+			    (pvo->pvo_pte.prot & VM_PROT_WRITE)) {
+				refchg =3D =
atomic_readandclear_32(&m->md.mdpg_attrs);
+				if (refchg & PTE_CHG)
+					vm_page_dirty(m);
+				if (refchg & PTE_REF)
+					vm_page_aflag_set(m, =
PGA_REFERENCED);
+			}
 		}
 	}
 	rw_wunlock(&pvh_global_lock);


=3D=3D=3D
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1EDCA498-0B67-4374-B7CA-1ECDA8EE32AD>