Date: Thu, 11 Jun 2020 21:25:32 -0500 From: Justin Hibbits <chmeeedalf@gmail.com> To: Mark Millard <marklmi@yahoo.com> Cc: "vangyzen@freebsd.org" <vangyzen@FreeBSD.org>, svn-src-head@freebsd.org, FreeBSD Current <freebsd-current@freebsd.org>, FreeBSD Hackers <freebsd-hackers@freebsd.org>, FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>, Brandon Bergren <bdragon@FreeBSD.org> Subject: Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311 Message-ID: <20200611212532.59f677be@ralga.knownspace> In-Reply-To: <DCB0BC72-1666-49F3-A838-B2A0D653A0C2@yahoo.com> References: <C24EE1A1-FAED-42C2-8204-CA7B1D20A369@yahoo.com> <695E6836-F860-4557-B7DE-CC1EDB347F18@yahoo.com> <DCABCD83-27B0-4F2D-9410-69102294A98E@yahoo.com> <121B9B09-141B-4DC3-918B-1E7CFB99E779@yahoo.com> <8AAB0462-3FA8-490C-8D8D-7C15B1C9E2DE@yahoo.com> <18E62746-80DB-4195-977D-4FF32D0129EE@yahoo.com> <F5953A6B-56CE-4D1C-8C18-58D44B639881@yahoo.com> <D0C483E5-3F6A-4816-A6BA-3D2C82C24F8E@yahoo.com> <C440956F-139E-4EF7-A68E-FE35D9934BD3@yahoo.com> <9562EEE4-62EF-4164-91C0-948CC0432984@yahoo.com> <9B68839B-AEC8-43EE-B3B6-B696A4A57DAE@yahoo.com> <359C9C7D-4106-42B5-AAB5-08EF995B8100@yahoo.com> <20200513105632.06db9e21@titan.knownspace> <B1225914-43BC-44EF-A73E-D06B890229C6@yahoo.com> <20200611155545.55526f7c@ralga.knownspace> <5542B85D-1C3A-41D8-98CE-3C02E990C3EB@yahoo.com> <20200611164216.47f82775@ralga.knownspace> <DEA9A860-5DEE-49EE-97F1-DBDB39D5C0A3@yahoo.com> <DCB0BC72-1666-49F3-A838-B2A0D653A0C2@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 11 Jun 2020 17:30:24 -0700 Mark Millard <marklmi@yahoo.com> wrote: > On 2020-Jun-11, at 16:49, Mark Millard <marklmi at yahoo.com> wrote: > > > On 2020-Jun-11, at 14:42, Justin Hibbits <chmeeedalf at gmail.com> > > wrote: > > > > On Thu, 11 Jun 2020 14:36:37 -0700 > > Mark Millard <marklmi@yahoo.com> wrote: > > > >> On 2020-Jun-11, at 13:55, Justin Hibbits <chmeeedalf at gmail.com> > >> wrote: > >> > >>> On Wed, 10 Jun 2020 18:56:57 -0700 > >>> Mark Millard <marklmi@yahoo.com> wrote: > > . . . > >> > >> > >>> That said, the attached patch effectively copies > >>> what's done in OEA6464 into OEA pmap. Can you test it? > >> > >> I'll try it once I get a chance, probably later > >> today. > >> . . . > > > > No luck at the change being a fix, I'm afraid. > > > > I verified that the build ended up with > > > > 00926cb0 <moea_protect+0x2ec> bl 008e8dc8 <PHYS_TO_VM_PAGE> > > 00926cb4 <moea_protect+0x2f0> mr r27,r3 > > 00926cb8 <moea_protect+0x2f4> addi r3,r3,36 > > 00926cbc <moea_protect+0x2f8> hwsync > > 00926cc0 <moea_protect+0x2fc> lwarx r25,0,r3 > > 00926cc4 <moea_protect+0x300> li r4,0 > > 00926cc8 <moea_protect+0x304> stwcx. r4,0,r3 > > 00926ccc <moea_protect+0x308> bne- 00926cc0 <moea_protect+0x2fc> > > 00926cd0 <moea_protect+0x30c> andi. r3,r25,128 > > 00926cd4 <moea_protect+0x310> beq 00926ce0 <moea_protect+0x31c> > > 00926cd8 <moea_protect+0x314> mr r3,r27 > > 00926cdc <moea_protect+0x318> bl 008e9874 <vm_page_dirty_KBI> > > > > in the installed kernel. So I doubt a > > mis-build would be involved. It is a > > head -r360311 based context still. World is > > without MALLOC_PRODUCTION so that jemalloc > > code executes its asserts, catching more > > and earlier than otherwise. > > > > First test . . . > > > > The only thing that the witness kernel reported was: > > > > Jun 11 15:58:16 FBSDG4S2 kernel: lock order reversal: > > Jun 11 15:58:16 FBSDG4S2 kernel: 1st 0x216fb00 Mountpoints (UMA > > zone) @ /usr/src/sys/vm/uma_core.c:4387 Jun 11 15:58:16 FBSDG4S2 > > kernel: 2nd 0x1192d2c kernelpmap (kernelpmap) @ > > /usr/src/sys/powerpc/aim/mmu_oea.c:1524 Jun 11 15:58:16 FBSDG4S2 > > kernel: stack backtrace: Jun 11 15:58:16 FBSDG4S2 kernel: #0 > > 0x5ec164 at witness_debugger+0x94 Jun 11 15:58:16 FBSDG4S2 kernel: > > #1 0x5ebe3c at witness_checkorder+0xb50 Jun 11 15:58:16 FBSDG4S2 > > kernel: #2 0x536d5c at __mtx_lock_flags+0xcc Jun 11 15:58:16 > > FBSDG4S2 kernel: #3 0x92636c at moea_kextract+0x5c Jun 11 15:58:16 > > FBSDG4S2 kernel: #4 0x965d30 at pmap_kextract+0x98 Jun 11 15:58:16 > > FBSDG4S2 kernel: #5 0x8bfdbc at zone_release+0xf0 Jun 11 15:58:16 > > FBSDG4S2 kernel: #6 0x8c7854 at bucket_drain+0x2f0 Jun 11 15:58:16 > > FBSDG4S2 kernel: #7 0x8c728c at bucket_free+0x54 Jun 11 15:58:16 > > FBSDG4S2 kernel: #8 0x8c74fc at bucket_cache_reclaim+0x1bc Jun 11 > > 15:58:16 FBSDG4S2 kernel: #9 0x8c7004 at zone_reclaim+0x128 Jun 11 > > 15:58:16 FBSDG4S2 kernel: #10 0x8c3a40 at uma_reclaim+0x170 Jun 11 > > 15:58:16 FBSDG4S2 kernel: #11 0x8c3f70 at uma_reclaim_worker+0x68 > > Jun 11 15:58:16 FBSDG4S2 kernel: #12 0x50fbac at fork_exit+0xb0 Jun > > 11 15:58:16 FBSDG4S2 kernel: #13 0x9684ac at fork_trampoline+0xc > > > > The processes that were hit were listed as: > > > > Jun 11 15:59:11 FBSDG4S2 kernel: pid 971 (cron), jid 0, uid 0: > > exited on signal 11 (core dumped) Jun 11 16:02:59 FBSDG4S2 kernel: > > pid 1111 (stress), jid 0, uid 0: exited on signal 6 (core dumped) > > Jun 11 16:03:27 FBSDG4S2 kernel: pid 871 (mountd), jid 0, uid 0: > > exited on signal 6 (core dumped) Jun 11 16:03:40 FBSDG4S2 kernel: > > pid 1065 (su), jid 0, uid 0: exited on signal 6 Jun 11 16:04:13 > > FBSDG4S2 kernel: pid 1088 (su), jid 0, uid 0: exited on signal 6 > > Jun 11 16:04:28 FBSDG4S2 kernel: pid 968 (sshd), jid 0, uid 0: > > exited on signal 6 > > > > Jun 11 16:05:42 FBSDG4S2 kernel: pid 1028 (login), jid 0, uid 0: > > exited on signal 6 > > > > Jun 11 16:05:46 FBSDG4S2 kernel: pid 873 (nfsd), jid 0, uid 0: > > exited on signal 6 (core dumped) > > > > > > Rebooting and rerunning and showing the stress output and such > > (I did not capture copies during the first test, but the first > > test had similar messages at the same sort of points): > > > > Second test . . . > > > > # stress -m 2 --vm-bytes 1700M > > stress: info: [1166] dispatching hogs: 0 cpu, 0 io, 2 vm, 0 hdd > > <jemalloc>: > > /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258: > > Failed assertion: "slab == extent_slab_get(extent)" <jemalloc>: > > /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258: > > Failed assertion: "slab == extent_slab_get(extent)" ^C > > > > # exit > > <jemalloc>: > > /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:200: > > Failed assertion: "ret == sz_index2size_compute(index)" Abort trap > > > > The other stuff was similar to to first test, not repeated here. > > The updated code looks odd to me for how "m" is > handled (part of a egrep to ensure I show all the > usage of m): > > moea_protect(mmu_t mmu, pmap_t pm, vm_offset_t sva, vm_offset_t eva, > vm_page_t m; > if (pm != kernel_pmap && m != NULL && > (m->a.flags & PGA_EXECUTABLE) == 0 && > if ((m->oflags & VPO_UNMANAGED) == 0) > vm_page_aflag_set(m, > PGA_EXECUTABLE); m = PHYS_TO_VM_PAGE(old_pte.pte_lo & PTE_RPGN); > refchg = > atomic_readandclear_32(&m->md.mdpg_attrs); vm_page_dirty(m); > vm_page_aflag_set(m, > PGA_REFERENCED); > > Or more completely, with notes mixed in: > > void > moea_protect(mmu_t mmu, pmap_t pm, vm_offset_t sva, vm_offset_t eva, > vm_prot_t prot) > { > . . . > vm_page_t m; > . . . > for (pvo = RB_NFIND(pvo_tree, &pm->pmap_pvo, &key); > pvo != NULL && PVO_VADDR(pvo) < eva; pvo = tpvo) { > . . . > if (pt != NULL) { > . . . > if (pm != kernel_pmap && m != NULL && > > NOTE: m seems to be uninitialized but tested for being NULL > above. > > (m->a.flags & PGA_EXECUTABLE) == 0 && > > Note: This looks to potentially be using a random, non-NULL > value for m during evaluation of m->a.flags . > > . . . > > if ((pvo->pvo_vaddr & PVO_MANAGED) && > (pvo->pvo_pte.prot & VM_PROT_WRITE)) { > m = PHYS_TO_VM_PAGE(old_pte.pte_lo & > PTE_RPGN); > > Note: m finally is potentially initialized(/set). > > refchg = > atomic_readandclear_32(&m->md.mdpg_attrs); if (refchg & PTE_CHG) > vm_page_dirty(m); > if (refchg & PTE_REF) > vm_page_aflag_set(m, > PGA_REFERENCED); . . . > > Note: So, if m is set above, then the next loop > iteration(s) would use this then-old value > instead of an initialized value. > > It looks to me like at least one assignment > to m is missing. > > moea64_pvo_protect has pg that seems analogous to > m and has: > > pg = PHYS_TO_VM_PAGE(pvo->pvo_pte.pa & LPTE_RPGN); > . . . > if (pm != kernel_pmap && pg != NULL && > (pg->a.flags & PGA_EXECUTABLE) == 0 && > (pvo->pvo_pte.pa & (LPTE_I | LPTE_G | LPTE_NOEXEC)) == 0) > { if ((pg->oflags & VPO_UNMANAGED) == 0) > vm_page_aflag_set(pg, PGA_EXECUTABLE); > > . . . > if (pg != NULL && (pvo->pvo_vaddr & PVO_MANAGED) && > (oldprot & VM_PROT_WRITE)) { > refchg |= atomic_readandclear_32(&pg->md.mdpg_attrs); > if (refchg & LPTE_CHG) > vm_page_dirty(pg); > if (refchg & LPTE_REF) > vm_page_aflag_set(pg, PGA_REFERENCED); > > > This might suggest some about what is missing. Can you try moving the assignment to 'm' to right below the moea_pte_change() call? - Justin > > > === > Mark Millard > marklmi at yahoo.com > ( dsl-only.net went > away in early 2018-Mar) >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20200611212532.59f677be>