Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 11 Jun 2020 17:30:24 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        Justin Hibbits <chmeeedalf@gmail.com>
Cc:        "vangyzen@freebsd.org" <vangyzen@FreeBSD.org>, svn-src-head@freebsd.org, FreeBSD Current <freebsd-current@freebsd.org>, FreeBSD Hackers <freebsd-hackers@freebsd.org>, FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>, Brandon Bergren <bdragon@FreeBSD.org>
Subject:   Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311
Message-ID:  <DCB0BC72-1666-49F3-A838-B2A0D653A0C2@yahoo.com>
In-Reply-To: <DEA9A860-5DEE-49EE-97F1-DBDB39D5C0A3@yahoo.com>
References:  <C24EE1A1-FAED-42C2-8204-CA7B1D20A369@yahoo.com> <8479DD58-44F6-446A-9CA5-D01F0F7C1B38@yahoo.com> <17ACDA02-D7EF-4F26-874A-BB3E935CD072@yahoo.com> <695E6836-F860-4557-B7DE-CC1EDB347F18@yahoo.com> <DCABCD83-27B0-4F2D-9410-69102294A98E@yahoo.com> <121B9B09-141B-4DC3-918B-1E7CFB99E779@yahoo.com> <8AAB0462-3FA8-490C-8D8D-7C15B1C9E2DE@yahoo.com> <18E62746-80DB-4195-977D-4FF32D0129EE@yahoo.com> <F5953A6B-56CE-4D1C-8C18-58D44B639881@yahoo.com> <D0C483E5-3F6A-4816-A6BA-3D2C82C24F8E@yahoo.com> <C440956F-139E-4EF7-A68E-FE35D9934BD3@yahoo.com> <9562EEE4-62EF-4164-91C0-948CC0432984@yahoo.com> <9B68839B-AEC8-43EE-B3B6-B696A4A57DAE@yahoo.com> <359C9C7D-4106-42B5-AAB5-08EF995B8100@yahoo.com> <20200513105632.06db9e21@titan.knownspace> <B1225914-43BC-44EF-A73E-D06B890229C6@yahoo.com> <20200611155545.55526f7c@ralga.knownspace> <5542B85D-1C3A-41D8-98CE-3C02E990C3EB@yahoo.com> <20200611164216.47f82775@ralga.knownspace> <DEA9A860-5DEE-49EE-97F1-DBDB39D5C0A3@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2020-Jun-11, at 16:49, Mark Millard <marklmi at yahoo.com> wrote:

> On 2020-Jun-11, at 14:42, Justin Hibbits <chmeeedalf at gmail.com> =
wrote:
>=20
> On Thu, 11 Jun 2020 14:36:37 -0700
> Mark Millard <marklmi@yahoo.com> wrote:
>=20
>> On 2020-Jun-11, at 13:55, Justin Hibbits <chmeeedalf at gmail.com>
>> wrote:
>>=20
>>> On Wed, 10 Jun 2020 18:56:57 -0700
>>> Mark Millard <marklmi@yahoo.com> wrote:
> . . .
>>=20
>>=20
>>> That said, the attached patch effectively copies
>>> what's done in OEA6464 into OEA pmap.  Can you test it? =20
>>=20
>> I'll try it once I get a chance, probably later
>> today.
>> . . .
>=20
> No luck at the change being a fix, I'm afraid.
>=20
> I verified that the build ended up with
>=20
> 00926cb0 <moea_protect+0x2ec> bl      008e8dc8 <PHYS_TO_VM_PAGE>
> 00926cb4 <moea_protect+0x2f0> mr      r27,r3
> 00926cb8 <moea_protect+0x2f4> addi    r3,r3,36
> 00926cbc <moea_protect+0x2f8> hwsync
> 00926cc0 <moea_protect+0x2fc> lwarx   r25,0,r3
> 00926cc4 <moea_protect+0x300> li      r4,0
> 00926cc8 <moea_protect+0x304> stwcx.  r4,0,r3
> 00926ccc <moea_protect+0x308> bne-    00926cc0 <moea_protect+0x2fc>
> 00926cd0 <moea_protect+0x30c> andi.   r3,r25,128
> 00926cd4 <moea_protect+0x310> beq     00926ce0 <moea_protect+0x31c>
> 00926cd8 <moea_protect+0x314> mr      r3,r27
> 00926cdc <moea_protect+0x318> bl      008e9874 <vm_page_dirty_KBI>
>=20
> in the installed kernel. So I doubt a
> mis-build would be involved. It is a
> head -r360311 based context still. World is
> without MALLOC_PRODUCTION so that jemalloc
> code executes its asserts, catching more
> and earlier than otherwise.
>=20
> First test . . .
>=20
> The only thing that the witness kernel reported was:
>=20
> Jun 11 15:58:16 FBSDG4S2 kernel: lock order reversal:
> Jun 11 15:58:16 FBSDG4S2 kernel:  1st 0x216fb00 Mountpoints (UMA zone) =
@ /usr/src/sys/vm/uma_core.c:4387
> Jun 11 15:58:16 FBSDG4S2 kernel:  2nd 0x1192d2c kernelpmap =
(kernelpmap) @ /usr/src/sys/powerpc/aim/mmu_oea.c:1524
> Jun 11 15:58:16 FBSDG4S2 kernel: stack backtrace:
> Jun 11 15:58:16 FBSDG4S2 kernel: #0 0x5ec164 at witness_debugger+0x94
> Jun 11 15:58:16 FBSDG4S2 kernel: #1 0x5ebe3c at =
witness_checkorder+0xb50
> Jun 11 15:58:16 FBSDG4S2 kernel: #2 0x536d5c at __mtx_lock_flags+0xcc
> Jun 11 15:58:16 FBSDG4S2 kernel: #3 0x92636c at moea_kextract+0x5c
> Jun 11 15:58:16 FBSDG4S2 kernel: #4 0x965d30 at pmap_kextract+0x98
> Jun 11 15:58:16 FBSDG4S2 kernel: #5 0x8bfdbc at zone_release+0xf0
> Jun 11 15:58:16 FBSDG4S2 kernel: #6 0x8c7854 at bucket_drain+0x2f0
> Jun 11 15:58:16 FBSDG4S2 kernel: #7 0x8c728c at bucket_free+0x54
> Jun 11 15:58:16 FBSDG4S2 kernel: #8 0x8c74fc at =
bucket_cache_reclaim+0x1bc
> Jun 11 15:58:16 FBSDG4S2 kernel: #9 0x8c7004 at zone_reclaim+0x128
> Jun 11 15:58:16 FBSDG4S2 kernel: #10 0x8c3a40 at uma_reclaim+0x170
> Jun 11 15:58:16 FBSDG4S2 kernel: #11 0x8c3f70 at =
uma_reclaim_worker+0x68
> Jun 11 15:58:16 FBSDG4S2 kernel: #12 0x50fbac at fork_exit+0xb0
> Jun 11 15:58:16 FBSDG4S2 kernel: #13 0x9684ac at fork_trampoline+0xc
>=20
> The processes that were hit were listed as:
>=20
> Jun 11 15:59:11 FBSDG4S2 kernel: pid 971 (cron), jid 0, uid 0: exited =
on signal 11 (core dumped)
> Jun 11 16:02:59 FBSDG4S2 kernel: pid 1111 (stress), jid 0, uid 0: =
exited on signal 6 (core dumped)
> Jun 11 16:03:27 FBSDG4S2 kernel: pid 871 (mountd), jid 0, uid 0: =
exited on signal 6 (core dumped)
> Jun 11 16:03:40 FBSDG4S2 kernel: pid 1065 (su), jid 0, uid 0: exited =
on signal 6
> Jun 11 16:04:13 FBSDG4S2 kernel: pid 1088 (su), jid 0, uid 0: exited =
on signal 6
> Jun 11 16:04:28 FBSDG4S2 kernel: pid 968 (sshd), jid 0, uid 0: exited =
on signal 6
>=20
> Jun 11 16:05:42 FBSDG4S2 kernel: pid 1028 (login), jid 0, uid 0: =
exited on signal 6
>=20
> Jun 11 16:05:46 FBSDG4S2 kernel: pid 873 (nfsd), jid 0, uid 0: exited =
on signal 6 (core dumped)
>=20
>=20
> Rebooting and rerunning and showing the stress output and such
> (I did not capture copies during the first test, but the first
> test had similar messages at the same sort of points):
>=20
> Second test . . .
>=20
> # stress -m 2 --vm-bytes 1700M
> stress: info: [1166] dispatching hogs: 0 cpu, 0 io, 2 vm, 0 hdd
> <jemalloc>: =
/usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258:=
 Failed assertion: "slab =3D=3D extent_slab_get(extent)"
> <jemalloc>: =
/usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258:=
 Failed assertion: "slab =3D=3D extent_slab_get(extent)"
> ^C
>=20
> # exit
> <jemalloc>: =
/usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:200: Failed =
assertion: "ret =3D=3D sz_index2size_compute(index)"
> Abort trap
>=20
> The other stuff was similar to to first test, not repeated here.

The updated code looks odd to me for how "m" is
handled (part of a egrep to ensure I show all the
usage of m):

moea_protect(mmu_t mmu, pmap_t pm, vm_offset_t sva, vm_offset_t eva,
        vm_page_t       m;
                        if (pm !=3D kernel_pmap && m !=3D NULL &&
                            (m->a.flags & PGA_EXECUTABLE) =3D=3D 0 &&
                                if ((m->oflags & VPO_UNMANAGED) =3D=3D =
0)
                                        vm_page_aflag_set(m, =
PGA_EXECUTABLE);
                                m =3D PHYS_TO_VM_PAGE(old_pte.pte_lo & =
PTE_RPGN);
                                refchg =3D =
atomic_readandclear_32(&m->md.mdpg_attrs);
                                        vm_page_dirty(m);
                                        vm_page_aflag_set(m, =
PGA_REFERENCED);

Or more completely, with notes mixed in:

void=20
moea_protect(mmu_t mmu, pmap_t pm, vm_offset_t sva, vm_offset_t eva,
    vm_prot_t prot)
{
        . . .
        vm_page_t       m;
        . . .
        for (pvo =3D RB_NFIND(pvo_tree, &pm->pmap_pvo, &key);
            pvo !=3D NULL && PVO_VADDR(pvo) < eva; pvo =3D tpvo) {
                . . .
                if (pt !=3D NULL) {
                        . . .
                        if (pm !=3D kernel_pmap && m !=3D NULL &&

NOTE: m seems to be uninitialized but tested for being NULL
above.

                            (m->a.flags & PGA_EXECUTABLE) =3D=3D 0 &&

Note: This looks to potentially be using a random, non-NULL
value for m during evaluation of m->a.flags .

                        . . .

                        if ((pvo->pvo_vaddr & PVO_MANAGED) &&
                            (pvo->pvo_pte.prot & VM_PROT_WRITE)) {
                                m =3D PHYS_TO_VM_PAGE(old_pte.pte_lo & =
PTE_RPGN);

Note: m finally is potentially initialized(/set).

                                refchg =3D =
atomic_readandclear_32(&m->md.mdpg_attrs);
                                if (refchg & PTE_CHG)
                                        vm_page_dirty(m);
                                if (refchg & PTE_REF)
                                        vm_page_aflag_set(m, =
PGA_REFERENCED);
. . .

Note: So, if m is set above, then the next loop
iteration(s) would use this then-old value
instead of an initialized value.

It looks to me like at least one assignment
to m is missing.

moea64_pvo_protect has pg that seems analogous to
m and has:

        pg =3D PHYS_TO_VM_PAGE(pvo->pvo_pte.pa & LPTE_RPGN);
. . .
        if (pm !=3D kernel_pmap && pg !=3D NULL &&
            (pg->a.flags & PGA_EXECUTABLE) =3D=3D 0 &&
            (pvo->pvo_pte.pa & (LPTE_I | LPTE_G | LPTE_NOEXEC)) =3D=3D =
0) {
                if ((pg->oflags & VPO_UNMANAGED) =3D=3D 0)
                        vm_page_aflag_set(pg, PGA_EXECUTABLE);

. . .
        if (pg !=3D NULL && (pvo->pvo_vaddr & PVO_MANAGED) &&
            (oldprot & VM_PROT_WRITE)) {
                refchg |=3D atomic_readandclear_32(&pg->md.mdpg_attrs);
                if (refchg & LPTE_CHG)
                        vm_page_dirty(pg);
                if (refchg & LPTE_REF)
                        vm_page_aflag_set(pg, PGA_REFERENCED);


This might suggest some about what is missing.


=3D=3D=3D
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?DCB0BC72-1666-49F3-A838-B2A0D653A0C2>