Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 5 May 2019 21:13:42 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        Justin Hibbits <chmeeedalf@gmail.com>, FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>
Subject:   Re: Remote Firewire dcons use shows my G5 "hang-up point" reports have been wrong: shows lines not shown on the problem context's display
Message-ID:  <D441256F-E683-4FB5-8B21-28ECF8B75FCA@yahoo.com>
In-Reply-To: <E4E3E09D-4945-45E8-91C3-0C89653F06D6@yahoo.com>
References:  <E4E3E09D-4945-45E8-91C3-0C89653F06D6@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
[Results based on what firewire dcons shows.]

On 2019-May-5, at 15:25, Mark Millard <marklmi at yahoo.com> wrote:

> I got remote FireWire dcons use going (partially),
>=20
> =
https://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kern=
eldebug-dcons.html
>=20
> # more /mnt/boot/loader.conf
> hw.firewire.phydma_enable=3D1
> hw.firewire.dcons_crom.force_console=3D1
> #dcons_gdb=3D1
> #boot_multicons=3D"YES"
>=20
> [Trying to turn on boot_multi_cons gets stuck at
> db> prompts *very* early and the display on the
> target ends up with mixed material from two
> streams of output. So: no use of dcons for
> low level console activity.]
>=20
> The above combination does show output on the
> remote, observing PowerMac, during part
> of the boot sequence. And . . .
>=20
> It shows more lines of output just before the
> overall hang-up than the target machine's
> display does.
>=20
> It appears that for most or all the hang-ups
> that I've reported, my report indicated a
> point earlier then the actual hang-up: more
> of the messages were actually generated and
> the code got farther than I knew.
>=20
> So there might be a separate issue of the
> screen display stopping its updates before
> the old PowerMac G5 has actually hung-up
> (overall). Or it my just be timing
> relationships do not get a chance to allow
> the "next screen update" or some such.
>=20
> It appears that tracking down hang-up points
> via print messages shown on the normal display
> can not generally be done on the G5s. FireWire
> dcons use may be required (unless there is some
> alternative).
>=20
> I'll note that I was not able to break into ddb
> via the dcons connection. It appeared that I
> had no ability to input to the target if I had
> no such on the directly-connected usb keyboard
> at the same time. In other words: I gained no
> control but got better observational data.
>=20
>=20
> FYI:
>=20
> I used (replacing ??'s with appropriate
> hexadecimal digits):
>=20
> dconschat -e \# -br -TC 5555 -t ??-??-??-??-??-??-??-?? &
> telnet localhost 5555
>=20
> on the observing PowerMac. (I happened to use a
> G4 as the observer of the target G5.)

Testing showed that most of the messages I had
added before this discovery were a waste (too early).
So I deleted them.

All failures were reported to be during cpu_mp_unleash
as seen via firewire dcons based displays. So I added
messages to cpu_mp_unleash to narrow the range for
failure points.

For a successful boot the messages from there look
like:

cpu_mp_unleash: start / before mtx_init
cpu_mp_unleash: in cpu loop at start with cpus so far 0
cpu_mp_unleash: bsp to be set awake
Adding CPU 0, hwref=3Dff89d680, awake=3D1
cpu_mp_unleash: in cpu loop at start with cpus so far 1
Waking up CPU 1 (dev=3Dff89eb70)
Adding CPU 1, hwref=3Dff89eb70, awake=3D1
cpu_mp_unleash: in cpu loop at start with cpus so far 2
Waking up CPU 2 (dev=3Dff89f248)
Adding CPU 2, hwref=3Dff89f248, awake=3D1
cpu_mp_unleash: in cpu loop at start with cpus so far 3
Waking up CPU 3 (dev=3Dff89f920)
Adding CPU 3, hwref=3Dff89f920, awake=3D1
cpu_mp_unleash: before timebase code
cpu_mp_unleash: before looping until expected are awake
Launching APs: 1 3 cpu_mp_unleash: looped until expected were awake
2
cpu_mp_unleash: before DELAY
cpu_mp_unleash: after DELAY

(I'm ignoring possibly interlaced "cd0" messages.)

But, for the dozen+ hang-ups from my testing so far,
there is only one sequence that each hang-up shares
with the other hang-ups:

cpu_mp_unleash: start / before mtx_init
cpu_mp_unleash: in cpu loop at start with cpus so far 0
cpu_mp_unleash: bsp to be set awake
Adding CPU 0, hwref=3Dff89d680, awake=3D1
cpu_mp_unleash: in cpu loop at start with cpus so far 1
Waking up CPU 1 (dev=3Dff89eb70)
cpu_mp_unleash: cpu skipped
cpu_mp_unleash: in cpu loop at start with cpus so far 2
Waking up CPU 2 (dev=3Dff89f248)

(There is no more output shown.)

The normal display had varying earlier stopping points
for what it displayed for the hang-ups. But the firewire
dcons display was uniform in what it reported and where
it stopped for a hang-up.


Notes about other testing of variations:

The firewire dcons reports for hang-ups were the same=20
or n_slbs-1 based and for (n_slbs-1)/2 based looping
of moea64_bootstrap_slb_prefault .



For reference, the patches currently in use in my
-r347003 based testing are shown below. The extended
comment reports the basics for what has been learned
about handle_kernel_slb_spill activity [through the
end of the numa_mem_regions(...) activity].

# svnlite diff /mnt/usr/src/sys/ | more
Index: /mnt/usr/src/sys/powerpc/aim/mmu_oea64.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- /mnt/usr/src/sys/powerpc/aim/mmu_oea64.c    (revision 347003)
+++ /mnt/usr/src/sys/powerpc/aim/mmu_oea64.c    (working copy)
@@ -956,10 +956,35 @@
        virtual_end =3D VM_MAX_SAFE_KERNEL_ADDRESS;=20
=20
        /*
-        * Map the entire KVA range into the SLB. We must not fault =
there.
+        * Map the initial kernel slb slots.
+        *
+        * Notes based on a 2-socket/2-core-each G5 PowerMac11,2
+        * based investigation . . .
+        *
+        * handle_kernel_slb_spill has been observed to be used in
+        * early booting for dar<DMAP_START, srr0<DMAP_START,
+        * DMAP_START<=3Ddar<KVA_START, in addition to KVA_START<=3Ddar =
.
+        * (Naming only suggestive here, not necessarily from the
+        * code.) Such happened some during this loop and later. (No
+        * activity after numa_mem_regions(...) was investigated.)
+        *
+        * The count of handle_kernel_slb_spill contributions by
+        * some stages varied from boot to boot. (The other stages
+        * had no observed use of handle_kernel_slb_spill for the
+        * little bit of testing done with counts being kept.)
+        *
+        * Thus the below will not establish any form of long term
+        * completeness for the kernel slb slots being just for KVA
+        * address ranges, on the bsp or on the other cpus. By the
+        * time the slb information is copied for other cpus, it
+        * likely has already been changed some by (varying)
+        * handle_kernel_slb_spill activity. (Including the use
+        * of mftb()%n_slbs for picking what slot to replace
+        * during the varying number of spills.)
         */
        #ifdef __powerpc64__
-       for (va =3D virtual_avail; va < virtual_end; va +=3D =
SEGMENT_LENGTH)
+       i =3D 0;
+       for (va =3D virtual_avail; va < virtual_end && i<n_slbs-1; va +=3D=
 SEGMENT_LENGTH, i++)
                moea64_bootstrap_slb_prefault(va, 0);
        #endif
=20
@@ -1090,8 +1115,8 @@
        CPU_SET(PCPU_GET(cpuid), &pm->pm_active);
=20
        #ifdef __powerpc64__
-       PCPU_SET(aim.userslb, pm->pm_slb);
-       __asm __volatile("slbmte %0, %1; isync" ::
+       PCPU_SET(aim.userslb, pm->pm_slb); // no slbie needed?
+       __asm __volatile("isync; slbmte %0, %1; isync" ::
            "r"(td->td_pcb->pcb_cpu.aim.usr_vsid), "r"(USER_SLB_SLBE));
        #else
        PCPU_SET(curpmap, pm->pmap_phys);
@@ -1104,7 +1129,7 @@
 {
        pmap_t  pm;
=20
-       __asm __volatile("isync; slbie %0" :: "r"(USER_ADDR));
+       __asm __volatile("isync; slbie %0; isync" :: "r"(USER_ADDR));
=20
        pm =3D &td->td_proc->p_vmspace->vm_pmap;
        CPU_CLR(PCPU_GET(cpuid), &pm->pm_active);
@@ -1956,7 +1981,7 @@
            (uintptr_t)uaddr >> ADDR_SR_SHFT;
        curthread->td_pcb->pcb_cpu.aim.usr_vsid =3D slbv;
 #ifdef __powerpc64__
-       __asm __volatile ("slbie %0; slbmte %1, %2; isync" ::
+       __asm __volatile ("isync; slbie %0; slbmte %1, %2; isync" ::
            "r"(USER_ADDR), "r"(slbv), "r"(USER_SLB_SLBE));
 #else
        __asm __volatile("mtsr %0,%1; isync" :: "n"(USER_SR), =
"r"(slbv));
Index: /mnt/usr/src/sys/powerpc/aim/moea64_native.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- /mnt/usr/src/sys/powerpc/aim/moea64_native.c        (revision =
347003)
+++ /mnt/usr/src/sys/powerpc/aim/moea64_native.c        (working copy)
@@ -406,7 +406,7 @@
         */
=20
        #ifdef __powerpc64__
-               __asm __volatile ("slbia");
+               __asm __volatile ("isync; slbia");
                __asm __volatile ("slbmfee %0,%1; slbie %0;" : =
"=3Dr"(seg0) :
                    "r"(0));
=20
@@ -417,6 +417,7 @@
                        __asm __volatile ("slbmte %0, %1" ::=20
                            "r"(slb[i].slbv), "r"(slb[i].slbe));=20
                }
+               __asm __volatile ("isync");
        #else
                for (i =3D 0; i < 16; i++)
                        mtsrin(i << ADDR_SR_SHFT, =
kernel_pmap->pm_sr[i]);
Index: /mnt/usr/src/sys/powerpc/aim/slb.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- /mnt/usr/src/sys/powerpc/aim/slb.c  (revision 347003)
+++ /mnt/usr/src/sys/powerpc/aim/slb.c  (working copy)
@@ -457,7 +457,7 @@
        /* If it is for this CPU, put it in the SLB right away */
        if (pmap_bootstrapped) {
                /* slbie not required */
-               __asm __volatile ("slbmte %0, %1" ::=20
+               __asm __volatile ("isync; slbmte %0, %1; isync" ::=20
                    "r"(slbcache[i].slbv), "r"(slbcache[i].slbe));=20
        }
=20
Index: /mnt/usr/src/sys/powerpc/aim/trap_subr64.S
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- /mnt/usr/src/sys/powerpc/aim/trap_subr64.S  (revision 347003)
+++ /mnt/usr/src/sys/powerpc/aim/trap_subr64.S  (working copy)
@@ -65,6 +65,7 @@
=20
        li      %r29, 0                 /* Set the counter to zero */
=20
+       isync
        slbia
        slbmfee %r31,%r29              =20
        clrrdi  %r31,%r31,28
@@ -71,6 +72,7 @@
        slbie   %r31
 1:     ld      %r31, 0(%r28)           /* Load SLB entry pointer */
        cmpdi   %r31, 0                 /* If NULL, stop */
+       isync
        beqlr
=20
        ld      %r30, 0(%r31)           /* Load SLBV */
@@ -96,6 +98,7 @@
        /* Otherwise, set up SLBs */
        li      %r29, 0                 /* Set the counter to zero */
=20
+       isync
        slbia
        slbmfee %r31,%r29              =20
        clrrdi  %r31,%r31,28
@@ -105,6 +108,7 @@
=20
        ld      %r31, 8(%r28)           /* Load SLBE */
        cmpdi   %r31, 0                 /* If SLBE is not valid, stop */
+       isync
        beqlr
        ld      %r30, 0(%r28)           /* Load SLBV  */
        slbmte  %r30, %r31              /* Install SLB entry */
@@ -113,6 +117,7 @@
        addi    %r29, %r29, 1
        cmpdi   %r29, 64                /* Repeat if we are not at the =
end */
        blt     1b=20
+       isync
        blr
=20
 /*
Index: /mnt/usr/src/sys/powerpc/include/cpufunc.h
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- /mnt/usr/src/sys/powerpc/include/cpufunc.h  (revision 347003)
+++ /mnt/usr/src/sys/powerpc/include/cpufunc.h  (working copy)
@@ -155,15 +155,8 @@
        return (tb);
 }
=20
-static __inline void
-mttb(u_quad_t time)
-{
+// mttb moved to after intr_restore
=20
-       mtspr(TBR_TBWL, 0);
-       mtspr(TBR_TBWU, (uint32_t)(time >> 32));
-       mtspr(TBR_TBWL, (uint32_t)(time & 0xffffffff));
-}
-
 static __inline void
 eieio(void)
 {
@@ -202,6 +195,19 @@
        mtmsr(msr);
 }
=20
+static __inline void
+mttb(u_quad_t time)
+{
+       const uint32_t   high=3D time>>32;
+       const uint32_t   low=3D  time&0xffffffffu;
+
+       const register_t predisable_msr=3D intr_disable();
+       mtspr(TBR_TBWL, 0);
+       mtspr(TBR_TBWU, high);
+       mtspr(TBR_TBWL, low);
+       intr_restore(predisable_msr);
+}
+
 static __inline struct pcpu *
 get_pcpu(void)
 {
Index: /mnt/usr/src/sys/powerpc/powerpc/mp_machdep.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- /mnt/usr/src/sys/powerpc/powerpc/mp_machdep.c       (revision =
347003)
+++ /mnt/usr/src/sys/powerpc/powerpc/mp_machdep.c       (working copy)
@@ -231,6 +231,7 @@
        if (mp_ncpus <=3D 1)
                return;
=20
+printf("cpu_mp_unleash: start / before mtx_init\n");
        mtx_init(&ap_boot_mtx, "ap boot", NULL, MTX_SPIN);
=20
        cpus =3D 0;
@@ -239,11 +240,14 @@
        tlb1_ap_prep();
 #endif
        STAILQ_FOREACH(pc, &cpuhead, pc_allcpu) {
+printf("cpu_mp_unleash: in cpu loop at start with cpus so far =
%d\n",cpus);
                cpus++;
                if (!pc->pc_bsp) {
                        if (bootverbose)
                                printf("Waking up CPU %d (dev=3D%x)\n",
                                    pc->pc_cpuid, (int)pc->pc_hwref);
+printf("Waking up CPU %d (dev=3D%x)\n",
+    pc->pc_cpuid, (int)pc->pc_hwref);
=20
                        ret =3D platform_smp_start_cpu(pc);
                        if (ret =3D=3D 0) {
@@ -252,6 +256,7 @@
                                        DELAY(1000);
                        }
                } else {
+printf("cpu_mp_unleash: bsp to be set awake\n");
                        pc->pc_awake =3D 1;
                }
                if (pc->pc_awake) {
@@ -259,13 +264,21 @@
                                printf("Adding CPU %d, hwref=3D%jx, =
awake=3D%x\n",
                                    pc->pc_cpuid, =
(uintmax_t)pc->pc_hwref,
                                    pc->pc_awake);
+printf("Adding CPU %d, hwref=3D%jx, awake=3D%x\n",
+    pc->pc_cpuid, (uintmax_t)pc->pc_hwref,
+    pc->pc_awake);
                        smp_cpus++;
                } else
+{
+printf("cpu_mp_unleash: cpu skipped\n");
                        CPU_SET(pc->pc_cpuid, &stopped_cpus);
+}
        }
=20
        ap_awake =3D 1;
=20
+printf("cpu_mp_unleash: before timebase code\n");
+
        /* Provide our current DEC and TB values for APs */
        ap_timebase =3D mftb() + 10;
        __asm __volatile("msync; isync");
@@ -275,8 +288,10 @@
=20
        platform_smp_timebase_sync(ap_timebase, 0);
=20
+printf("cpu_mp_unleash: before looping until expected are awake\n");
        while (ap_awake < smp_cpus)
                ;
+printf("cpu_mp_unleash: looped until expected were awake\n");
=20
        if (smp_cpus !=3D cpus || cpus !=3D mp_ncpus) {
                printf("SMP: %d CPUs found; %d CPUs usable; %d CPUs =
woken\n",
@@ -286,8 +301,10 @@
        if (smp_cpus > 1)
                atomic_store_rel_int(&smp_started, 1);
=20
+printf("cpu_mp_unleash: before DELAY\n");
        /* Let the APs get into the scheduler */
        DELAY(10000);
+printf("cpu_mp_unleash: after DELAY\n");
=20
 }
=20
Index: /mnt/usr/src/sys/powerpc/powerpc/trap.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- /mnt/usr/src/sys/powerpc/powerpc/trap.c     (revision 347003)
+++ /mnt/usr/src/sys/powerpc/powerpc/trap.c     (working copy)
@@ -453,8 +453,8 @@
 #if defined(__powerpc64__) && defined(AIM)
                case EXC_DSE:
                        if (td->td_pcb->pcb_cpu.aim.usr_vsid !=3D 0 &&
-                           (frame->dar & SEGMENT_MASK) =3D=3D =
USER_ADDR) {
-                               __asm __volatile ("slbmte %0, %1" ::
+                           (frame->dar & SEGMENT_MASK) =3D=3D =
USER_ADDR) { // no slbie needed?
+                               __asm __volatile ("isync; slbmte %0, %1; =
isync" ::
                                        =
"r"(td->td_pcb->pcb_cpu.aim.usr_vsid),
                                        "r"(USER_SLB_SLBE));
                                return;
@@ -712,8 +712,8 @@
         * Speculatively restore last user SLB segment, which we know is
         * invalid already, since we are likely to do =
copyin()/copyout().
         */
-       if (td->td_pcb->pcb_cpu.aim.usr_vsid !=3D 0)
-               __asm __volatile ("slbmte %0, %1; isync" ::
+       if (td->td_pcb->pcb_cpu.aim.usr_vsid !=3D 0) // no slbie needed?
+               __asm __volatile ("isync; slbmte %0, %1; isync" ::
                    "r"(td->td_pcb->pcb_cpu.aim.usr_vsid), =
"r"(USER_SLB_SLBE));
 #endif
=20




=3D=3D=3D
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?D441256F-E683-4FB5-8B21-28ECF8B75FCA>