Date: Sun, 5 May 2019 21:13:42 -0700 From: Mark Millard <marklmi@yahoo.com> To: Justin Hibbits <chmeeedalf@gmail.com>, FreeBSD PowerPC ML <freebsd-ppc@freebsd.org> Subject: Re: Remote Firewire dcons use shows my G5 "hang-up point" reports have been wrong: shows lines not shown on the problem context's display Message-ID: <D441256F-E683-4FB5-8B21-28ECF8B75FCA@yahoo.com> In-Reply-To: <E4E3E09D-4945-45E8-91C3-0C89653F06D6@yahoo.com> References: <E4E3E09D-4945-45E8-91C3-0C89653F06D6@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
[Results based on what firewire dcons shows.] On 2019-May-5, at 15:25, Mark Millard <marklmi at yahoo.com> wrote: > I got remote FireWire dcons use going (partially), >=20 > = https://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kern= eldebug-dcons.html >=20 > # more /mnt/boot/loader.conf > hw.firewire.phydma_enable=3D1 > hw.firewire.dcons_crom.force_console=3D1 > #dcons_gdb=3D1 > #boot_multicons=3D"YES" >=20 > [Trying to turn on boot_multi_cons gets stuck at > db> prompts *very* early and the display on the > target ends up with mixed material from two > streams of output. So: no use of dcons for > low level console activity.] >=20 > The above combination does show output on the > remote, observing PowerMac, during part > of the boot sequence. And . . . >=20 > It shows more lines of output just before the > overall hang-up than the target machine's > display does. >=20 > It appears that for most or all the hang-ups > that I've reported, my report indicated a > point earlier then the actual hang-up: more > of the messages were actually generated and > the code got farther than I knew. >=20 > So there might be a separate issue of the > screen display stopping its updates before > the old PowerMac G5 has actually hung-up > (overall). Or it my just be timing > relationships do not get a chance to allow > the "next screen update" or some such. >=20 > It appears that tracking down hang-up points > via print messages shown on the normal display > can not generally be done on the G5s. FireWire > dcons use may be required (unless there is some > alternative). >=20 > I'll note that I was not able to break into ddb > via the dcons connection. It appeared that I > had no ability to input to the target if I had > no such on the directly-connected usb keyboard > at the same time. In other words: I gained no > control but got better observational data. >=20 >=20 > FYI: >=20 > I used (replacing ??'s with appropriate > hexadecimal digits): >=20 > dconschat -e \# -br -TC 5555 -t ??-??-??-??-??-??-??-?? & > telnet localhost 5555 >=20 > on the observing PowerMac. (I happened to use a > G4 as the observer of the target G5.) Testing showed that most of the messages I had added before this discovery were a waste (too early). So I deleted them. All failures were reported to be during cpu_mp_unleash as seen via firewire dcons based displays. So I added messages to cpu_mp_unleash to narrow the range for failure points. For a successful boot the messages from there look like: cpu_mp_unleash: start / before mtx_init cpu_mp_unleash: in cpu loop at start with cpus so far 0 cpu_mp_unleash: bsp to be set awake Adding CPU 0, hwref=3Dff89d680, awake=3D1 cpu_mp_unleash: in cpu loop at start with cpus so far 1 Waking up CPU 1 (dev=3Dff89eb70) Adding CPU 1, hwref=3Dff89eb70, awake=3D1 cpu_mp_unleash: in cpu loop at start with cpus so far 2 Waking up CPU 2 (dev=3Dff89f248) Adding CPU 2, hwref=3Dff89f248, awake=3D1 cpu_mp_unleash: in cpu loop at start with cpus so far 3 Waking up CPU 3 (dev=3Dff89f920) Adding CPU 3, hwref=3Dff89f920, awake=3D1 cpu_mp_unleash: before timebase code cpu_mp_unleash: before looping until expected are awake Launching APs: 1 3 cpu_mp_unleash: looped until expected were awake 2 cpu_mp_unleash: before DELAY cpu_mp_unleash: after DELAY (I'm ignoring possibly interlaced "cd0" messages.) But, for the dozen+ hang-ups from my testing so far, there is only one sequence that each hang-up shares with the other hang-ups: cpu_mp_unleash: start / before mtx_init cpu_mp_unleash: in cpu loop at start with cpus so far 0 cpu_mp_unleash: bsp to be set awake Adding CPU 0, hwref=3Dff89d680, awake=3D1 cpu_mp_unleash: in cpu loop at start with cpus so far 1 Waking up CPU 1 (dev=3Dff89eb70) cpu_mp_unleash: cpu skipped cpu_mp_unleash: in cpu loop at start with cpus so far 2 Waking up CPU 2 (dev=3Dff89f248) (There is no more output shown.) The normal display had varying earlier stopping points for what it displayed for the hang-ups. But the firewire dcons display was uniform in what it reported and where it stopped for a hang-up. Notes about other testing of variations: The firewire dcons reports for hang-ups were the same=20 or n_slbs-1 based and for (n_slbs-1)/2 based looping of moea64_bootstrap_slb_prefault . For reference, the patches currently in use in my -r347003 based testing are shown below. The extended comment reports the basics for what has been learned about handle_kernel_slb_spill activity [through the end of the numa_mem_regions(...) activity]. # svnlite diff /mnt/usr/src/sys/ | more Index: /mnt/usr/src/sys/powerpc/aim/mmu_oea64.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /mnt/usr/src/sys/powerpc/aim/mmu_oea64.c (revision 347003) +++ /mnt/usr/src/sys/powerpc/aim/mmu_oea64.c (working copy) @@ -956,10 +956,35 @@ virtual_end =3D VM_MAX_SAFE_KERNEL_ADDRESS;=20 =20 /* - * Map the entire KVA range into the SLB. We must not fault = there. + * Map the initial kernel slb slots. + * + * Notes based on a 2-socket/2-core-each G5 PowerMac11,2 + * based investigation . . . + * + * handle_kernel_slb_spill has been observed to be used in + * early booting for dar<DMAP_START, srr0<DMAP_START, + * DMAP_START<=3Ddar<KVA_START, in addition to KVA_START<=3Ddar = . + * (Naming only suggestive here, not necessarily from the + * code.) Such happened some during this loop and later. (No + * activity after numa_mem_regions(...) was investigated.) + * + * The count of handle_kernel_slb_spill contributions by + * some stages varied from boot to boot. (The other stages + * had no observed use of handle_kernel_slb_spill for the + * little bit of testing done with counts being kept.) + * + * Thus the below will not establish any form of long term + * completeness for the kernel slb slots being just for KVA + * address ranges, on the bsp or on the other cpus. By the + * time the slb information is copied for other cpus, it + * likely has already been changed some by (varying) + * handle_kernel_slb_spill activity. (Including the use + * of mftb()%n_slbs for picking what slot to replace + * during the varying number of spills.) */ #ifdef __powerpc64__ - for (va =3D virtual_avail; va < virtual_end; va +=3D = SEGMENT_LENGTH) + i =3D 0; + for (va =3D virtual_avail; va < virtual_end && i<n_slbs-1; va +=3D= SEGMENT_LENGTH, i++) moea64_bootstrap_slb_prefault(va, 0); #endif =20 @@ -1090,8 +1115,8 @@ CPU_SET(PCPU_GET(cpuid), &pm->pm_active); =20 #ifdef __powerpc64__ - PCPU_SET(aim.userslb, pm->pm_slb); - __asm __volatile("slbmte %0, %1; isync" :: + PCPU_SET(aim.userslb, pm->pm_slb); // no slbie needed? + __asm __volatile("isync; slbmte %0, %1; isync" :: "r"(td->td_pcb->pcb_cpu.aim.usr_vsid), "r"(USER_SLB_SLBE)); #else PCPU_SET(curpmap, pm->pmap_phys); @@ -1104,7 +1129,7 @@ { pmap_t pm; =20 - __asm __volatile("isync; slbie %0" :: "r"(USER_ADDR)); + __asm __volatile("isync; slbie %0; isync" :: "r"(USER_ADDR)); =20 pm =3D &td->td_proc->p_vmspace->vm_pmap; CPU_CLR(PCPU_GET(cpuid), &pm->pm_active); @@ -1956,7 +1981,7 @@ (uintptr_t)uaddr >> ADDR_SR_SHFT; curthread->td_pcb->pcb_cpu.aim.usr_vsid =3D slbv; #ifdef __powerpc64__ - __asm __volatile ("slbie %0; slbmte %1, %2; isync" :: + __asm __volatile ("isync; slbie %0; slbmte %1, %2; isync" :: "r"(USER_ADDR), "r"(slbv), "r"(USER_SLB_SLBE)); #else __asm __volatile("mtsr %0,%1; isync" :: "n"(USER_SR), = "r"(slbv)); Index: /mnt/usr/src/sys/powerpc/aim/moea64_native.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /mnt/usr/src/sys/powerpc/aim/moea64_native.c (revision = 347003) +++ /mnt/usr/src/sys/powerpc/aim/moea64_native.c (working copy) @@ -406,7 +406,7 @@ */ =20 #ifdef __powerpc64__ - __asm __volatile ("slbia"); + __asm __volatile ("isync; slbia"); __asm __volatile ("slbmfee %0,%1; slbie %0;" : = "=3Dr"(seg0) : "r"(0)); =20 @@ -417,6 +417,7 @@ __asm __volatile ("slbmte %0, %1" ::=20 "r"(slb[i].slbv), "r"(slb[i].slbe));=20 } + __asm __volatile ("isync"); #else for (i =3D 0; i < 16; i++) mtsrin(i << ADDR_SR_SHFT, = kernel_pmap->pm_sr[i]); Index: /mnt/usr/src/sys/powerpc/aim/slb.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /mnt/usr/src/sys/powerpc/aim/slb.c (revision 347003) +++ /mnt/usr/src/sys/powerpc/aim/slb.c (working copy) @@ -457,7 +457,7 @@ /* If it is for this CPU, put it in the SLB right away */ if (pmap_bootstrapped) { /* slbie not required */ - __asm __volatile ("slbmte %0, %1" ::=20 + __asm __volatile ("isync; slbmte %0, %1; isync" ::=20 "r"(slbcache[i].slbv), "r"(slbcache[i].slbe));=20 } =20 Index: /mnt/usr/src/sys/powerpc/aim/trap_subr64.S =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /mnt/usr/src/sys/powerpc/aim/trap_subr64.S (revision 347003) +++ /mnt/usr/src/sys/powerpc/aim/trap_subr64.S (working copy) @@ -65,6 +65,7 @@ =20 li %r29, 0 /* Set the counter to zero */ =20 + isync slbia slbmfee %r31,%r29 =20 clrrdi %r31,%r31,28 @@ -71,6 +72,7 @@ slbie %r31 1: ld %r31, 0(%r28) /* Load SLB entry pointer */ cmpdi %r31, 0 /* If NULL, stop */ + isync beqlr =20 ld %r30, 0(%r31) /* Load SLBV */ @@ -96,6 +98,7 @@ /* Otherwise, set up SLBs */ li %r29, 0 /* Set the counter to zero */ =20 + isync slbia slbmfee %r31,%r29 =20 clrrdi %r31,%r31,28 @@ -105,6 +108,7 @@ =20 ld %r31, 8(%r28) /* Load SLBE */ cmpdi %r31, 0 /* If SLBE is not valid, stop */ + isync beqlr ld %r30, 0(%r28) /* Load SLBV */ slbmte %r30, %r31 /* Install SLB entry */ @@ -113,6 +117,7 @@ addi %r29, %r29, 1 cmpdi %r29, 64 /* Repeat if we are not at the = end */ blt 1b=20 + isync blr =20 /* Index: /mnt/usr/src/sys/powerpc/include/cpufunc.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /mnt/usr/src/sys/powerpc/include/cpufunc.h (revision 347003) +++ /mnt/usr/src/sys/powerpc/include/cpufunc.h (working copy) @@ -155,15 +155,8 @@ return (tb); } =20 -static __inline void -mttb(u_quad_t time) -{ +// mttb moved to after intr_restore =20 - mtspr(TBR_TBWL, 0); - mtspr(TBR_TBWU, (uint32_t)(time >> 32)); - mtspr(TBR_TBWL, (uint32_t)(time & 0xffffffff)); -} - static __inline void eieio(void) { @@ -202,6 +195,19 @@ mtmsr(msr); } =20 +static __inline void +mttb(u_quad_t time) +{ + const uint32_t high=3D time>>32; + const uint32_t low=3D time&0xffffffffu; + + const register_t predisable_msr=3D intr_disable(); + mtspr(TBR_TBWL, 0); + mtspr(TBR_TBWU, high); + mtspr(TBR_TBWL, low); + intr_restore(predisable_msr); +} + static __inline struct pcpu * get_pcpu(void) { Index: /mnt/usr/src/sys/powerpc/powerpc/mp_machdep.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /mnt/usr/src/sys/powerpc/powerpc/mp_machdep.c (revision = 347003) +++ /mnt/usr/src/sys/powerpc/powerpc/mp_machdep.c (working copy) @@ -231,6 +231,7 @@ if (mp_ncpus <=3D 1) return; =20 +printf("cpu_mp_unleash: start / before mtx_init\n"); mtx_init(&ap_boot_mtx, "ap boot", NULL, MTX_SPIN); =20 cpus =3D 0; @@ -239,11 +240,14 @@ tlb1_ap_prep(); #endif STAILQ_FOREACH(pc, &cpuhead, pc_allcpu) { +printf("cpu_mp_unleash: in cpu loop at start with cpus so far = %d\n",cpus); cpus++; if (!pc->pc_bsp) { if (bootverbose) printf("Waking up CPU %d (dev=3D%x)\n", pc->pc_cpuid, (int)pc->pc_hwref); +printf("Waking up CPU %d (dev=3D%x)\n", + pc->pc_cpuid, (int)pc->pc_hwref); =20 ret =3D platform_smp_start_cpu(pc); if (ret =3D=3D 0) { @@ -252,6 +256,7 @@ DELAY(1000); } } else { +printf("cpu_mp_unleash: bsp to be set awake\n"); pc->pc_awake =3D 1; } if (pc->pc_awake) { @@ -259,13 +264,21 @@ printf("Adding CPU %d, hwref=3D%jx, = awake=3D%x\n", pc->pc_cpuid, = (uintmax_t)pc->pc_hwref, pc->pc_awake); +printf("Adding CPU %d, hwref=3D%jx, awake=3D%x\n", + pc->pc_cpuid, (uintmax_t)pc->pc_hwref, + pc->pc_awake); smp_cpus++; } else +{ +printf("cpu_mp_unleash: cpu skipped\n"); CPU_SET(pc->pc_cpuid, &stopped_cpus); +} } =20 ap_awake =3D 1; =20 +printf("cpu_mp_unleash: before timebase code\n"); + /* Provide our current DEC and TB values for APs */ ap_timebase =3D mftb() + 10; __asm __volatile("msync; isync"); @@ -275,8 +288,10 @@ =20 platform_smp_timebase_sync(ap_timebase, 0); =20 +printf("cpu_mp_unleash: before looping until expected are awake\n"); while (ap_awake < smp_cpus) ; +printf("cpu_mp_unleash: looped until expected were awake\n"); =20 if (smp_cpus !=3D cpus || cpus !=3D mp_ncpus) { printf("SMP: %d CPUs found; %d CPUs usable; %d CPUs = woken\n", @@ -286,8 +301,10 @@ if (smp_cpus > 1) atomic_store_rel_int(&smp_started, 1); =20 +printf("cpu_mp_unleash: before DELAY\n"); /* Let the APs get into the scheduler */ DELAY(10000); +printf("cpu_mp_unleash: after DELAY\n"); =20 } =20 Index: /mnt/usr/src/sys/powerpc/powerpc/trap.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /mnt/usr/src/sys/powerpc/powerpc/trap.c (revision 347003) +++ /mnt/usr/src/sys/powerpc/powerpc/trap.c (working copy) @@ -453,8 +453,8 @@ #if defined(__powerpc64__) && defined(AIM) case EXC_DSE: if (td->td_pcb->pcb_cpu.aim.usr_vsid !=3D 0 && - (frame->dar & SEGMENT_MASK) =3D=3D = USER_ADDR) { - __asm __volatile ("slbmte %0, %1" :: + (frame->dar & SEGMENT_MASK) =3D=3D = USER_ADDR) { // no slbie needed? + __asm __volatile ("isync; slbmte %0, %1; = isync" :: = "r"(td->td_pcb->pcb_cpu.aim.usr_vsid), "r"(USER_SLB_SLBE)); return; @@ -712,8 +712,8 @@ * Speculatively restore last user SLB segment, which we know is * invalid already, since we are likely to do = copyin()/copyout(). */ - if (td->td_pcb->pcb_cpu.aim.usr_vsid !=3D 0) - __asm __volatile ("slbmte %0, %1; isync" :: + if (td->td_pcb->pcb_cpu.aim.usr_vsid !=3D 0) // no slbie needed? + __asm __volatile ("isync; slbmte %0, %1; isync" :: "r"(td->td_pcb->pcb_cpu.aim.usr_vsid), = "r"(USER_SLB_SLBE)); #endif =20 =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?D441256F-E683-4FB5-8B21-28ECF8B75FCA>