Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 16 Feb 2019 14:37:36 -0800
From:      Mark Millard <marklmi@yahoo.com>
To:        Justin Hibbits <chmeeedalf@gmail.com>
Cc:        Mark Millard via freebsd-ppc <freebsd-ppc@freebsd.org>
Subject:   Re: Some evidence about the PowerMac G5 multiprocessor boot hang ups with the modern VM_MAX_KERNEL_ADDRESS value [found yet more staging info]
Message-ID:  <1C94675F-98DE-421D-83E3-0C79206B8F1B@yahoo.com>
In-Reply-To: <BE2FA129-6B50-49AB-AA58-17D0B6E4B8AB@yahoo.com>
References:  <11680D15-D43D-4115-AF4F-5F6E4E0022C9@yahoo.com> <9FBCA729-CE80-44CD-8873-431853E55231@yahoo.com> <1F3411CF-3D28-43C0-BEF1-4672B5CC1543@yahoo.com> <20190215151710.35545a26@ralga.knownspace> <6445CE54-26AA-4E21-B17E-921D72D4081A@yahoo.com> <20190215160942.1b282f71@ralga.knownspace> <744610C7-90EB-42A0-8B08-AFA0F12E5994@yahoo.com> <20190215180421.61afcae3@ralga.knownspace> <C35575A8-6316-40CB-B6E7-B2FB7F438EA4@yahoo.com> <518C5B96-75C4-4C24-BDEE-68A542242CA3@yahoo.com> <A800DDED-B1A1-4ACF-87B5-AFEA683F48F2@yahoo.com> <BE2FA129-6B50-49AB-AA58-17D0B6E4B8AB@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
[I referred to 2 resets where I should have referred to
between the last reset and the ps_awake loop. So fixing
that .  . .]

On 2019-Feb-16, at 13:15, Mark Millard <marklmi at yahoo.com> wrote:

> [I got a successful boot and so report its messages
> from starting the CPUs. It has one interesting value
> display.]
>=20
> On 2019-Feb-16, at 12:49, Mark Millard <marklmi at yahoo.com> wrote:
>=20
>> [I added to moea64_cpu_bootstrap_native to see
>> more staging infomrtion.]
>>=20
>> On 2019-Feb-16, at 12:07, Mark Millard <marklmi at yahoo.com> wrote:
>>=20
>>> [I needed to allow more time after the 2 resets before
>>> having CPU 0 look at the memory. It was reporting
>>> older values instead of my added writes. The odd
>>> non-zero value was from before the activity of interest.]
>>>=20
>>> I start with the new result found, then give supporting
>>> material.
>>>=20
>>> I've now seen hangs with:
>>>=20
>>> *(unsigned long*)0xc0000000000000f0)=3D0x10
>>>=20
>>> for CPU 3. So the following completed:
>>>=20
>>> void
>>> cpudep_ap_early_bootstrap(void)
>>> {
>>> #ifndef __powerpc64__
>>>      register_t reg;
>>> #endif
>>>=20
>>>      switch (mfpvr() >> 16) {
>>>      case IBM970:
>>>      case IBM970FX:
>>>      case IBM970MP:
>>>              /* Restore HID4 and HID5, which are necessary for the =
MMU */
>>>=20
>>> #ifdef __powerpc64__
>>>              mtspr(SPR_HID4, bsp_state[2]); powerpc_sync(); isync();
>>>              mtspr(SPR_HID5, bsp_state[3]); powerpc_sync(); isync();
>>> #else
>>>              __asm __volatile("ld %0, 16(%2); sync; isync;   \
>>>                  mtspr %1, %0; sync; isync;"
>>>                  : "=3Dr"(reg) : "K"(SPR_HID4), "b"(bsp_state));
>>>              __asm __volatile("ld %0, 24(%2); sync; isync;   \
>>>                  mtspr %1, %0; sync; isync;"
>>>                  : "=3Dr"(reg) : "K"(SPR_HID5), "b"(bsp_state));
>>> #endif
>>>              powerpc_sync();
>>>              break;
>>>      case IBMPOWER8:
>>>      case IBMPOWER8E:
>>>      case IBMPOWER9:
>>> #ifdef __powerpc64__
>>>              if (mfmsr() & PSL_HV) {
>>>                      isync();
>>>                      /*
>>>                       * Direct interrupts to SRR instead of HSRR and
>>>                       * reset LPCR otherwise
>>>                       */
>>>                      mtspr(SPR_LPID, 0);
>>>                      isync();
>>>=20
>>>                      mtspr(SPR_LPCR, lpcr);
>>>                      isync();
>>>              }
>>> #endif
>>>              break;
>>>      }
>>>=20
>>>      __asm __volatile("mtsprg 0, %0" :: "r"(ap_pcpu));
>>>      powerpc_sync();
>>>=20
>>>      *(unsigned long*)0xc0000000000000f0 =3D 0x10; // HACK!!!
>>>      powerpc_sync(); // HACK!!!
>>> }
>>>=20
>>> but the following (and later) did not complete:
>>>=20
>>> void
>>> pmap_cpu_bootstrap(int ap)
>>> {
>>>      /*     =20
>>>       * No KTR here because our console probably doesn't work yet
>>>       */
>>>=20
>>>      return (MMU_CPU_BOOTSTRAP(mmu_obj, ap));
>>>=20
>>>      *(unsigned long*)0xc0000000000000f0 =3D 0x20; // HACK!!!
>>>      powerpc_sync(); // HACK!!!
>>> }
>>>=20
>>>=20
>>> . . .
>>=20
>> The below additions lead to moea64_cpu_bootstrap_native
>> lead to:
>>=20
>> *(unsigned long*)0xc0000000000000f0)=3D0x25
>>=20
>> which indicates that moea64_cpu_bootstrap_native
>> got to its end but pmap_cpu_bootstrap (the caller
>> via MMU_CPU_BOOTSTRAP) did not record its:
>>=20
>> *(unsigned long*)0xc0000000000000f0 =3D 0x20;
>>=20
>> from after the call. moea64_cpu_bootstrap_native
>> (and MMU_CPU_BOOTRAP) seems to have trouble
>> returning to pmap_cpu_bootstrap.
>>=20
>>=20
>> The below // HACK!!! lines are what I added:
>>=20
>> static void
>> moea64_cpu_bootstrap_native(mmu_t mmup, int ap)
>> {
>>       int i =3D 0;
>>       #ifdef __powerpc64__
>>       struct slb *slb =3D PCPU_GET(aim.slb);
>>       register_t seg0;
>>       #endif
>>=20
>>       /*
>>        * Initialize segment registers and MMU
>>        */
>>=20
>>       mtmsr(mfmsr() & ~PSL_DR & ~PSL_IR);
>>=20
>>       *(unsigned long*)0xc0000000000000f0 =3D 0x21; // HACK!!!
>>       powerpc_sync(); // HACK!!!
>>=20
>>       /*
>>        * Install kernel SLB entries
>>        */
>>=20
>>       #ifdef __powerpc64__
>>               __asm __volatile ("slbia");
>>               __asm __volatile ("slbmfee %0,%1; slbie %0;" : =
"=3Dr"(seg0) :
>>                   "r"(0));
>>=20
>>               *(unsigned long*)0xc0000000000000f0 =3D 0x22; // =
HACK!!!
>>               powerpc_sync(); // HACK!!!
>>=20
>>               for (i =3D 0; i < n_slbs; i++) {
>>                       if (!(slb[i].slbe & SLBE_VALID))
>>                               continue;
>>=20
>>                       __asm __volatile ("slbmte %0, %1" ::
>>                           "r"(slb[i].slbv), "r"(slb[i].slbe));
>>               }
>>=20
>>               *(unsigned long*)0xc0000000000000f0 =3D 0x23; // =
HACK!!!
>>               powerpc_sync(); // HACK!!!
>>       #else
>>               for (i =3D 0; i < 16; i++)
>>                       mtsrin(i << ADDR_SR_SHFT, =
kernel_pmap->pm_sr[i]);
>>       #endif
>>=20
>>       /*
>>        * Install page table
>>        */
>>=20
>>       __asm __volatile ("ptesync; mtsdr1 %0; isync"
>>           :: "r"(((uintptr_t)moea64_pteg_table & ~DMAP_BASE_ADDRESS)
>>                    | (uintptr_t)(flsl(moea64_pteg_mask >> 11))));
>>=20
>>       *(unsigned long*)0xc0000000000000f0 =3D 0x24; // HACK!!!
>>       powerpc_sync(); // HACK!!!
>>=20
>>       tlbia();
>>=20
>>       *(unsigned long*)0xc0000000000000f0 =3D 0x25; // HACK!!!
>>       powerpc_sync(); // HACK!!!
>> }
>>=20
>=20
> =46rom a successful boot, for reference:
>=20
> Adding CPU 0, hwref=3Dcd38, awake=3D1
> Trying to mount root from ufs:/dev/ufs/FBSDG5L2rootfs [rw,noatime]...
> Waking up CPU 3 (dev=3Dc480)
> powermac_smp_start_cpu 's OF_getprop for CPU 3, hwref=3Dc480, awake=3D0:=
 res=3D4, reset=3D8c
> powermac_smp_start_cpu for CPU 3, hwref=3Dc480, awake=3D0: =
rstvec_virtbase=3D0xe000000087fd2000
> powermac_smp_start_cpu for CPU 3, hwref=3Dc480, awake=3D0: =
rstvec=3D0xe000000087fd208c
> Before reset 4&0 for CPU 3, hwref=3Dc480, awake=3D0
> After reset 4&0 for CPU 3, hwref=3Dc480, awake=3D0, *(unsigned =
long*)0xc0000000000000e0=3D0x0, *(unsigned long*)0xc0000000000000f0=3D0x25=

> After attempted wait for awake CPU 3, hwref=3Dc480, awake=3D1, =
*(unsigned long*)0xc0000000000000e0=3D0xc0000000016c6100, *(unsigned =
long*)0xc0000000000000f0=3D0x51
> cpu_mp_unleash attempting to wait for pc_awake: CPU 3, hwref=3Dc480, =
awake=3D1
> cpu_mp_unleash after platform_smp_start_cpu and waiting: CPU 3, =
hwref=3Dc480, awake=3D1
> Adding CPU 3, hwref=3Dc480, awake=3D1
> Waking up CPU 2 (dev=3Dc768)
> powermac_smp_start_cpu 's OF_getprop for CPU 2, hwref=3Dc768, awake=3D0:=
 res=3D4, reset=3D8b
> powermac_smp_start_cpu for CPU 2, hwref=3Dc768, awake=3D0: =
rstvec=3D0xe000000087fd208b
> Before reset 4&0 for CPU 2, hwref=3Dc768, awake=3D0
> After reset 4&0 for CPU 2, hwref=3Dc768, awake=3D0, *(unsigned =
long*)0xc0000000000000e0=3D0xc0000000016c6100, *(unsigned =
long*)0xc0000000000000f0=3D0x51
> After attempted wait for awake CPU 2, hwref=3Dc768, awake=3D1, =
*(unsigned long*)0xc0000000000000e0=3D0xc0000000016c5100, *(unsigned =
long*)0xc0000000000000f0=3D0x51
> cpu_mp_unleash attempting to wait for pc_awake: CPU 2, hwref=3Dc768, =
awake=3D1
> cpu_mp_unleash after platform_smp_start_cpu and waiting: CPU 2, =
hwref=3Dc768, awake=3D1
> Adding CPU 2, hwref=3Dc768, awake=3D1
> Waking up CPU 1 (dev=3Dca50)
> powermac_smp_start_cpu 's OF_getprop for CPU 1, hwref=3Dca50, awake=3D0:=
 res=3D4, reset=3D8a
> powermac_smp_start_cpu for CPU 1, hwref=3Dca50, awake=3D0: =
rstvec=3D0xe000000087fd208a
> Before reset 4&0 for CPU 1, hwref=3Dca50, awake=3D0
> After reset 4&0 for CPU 1, hwref=3Dca50, awake=3D0, *(unsigned =
long*)0xc0000000000000e0=3D0xc0000000016c5100, *(unsigned =
long*)0xc0000000000000f0=3D0x51
> After attempted wait for awake CPU 1, hwref=3Dca50, awake=3D1, =
*(unsigned long*)0xc0000000000000e0=3D0xc0000000016c4100, *(unsigned =
long*)0xc0000000000000f0=3D0x51
> cpu_mp_unleash attempting to wait for pc_awake: CPU 1, hwref=3Dca50, =
awake=3D1
> cpu_mp_unleash after platform_smp_start_cpu and waiting: CPU 1, =
hwref=3Dca50, awake=3D1
> Adding CPU 1, hwref=3Dca50, awake=3D1
> machdep_ap_bootstrap before ap_boot_mtx lock: AP CPU #3 launched
> machdep_ap_bootstrap before ap_boot_mtx lock: AP CPU #2 launched
> machdep_ap_bootstrap before ap_boot_mtx lock: AP CPU #1 launched
> SMP: AP CPU #3 launched
> SMP: AP CPU #2 launched
> SMP: AP CPU #1 launched
> machdep_ap_bootstrap after smp_started!=3D0: AP CPU #3 launched
> machdep_ap_bootstrap after smp_started!=3D0: AP CPU #2 launched
> machdep_ap_bootstrap after smp_started!=3D0: AP CPU #1 launched

The below has the bad reference to the 2 resets:

> Interstingly the 0x25 shows before the CPU 3 tied resets instead of
> the laster 0x20 (from pmap_cpu_bootstrap), 0x30 (from
> cpudep_ap_bootstrap), or 0x40 (from cpudep_ap_setup). The 0x51
> does show after the pc_awake loop.
>=20

The 0x25 was shown after the 2 resets but before the pc_awake
wait loop. The 0x51 was seen after the pc_awake wait loop.

I'm going to delete the code reporting a bunch of information
that has been stable for both successful boots and hang-up
boots (for the modern VM_MAX_KERNEL_ADDRESS value).

=3D=3D=3D
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1C94675F-98DE-421D-83E3-0C79206B8F1B>