Date: Tue, 26 Feb 2019 13:11:52 -0800 From: Mark Millard <marklmi@yahoo.com> To: Justin Hibbits <chmeeedalf@gmail.com> Cc: FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>, Dennis Clarke <dclarke@blastwave.org> Subject: Re: An experimental hack that appears to allow old PowerMacG5 4-core (system total) system to boot reliably (head -r343884 based context) Message-ID: <E0F61356-1B3F-4894-9979-A5D27D7E4686@yahoo.com> In-Reply-To: <466B6E08-5631-41FB-A1FD-263C27519F65@yahoo.com> References: <AE42887B-3B50-452F-85AA-CCB382179124@yahoo.com> <CAHSQbTBX3UR0M6V4sOjO9KFMWbi32bCR5TmBj6kF%2BgF9hF_kLg@mail.gmail.com> <466B6E08-5631-41FB-A1FD-263C27519F65@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
[I explicitly note that my hack is racy. It apepars that I've finally had an example.] On 2019-Feb-24, at 13:50, Mark Millard <marklmi at yahoo.com> wrote: > On 2019-Feb-24, at 13:07, Justin Hibbits <chmeeedalf at gmail.com> = wrote: >=20 >> On Sat, Feb 23, 2019 at 1:36 PM Mark Millard <marklmi@yahoo.com> = wrote: >>>=20 >>> For sys/powerpc/aim/mp_cpudep.c 's cpudep_ap_bootstrap I added as = shown below: >>>=20 >>> +extern void hack_into_slb_if_needed(void* vap); // HACK!!! >>> + >>> uintptr_t >>> cpudep_ap_bootstrap(void) >>> { >>> . . . >>> + hack_into_slb_if_needed(pcpup->pc_curpcb); // HACK!!! >>> + >>> sp =3D pcpup->pc_curpcb->pcb_sp; In the above, after the implict slb_insert_kernel, but before the pcpup->pc_curpcb-> attempt, the slb entry could be replaced again. There are, after all, other threads in operation before SI_SUB_SMP starts: SI_SUB_KTHREAD_INIT =3D 0xe000000, /* init process*/ SI_SUB_KTHREAD_PAGE =3D 0xe400000, /* pageout daemon*/ SI_SUB_KTHREAD_VM =3D 0xe800000, /* vm daemon*/ SI_SUB_KTHREAD_BUF =3D 0xea00000, /* buffer daemon*/ SI_SUB_KTHREAD_UPDATE =3D 0xec00000, /* update daemon*/ SI_SUB_KTHREAD_IDLE =3D 0xee00000, /* idle procs*/ #ifndef EARLY_AP_STARTUP SI_SUB_SMP =3D 0xf000000, /* start the APs*/ #endif I've finally had one boot hang-up, apparently from this happening. >>> and in src/sys/powerpc/aim/slb.c I added an implementation: >>>=20 >>> +void hack_into_slb_if_needed(void* vap); // HACK!!! >>> +void hack_into_slb_if_needed(void* vap) // HACK!!! >>> +{ // HACK!!! >>> + struct slb *cache=3D PCPU_GET(aim.slb); >>> + vm_offset_t va=3D (vm_offset_t)vap; >>> + uint64_t slbv=3D kernel_va_to_slbv(va); >>> + uint64_t esid=3D va>>ADDR_SR_SHFT; >>> + uint64_t slbe=3D (esid<<SLBE_ESID_SHIFT) | SLBE_VALID; >>> + int i; >>> + >>> + for (i =3D 0; i < n_slbs; i++) { >>> + if (i =3D=3D USER_SLB_SLOT) >>> + continue; >>> + if (cache[i].slbe =3D=3D (slbe | i)) >>> + break; >>> + } >>> + >>> + if (i=3D=3Dn_slbs) >>> + slb_insert_kernel(slbe,slbv); >>> +} // HACK!!! >>> + >>>=20 >>> So far I've not had any boot hang-ups after this. >>>=20 >>> Given the random nature of the hang-ups it will be a >>> while before I conclude for sure how reliable this >>> change makes booting, but so far so good. >>>=20 >>> (I recognize that the "break" could be "return" >>> and then then the "if (i=3D=3Dn_slbs)" would not be >>> needed.) >>>=20 >>>=20 >>> Other issues not fixed by this: >>>=20 >>> This does not change the buf*daemon* randomly getting >>> hung up (and so timing out on shutdown). This appears >>> to be the same issue that leads to the fans sometimes >>> starting to run full-rate because of pmac_thermal >>> being hun -up. >>>=20 >>> For buf*daemon* "top -SHIopid" before shutdown shows >>> just the ones that will not hang-up. The same goes for >>> seeing before hand for pmac_thermal vs. the fans. >>>=20 >>> =3D=3D=3D >>> Mark Millard >>=20 >> Hi Mark, >>=20 >> Fantastic work tracking this down! So the problem is we now can = fault >> when accessing KVA space. I think we should allow this, otherwise we >> can hamper performance with reduced KVA size. I'll have to think >> about how best to do this. Would you be willing to test patches I >> come up with? >=20 > I'll try to test whatever updates you want but there may be some > issues with timeliness. >=20 >=20 >=20 > The reason for the "sometimes" boot-failure is that the entry in the > slb for the PCB/stack for the CPU being added has sometimes been > replaced already before the CPU the pcb is for has sufficiently > configured to allow automatic handling --and other times has not > yet been replaced: the random slb replacement mechanism. >=20 > There already is code to handle slb entry replacements but it does > not work for a CPU still being set up (at the stage of the > sometimes failure). At least that is what I expect for: >=20 > # grep -r "handle_kernel_slb_spill" /usr/src/sys/powerpc/ > /usr/src/sys/powerpc/aim/trap_subr64.S: bl = handle_kernel_slb_spill > /usr/src/sys/powerpc/powerpc/trap.c: void = handle_kernel_slb_spill(int, register_t, register_t); > /usr/src/sys/powerpc/powerpc/trap.c:handle_kernel_slb_spill(int type, = register_t dar, register_t srr0) >=20 > So my hack was to separately do the potential replacement in that > early time frame to allow the configuration for the CPU to get > far enough along for the existing mechanism to work. (At least > that is what I expect that I did.) >=20 > So far I've had no boot failures of any kind with the hack. > I've removed the hacks for reporting information and things > still work. >=20 > But I've not tried anything extensive after booting because > things like buf*daemon* threads and pmac_thermal are randomly > hanging up in/at: >=20 > mi_switch+0x134 sleepq_switch+0x2ec sleepq_timedwait+0x48 _sleep+0x41c > (mi_swtich seems to have called sched_switch based on the > "+0x134" and the code in that area --but ched_switch is not > listed) >=20 > I've no clue what is safe when one or more buf*daeomon* threads > make no progress. >=20 > For shutdown that frequently leads to timeouts for stopping some > buf*deamon* threads (when all 8 time out it takes about 8 minutes). > The buf*deamon* that fail are the ones that "top -SHIopid" no > longer shows. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E0F61356-1B3F-4894-9979-A5D27D7E4686>