From owner-freebsd-ppc@freebsd.org Tue Feb 26 21:11:57 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A47701504135 for ; Tue, 26 Feb 2019 21:11:57 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic305-22.consmr.mail.ne1.yahoo.com (sonic305-22.consmr.mail.ne1.yahoo.com [66.163.185.148]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 862C8737A6 for ; Tue, 26 Feb 2019 21:11:56 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: OLsV_NQVM1mvRRyCyIsKLRyN9sXbjQhUgFV0ofmvE5TNVz2Y_L2KAEQt8EeYe1H Oj5AyG_pUrXTpfBw1AS.ZzOvfnV1RSE5Wc1K9ejhagYG_NRU6pyzAmJG0XQb6lDv27FtzGU97ora RLbbQl2.djkUSNKSbsq3b8U2TFWZD8SMIj7B5xgZsegO.2aMCEJ53U542_celG584K3rMl3XUY2M gjOfBlvYFE_TgGp3kcVGvnwIHSn06f4sZWPMhSx6Z6Gh12djB_Ri3UvsfDQ6waKQ4ECRtp3kurwI 3BdWUDh8RnWn9iaqDnMyxFEzNvyOXKg01iZLvl50u3cEWnpDD6vFKQpwEBIIhUbShLSDiAjGITQV KS0pu_s8Kwj.vL2S9rnylxSrKOEmiROaOHMxIDwcS9ezCzJVmhVIyhnLiqm2Wsgel0kUD94AY9_F r88xDw4Hk99JWRUwrNIHsERiDf9L9vuR6raVHg8N.DRmC__3JuehgmVH5mngIaNKySa._VGDxs.P byKTOq853yZrOkVSKKNTMAkKMgGRHGlwQfN.mEZ7w4fvgn0vTJyI3nbXa1hq_7acyzsnYlyTz_ou BOpLFxRFcELaEUEQ3iTKZmj7YSYTlE.tFHi7fq7MA7vEGgAu5UN996YcARn5W7qNqBpQSGAApn4w J0vjkoxBe1akY9xn1dCcd1Rm_YGl1uZhJElCV.UokT7yed7789dXS_bpLtefE6hYlftNBFhhJZM. ktDynL7ALqbIaZ1n_s67Wc3dY7nWifa9fBPH_sjE8ELMGgSdRvcz6cmmlXRbhBJBI_SLlpjscrdv X_yA8atcd8VkJB9Md7pHKWVui26aSxVDXorwVEJLE2tqC4yAK9lHAlV9V42rxKDR5md5Giva.tIf I0hdcgg91iOsi0xGVJ70lXvgUoJ7UgPlIsHPJ7J6Gj2dvdhFhigQ5JLhBDHkyr3ukNH9vFARw2Dd T0Bgu.Jx6NbFl4rpHy4JZfsMd48ZQx1zsqWdMjpfdxd6lxFa3hF4CVtH9IZ7JFubQKZROpK8t27n cOUnBPQnYBrURql.Bvypq06p6lhvtcBuwWjpQCqEAobFghmysKwjfswL4j84- Received: from sonic.gate.mail.ne1.yahoo.com by sonic305.consmr.mail.ne1.yahoo.com with HTTP; Tue, 26 Feb 2019 21:11:55 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.115]) ([67.170.167.181]) by smtp417.mail.ne1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 5a67ab55de4bbffc4121695e74b1b6c7; Tue, 26 Feb 2019 21:11:53 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: Re: An experimental hack that appears to allow old PowerMacG5 4-core (system total) system to boot reliably (head -r343884 based context) From: Mark Millard In-Reply-To: <466B6E08-5631-41FB-A1FD-263C27519F65@yahoo.com> Date: Tue, 26 Feb 2019 13:11:52 -0800 Cc: FreeBSD PowerPC ML , Dennis Clarke Content-Transfer-Encoding: quoted-printable Message-Id: References: <466B6E08-5631-41FB-A1FD-263C27519F65@yahoo.com> To: Justin Hibbits X-Mailer: Apple Mail (2.3445.102.3) X-Rspamd-Queue-Id: 862C8737A6 X-Spamd-Bar: ++ X-Spamd-Result: default: False [2.11 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; DKIM_TRACE(0.00)[yahoo.com:+]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36646, ipnet:66.163.184.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_SPAM_SHORT(0.96)[0.959,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(1.36)[ip: (4.57), ipnet: 66.163.184.0/21(1.27), asn: 36646(1.01), country: US(-0.07)]; NEURAL_SPAM_MEDIUM(0.29)[0.289,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(0.02)[0.019,0]; RCVD_IN_DNSWL_NONE(0.00)[148.185.163.66.list.dnswl.org : 127.0.5.0] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Feb 2019 21:11:57 -0000 [I explicitly note that my hack is racy. It apepars that I've finally had an example.] On 2019-Feb-24, at 13:50, Mark Millard wrote: > On 2019-Feb-24, at 13:07, Justin Hibbits = wrote: >=20 >> On Sat, Feb 23, 2019 at 1:36 PM Mark Millard = wrote: >>>=20 >>> For sys/powerpc/aim/mp_cpudep.c 's cpudep_ap_bootstrap I added as = shown below: >>>=20 >>> +extern void hack_into_slb_if_needed(void* vap); // HACK!!! >>> + >>> uintptr_t >>> cpudep_ap_bootstrap(void) >>> { >>> . . . >>> + hack_into_slb_if_needed(pcpup->pc_curpcb); // HACK!!! >>> + >>> sp =3D pcpup->pc_curpcb->pcb_sp; In the above, after the implict slb_insert_kernel, but before the pcpup->pc_curpcb-> attempt, the slb entry could be replaced again. There are, after all, other threads in operation before SI_SUB_SMP starts: SI_SUB_KTHREAD_INIT =3D 0xe000000, /* init process*/ SI_SUB_KTHREAD_PAGE =3D 0xe400000, /* pageout daemon*/ SI_SUB_KTHREAD_VM =3D 0xe800000, /* vm daemon*/ SI_SUB_KTHREAD_BUF =3D 0xea00000, /* buffer daemon*/ SI_SUB_KTHREAD_UPDATE =3D 0xec00000, /* update daemon*/ SI_SUB_KTHREAD_IDLE =3D 0xee00000, /* idle procs*/ #ifndef EARLY_AP_STARTUP SI_SUB_SMP =3D 0xf000000, /* start the APs*/ #endif I've finally had one boot hang-up, apparently from this happening. >>> and in src/sys/powerpc/aim/slb.c I added an implementation: >>>=20 >>> +void hack_into_slb_if_needed(void* vap); // HACK!!! >>> +void hack_into_slb_if_needed(void* vap) // HACK!!! >>> +{ // HACK!!! >>> + struct slb *cache=3D PCPU_GET(aim.slb); >>> + vm_offset_t va=3D (vm_offset_t)vap; >>> + uint64_t slbv=3D kernel_va_to_slbv(va); >>> + uint64_t esid=3D va>>ADDR_SR_SHFT; >>> + uint64_t slbe=3D (esid<>> + int i; >>> + >>> + for (i =3D 0; i < n_slbs; i++) { >>> + if (i =3D=3D USER_SLB_SLOT) >>> + continue; >>> + if (cache[i].slbe =3D=3D (slbe | i)) >>> + break; >>> + } >>> + >>> + if (i=3D=3Dn_slbs) >>> + slb_insert_kernel(slbe,slbv); >>> +} // HACK!!! >>> + >>>=20 >>> So far I've not had any boot hang-ups after this. >>>=20 >>> Given the random nature of the hang-ups it will be a >>> while before I conclude for sure how reliable this >>> change makes booting, but so far so good. >>>=20 >>> (I recognize that the "break" could be "return" >>> and then then the "if (i=3D=3Dn_slbs)" would not be >>> needed.) >>>=20 >>>=20 >>> Other issues not fixed by this: >>>=20 >>> This does not change the buf*daemon* randomly getting >>> hung up (and so timing out on shutdown). This appears >>> to be the same issue that leads to the fans sometimes >>> starting to run full-rate because of pmac_thermal >>> being hun -up. >>>=20 >>> For buf*daemon* "top -SHIopid" before shutdown shows >>> just the ones that will not hang-up. The same goes for >>> seeing before hand for pmac_thermal vs. the fans. >>>=20 >>> =3D=3D=3D >>> Mark Millard >>=20 >> Hi Mark, >>=20 >> Fantastic work tracking this down! So the problem is we now can = fault >> when accessing KVA space. I think we should allow this, otherwise we >> can hamper performance with reduced KVA size. I'll have to think >> about how best to do this. Would you be willing to test patches I >> come up with? >=20 > I'll try to test whatever updates you want but there may be some > issues with timeliness. >=20 >=20 >=20 > The reason for the "sometimes" boot-failure is that the entry in the > slb for the PCB/stack for the CPU being added has sometimes been > replaced already before the CPU the pcb is for has sufficiently > configured to allow automatic handling --and other times has not > yet been replaced: the random slb replacement mechanism. >=20 > There already is code to handle slb entry replacements but it does > not work for a CPU still being set up (at the stage of the > sometimes failure). At least that is what I expect for: >=20 > # grep -r "handle_kernel_slb_spill" /usr/src/sys/powerpc/ > /usr/src/sys/powerpc/aim/trap_subr64.S: bl = handle_kernel_slb_spill > /usr/src/sys/powerpc/powerpc/trap.c: void = handle_kernel_slb_spill(int, register_t, register_t); > /usr/src/sys/powerpc/powerpc/trap.c:handle_kernel_slb_spill(int type, = register_t dar, register_t srr0) >=20 > So my hack was to separately do the potential replacement in that > early time frame to allow the configuration for the CPU to get > far enough along for the existing mechanism to work. (At least > that is what I expect that I did.) >=20 > So far I've had no boot failures of any kind with the hack. > I've removed the hacks for reporting information and things > still work. >=20 > But I've not tried anything extensive after booting because > things like buf*daemon* threads and pmac_thermal are randomly > hanging up in/at: >=20 > mi_switch+0x134 sleepq_switch+0x2ec sleepq_timedwait+0x48 _sleep+0x41c > (mi_swtich seems to have called sched_switch based on the > "+0x134" and the code in that area --but ched_switch is not > listed) >=20 > I've no clue what is safe when one or more buf*daeomon* threads > make no progress. >=20 > For shutdown that frequently leads to timeouts for stopping some > buf*deamon* threads (when all 8 time out it takes about 8 minutes). > The buf*deamon* that fail are the ones that "top -SHIopid" no > longer shows. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)