Date: Wed, 1 May 2019 14:35:56 -0700 From: Mark Millard <marklmi@yahoo.com> To: Justin Hibbits <chmeeedalf@gmail.com> Cc: FreeBSD PowerPC ML <freebsd-ppc@freebsd.org> Subject: Re: How many segments does it take to span from VM_MIN_KERNEL_ADDRESS through VM_MAX_SAFE_KERNEL_ADDRESS? 128 in moea64_late_bootstrap Message-ID: <C01CF848-890B-407D-876A-9C48F5F3CD40@yahoo.com> In-Reply-To: <212E50E5-7EB1-4381-A662-D5EACB1E5D46@yahoo.com> References: <3C69CF7C-7F33-4C79-92C0-3493A1294996@yahoo.com> <6159F4A6-9431-4B99-AA62-451B8DF08A6E@yahoo.com> <20190501094029.542c5f46@titan.knownspace> <212E50E5-7EB1-4381-A662-D5EACB1E5D46@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
[This just reports about the experiment, but not from an official head version or snapshot: preliminary information in the interest of time. It hangs, but in a different place/stage than cpudep_ap_bootstrap , matching Dennis Clarke's 2019-Feb-14 reports about hangups, from before my patches were available.] On 2019-May-1, at 11:51, Mark Millard <marklmi at yahoo.com> wrote: > On 2019-May-1, at 07:40, Justin Hibbits <chmeeedalf at gmail.com> = wrote: >=20 >> On Tue, 30 Apr 2019 21:45:00 -0700 >> Mark Millard <marklmi@yahoo.com> wrote: >>=20 >>> [I realized another implication about a another point of >>> potential slb-misses in cpudep_ap_bootstrap: the >>> address in sprg0 on the cpu might end up not able to be >>> dereferenced.] >>>=20 >>> On 2019-Apr-30, at 20:58, Mark Millard <marklmi at yahoo.com> wrote: >>>=20 >>>> [At the end this note shows why the old VM_MAX_KERNEL_ADDRESS >>>> lead to no slb-miss exceptions in cpudep_ap_bootstrap.] >>>>=20 >>>> There is code in moea64_late_bootstrap that looks like: >>>>=20 >>>> virtual_avail =3D VM_MIN_KERNEL_ADDRESS; >>>> virtual_end =3D VM_MAX_SAFE_KERNEL_ADDRESS; >>>>=20 >>>> /* >>>> * Map the entire KVA range into the SLB. We must not fault >>>> there. */ >>>> #ifdef __powerpc64__ >>>> for (va =3D virtual_avail; va < virtual_end; va +=3D >>>> SEGMENT_LENGTH) moea64_bootstrap_slb_prefault(va, 0); >>>> #endif >>=20 >> What happens if you revert all your patches, >=20 > Most of the patches in Bugzilla 233863 are not for this > issue at all and are not tied to starting the non-bsp > cpus. (The one for improving how close the Time Base > registers are is tied to starting these cpus.) Only the > aim/mp_cpudep.c and aim/slb.c changes seem relevant. >=20 > Are you worried about some form of interaction that means > I need to avoid patches for other issues? >=20 > Note: for now I'm staying at using head -r345758 as the > basis for my experiments. >=20 >> and change this loop to >> stop at n_slb? So something more akin to: >>=20 >> int i =3D 0; >>=20 >> for (va =3D virtual_avail; va < virtual_end && i < n_slb - 1; va >> +=3D SEGMENT_LENGTH, i++); >> ... >>=20 >> If it reliably boots with that, then that's fine. We can prefault as >> much as we can and leave the rest for on-demand. >=20 > I'm happy to experiment with this loop without my hack > for forcing the slb entry to exist in cpudep_ap_bootstrap. >=20 > But, it seems to presume that the pc_curpcb's will > all always point into the lower address range spanned > when cpudep_ap_bootstrap is executing on the cpu. > Does some known property limit the pc_curpcb-> > references to such? Only that would be sure to > avoid an slb-miss at that stage. Or is this just an > alternate hack or a means of getting evidence, not a > proposed solution? >=20 > (Again, I'm happy to disable my hack that forces the > slb entry and to try the loop suggested.) Note: I've not started any experiments for isync's related to instructions such as slbmte yet: that was all just inspection and reading about requirements so far. So to disable my slb-force-no-miss hack in cpudep_ap_bootstrap I reverted it: # svnlite revert /usr/src/sys/powerpc/aim/mp_cpudep.c = /usr/src/sys/powerpc/aim/slb.c Reverted 'sys/powerpc/aim/mp_cpudep.c' Reverted 'sys/powerpc/aim/slb.c' (hack_into_slb_if_needed(...) was implemented in mp_cpudep.c and used in slb.c before reverting.) And the patch for the loop looks like: virtual_end =3D VM_MAX_SAFE_KERNEL_ADDRESS;=20 =20 /* - * Map the entire KVA range into the SLB. We must not fault = there. + * Map the lower-address part of the KVA range into the SLB. We = must not fault there. */ #ifdef __powerpc64__ - for (va =3D virtual_avail; va < virtual_end; va +=3D = SEGMENT_LENGTH) + i =3D 0; + for (va =3D virtual_avail; va < virtual_end && i<n_slbs-1; va +=3D= SEGMENT_LENGTH, i++) moea64_bootstrap_slb_prefault(va, 0); #endif =20 So I've built, installed, and have tested some: it did not go well overall. Using: OK set debug.verbose_sysinit=3D1 to show better context about where the hangs occur, shows: (Typed from a screen picture.) subsystem a800000 boot_run_interrupt_driven_config_hooks(0)... . . . (omitted) . . . done. vt_upgrade(&vt_consdev). . . The "vt_upgrade(&vt_consdev). . ." never says done when booting hangs with the above changes. Trying to boot a bunch of times did produce one completed boot, all 4 cpus working. Otherwise I'm using kernel.old to manage to complete a boot. I'll note that "vt_upgrade(&vt_consdev). . ." is where Dennis Clarke reported for the hangups that he was seeing, without any of my patches being available back then: 2019-Feb-14. You wrote in another reply: > The idea with this is if you can test with stock -CURRENT (or > post-VM_KERNEL_MAXADDR change), to eliminate any other variables. = This > is *only* for testing that it brings up the APs, not that they're > properly synced. That will happen with other changes. This is a > proposed solution. =46rom my understanding, we typically allocate = from > low to high for KVA allocations, so keeping the low addresses in = memory > long enough to bring up the APs to sanity is the goal, so the commit > would be along the lines of "Prefault as much of KVA as we can fit = into > the SLB". This will have the sleep-gets-stuck problem, likely normally happening quickly after booting and logging in (presuming a boot). The resulting boots for such are not always all that useful after various threads hang up. Also, getting such a almost-exactly-head-revision variant set up without messing up my current context will take some time: I'm not set up for such. I currently have no access to a cross-build environment, the activity is self hosted on a 2-socket/2-cores-each G5. So I will have to build from a context that has patches (or is too old). Thus the preliminary results above that I could produce quickly that are not from the context that you asked for. But it also appears that "vt_upgrade(&vt_consdev). . ." would not be tied to cpudep_ap_bootstrap and evaluating: sp =3D pcpup->pc_curpcb->pcb_sp Still, I'll work on having a gcc-4.2.1-based just-head context built, not that it would install and boot in that state. So I will have to build from a context that has patches, using a different source tree for the "self-hosted cross build" to do your kind of experiment. But I'd then be ready for "self-hosted cross built" experiments. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?C01CF848-890B-407D-876A-9C48F5F3CD40>