FreeBSD Mail Archives

Date:      Wed, 1 May 2019 09:40:29 -0500
From:      Justin Hibbits <chmeeedalf@gmail.com>
To:        Mark Millard <marklmi@yahoo.com>
Cc:        FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>
Subject:   Re: How many segments does it take to span from VM_MIN_KERNEL_ADDRESS through VM_MAX_SAFE_KERNEL_ADDRESS? 128 in moea64_late_bootstrap
Message-ID:  <20190501094029.542c5f46@titan.knownspace>
In-Reply-To: <6159F4A6-9431-4B99-AA62-451B8DF08A6E@yahoo.com>
References:  <3C69CF7C-7F33-4C79-92C0-3493A1294996@yahoo.com> <6159F4A6-9431-4B99-AA62-451B8DF08A6E@yahoo.com>

On Tue, 30 Apr 2019 21:45:00 -0700
Mark Millard <marklmi@yahoo.com> wrote:

> [I realized another implication about a another point of
> potential slb-misses in cpudep_ap_bootstrap: the
> address in sprg0 on the cpu might end up not able to be
> dereferenced.]
> 
> On 2019-Apr-30, at 20:58, Mark Millard <marklmi at yahoo.com> wrote:
> 
> > [At the end this note shows why the old VM_MAX_KERNEL_ADDRESS
> > lead to no slb-miss exceptions in cpudep_ap_bootstrap.]
> > 
> > There is code in moea64_late_bootstrap that looks like:
> > 
> >        virtual_avail = VM_MIN_KERNEL_ADDRESS;
> >        virtual_end = VM_MAX_SAFE_KERNEL_ADDRESS;
> > 
> >        /*
> >         * Map the entire KVA range into the SLB. We must not fault
> > there. */
> >        #ifdef __powerpc64__
> >        for (va = virtual_avail; va < virtual_end; va +=
> > SEGMENT_LENGTH) moea64_bootstrap_slb_prefault(va, 0);
> >        #endif

What happens if you revert all your patches, and change this loop to
stop at n_slb?  So something more akin to:

	int i = 0;

	for (va = virtual_avail; va < virtual_end && i < n_slb - 1; va
	+= SEGMENT_LENGTH, i++);
		...

If it reliably boots with that, then that's fine.  We can prefault as
much as we can and leave the rest for on-demand.

> > 
> > where (modern):
> > 
> > #define VM_MIN_KERNEL_ADDRESS           0xe000000000000000UL
> > #define VM_MAX_SAFE_KERNEL_ADDRESS      VM_MAX_KERNEL_ADDRESS
> > #define VM_MAX_KERNEL_ADDRESS           0xe0000007ffffffffUL
> > #define       SEGMENT_LENGTH  0x10000000UL
> > 
> > So:
> > 
> > 0xe000000000000000UL: VM_MIN_KERNEL_ADDRESS
> > 0x0000000010000000UL: SEGMENT_LENGTH
> > 0xe0000007ffffffffUL: VM_MAX_KERNEL_ADDRESS
> > 
> > So I see the loop as doing moea64_bootstrap_slb_prefault
> > 128 times (decimal, 0x00..0x7f at the appropriate
> > byte in va).
> > 
> > (I do not see why this loop keeps going once the slb
> > kernel slots are all full. Nor is it obvious to me
> > why the larger va values should be the ones more
> > likely to still be covered. But I'm going a different
> > direction below.)
> > 
> > That also means that the code does random replacement (based
> > on mftb()%n_slbs, but avoiding USER_SLB_SLOT) 128-(64-1),
> > or 65 times. The slb_insert_kernel use in 
> > moea64_bootstrap_slb_prefault does that:
> > 
...
> > 
> > I expect that the above explains the variability in
> > if cpudep_ap_bootstrap 's:
> > 
> > sp = pcpup->pc_curpcb->pcb_sp
> > 
> > gets a slb fault for dereferencing the pc_curpcb stage
> > of that vs. not.  
> 
> 
> Note: the random replacements could also make
> dereferencing pcpup-> (aka (get_pcpu())->) end
> up with a slb-miss, where:
> 
> static __inline struct pcpu *
> get_pcpu(void)
> {
>         struct pcpu *ret;
>  
>         __asm __volatile("mfsprg %0, 0" : "=r"(ret));
>         
>         return (ret);
> }
> 
> If the slb entry covering address ranges accessed
> based on sprg0 is ever replaced, no code based on
> getting sprg0's value to find the matching pcpu
> information is going to work, *including in the
> slb spill trap code* [GET_CPUINFO(%r?)].

Keep in mind that the PCPU pointer is in the DMAP, since it's in the
kernel image.  It's not in KVA.  However, some structures pointed to by
pcpu are in KVA, and those are what are faulting.

> 
> Does a kernel entry need to be reserved for the
> CPU that never is replaced so that sprg0 can always
> be used to find pcpu information via sprg0?
> 
> This is something my hack did not deal with.
> And, in fact, trying to force 2 entries to exist
> at the same time, one for "dereferencing sprg0"
> and one for dereferencing the pc_curpcb so found
> is currently messy, given the way other things
> work.

It should not.  The kernel SLB miss handler runs entirely in real
mode, and __pcpu[] is in the DMAP, so real addresses match up directly
with DMAP addresses (modulo 0xc000000000000000).

> 
> The lack of handing may explain the (rare) hangups
> with the existing hack present.
> 
> 
> > I also expect that the old VM_MAX_KERNEL_ADDRESS value
> > explains the lack of slb-misses in old times:
> > 
> > 0xe000000000000000UL: VM_MIN_KERNEL_ADDRESS
> > 0x0000000010000000UL: SEGMENT_LENGTH
> > 0xe0000001c7ffffffUL: VM_MAX_KERNEL_ADDRESS
> > 
> > So 0x00..0x1c is 29 alternatives (decimal). That
> > fits in 64-1 slots, or even 32-1 slots: no
> > random replacements happened above or elsewhere.
> > That, in turn meant no testing of the handling
> > of any slb-misses back then.
> > 
> > 
> > [Other list messages suggest missing context synchronizing
> > instructions for slbmte and related instructions. The
> > history is not evidence about that, given the lack of
> > slb-misses.]  

Possibly.  However, we may want direct control of the slots at boot
time so as to make sure none of the range we're prefaulting gets
replaced by the pseudo-random slot chooser.

- Justin

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190501094029.542c5f46>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation