Date: Fri, 29 Nov 2019 12:25:42 -0600 From: Alan Cox <alc@rice.edu> To: Konstantin Belousov <kostikbel@gmail.com>, Andrew Turner <andrew@fubar.geek.nz> Cc: meloun.michal@gmail.com, Alan Cox <alc@FreeBSD.org>, src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r355145 - head/sys/arm64/arm64 Message-ID: <2b74e0f9-ee3e-4025-4099-4b77458654e9@rice.edu> In-Reply-To: <20191128135226.GR10580@kib.kiev.ua> References: <201911272033.xARKXowX014908@repo.freebsd.org> <df22055b-2f80-2f9f-2b45-f66281435846@gmail.com> <EBA9203E-17DF-455D-A491-EB4AEE0E37DF@fubar.geek.nz> <20191128135226.GR10580@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
On 11/28/19 7:52 AM, Konstantin Belousov wrote: > On Thu, Nov 28, 2019 at 09:17:15AM +0000, Andrew Turner wrote: >> >>> On 28 Nov 2019, at 08:48, Michal Meloun <meloun.michal@gmail.com> wrote: >>> >>> >>> >>> On 27.11.2019 21:33, Alan Cox wrote: >>>> Author: alc >>>> Date: Wed Nov 27 20:33:49 2019 >>>> New Revision: 355145 >>>> URL: https://svnweb.freebsd.org/changeset/base/355145 >>>> >>>> Log: >>>> There is no reason why we need to pin the underlying thread to its current >>>> processor in pmap_invalidate_{all,page,range}(). These functions are using >>>> an instruction that broadcasts the TLB invalidation to every processor, so >>>> even if a thread migrates in the middle of one of these functions every >>>> processor will still perform the required TLB invalidations. >>> I think this is not the right assumption. The problem is not in TLB >>> operations themselves, but in following 'dsb' and / or 'isb'. 'dsb' >>> ensures that all TLB operation transmitted by the local CPU is performed >>> and visible to other observers. But it does nothing with TLBs emitted by >>> other CPUs. >>> For example, if a given thread is rescheduled after all TLB operations >>> but before 'dsb' or 'isb' is performed, then the requested >>> synchronization does not occur at all. >> The tibi instructions need a context synchronisation point. One option is the dsb & isb instructions, another is an exception entry. >> >> For a thread to be rescheduled it requires the timer interrupt to fire. As an exception entry is a context synchronisation point and an interrupt will cause an exception entry there will be such a point after the the tibi instruction. >> > D5.10.2. TLB maintenance instructions, 'Ordering and completion of > TLB maintenance instructions' states that DSB on the PE that issued > TLBI is required. It does not state that arbitrary even causing > SynchronizeContext() is enough. > > Also I was not able to find any explanation of SynchronizeContext(). The entry for "Context Synchronization Events" in the Glossary on page 7460 provides the best explanation that I've found. The first three listed events are an isb instruction, taking an exception, and returning from an exception. However, there is nothing here to suggest that taking or returning from an exception can take the place of a dsb instruction. (On a related note, I'll observe that Linux does not perform an isb instruction during TLB invalidation unless it is changing a leaf in the kernel page table. For changes to user-space page tables, they appear to be assuming that the return to user-space will suffice to resync the user-space instruction stream.) > Curiously, on IA32 exceptions are not specified to issue a serialization > point, although rumors say that on all produced microarchitectures they are. This issue has similarities to the one that we discussed in https://reviews.freebsd.org/D22007. For example, on a context switch we will perform a dsb instruction in pmap_activate_int() unless we are switching to a thread within the same address space. Moreover, that dsb instruction will be performed before the lock on the old thread is released. So, in this case, we are guaranteed that any previously issued tlbi instructions are completed before the thread can be restarted on another processor. However, if we are simply switching between threads within the same address space, we are not performing a dsb instruction. And, my error was believing that the memory barriers inherent to the locking operations on the thread during context switches would be sufficient to ensure that all of the tlbi instructions on the initial processor would have completed before the dsb instruction on the eventual processor finishing the pmap_invalidate_*() completed. After further reading, I'm afraid that we have a similar issue with cache management functions, like cpu_icache_sync_range(). Quoting K11.5.2, "The ARMv8 architecture requires a PE that executes an instruction cache maintenance instruction to execute a DSB instruction to ensure completion of the maintenance operation." So, we have a similar problem if we are preempted during the cpu_icache_sync_range() calls from pmap_enter(), pmap_enter_quick(), and pmap_sync_icache(). Unless we are switching to a different address space, we are not guaranteed to perform a dsb instruction on the initial processor. Moreover, we are configuring the processor in locore.S so that user-space can directly perform "ic ivau" instructions. (See SCTLR_UCI in sctlr_set.) So, I'm inclined to say that we should handle this issue analogously to r354630 on amd64: Index: arm64/arm64/pmap.c =================================================================== --- arm64/arm64/pmap.c (revision 355145) +++ arm64/arm64/pmap.c (working copy) @@ -5853,8 +5853,11 @@ pmap_activate_int(pmap_t pmap)        KASSERT(PCPU_GET(curpmap) != NULL, ("no active pmap"));        KASSERT(pmap != kernel_pmap, ("kernel pmap activation")); -      if (pmap == PCPU_GET(curpmap)) +      if (pmap == PCPU_GET(curpmap)) { +              /* XXXExplain why. */ +              dsb(ish);                return (false); +      }        /*         * Ensure that the store to curpmap is globally visible before the
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2b74e0f9-ee3e-4025-4099-4b77458654e9>