Date: Thu, 21 Feb 2013 13:53:52 +0100 From: Svatopluk Kraus <onwahe@gmail.com> To: John Baldwin <jhb@freebsd.org> Cc: Konstantin Belousov <kostikbel@gmail.com>, freebsd-current@freebsd.org Subject: Re: [patch] i386 pmap sysmaps_pcpu[] atomic access Message-ID: <CAFHCsPXEDiaKSkHNbycs%2B1DKkL7BcYJBk5eN%2BKpsLY8B%2B-u5gQ@mail.gmail.com> In-Reply-To: <201302201022.29294.jhb@freebsd.org> References: <CAFHCsPUVTM9jfrnzY72YsPszLWkg-UaJcycTR4xXcS%2BfPzS1Vg@mail.gmail.com> <20130219185129.GC2598@kib.kiev.ua> <CAFHCsPV0p13yjs65i6P17JocoGhQCbVEaaTb443C=c9AJNO%2Bww@mail.gmail.com> <201302201022.29294.jhb@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Feb 20, 2013 at 4:22 PM, John Baldwin <jhb@freebsd.org> wrote: > On Wednesday, February 20, 2013 7:31:08 am Svatopluk Kraus wrote: >> On Tue, Feb 19, 2013 at 7:51 PM, Konstantin Belousov >> <kostikbel@gmail.com> wrote: >> > On Mon, Feb 18, 2013 at 11:18:16PM +0100, Svatopluk Kraus wrote: >> >> On Mon, Feb 18, 2013 at 9:36 PM, Konstantin Belousov >> >> <kostikbel@gmail.com> wrote: >> >> Well, I'm taking a part on porting FreeBSD to ARM11mpcore. UP case was >> >> simple. SMP case is more complex and rather new for me. Recently, I >> >> was solving a problem with PCPU stuff. For example, PCPU_GET is >> >> implemented by one instruction on i386 arch. So, a need of atomicity >> >> with respect to interrupts can be overlooked. On load-store archs, the >> >> implementation which works in SMP case is not so simple. And what >> >> works in UP case (single PCPU), not works in SMP case. Believe me, >> >> mysterious and sporadic 'mutex not owned' assertions and others ones >> >> caused by curthreads mess, it takes a while ... >> > Note that PCPU_GET() is not needed to be atomic. The reason is that the code >> > which uses its result would not be atomic (single-instruction or whatever, see >> > below). Thus, either the preemption should not matter since the action with >> > the per-cpu data is advisory, as is in the case of i386 pmap you noted. >> > or some external measures should be applied in advance to the containing >> > region (which you proposed, by bracing with sched_pin()). >> >> So, it's advisory in the case of i386 pmap... Well, if you can live >> with that, I can too. >> >> > >> > Also, note that it is not interrupts but preemption which is concern. >> >> Yes and no. In theory, yes, a preemption is a concern. In FreeBSD, >> however, sched_pin() and critical_enter() and their counterparts are >> implemented with help of curthread. And curthread definition falls to >> PCPU_GET(curthread) if not defined in other way. So, curthread should >> be atomic with respect to interrupts and in general, PCPU_GET() should >> be too. Note that spinlock_enter() definitions often (always) use >> curthread too. Anyhow, it's defined in MD code and can be defined for >> each arch separately. > > curthread is a bit magic. :) If you perform a context switch during an > interrupt (which will change 'curthread') you also change your register state. > When you resume, the register state is also restored. This means that while > something like 'PCPU_GET(cpuid)' might be stale after you read it, 'curthread' > never is. However, it is true that actually reading curthread has to be > atomic. If your read of curthread looks like: > > mov <pcpu reg>, r0 > add <offset of pc_curthread>, r0 > ld r0, r1 > > Then that will indeed break. Alpha used a fixed register for 'pcpu_reg' > (as does ia64 IIRC). OTOH, you might also be able to depend on the fact that > pc_curthread is the first thing in 'struct pcpu' (and always will be, you could > add a CTASSERT to future-proof). In that case, you can remove the 'add' > instruction and instead just do: > > ld <pcpu reg>, r1 > > which is fine. Just for the record. There are three extra (coprocessor) registers per a core in arm11mpcore (armv6k). Unfortunately only one is Privileged. The other ones are respectively User Read only and User Read Write. For now, we are using the Privileged one to store pcpu pointer (curthread is correctly set during each context switch). Thus, using a coprocessor register means that reading of curthread is not a single instruction implementation now. After we figured out the curthread issue in SMP case, using a fixed (processor) register for pcpu is an option. Meanwhile, we disable interrupts before reading of curthread and enable them after. The same is done for other PCPU stuff. For now we have not stable system enough for profiling, however, when it will be, it would be interesting to learn how various implementations of curthread reading impact system performance. Svata
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFHCsPXEDiaKSkHNbycs%2B1DKkL7BcYJBk5eN%2BKpsLY8B%2B-u5gQ>