From owner-freebsd-current@FreeBSD.ORG Tue Feb 26 12:47:14 2013 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 96594DE8 for ; Tue, 26 Feb 2013 12:47:14 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 4340D7C for ; Tue, 26 Feb 2013 12:47:14 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id A934FB95D; Tue, 26 Feb 2013 07:47:12 -0500 (EST) From: John Baldwin To: Olivier Houchard Subject: Re: [patch] i386 pmap sysmaps_pcpu[] atomic access Date: Mon, 25 Feb 2013 16:26:14 -0500 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p25; KDE/4.5.5; amd64; ; ) References: <20130225190920.1b7d348d@bender> <20130225124747.GA21027@ci0.org> In-Reply-To: <20130225124747.GA21027@ci0.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201302251626.14793.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Tue, 26 Feb 2013 07:47:13 -0500 (EST) Cc: Konstantin Belousov , freebsd-current@freebsd.org, Andrew Turner , Svatopluk Kraus X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Feb 2013 12:47:14 -0000 On Monday, February 25, 2013 7:47:47 am Olivier Houchard wrote: > On Mon, Feb 25, 2013 at 07:09:20PM +1300, Andrew Turner wrote: > > On Thu, 21 Feb 2013 10:43:49 -0500 > > John Baldwin wrote: > > > > > On Thursday, February 21, 2013 7:53:52 am Svatopluk Kraus wrote: > > > > On Wed, Feb 20, 2013 at 4:22 PM, John Baldwin > > > > wrote: > > > > > On Wednesday, February 20, 2013 7:31:08 am Svatopluk Kraus wrote: > > > > >> On Tue, Feb 19, 2013 at 7:51 PM, Konstantin Belousov > > > > >> wrote: > > > > >> > On Mon, Feb 18, 2013 at 11:18:16PM +0100, Svatopluk Kraus > > > > >> > wrote: > > > > >> >> On Mon, Feb 18, 2013 at 9:36 PM, Konstantin Belousov > > > > >> >> wrote: > > > > >> >> Well, I'm taking a part on porting FreeBSD to ARM11mpcore. UP > > > > >> >> case was simple. SMP case is more complex and rather new for > > > > >> >> me. Recently, I was solving a problem with PCPU stuff. For > > > > >> >> example, PCPU_GET is implemented by one instruction on i386 > > > > >> >> arch. So, a need of atomicity with respect to interrupts can > > > > >> >> be overlooked. On load-store archs, the implementation which > > > > >> >> works in SMP case is not so simple. And what works in UP case > > > > >> >> (single PCPU), not works in SMP case. Believe me, mysterious > > > > >> >> and sporadic 'mutex not owned' assertions and others ones > > > > >> >> caused by curthreads mess, it takes a while ... > > > > >> > Note that PCPU_GET() is not needed to be atomic. The reason is > > > > >> > that the code which uses its result would not be atomic > > > > >> > (single-instruction or whatever, see below). Thus, either the > > > > >> > preemption should not matter since the action with the per-cpu > > > > >> > data is advisory, as is in the case of i386 pmap you noted. or > > > > >> > some external measures should be applied in advance to the > > > > >> > containing region (which you proposed, by bracing with > > > > >> > sched_pin()). > > > > >> > > > > >> So, it's advisory in the case of i386 pmap... Well, if you can > > > > >> live with that, I can too. > > > > >> > > > > >> > > > > > >> > Also, note that it is not interrupts but preemption which is > > > > >> > concern. > > > > >> > > > > >> Yes and no. In theory, yes, a preemption is a concern. In > > > > >> FreeBSD, however, sched_pin() and critical_enter() and their > > > > >> counterparts are implemented with help of curthread. And > > > > >> curthread definition falls to PCPU_GET(curthread) if not defined > > > > >> in other way. So, curthread should be atomic with respect to > > > > >> interrupts and in general, PCPU_GET() should be too. Note that > > > > >> spinlock_enter() definitions often (always) use curthread too. > > > > >> Anyhow, it's defined in MD code and can be defined for each arch > > > > >> separately. > > > > > > > > > > curthread is a bit magic. :) If you perform a context switch > > > > > during an interrupt (which will change 'curthread') you also > > > > > change your register state. When you resume, the register state > > > > > is also restored. This means that while something like > > > > > 'PCPU_GET(cpuid)' might be stale after you read it, 'curthread' > > > > > never is. However, it is true that actually reading curthread > > > > > has to be atomic. If your read of curthread looks like: > > > > > > > > > > mov , r0 > > > > > add , r0 > > > > > ld r0, r1 > > > > > > > > > > Then that will indeed break. Alpha used a fixed register for > > > > > 'pcpu_reg' (as does ia64 IIRC). OTOH, you might also be able to > > > > > depend on the fact that pc_curthread is the first thing in > > > > > 'struct pcpu' (and always will be, you could add a CTASSERT to > > > > > future-proof). In that case, you can remove the 'add' > > > > > instruction and instead just do: > > > > > > > > > > ld , r1 > > > > > > > > > > which is fine. > > > > > > > > Just for the record. There are three extra (coprocessor) registers > > > > per a core in arm11mpcore (armv6k). Unfortunately only one is > > > > Privileged. The other ones are respectively User Read only and User > > > > Read Write. For now, we are using the Privileged one to store pcpu > > > > pointer (curthread is correctly set during each context switch). > > > > Thus, using a coprocessor register means that reading of curthread > > > > is not a single instruction implementation now. After we figured > > > > out the curthread issue in SMP case, using a fixed (processor) > > > > register for pcpu is an option. Meanwhile, we disable interrupts > > > > before reading of curthread and enable them after. The same is done > > > > for other PCPU stuff. For now we have not stable system enough for > > > > profiling, however, when it will be, it would be interesting to > > > > learn how various implementations of curthread reading impact > > > > system performance. > > > > > > curthread is read _a lot_, so I would expect this to hurt. What we > > > did on Alpha was to use a fixed register for pcpu access, but we used > > > the equivalent of a coprocessor register to also store the value so > > > we could set that fixed register on entry to the kernel (userland was > > > free to use the register for its own purposes). > > > > > > > The current code on ARM is not atomic, it loads the base address of the > > pcpu data from the coprocessor then loads curthread from this address. > > One solution I discussed with Olivier Houchard is to keep the data in > > the coprocessor but to then load it into a dedicated register when we > > enter the kernel. > > > > Reading curthread would then be a single load instruction thanks to > > ARM's ldr using an immediate offset [1], e.g. to load from r12 into r1 > > it would be 'ldr r1, [r12]'. > > > > Wouldn't it be doable to either use the extra coprocessor register for > curthread, setting it when entering the kernel as John suggested, or just > using the privileged one for curthread, and changing get_pcpu() to be > __pcpu[cpuid], and then making sure pcpu accesses are atomic, either by > disabling interrupts or using ldrex/strex ? I take it curthread (and curpcb, > which we can get from curthread) is by far the most used, and it wouldn't hurt > that badly to pessimize the other pcpu variables ? I think you can store pcpu in the coprocessor register. If you always copy it to a dedicated register on kernel entry (say r12) then 'ldr r1, [r12]' to read curthread should be atomic. (Just put a CTASSERT() that offsetof(struct pcpu, pc_curthread) == 0 in arm's machdep.c along with a comment that 'curthread' depends on this. You can then use a custom curthread function like we do on amd64, etc. to hardcode it as a single instruction (and mark it __pure2 while you are at it). -- John Baldwin