From owner-svn-src-head@FreeBSD.ORG Tue Jun 25 14:54:01 2013 Return-Path: Delivered-To: svn-src-head@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 71ECFCF5; Tue, 25 Jun 2013 14:54:01 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail110.syd.optusnet.com.au (mail110.syd.optusnet.com.au [211.29.132.97]) by mx1.freebsd.org (Postfix) with ESMTP id 05CFF1F99; Tue, 25 Jun 2013 14:54:00 +0000 (UTC) Received: from c122-106-156-23.carlnfd1.nsw.optusnet.com.au (c122-106-156-23.carlnfd1.nsw.optusnet.com.au [122.106.156.23]) by mail110.syd.optusnet.com.au (Postfix) with ESMTPS id EC06C7811AF; Wed, 26 Jun 2013 00:53:51 +1000 (EST) Date: Wed, 26 Jun 2013 00:53:48 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Gleb Smirnoff Subject: Re: svn commit: r252032 - head/sys/amd64/include In-Reply-To: <20130625132814.GL1214@FreeBSD.org> Message-ID: <20130626000446.R1793@besplex.bde.org> References: <20130621090207.F1318@besplex.bde.org> <20130621064901.GS1214@FreeBSD.org> <20130621184140.G848@besplex.bde.org> <20130621135427.GA1214@FreeBSD.org> <20130622110352.J2033@besplex.bde.org> <20130622124832.S2347@besplex.bde.org> <20130622174921.I3112@besplex.bde.org> <20130623073343.GY91021@kib.kiev.ua> <20130624081056.GD1214@FreeBSD.org> <20130624211428.O2235@besplex.bde.org> <20130625132814.GL1214@FreeBSD.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.0 cv=Q6eKePKa c=1 sm=1 a=0l9hOOMEwYoA:10 a=kj9zAlcOel0A:10 a=PO7r1zJSAAAA:8 a=JzwRw_2MAAAA:8 a=gvJhbHXk4isA:10 a=jNPman2aqbC_opBzTssA:9 a=CjuIK1q_8ugA:10 a=Ld5LoEn-QvjovzbF:21 a=OTHvqI1c0fGSTJtT:21 a=ebeQFi2P/qHVC0Yw9JDJ4g==:117 Cc: Konstantin Belousov , svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org, Bruce Evans X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Jun 2013 14:54:01 -0000 On Tue, 25 Jun 2013, Gleb Smirnoff wrote: > On Mon, Jun 24, 2013 at 11:16:33PM +1000, Bruce Evans wrote: > B> > K> This is quite interesting idea, but I still did not decided if it > B> > K> acceptable. The issue is that we could add the carry to the other > B> > K> processor counter, if the preemption kicks in at right time between > B> > K> two instructions. I did not found any argument why would it be > B> > K> wrong, the races for fetch operation seems to be the same with either > B> > K> local update method. > B> > > B> > This would be wrong since update isn't locked. Thus, if we are put on > B> > other CPU between two instructions, and in second instruction updating > B> > another CPU counter simultaneously with the original CPU we were on, > B> > then we are losing an update. > B> > B> Hmm, this is subtle. The update is never lost, but is applied to > B> different counter, non-atomically at a later time. Non-atomicity > B> only matters when there is a carry since the counter only goes > B> transiently backards in this case. For example: initial state: > B> > B> CPU1 counter: 00000000 ffffffff > B> CPU2 counter: 00000000 fffffffe > B> > B> Start adding 1 to the first counter, doing it non-atomically by > B> incrementing the low word first. > B> > B> CPU1 counter: 00000000 00000000 (carry in CPU1 eflags) > B> CPU2 counter: 00000000 fffffffe > B> > B> Get preempted at this point: > B> > B> CPU1 counter: 00000000 00000000 > B> CPU2 counter: 00000000 fffffffe (carry in CPU2 eflags) > > Nope, the non-atomicity isn't specific to the case when we carry > bit from least significant part to most one. Sure, but the other parts already work correctly. > The non-atomicity is much simplier: two CPUs write to same memory address > w/o using lock prefix. No, they can't, since these are per-CPU counters. Except for broken per-CPU counters and invalid accesses by CPUs that don't own the counter in upper layer code. > Thus, any update mechanism that first calculates absolute per-CPU address, > then writes to it w/o lock, requires critical section. Because preemption > between calculation and write, and later execution on other CPU leads > to two CPUs writing to same address. Only broken per-CPU counters do that. I already pointed out that all arches except amd64 and i386 have buggy PCPU_ADD() and other PCPU_*() access functions because they do do that. ia64, powerpc and sparc64 even document that that they do this in an XXX comment. Actually, ia64 and sparc64 and perhaps some others are not quite this bad, since their pcpu pointer is in a special register that should get switched by context switches. It still takes a lot of magic for the compiler to use this register for the accesses and not copy it to another register that doesn't get context switched. Their pcpu access bugs may be limited to the compiler possibly generating non-atomic load-modify-store instructions. Their XXX comment has wording bugs either way. It says that "this" operation should be made atomic, but the scope of the comment is 4 operations. The first operation is PCPU_ADD(), and it used to be the only one that does load-modify-store. Now it is followed by the PCPU_INC() operation which also does load-modify-store. Aligned word-sized operations have a chance of being atomic enough for the other operations. Bruce