From owner-freebsd-arch@FreeBSD.ORG Tue Oct 28 14:33:09 2014 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 2DD312B8; Tue, 28 Oct 2014 14:33:09 +0000 (UTC) Received: from mail-wi0-x231.google.com (mail-wi0-x231.google.com [IPv6:2a00:1450:400c:c05::231]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 6F5C136A; Tue, 28 Oct 2014 14:33:08 +0000 (UTC) Received: by mail-wi0-f177.google.com with SMTP id ex7so1786198wid.10 for ; Tue, 28 Oct 2014 07:33:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:sender:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=jzJCJ/x9TPlske+Ae9sJ+KKAiDx64WR+YwmL3PKINic=; b=T6fDJEjC5VO+EA+Bi+saaE6lDDC+1OQsIjywQtKA+wok5ZnyvQyCMCj5SJ+tCUWyle NbfieXvHbl9F5pT6w1LmXsuCWpXeLv0vVtpv16jCNyZhtcWJ1ybr5513H+6QqROxi0YH LD5UyCjouWbMqTWdqkF8vzjW74g/pkzECk2wLO3PZbUkgyrDiw9V2eDLcJ54PPJwhUs0 mHoBNE1ishFSVkz7nF/BOeqb/iUNEWl3oHD9Idayn9sk+5IY4HTe6K6TbZ4jZD6UNo1C Q7Vcac9bKQXc7+jgGBkamr+JsIR6iR8mBZZ/Mn0gK99VeCGsY47ozaI+i2myxsv2xYLi TWsA== MIME-Version: 1.0 X-Received: by 10.180.83.37 with SMTP id n5mr28839571wiy.7.1414506786594; Tue, 28 Oct 2014 07:33:06 -0700 (PDT) Reply-To: attilio@FreeBSD.org Sender: asmrookie@gmail.com Received: by 10.217.69.73 with HTTP; Tue, 28 Oct 2014 07:33:06 -0700 (PDT) In-Reply-To: <20141028142510.10a9d3cb@bender.lan> References: <20141028025222.GA19223@dft-labs.eu> <20141028142510.10a9d3cb@bender.lan> Date: Tue, 28 Oct 2014 15:33:06 +0100 X-Google-Sender-Auth: ElSPvKB72y9f1cRQFz2uCY0dy7U Message-ID: Subject: Re: atomic ops From: Attilio Rao To: Andrew Turner Content-Type: text/plain; charset=UTF-8 Cc: "freebsd-arch@freebsd.org" , Adrian Chadd , Mateusz Guzik , Konstantin Belousov , Alan Cox X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Oct 2014 14:33:09 -0000 On Tue, Oct 28, 2014 at 3:25 PM, Andrew Turner wrote: > On Tue, 28 Oct 2014 14:18:41 +0100 > Attilio Rao wrote: > >> On Tue, Oct 28, 2014 at 3:52 AM, Mateusz Guzik >> wrote: >> > As was mentioned sometime ago, our situation related to atomic ops >> > is not ideal. >> > >> > atomic_load_acq_* and atomic_store_rel_* (at least on amd64) provide >> > full memory barriers, which is stronger than needed. >> > >> > Moreover, load is implemented as lock cmpchg on var address, so it >> > is addditionally slower especially when cpus compete. >> >> I already explained this once privately: fully memory barriers is not >> stronger than needed. >> FreeBSD has a different semantic than Linux. We historically enforce a >> full barrier on _acq() and _rel() rather then just a read and write >> barrier, hence we need a different implementation than Linux. >> There is code that relies on this property, like the locking >> primitives (release a mutex, for instance). > > On 32-bit ARM prior to ARMv8 (i.e. all chips we currently support) > there are only full barriers. On both 32 and 64-bit ARMv8 ARM has added > support for load-acquire and store-release atomic instructions. For the > use in atomic instructions we can assume these only operate of the > address passed to them. > > It is unlikely we will use them in the 32-bit port however I would like > to know the expected semantics of these atomic functions to make sure > we get them correct in the arm64 port. I have been advised by one of > the ARM Linux kernel maintainers on the problems they have found using > these instructions but have yet to determine what our atomic functions > guarantee. For FreeBSD the "reference doc" is atomic(9). It clearly states: The second variant of each operation includes a read memory barrier. This barrier ensures that the effects of this operation are completed before the effects of any later data accesses. As a result, the opera- tion is said to have acquire semantics as it acquires a pseudo-lock requiring further operations to wait until it has completed. To denote this, the suffix ``_acq'' is inserted into the function name immediately prior to the ``_'' suffix. For example, to subtract two integers ensuring that any later writes will happen after the subtraction is per- formed, use atomic_subtract_acq_int(). The third variant of each operation includes a write memory barrier. This ensures that all effects of all previous data accesses are completed before this operation takes place. As a result, the operation is said to have release semantics as it releases any pending data accesses to be completed before its operation is performed. To denote this, the suffix ``_rel'' is inserted into the function name immediately prior to the ``_'' suffix. For example, to add two long integers ensuring that all previous writes will happen first, use atomic_add_rel_long(). The bottom-side of all this is that read memory barriers ensures that the effect of the operations you are making (load in case of atomic_load_acq_int(), for example) are completed before any later data accesses. "Data accesses" qualifies for *all* the operations including read, writes, etc. This is very different by what Linux assumes for its rmb() barrier, for example which just orders loads. So for FreeBSD there is no _acq -> rmb() analogy and there is no _rel -> wmb() analogy. This must be kept well in mind when trying to optimize the atomic_*() operations. Attilio -- Peace can only be achieved by understanding - A. Einstein