From owner-svn-doc-all@freebsd.org Thu Oct 15 23:51:23 2015 Return-Path: Delivered-To: svn-doc-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 31506A15EDC; Thu, 15 Oct 2015 23:51:23 +0000 (UTC) (envelope-from bjk@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 07DB56D2; Thu, 15 Oct 2015 23:51:22 +0000 (UTC) (envelope-from bjk@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id t9FNpMUi092477; Thu, 15 Oct 2015 23:51:22 GMT (envelope-from bjk@FreeBSD.org) Received: (from bjk@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id t9FNpMbm092476; Thu, 15 Oct 2015 23:51:22 GMT (envelope-from bjk@FreeBSD.org) Message-Id: <201510152351.t9FNpMbm092476@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: bjk set sender to bjk@FreeBSD.org using -f From: Benjamin Kaduk Date: Thu, 15 Oct 2015 23:51:22 +0000 (UTC) To: doc-committers@freebsd.org, svn-doc-all@freebsd.org, svn-doc-head@freebsd.org Subject: svn commit: r47579 - head/en_US.ISO8859-1/htdocs/news/status X-SVN-Group: doc-head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-doc-all@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: "SVN commit messages for the entire doc trees \(except for " user" , " projects" , and " translations" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Oct 2015 23:51:23 -0000 Author: bjk Date: Thu Oct 15 23:51:21 2015 New Revision: 47579 URL: https://svnweb.freebsd.org/changeset/doc/47579 Log: Add the atomics report from kib Modified: head/en_US.ISO8859-1/htdocs/news/status/report-2015-07-2015-09.xml Modified: head/en_US.ISO8859-1/htdocs/news/status/report-2015-07-2015-09.xml ============================================================================== --- head/en_US.ISO8859-1/htdocs/news/status/report-2015-07-2015-09.xml Thu Oct 15 23:18:59 2015 (r47578) +++ head/en_US.ISO8859-1/htdocs/news/status/report-2015-07-2015-09.xml Thu Oct 15 23:51:21 2015 (r47579) @@ -1207,4 +1207,147 @@ + + Atomics + + + + + Konstantin + Belousov + + kib@FreeBSD.org + + + + + Alan + Cox + + alc@FreeBSD.org + + + + + Bruce + Evans + + bde@FreeBSD.org + + + + +

Atomic operations serve two fundamental purposes. First, they + are the building blocks for expressing synchronization algorithms + in a single, machine-independent way using high-level languages. + In essense, atomics abstract the different building blocks + supported by the various architectures on which &os; runs, + making it easier to develop and reason about lock-less code by + hiding hardware-level details.

+ +

Atomics also provide the barrier operations that allow software + to control the effects on memory of out-of-order and speculative + execution in modern processors as well as optimizations by + compilers. This capability is especially important to + multithreaded software, such as the &os; kernel, when running + on systems where multiple processors communicate through a shared + main memory.

+ +

Each machine architecture defines a memory model, which + specifies the possible effects on memory of out-of-order and + speculative execution. More precisely, it specifies the extent to + which the machine may visibly reorder memory accesses in order to + optimize performance. Unfortunately, there are almost as many + models as architectures. Moreover, some architectures, for + instance IA32 or Sparcv9 TSO, are relatively strongly ordered. In + contrast, others, like PowerPC or ARM, are very relaxed. In + effect, atomics define a very relaxed abstract memory model for + &os;'s machine-independent code that can be efficiently + realized on any of these architectures.

+ +

However, most &os; development and testing still happens on + x86 machines, which, when combined with x86's strongly ordered + memory model, leads to errors in the use of atomics, specifically, + barriers. In other words, the code is not properly written to + &os;'s abstract memory model, but the strong ordering of the + x86 architecture hides this fact. The architectures impacted + by the code that incorrectly uses atomics are less popular or + have limited availability, and the resulting bugs from the misuse + of atomics are hard to diagnose.

+ +

The goal of this project is to audit and upgrade the usage of + lockless facilities, hopefully fixing bugs before they are + observed in the wild.

+ +

&os; defines its own set of atomics operations, like many + other operating systems. But unlike other operating systems, &os; + models its atomics and barriers on the release consistency model, + which is also known as acquire/release model. This is the same + model which is used by the C11 and C++11 language standards as + well as the new 64-bit ARM architecture. Despite having + syntactical differences, C11 and &os; atomics share essentially + the same semantics. Consequently, ample tutorials about the C11 + memory model and algorithms expressed with C11 atomics can be + trivially reused under &os;.

+ +

One facility of C11 that was missing from &os; atomics, + was fences. Fences are bidirectional barrier operations + which could not be expressed by the existing atomic+barrier + accesses. They were added in r285283.

+ +

Due to the strong memory model implemented by x86 processors, + atomic_load_acq() and atomic_store_rel() can be implemented by + plain load and store instructions with only a compiler barrier; no + additional ordering constraints are required. This simplification + of atomic_store_rel() was done some time ago in r236456. The + atomic_load_acq() change was done in r285934, after careful review + of all its uses in the kernel and user-space to ensure that no + hidden dependency on a stronger implementation was left.

+ +

The only reordering in memory accesses which is allowed on + x86 is that loads may be reordered with older stores to different + locations. This results from the use of store buffers at the + micro-architecural level. So, to ensure sequentially consistent + behavior on x86, a store/load barrier needs to be issued, which + can be done with an MFENCE instruction or by any locked RMW + operation. The latter approach is recommended by the optimization + guides from Intel and AMD. It was noted that careful selection of + the scratch memory location, which is modified by the locked RWM + operation, can reduce the cost of barrier by avoiding false data + dependencies. The corresponding optimization was committed in + r284901.

+ +

The atomic(9) man page was often a cause of confusion due to + both erroneous and ambiguous statements. The most significant of + these issues were addressed in changes r286513 and r286784.

+ +

Some examples of our preemptive fixes to the misuse of atomics + that would only become evident on weakly ordered machines + are:

+ +
    +
  • A very important lockless algorithm, used in both the + kernel and libc, is the timekeeping functionality implemented in + kern/kern_tc.c and the userspace + __vdso_gettimeofday. This algorithm relied on x86 TSO + behavior. It was fixed in r284178 and r285286.
  • + +
  • The kern/kern_intr.c lockless updates to the + it_need indicator were corrected in r285607.
  • + +
  • An issue with + kern/subr_smp.c:smp_rendezvous_cpus() not guaranteeing + the visibility of updates done on other CPUs to the caller was + fixed in r285771.
  • + +
  • The pthread_once() implementation was fixed to + include missed barriers in r287556.
  • +
+ + + + The FreeBSD Foundation (Konstantin Belousov's work) + +
+