Use C++11 atomics
This CL moves the Chrome code base to C++11 atomics instead of inline
assembly for the toolchains that support C++11's <atomic>. The patch is
constructed to modify as little code as possible, and will be followed-up with
other patches to clean up the rest of the code base.
This change should allow compilers to emit better code when dealing with atomics
(LLVM has recently seen substantial improvements thanks to morisset's work), be
more portable, eventually delete a bunch of code, and in subsequent changes
atomicity will be based on memory location instead of individual accesses (making
the code less error-prone, and easier to analyze).
This patch stems from a fix I recently made to V8 to build under NaCl and
PNaCl. This was gating V8's C++11 migration: they needed to move to the
LLVM-based NaCl toolchain which didn't work with atomicops' inline assembly. V8
currently uses the __sync_* primitives for the NaCl/PNaCl build, which imply
sequential consistency and are therefore suboptimal. Before doing further work I
want to fix Chrome's own headers, and trickle these changes to V8.
Future changes:
* The atomicops headers are copied in 8 locations [0] in the Chrome code base,
not all of them are up to date. Those should all be updated in their
individual repositories (some are in thiry_party).
* The signature of functions should be updated to take an atomic location
instead of the pointer argument, and all call sites should be updated (there
are current 127 such calls [1] in chromium/src in non-implementation and
non-test files).
* Atomic operations should be used directly.
A few things worth noting:
* The atomicops_internals_portable.h file contains the main implementation, and
has extra notes on implementation choices.
* The CL removes x86 CPU features detection from the portable implementation:
- Because the headers are copied in a few places and the x86 CPU feature
detection uses a static global that parses cpuid, the
atomicops_internals_x86_gcc.cc file was only built once, but its struct
interface was declared external and used in protobuf and
tcmalloc. Removing the struct from Chrome's own base makes the linker sad
because of the two uses. This has two implications:
# Don't specialize atomics for SSE2 versus non SSE2. This is a non-issue
since Chrome requires a minimum of SSE2. The code is therefore faster
(skip a branch).
# The code doesn't detect the AMD Opteron Rev E memory barrier bug from
AMD errata 147 [2]. This bug affects Opterons 1xx/2xx/8xx that shipped
in 2003-2004. In general compilers should work around this bug, not
library code. GCC has had this workaround since 2009 [3], but it is
still an open bug in LLVM [4]. See comment #27 on the code review: this
CPU is used by 0.02% of Chrome users, for whom the race would have to
occur and the compiler not have the workaround for the bug to manifest.
This also makes the code faster by skipping a branch.
* The CL moves the declaration of the x86 CPU features struct to atomicops.h,
and matches its fields with the ones duplicated across the code base. This
is transitional: it fixes a link failure because tcmalloc relied on the
x86_gcc.{h,cc} declaration and implementation, and it fixes an ODR violation
where the struct didn't always have 3 members (and the other members were
sometimes accessed, uninitialized).
* tsan already rewrites all atomics to its own primitives, its header is
therefore now unnecessary. The implementation takes care to detect and error
out if that turns out not to be true.
* MemoryBarrier works around a libstdc++ bug in older versions [5]. The
workaround is exactly the non-buggy code for libstdc++ (call out to the
compiler's builtin).
* The assembly generated by GCC 4.8 and LLVM 3.6 for x86-32/x86-64/ARM when
using this portable implementation can be found in [6].
[0]: find . -name "atomicops.h"
./base/atomicops.h
./v8/src/base/atomicops.h
./native_client_sdk/src/libraries/sdk_util/atomicops.h
./third_party/webrtc/base/atomicops.h
./third_party/tcmalloc/chromium/src/base/atomicops.h
./third_party/tcmalloc/vendor/src/base/atomicops.h
./third_party/re2/util/atomicops.h
./third_party/protobuf/src/google/protobuf/stubs/atomicops.h
[1]: git grep Barrier_ | grep -v atomicops | grep -v unittest | wc -l
[2]: http://support.amd.com/us/Processor_TechDocs/25759.pdf
[3]: https://gcc.gnu.org/ml/gcc-patches/2009-10/txt00046.txt
[4]: http://llvm.org/bugs/show_bug.cgi?id=5934
[5]: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51038
[6]: https://code.google.com/p/chromium/issues/detail?id=423074
TEST=ninja -C out/Release base_unittests
TEST=trybots
BUG=246514
BUG=234215
BUG=420970
BUG=423074
R= [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Review URL: https://codereview.chromium.org/636783002
Cr-Commit-Position: refs/heads/master@{#299485}
diff --git a/base/atomicops_internals_portable.h b/base/atomicops_internals_portable.h
new file mode 100644
index 0000000..b25099f
--- /dev/null
+++ b/base/atomicops_internals_portable.h
@@ -0,0 +1,227 @@
+// Copyright (c) 2014 The Chromium Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+// This file is an internal atomic implementation, use atomicops.h instead.
+//
+// This implementation uses C++11 atomics' member functions. The code base is
+// currently written assuming atomicity revolves around accesses instead of
+// C++11's memory locations. The burden is on the programmer to ensure that all
+// memory locations accessed atomically are never accessed non-atomically (tsan
+// should help with this).
+//
+// TODO(jfb) Modify the atomicops.h API and user code to declare atomic
+// locations as truly atomic. See the static_assert below.
+//
+// Of note in this implementation:
+// * All NoBarrier variants are implemented as relaxed.
+// * All Barrier variants are implemented as sequentially-consistent.
+// * Compare exchange's failure ordering is always the same as the success one
+// (except for release, which fails as relaxed): using a weaker ordering is
+// only valid under certain uses of compare exchange.
+// * Acquire store doesn't exist in the C11 memory model, it is instead
+// implemented as a relaxed store followed by a sequentially consistent
+// fence.
+// * Release load doesn't exist in the C11 memory model, it is instead
+// implemented as sequentially consistent fence followed by a relaxed load.
+// * Atomic increment is expected to return the post-incremented value, whereas
+// C11 fetch add returns the previous value. The implementation therefore
+// needs to increment twice (which the compiler should be able to detect and
+// optimize).
+
+#ifndef BASE_ATOMICOPS_INTERNALS_PORTABLE_H_
+#define BASE_ATOMICOPS_INTERNALS_PORTABLE_H_
+
+#include <atomic>
+
+namespace base {
+namespace subtle {
+
+// This implementation is transitional and maintains the original API for
+// atomicops.h. This requires casting memory locations to the atomic types, and
+// assumes that the API and the C++11 implementation are layout-compatible,
+// which isn't true for all implementations or hardware platforms. The static
+// assertion should detect this issue, were it to fire then this header
+// shouldn't be used.
+//
+// TODO(jfb) If this header manages to stay committed then the API should be
+// modified, and all call sites updated.
+typedef volatile std::atomic<Atomic32>* AtomicLocation32;
+static_assert(sizeof(*(AtomicLocation32) nullptr) == sizeof(Atomic32),
+ "incompatible 32-bit atomic layout");
+
+inline void MemoryBarrier() {
+#if defined(__GLIBCXX__)
+ // Work around libstdc++ bug 51038 where atomic_thread_fence was declared but
+ // not defined, leading to the linker complaining about undefined references.
+ __atomic_thread_fence(std::memory_order_seq_cst);
+#else
+ std::atomic_thread_fence(std::memory_order_seq_cst);
+#endif
+}
+
+inline Atomic32 NoBarrier_CompareAndSwap(volatile Atomic32* ptr,
+ Atomic32 old_value,
+ Atomic32 new_value) {
+ ((AtomicLocation32)ptr)
+ ->compare_exchange_strong(old_value,
+ new_value,
+ std::memory_order_relaxed,
+ std::memory_order_relaxed);
+ return old_value;
+}
+
+inline Atomic32 NoBarrier_AtomicExchange(volatile Atomic32* ptr,
+ Atomic32 new_value) {
+ return ((AtomicLocation32)ptr)
+ ->exchange(new_value, std::memory_order_relaxed);
+}
+
+inline Atomic32 NoBarrier_AtomicIncrement(volatile Atomic32* ptr,
+ Atomic32 increment) {
+ return increment +
+ ((AtomicLocation32)ptr)
+ ->fetch_add(increment, std::memory_order_relaxed);
+}
+
+inline Atomic32 Barrier_AtomicIncrement(volatile Atomic32* ptr,
+ Atomic32 increment) {
+ return increment + ((AtomicLocation32)ptr)->fetch_add(increment);
+}
+
+inline Atomic32 Acquire_CompareAndSwap(volatile Atomic32* ptr,
+ Atomic32 old_value,
+ Atomic32 new_value) {
+ ((AtomicLocation32)ptr)
+ ->compare_exchange_strong(old_value,
+ new_value,
+ std::memory_order_acquire,
+ std::memory_order_acquire);
+ return old_value;
+}
+
+inline Atomic32 Release_CompareAndSwap(volatile Atomic32* ptr,
+ Atomic32 old_value,
+ Atomic32 new_value) {
+ ((AtomicLocation32)ptr)
+ ->compare_exchange_strong(old_value,
+ new_value,
+ std::memory_order_release,
+ std::memory_order_relaxed);
+ return old_value;
+}
+
+inline void NoBarrier_Store(volatile Atomic32* ptr, Atomic32 value) {
+ ((AtomicLocation32)ptr)->store(value, std::memory_order_relaxed);
+}
+
+inline void Acquire_Store(volatile Atomic32* ptr, Atomic32 value) {
+ ((AtomicLocation32)ptr)->store(value, std::memory_order_relaxed);
+ MemoryBarrier();
+}
+
+inline void Release_Store(volatile Atomic32* ptr, Atomic32 value) {
+ ((AtomicLocation32)ptr)->store(value, std::memory_order_release);
+}
+
+inline Atomic32 NoBarrier_Load(volatile const Atomic32* ptr) {
+ return ((AtomicLocation32)ptr)->load(std::memory_order_relaxed);
+}
+
+inline Atomic32 Acquire_Load(volatile const Atomic32* ptr) {
+ return ((AtomicLocation32)ptr)->load(std::memory_order_acquire);
+}
+
+inline Atomic32 Release_Load(volatile const Atomic32* ptr) {
+ MemoryBarrier();
+ return ((AtomicLocation32)ptr)->load(std::memory_order_relaxed);
+}
+
+#if defined(ARCH_CPU_64_BITS)
+
+typedef volatile std::atomic<Atomic64>* AtomicLocation64;
+static_assert(sizeof(*(AtomicLocation64) nullptr) == sizeof(Atomic64),
+ "incompatible 64-bit atomic layout");
+
+inline Atomic64 NoBarrier_CompareAndSwap(volatile Atomic64* ptr,
+ Atomic64 old_value,
+ Atomic64 new_value) {
+ ((AtomicLocation64)ptr)
+ ->compare_exchange_strong(old_value,
+ new_value,
+ std::memory_order_relaxed,
+ std::memory_order_relaxed);
+ return old_value;
+}
+
+inline Atomic64 NoBarrier_AtomicExchange(volatile Atomic64* ptr,
+ Atomic64 new_value) {
+ return ((AtomicLocation64)ptr)
+ ->exchange(new_value, std::memory_order_relaxed);
+}
+
+inline Atomic64 NoBarrier_AtomicIncrement(volatile Atomic64* ptr,
+ Atomic64 increment) {
+ return increment +
+ ((AtomicLocation64)ptr)
+ ->fetch_add(increment, std::memory_order_relaxed);
+}
+
+inline Atomic64 Barrier_AtomicIncrement(volatile Atomic64* ptr,
+ Atomic64 increment) {
+ return increment + ((AtomicLocation64)ptr)->fetch_add(increment);
+}
+
+inline Atomic64 Acquire_CompareAndSwap(volatile Atomic64* ptr,
+ Atomic64 old_value,
+ Atomic64 new_value) {
+ ((AtomicLocation64)ptr)
+ ->compare_exchange_strong(old_value,
+ new_value,
+ std::memory_order_acquire,
+ std::memory_order_acquire);
+ return old_value;
+}
+
+inline Atomic64 Release_CompareAndSwap(volatile Atomic64* ptr,
+ Atomic64 old_value,
+ Atomic64 new_value) {
+ ((AtomicLocation64)ptr)
+ ->compare_exchange_strong(old_value,
+ new_value,
+ std::memory_order_release,
+ std::memory_order_relaxed);
+ return old_value;
+}
+
+inline void NoBarrier_Store(volatile Atomic64* ptr, Atomic64 value) {
+ ((AtomicLocation64)ptr)->store(value, std::memory_order_relaxed);
+}
+
+inline void Acquire_Store(volatile Atomic64* ptr, Atomic64 value) {
+ ((AtomicLocation64)ptr)->store(value, std::memory_order_relaxed);
+ MemoryBarrier();
+}
+
+inline void Release_Store(volatile Atomic64* ptr, Atomic64 value) {
+ ((AtomicLocation64)ptr)->store(value, std::memory_order_release);
+}
+
+inline Atomic64 NoBarrier_Load(volatile const Atomic64* ptr) {
+ return ((AtomicLocation64)ptr)->load(std::memory_order_relaxed);
+}
+
+inline Atomic64 Acquire_Load(volatile const Atomic64* ptr) {
+ return ((AtomicLocation64)ptr)->load(std::memory_order_acquire);
+}
+
+inline Atomic64 Release_Load(volatile const Atomic64* ptr) {
+ MemoryBarrier();
+ return ((AtomicLocation64)ptr)->load(std::memory_order_relaxed);
+}
+
+#endif // defined(ARCH_CPU_64_BITS)
+}
+} // namespace base::subtle
+
+#endif // BASE_ATOMICOPS_INTERNALS_PORTABLE_H_