CppCon 2019 has ended
Back To Schedule
Wednesday, September 18 • 14:00 - 15:00
Abusing Your Memory Model for Fun and Profit

Log in to save this to your schedule, view media, leave feedback and see who's attending!

The most efficient concurrent C++ data structures used in the wild today usually achieve break-neck performance by either constraining their workload or constraining correctness to a particular memory model. The audience will learn about the Wild West of abusing memory models for performance and simplification, through real world examples. Non-blocking data structures and their benefits often come at the cost of increased latency because they require additional complexity in the common case. There are plenty of exceptions to this if the requirements of the data structure are relaxed, such as supporting only a bounded level of write or read concurrency or if correctness is constrained to a particular memory model. For this reason, well-designed specialized non-blocking data structures guarantee improved resiliency, throughput and latency in all cases compared to alternatives relying on traditional concurrency primitives. Specialized concurrent structures are common place in the Linux kernel and other performance critical systems.

You will learn about foundational concepts to understanding your underlying hardware's memory model and abusing memory models for fun and profit:
* Cache coherency
* Store Buffers
* Pipelines and speculative execution

This talk provides real-world examples that exploit the x86-TSO model to their advantage:
* A general technique to turn literally, any, open-addressed hash table into a concurrent hash table with low to negligible (near 0) cost. The transformation makes your hash table wait-free for writers and mostly wait-free for readers (lock-free in hypothetical worse cases) and is practical for languages such as C++. The mechanism is superior to the previously popular Azure lock-free hash table and even more importantly, practical for any non-garbage-collected environment. The overhead is negligible on TSO and low on non-TSO.
* Blazingly fast event counters. An extremely efficient replacement for condition variables is introduced and faster than any other alternative. This is implemented without requiring any heavy-weight atomic operations on the fast path by exploiting properties of the x86-TSO model.
* Scalable memory management: Exploit the ordering and visibility constraints of the underlying architecture for blazingly fast implementations of RCU and other safe memory reclamation schemes.
* and more.

avatar for Samy Al Bahra

Samy Al Bahra

CTO, Backtrace
Samy Al Bahra is the cofounder of Backtrace, where he is helping build a modern debugging platform for today’s complex applications. Prior to Backtrace, Samy was a principal engineer at AppNexus, where he played a lead role in the architecture and development of many mission-critical... Read More →

Paul Khuong

Vice President, Not Google
After toiling on his dissertation about mathematical optimisation methods for large-scale network design, Paul Khuong has spent the majority of his professional and hobbyist life reverse engineering and modernising C, C++, and Common Lisp legacy systems. When not engaged in code archaeology... Read More →

Wednesday September 18, 2019 14:00 - 15:00 MDT
Aurora C
  • Concurrency