Conditions for Strong Synchronization In Concurrent Data Types

Similar documents
Laws of Order: Expensive Synchronization in Concurrent Algorithms Cannot be Eliminated

Transactional Support for SDN Control Planes "

The Computability of Relaxed Data Structures: Queues and Stacks as Examples

Hagit Attiya and Eshcar Hillel. Computer Science Department Technion

Linked Lists, Stacks, Queues, Deques. It s time for a chainge!

CHAPTER 4 ESSENTIAL DATA STRUCTRURES

SSC - Concurrency and Multi-threading Java multithreading programming - Synchronisation (I)

Monitors, Java, Threads and Processes

Data Structures and Algorithms

Data Structures and Algorithms Stacks and Queues

Concurrent Data Structures

Wait-Free Queues With Multiple Enqueuers and Dequeuers

Overview Motivating Examples Interleaving Model Semantics of Correctness Testing, Debugging, and Verification

Avoid a single point of failure by replicating the server Increase scalability by sharing the load among replicas

INSTRUCTION LEVEL PARALLELISM PART VII: REORDER BUFFER

Linked Lists Linked Lists, Queues, and Stacks

Common Data Structures

Chapter 3: Restricted Structures Page 1

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++

Compiling Object Oriented Languages. What is an Object-Oriented Programming Language? Implementation: Dynamic Binding

7.1 Our Current Model

Bounded Cost Algorithms for Multivalued Consensus Using Binary Consensus Instances

Dynamic Load Balancing using Graphics Processors

1 The Java Virtual Machine

Universidad Carlos III de Madrid

Scheduling Task Parallelism" on Multi-Socket Multicore Systems"

This lecture. Abstract data types Stacks Queues. ADTs, Stacks, Queues Goodrich, Tamassia

D06 PROGRAMMING with JAVA

Timing of a Disk I/O Transfer

CS104: Data Structures and Object-Oriented Design (Fall 2013) October 24, 2013: Priority Queues Scribes: CS 104 Teaching Team

A Comparison Of Shared Memory Parallel Programming Models. Jace A Mogill David Haglin

QUEUES. Primitive Queue operations. enqueue (q, x): inserts item x at the rear of the queue q

Intro to GPU computing. Spring 2015 Mark Silberstein, , Technion 1

Geo-Replication in Large-Scale Cloud Computing Applications

Design Patterns in C++

A Comparison of Task Pools for Dynamic Load Balancing of Irregular Algorithms

Parallel Programming

Chapter 11 I/O Management and Disk Scheduling

Queues and Stacks. Atul Prakash Downey: Chapter 15 and 16

5. A full binary tree with n leaves contains [A] n nodes. [B] log n 2 nodes. [C] 2n 1 nodes. [D] n 2 nodes.

Replication on Virtual Machines

OPTIMIZE DMA CONFIGURATION IN ENCRYPTION USE CASE. Guillène Ribière, CEO, System Architect

Data Structures and Data Manipulation

Redo Recovery after System Crashes

Operating System: Scheduling

MAX = 5 Current = 0 'This will declare an array with 5 elements. Inserting a Value onto the Stack (Push)

What is a Stack? Stacks and Queues. Stack Abstract Data Type. Java Interface for Stack ADT. Array-based Implementation

Disks and RAID. Profs. Bracy and Van Renesse. based on slides by Prof. Sirer

Chapter 11 I/O Management and Disk Scheduling

Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C

Abstract Data Type. EECS 281: Data Structures and Algorithms. The Foundation: Data Structures and Abstract Data Types

I/O Management. General Computer Architecture. Goals for I/O. Levels of I/O. Naming. I/O Management. COMP755 Advanced Operating Systems 1

MEMORY MODEL SENSITIVE ANALYSIS OF CONCURRENT DATA TYPES. Sebastian Burckhardt. Computer and Information Science

Algorithms and Data Structures

Interpreters and virtual machines. Interpreters. Interpreters. Why interpreters? Tree-based interpreters. Text-based interpreters

STACKS,QUEUES, AND LINKED LISTS

Graph Analytics in Big Data. John Feo Pacific Northwest National Laboratory

How To Make A Correct Multiprocess Program Execute Correctly On A Multiprocedor

Solution: start more than one instruction in the same clock cycle CPI < 1 (or IPC > 1, Instructions per Cycle) Two approaches:

DATA STRUCTURES USING C

The Quest for Speed - Memory. Cache Memory. A Solution: Memory Hierarchy. Memory Hierarchy

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture

RAID HARDWARE. On board SATA RAID controller. RAID drive caddy (hot swappable) SATA RAID controller card. Anne Watson 1

Data Structures. Jaehyun Park. CS 97SI Stanford University. June 29, 2015

DATA STRUCTURE - STACK

Optimizing Parallel Reduction in CUDA. Mark Harris NVIDIA Developer Technology

The Correctness Criterion for Deferred Update Replication

Stacks. Stacks (and Queues) Stacks. q Stack: what is it? q ADT. q Applications. q Implementation(s) CSCU9A3 1

Layered Approach to Development of OO War Game Models Using DEVS Framework

22c:31 Algorithms. Ch3: Data Structures. Hantao Zhang Computer Science Department

Concepts of Concurrent Computation

Database Concurrency Control and Recovery. Simple database model

Chapter 6 Concurrent Programming

Concurrent programming in Java

Chapter 3 Operating-System Structures

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 5 - DBMS Architecture

How To Write A Multi Threaded Software On A Single Core (Or Multi Threaded) System

On Sorting and Load Balancing on GPUs

picojava TM : A Hardware Implementation of the Java Virtual Machine

1. The memory address of the first element of an array is called A. floor address B. foundation addressc. first address D.

Last not not Last Last Next! Next! Line Line Forms Forms Here Here Last In, First Out Last In, First Out not Last Next! Call stack: Worst line ever!

Practical Performance Understanding the Performance of Your Application

Facing the Challenges for Real-Time Software Development on Multi-Cores

Computer Architecture

Analyzing IBM i Performance Metrics

SHARED HASH TABLES IN PARALLEL MODEL CHECKING

Sequential Data Structures

02 B The Java Virtual Machine

Angelika Langer The Art of Garbage Collection Tuning

Transcription:

Conditions for Strong Synchronization In Concurrent Data Types Maged Michael IBM TJ Watson Research Center Joint work with Martin Vechev and Vijay Sarsawat Joint work with Hagit Attiya, Rachid Guerraoui, Danny Hendler, Petr Kuznetsov, and Martin Vechev Dagstuhl Seminar on Consistency in Distributed Systems February 2013 1 Maged Michael Conditions for Strong Synchronization

Idempotent Work Stealing Joint work with Martin Vechev and Vijay Saraswat 2 Maged Michael Conditions for Strong Synchronization

Work Stealing Load Balancing T1 T2 Work stealing is a load balancing technique Three operations: Put a task in own work set Take a task from own work set Steal a task from another thread s work set T1 s Work Put T1 Steal T2 Take T2 s Work No More Work 3 Maged Michael Conditions for Strong Synchronization

Idempotent Work Stealing Observation: Some application semantics can tolerate the repetition of tasks. Such tasks are idempotent tasks. Conventional work stealing Each inserted task is eventually extracted exactly once Idempotent work stealing Each inserted task is eventually extracted at least once Example: Take and steal extract same task T1 Take t T1 s Work t Steal T2 Inserted once, extracted twice the same task 4 Maged Michael Conditions for Strong Synchronization

Work Stealing Algorithms Arora+ 1998, Frigo+ 1998, Hendler+ 2002, 2006, Chase-Lev 2005 Prior algorithms require a store-load ordering in the owner s critical path Example from Chase-Lev 2005 The owner s take operation public Object popbottom() { 20 long b = this.bottom; 21 CircularArray a = this.activearray; 22 b = b - 1; 23 this.bottom = b; store 24 long t = this.top; load... Store-load fence instructions and atomic instructions are typically slower than regular memory access instructions 5 Maged Michael Conditions for Strong Synchronization

Opportunity Design idempotent work stealing to exploit relaxed application semantics Owner s critical path uses no store-load fences and no atomic operations 6 Maged Michael Conditions for Strong Synchronization

Guarantees: Idempotent Work Stealing Algorithms No lost tasks No garbage tasks extracted Owner never extracts the same task twice Three Algorithms with three extraction policies LIFO FIFO H T H T Put Take Steal Put Put Take Take Steal Steal Double Ended LIFO: Owner and thieves extract tasks from tail FIFO: Owner and thieves extract tasks from head Double-Ended: Owner extracts from tail. Thieves extract from head. H T 7 Maged Michael Conditions for Strong Synchronization

Structures anchor: <integer,integer> // <tail,tag> tasks: task array LIFO Algorithm packed word tail tag 2 98765 Put (task) No StoreLoad order and no atomic ops by owner 1 <t,g> := anchor 2 if (t == tasks.size) EXPAND... 3 tasks.array[t] := task 4 anchor := <t+1,g+1> Steal () 1 <t,g> := anchor 2 if (t == 0) return EMPTY Take () 1 <t,g> := anchor 2 if (t == 0) return EMPTY 3 task := tasks.array[t-1] 4 anchor := <t-1,g> 5 return task Order read in 1 before read in 3 3 a := tasks 4 task := a.array[t-1] Order read in 4 before CAS in 5 5 if!cas(anchor,<t,g>,<t-1,g>) CONFLICT... 6 return task only thieves need atomic ops 8 Maged Michael Conditions for Strong Synchronization

How steals may be lost LIFO Algorithm Losing Steals Put (Y) 1 1 <t,g> := anchor t,g == 2,100 2 if (t == capacity) EXPAND... 3 3 tasks[t] := task tasks[2] == Y Order write in 3 before write in 4 4 4 anchor := <t+1,g+1> anchor == 3,101 Tail Tag 21 3 100 101 W T X T Y 2 Steal X The steal of X is lost Similarly, steals concurrent with a slow take may be lost But, steals not concurrent with owner ops are never lost 9 Maged Michael Conditions for Strong Synchronization

FIFO Algorithm Structures head: integer tail: integer tasks: task array Put (task) 1 h := head 2 t := tail 3 if (t == h + tasks.size) EXPAND... 4 tasks.array[t % tasks.size] := task 5 tail := t + 1 Take () 1 h := head 2 t := tail 3 if (t == h) return EMPTY 4 task := tasks.array[h % tasks.size] 5 head := h + 1 6 return task No packed tags. No size limit No StoreLoad order and no atomic ops by owner Steal () head tail 1 h := head Order read in 1 before read in 2 2 t := tail 3 if (t == h) return EMPTY Order read in 1 before read in 4 4 a := tasks 5 task := a.array[h % a.size] Order read in 5 before CAS in 6 6 if!cas(head,h,h+1) CONFLICT... 7 return task only thieves need atomic ops 10 Maged Michael Conditions for Strong Synchronization

Double-Ended Algorithm Structures anchor: <integer,integer,integer> // <head,size,tag> tasks: task array Put (task) 1 <h,s,g> := anchor No StoreLoad order and no atomic ops by owner packed word limited max size head size tag 1 2 9876 head head + size 2 if (s == tasks.size) EXPAND... 3 tasks.array[h+s % tasks.size] := task Order write in 3 before write in 4 4 anchor := <h,s+1,g+1> Steal () 1 <h,s,g> := anchor 2 if (s == 0) return EMPTY Order read in 1 before read in 3 Take () 1 <h,s,g> := anchor 2 if (s == 0) return EMPTY 3 task := tasks.array[h+s-1 % tasks.size] 4 anchor := <h,s-1,g> 5 return task 3 a := tasks 4 task := a.array[h % a.size] 5 h2 := h+1 % MAXSIZE Order read in 4 before CAS in 6 6 if!cas(anchor,<h,s,g>,<h2,s-1,g>) CONFLICT 7 return task 11 Maged Michael Conditions for Strong Synchronization

Conditions for Strong Synchronization Joint work with Hagit Attiya, Rachid Guerraoui, Danny Hendler, Petr Kuznetsov, and Martin Vechev 12 Maged Michael Conditions for Strong Synchronization

Motivation There are good algorithms with good features except for the requirement of strong synchronization Read After Write (RAW) Order StoreLoad order Atomic Write After Read (AWAR) Strong synchronization is typically slower than regular instructions Are there conditions under which the avoidance of both RAW and AWAR is impossible? 13 Maged Michael Conditions for Strong Synchronization

Strong Non-Commutativity Given a sequential specification Spec, a complete invocation s 1 of a method m 1 is strongly non-commutative (SNC) if there exist a method m 2, histories base and s 2, such that: s 2 is a complete invocation of m 2 processes executing s 1 and s 2 differ base is a complete sequential history base Spec, base s 1 Spec, base s 2 Spec base s 1 s 2 Spec, base s 2 s 1 Spec s 1 influences s 2 AND s 2 influences s 1 Any SNC invocation must contain strong synchronization Acknowledgement to Sebastian Burckhardt for suggesting the improved form of the SNC definition 14 Maged Michael Conditions for Strong Synchronization

Common Examples of SNC Invocations Set Add(v) : true // Returns true iff v was not in the set Remove(v) : true // Returns true iff v was in the set LIFO Stack Pop() : Nonempty Value FIFO Queue Dequeue() : Nonempty Value CAS Data Type CAS(expected,newval) : true Work Stealing Take() : Nonempty Task Steal() : Nonempty Task Counter FetchAndAdd(v) : value 15 Maged Michael Conditions for Strong Synchronization

Avoiding SNC: Limited Concurrency E.g., single-consumer FIFO queue Successful multi-consumer dequeue is SNC Successful single-consumer dequeue is not SNC 16 Maged Michael Conditions for Strong Synchronization

Avoiding SNC: Limited API E.g., Set Add without return value Set Add : true is SNC Set Add : void is not SNC E.g., Counter Add without returning value FetchAndAdd : integer is SNC AtomicAdd : void is not SNC 17 Maged Michael Conditions for Strong Synchronization

Avoiding SNC: Idempotent Types E.g., idempotent work stealing Conventional (non-idempotent) Take is SNC SNC with Steal Idempotent Take is not SNC 18 Maged Michael Conditions for Strong Synchronization

Implications Algorithm Design Guidance on when avoiding RAW/AWAR is futile Hardware Design Added motivation to lower overheads of RAW/AWAR API Design Sometimes return values in APIs dictate RAW/AWAR Specifications and Correctness Conditions Motivation to examine requirements for linearizability Formal Verification and Algorithm Synthesis Avoid useless work on non-raw/awar algorithms when RAW/AWAR is required THANK YOU 19 Maged Michael Conditions for Strong Synchronization