Concurrency and Parallelism
|
|
|
- Leonard Dixon
- 10 years ago
- Views:
Transcription
1 Concurrency and Parallelism 480/571 G. Castagna (CNRS) Cours de Programmation Avancée 480 / 571
2 Outline 39 Concurrency 40 Preemptive multi-threading 41 Locks, Conditional Variables, Monitors 42 Doing without mutual exclusion 43 Cooperative multi-threading 44 Channeled communication 45 Software Transactional Memory 481/571 G. Castagna (CNRS) Cours de Programmation Avancée 481 / 571
3 Outline 39 Concurrency 40 Preemptive multi-threading 41 Locks, Conditional Variables, Monitors 42 Doing without mutual exclusion 43 Cooperative multi-threading 44 Channeled communication 45 Software Transactional Memory 482/571 G. Castagna (CNRS) Cours de Programmation Avancée 482 / 571
4 Concurrent vs parallel Concurrency Do many unrelated things at once Goals are expressiveness, responsiveness, and multitasking Parallelism Get a faster answer with multiple CPUs 483/571 G. Castagna (CNRS) Cours de Programmation Avancée 483 / 571
5 Concurrent vs parallel Concurrency Do many unrelated things at once Goals are expressiveness, responsiveness, and multitasking Parallelism Get a faster answer with multiple CPUs Here we will focus on the concurrency part (at least for this year) 483/571 G. Castagna (CNRS) Cours de Programmation Avancée 483 / 571
6 Threads Thread Threads are sequential computations that share memory. Threads of control (also called lightweight process) execute concurrently in the same memory space. Threads communicate by in-place modification of shared data structures, or by sending and receiving data on communication channels. 484/571 G. Castagna (CNRS) Cours de Programmation Avancée 484 / 571
7 Threads Thread Threads are sequential computations that share memory. Threads of control (also called lightweight process) execute concurrently in the same memory space. Threads communicate by in-place modification of shared data structures, or by sending and receiving data on communication channels. Two kinds of threads 1 Native threads (a.k.a. OS Threads). They are directly handled by the OS. Compatible with multiprocessors and low level processor capabilities Better handling of input/output. Compatible with native code. 2 Green threads (a.k.a. light threads or user space threads). They are handled by the virtual machine. More lightweight: context switch is much faster, much more threads can coexist. They are portable but must be executed in the VM. Input/outputs must be asynchronous. 484/571 G. Castagna (CNRS) Cours de Programmation Avancée 484 / 571
8 Which threads for whom Green threads To be used if: Don t want to wait for user input or blocking operations Need a lot of threads and need to switch from one to another rapidly Don t care about about using multiple CPUs, since "the machine spends most of its time waiting on the user anyway". Typical usage: a web server. 485/571 G. Castagna (CNRS) Cours de Programmation Avancée 485 / 571
9 Which threads for whom Green threads To be used if: Don t want to wait for user input or blocking operations Need a lot of threads and need to switch from one to another rapidly Don t care about about using multiple CPUs, since "the machine spends most of its time waiting on the user anyway". Typical usage: a web server. Native threads To be used if: Don t want to wait for long running computations Either long running computation must advance at the same time or, better, run in parallel on multiple processors and actually finish faster Typical usage: heavy computations 485/571 G. Castagna (CNRS) Cours de Programmation Avancée 485 / 571
10 Haskell mixed solution Native threads: 1:1 Threre is a one-to-one correspondence between the application-level threads and the kernel threads Green threads: N:1 the program threads are managed in the user space. The kernel is not aware of them, so all application-level threads are mapped to a single kernel thread. 486/571 G. Castagna (CNRS) Cours de Programmation Avancée 486 / 571
11 Haskell mixed solution Native threads: 1:1 Threre is a one-to-one correspondence between the application-level threads and the kernel threads Green threads: N:1 the program threads are managed in the user space. The kernel is not aware of them, so all application-level threads are mapped to a single kernel thread. Haskell and Erlang solution: Hybrid threads: N:M (with N M ) Intermediate solution: spawn a whole bunch of lightweight green threads, but the interpreter schedules these threads onto a smaller number of native threads. Can exploit multi-core, multi-processor architectures Avoids to block all the threads on a blocking call Hard to implement in particular the scheduling. When using blocking system calls you actually need to notify somehow kernel to block only one green thread and not kernel one. 486/571 G. Castagna (CNRS) Cours de Programmation Avancée 486 / 571
12 Multi-threading Two kinds of multi-threading 1 Preemptive threading: A scheduler handles thread executions. Each thread is given a maximum time quantum and it is interrupted either because it finished its time slice or because it requests a slow operation (e.g., I/O, page-faulting memory access...) 2 Cooperative threading: Each thread keeps control until either it explicitly handles it to another thread or it execute an asynchronous operation (e.g. I/O). 487/571 G. Castagna (CNRS) Cours de Programmation Avancée 487 / 571
13 Multi-threading Two kinds of multi-threading 1 Preemptive threading: A scheduler handles thread executions. Each thread is given a maximum time quantum and it is interrupted either because it finished its time slice or because it requests a slow operation (e.g., I/O, page-faulting memory access...) 2 Cooperative threading: Each thread keeps control until either it explicitly handles it to another thread or it execute an asynchronous operation (e.g. I/O). Possible combinations 1 Green threads are mostly preemptive, but several implementations of cooperative green threads are available (eg, thelwt library in OCaml and the Coro module in Perl). 2 OS threads are nearly always preemptive since on a cooperative OS all applications must be programmed fairly and pass the hand to other applications from time to time 487/571 G. Castagna (CNRS) Cours de Programmation Avancée 487 / 571
14 Outline 39 Concurrency 40 Preemptive multi-threading 41 Locks, Conditional Variables, Monitors 42 Doing without mutual exclusion 43 Cooperative multi-threading 44 Channeled communication 45 Software Transactional Memory 488/571 G. Castagna (CNRS) Cours de Programmation Avancée 488 / 571
15 Shared memory model/ Process synchronization Threads/processes are defined to achieve together a common goal therefore they do not live in isolation: 489/571 G. Castagna (CNRS) Cours de Programmation Avancée 489 / 571
16 Shared memory model/ Process synchronization Threads/processes are defined to achieve together a common goal therefore they do not live in isolation: To ensure that the goal is achieved threads/processes must synchronize. The purpose of process synchronization is to enforce constraints such as: Serialization: this part of thread A must happen before this part of thread B. Mutual exclusion: no two threads can execute this concurrently. 489/571 G. Castagna (CNRS) Cours de Programmation Avancée 489 / 571
17 Shared memory model/ Process synchronization Threads/processes are defined to achieve together a common goal therefore they do not live in isolation: To ensure that the goal is achieved threads/processes must synchronize. The purpose of process synchronization is to enforce constraints such as: Serialization: this part of thread A must happen before this part of thread B. Mutual exclusion: no two threads can execute this concurrently. Several software tools are available to build synchronization policies for shared memory accesses: Semaphores Locks / Mutexes / Spinlocks Condition variables Barriers Monitors 489/571 G. Castagna (CNRS) Cours de Programmation Avancée 489 / 571
18 Concurrent events Two events are concurrent if we cannot tell by looking at the program which will happen first. 490/571 G. Castagna (CNRS) Cours de Programmation Avancée 490 / 571
19 Concurrent events Two events are concurrent if we cannot tell by looking at the program which will happen first. Thread A a1 x = 5 a2 print x Thread B b1 x = 7 490/571 G. Castagna (CNRS) Cours de Programmation Avancée 490 / 571
20 Concurrent events Two events are concurrent if we cannot tell by looking at the program which will happen first. Thread A a1 x = 5 a2 print x Thread B b1 x = 7 Possible outcomes: output 5 and final value forx=7(eg, a1 a2 b1) output 7 and final value forx=7(eg, a1 b1 a2) output 5 and final value forx=5(eg, b1 a1 a2) 490/571 G. Castagna (CNRS) Cours de Programmation Avancée 490 / 571
21 Concurrent events Two events are concurrent if we cannot tell by looking at the program which will happen first. Thread A a1 x = 5 a2 print x Thread B b1 x = 7 Possible outcomes: output 5 and final value forx=7(eg, a1 a2 b1) output 7 and final value forx=7(eg, a1 b1 a2) output 5 and final value forx=5(eg, b1 a1 a2) Thread A x = x + 1 Thread A x = x /571 G. Castagna (CNRS) Cours de Programmation Avancée 490 / 571
22 Concurrent events Two events are concurrent if we cannot tell by looking at the program which will happen first. Thread A a1 x = 5 a2 print x Thread B b1 x = 7 Possible outcomes: output 5 and final value forx=7(eg, a1 a2 b1) output 7 and final value forx=7(eg, a1 b1 a2) output 5 and final value forx=5(eg, b1 a1 a2) Thread A x = x + 1 Thread A x = x + 1 If initiallyx = 0 then bothx = 1 andx = 2 are possible outcomes 490/571 G. Castagna (CNRS) Cours de Programmation Avancée 490 / 571
23 Concurrent events Two events are concurrent if we cannot tell by looking at the program which will happen first. Thread A a1 x = 5 a2 print x Thread B b1 x = 7 Possible outcomes: output 5 and final value forx=7(eg, a1 a2 b1) output 7 and final value forx=7(eg, a1 b1 a2) output 5 and final value forx=5(eg, b1 a1 a2) Thread A x = x + 1 Thread A x = x + 1 If initiallyx = 0 then bothx = 1 andx = 2 are possible outcomes Reason: The increment may be not atomic: (t read x;x read t) For instance, in some assembler,lda $44; ADC #$01; STA $44 instead ofinc $44 490/571 G. Castagna (CNRS) Cours de Programmation Avancée 490 / 571
24 Model of execution We must define the model of execution On some machinesx++ is atomic But let us not count on it: we do not want to write specialized code for each different hardware. Assume (rather pessimistically) that: Result of concurrent writes is undefined. Result of concurrent read-write is undefined. Concurrent reads are ok. Threads can be interrupted at any time (preemptive multi-threading). 491/571 G. Castagna (CNRS) Cours de Programmation Avancée 491 / 571
25 Model of execution We must define the model of execution On some machinesx++ is atomic But let us not count on it: we do not want to write specialized code for each different hardware. Assume (rather pessimistically) that: Result of concurrent writes is undefined. Result of concurrent read-write is undefined. Concurrent reads are ok. Threads can be interrupted at any time (preemptive multi-threading). To solve synchronization problems let us first consider a very simple and universal software synchronization tool: semaphores 491/571 G. Castagna (CNRS) Cours de Programmation Avancée 491 / 571
26 Semaphore Semaphores are Simple. The concept is a little bit harder than that of a variable. Versatile. You can pretty much solve all synchronization problems by semaphores. Error-prone. They are so low level that they tend to be error-prone. 492/571 G. Castagna (CNRS) Cours de Programmation Avancée 492 / 571
27 Semaphore Semaphores are Simple. The concept is a little bit harder than that of a variable. Versatile. You can pretty much solve all synchronization problems by semaphores. Error-prone. They are so low level that they tend to be error-prone. We start by them because: They are good for learning to think about synchronization However: They are not the best choice for common use-cases (you d better use specialized tools for specific problems, such as mutexes, conditionals, monitors, etc). 492/571 G. Castagna (CNRS) Cours de Programmation Avancée 492 / 571
28 Semaphore Semaphores are Simple. The concept is a little bit harder than that of a variable. Versatile. You can pretty much solve all synchronization problems by semaphores. Error-prone. They are so low level that they tend to be error-prone. We start by them because: They are good for learning to think about synchronization However: They are not the best choice for common use-cases (you d better use specialized tools for specific problems, such as mutexes, conditionals, monitors, etc). Definition (Dijkstra 1965) A semaphore is an integers 0 with two operations P and S: P(s) : ifs>0 thens-- else the caller is suspended S(s) : if there is a suspended process, then resume it elses++ 492/571 G. Castagna (CNRS) Cours de Programmation Avancée 492 / 571
29 Semaphore object In Python: A semaphore is a class encapsulating an integer with two methods: Semaphore(n) initialize the counter to n (default is 1). acquire() if the internal counter is larger than zero on entry, decrement it by one and return immediately. If it is zero on entry, block, waiting until some other thread has called release(). The order in which blocked threads are awakened is not specified. release() If another thread is waiting for it to become larger than zero again, wake up that thread otherwise increment the internal counter 493/571 G. Castagna (CNRS) Cours de Programmation Avancée 493 / 571
30 Semaphore object In Python: A semaphore is a class encapsulating an integer with two methods: Semaphore(n) initialize the counter to n (default is 1). acquire() if the internal counter is larger than zero on entry, decrement it by one and return immediately. If it is zero on entry, block, waiting until some other thread has called release(). The order in which blocked threads are awakened is not specified. release() If another thread is waiting for it to become larger than zero again, wake up that thread otherwise increment the internal counter Variations that can be met in other languages: wait(),signal() (I will use this pair, because of the signaling pattern). negative counter to count the process awaiting at the semaphore. 493/571 G. Castagna (CNRS) Cours de Programmation Avancée 493 / 571
31 Semaphore object In Python: A semaphore is a class encapsulating an integer with two methods: Semaphore(n) initialize the counter to n (default is 1). acquire() if the internal counter is larger than zero on entry, decrement it by one and return immediately. If it is zero on entry, block, waiting until some other thread has called release(). The order in which blocked threads are awakened is not specified. release() If another thread is waiting for it to become larger than zero again, wake up that thread otherwise increment the internal counter Variations that can be met in other languages: wait(),signal() (I will use this pair, because of the signaling pattern). negative counter to count the process awaiting at the semaphore. Notice: noget method (to return the value of the counter). Why? 493/571 G. Castagna (CNRS) Cours de Programmation Avancée 493 / 571
32 Semaphores to enforce Serialization Problem: Thread A statement a1 Thread B statement b1 How do we enforce the constraint: «a1 before b1»? 494/571 G. Castagna (CNRS) Cours de Programmation Avancée 494 / 571
33 Semaphores to enforce Serialization Problem: Thread A statement a1 Thread B statement b1 How do we enforce the constraint: «a1 before b1»? The signaling pattern: sem = Semaphore(0) Thread A statement a1 sem.signal() Thread B sem.wait() statement b1 You can think ofsemaphore(0) as a locked lock. 494/571 G. Castagna (CNRS) Cours de Programmation Avancée 494 / 571
34 Semaphores to enforce Mutual Exclusion Problem: Thread A x = x + 1 Thread B x = x + 1 Concurrent execution is non-deterministic How can we avoid concurrency? 495/571 G. Castagna (CNRS) Cours de Programmation Avancée 495 / 571
35 Semaphores to enforce Mutual Exclusion Problem: Thread A x = x + 1 Thread B x = x + 1 Concurrent execution is non-deterministic How can we avoid concurrency? Solution: mutex = Semaphore(1) Thread A mutex.wait() x = x + 1 mutex.signal() Code betweenwait andsignal is atomic. Thread B mutex.wait() x = x + 1 mutex.signal() 495/571 G. Castagna (CNRS) Cours de Programmation Avancée 495 / 571
36 More synch problems: readers and writers Problem: Threads are either writers or readers: Only one writer can write concurrently A reader cannot read concurrently with a writer Any number of readers can read concurrently 496/571 G. Castagna (CNRS) Cours de Programmation Avancée 496 / 571
37 More synch problems: readers and writers Problem: Threads are either writers or readers: Only one writer can write concurrently A reader cannot read concurrently with a writer Any number of readers can read concurrently Solution: readers = 0 mutex = Semaphore(1) roomempty = Semaphore(1) Writer threads roomempty.wait() critical section for writers roomempty.signal() Reader threads mutex.wait() readers += 1 if readers == 1: roomempty.wait() # first in lock mutex.signal() critical section for readers mutex.wait() readers -= 1 if readers == 0: roomempty.signal() # last out unlk mutex.signal() G. Castagna (CNRS) Cours de Programmation Avancée 496 / /571
38 Readers and writers Let us look for some common patterns The scoreboard pattern (readers) Check in Update state on the scoreboard (number of readers) make some conditional behavior check out The turnstile pattern (writer) Threads go through the turnstile serially One blocks, all wait It passes, it unblocks Other threads (ie, the readers) can lock the turnstile 497/571 G. Castagna (CNRS) Cours de Programmation Avancée 497 / 571
39 Readers and writers Readers while checking in/out implement the lightswitch pattern: The first person that enters the room switch the light on (acquires the lock) The last person that exits the room switch the light off (releases the lock) 498/571 G. Castagna (CNRS) Cours de Programmation Avancée 498 / 571
40 Readers and writers Readers while checking in/out implement the lightswitch pattern: The first person that enters the room switch the light on (acquires the lock) The last person that exits the room switch the light off (releases the lock) Implementation: class Lightswitch: def init (self): self.counter = 0 self.mutex = Semaphore(1) def lock(self, semaphore): self.mutex.wait() self.counter += 1 if self.counter == 1: semaphore.wait() self.mutex.signal() def unlock(self, semaphore): self.mutex.wait() self.counter -= 1 if self.counter == 0: semaphore.signal() self.mutex.signal() 498/571 G. Castagna (CNRS) Cours de Programmation Avancée 498 / 571
41 Before: readers = 0 mutex = Semaphore(1) roomempty = Semaphore(1) Writer threads roomempty.wait() critical section for writers roomempty.signal() Reader threads mutex.wait() readers += 1 if readers == 1: roomempty.wait() # first in lock mutex.signal() critical section for readers mutex.wait() readers -= 1 if readers == 0: roomempty.signal() # last out unlk mutex.signal() 499/571 G. Castagna (CNRS) Cours de Programmation Avancée 499 / 571
42 Before: readers = 0 mutex = Semaphore(1) roomempty = Semaphore(1) Writer threads roomempty.wait() critical section for writers roomempty.signal() After: readlightswitch = Lightswitch() roomempty = Semaphore(1) Reader threads mutex.wait() readers += 1 if readers == 1: roomempty.wait() # first in lock mutex.signal() critical section for readers mutex.wait() readers -= 1 if readers == 0: roomempty.signal() # last out unlk mutex.signal() Writer threads roomempty.wait() critical section for writers roomempty.signal() Reader threads readlightswitch.lock(roomempty) critical section for readers readlightswitch.unlock(roomempty) 499/571 G. Castagna (CNRS) Cours de Programmation Avancée 499 / 571
43 Programming golden rules When programming becomes too complex then: 1 Abstract common patterns 2 Split it in more elementary problems The previous case was an example of abstraction. Next we are going to see an example of modularization, where we combine our elementary patterns to solve more complex problems 500/571 G. Castagna (CNRS) Cours de Programmation Avancée 500 / 571
44 The unisex bathroom problem A women at Xerox was working in a cubicle in the basement, and the nearest women s bathroom was two floors up. She proposed to the Uberboss that they convert the men s bathroom on her floor to a unisex bathroom. The Uberboss agreed, provided that the following synchronization constraints can be maintained: 1 There cannot be men and women in the bathroom at the same time. 2 There should never be more than three employees squandering company time in the bathroom. You may assume that the bathroom is equipped with all the semaphores you need. 501/571 G. Castagna (CNRS) Cours de Programmation Avancée 501 / 571
45 The unisex bathroom problem Solution hint: empty = Semaphore(1) maleswitch = Lightswitch() femaleswitch = Lightswitch() malemultiplex = Semaphore(3) femalemultiplex = Semaphore(3) empty is 1 if the room is empty and 0 otherwise. maleswitch allows men to bar women from the room. When the first male enters, the lightswitch locksempty, barring women; When the last male exits, it unlocksempty, allowing women to enter. Women do likewise usingfemaleswitch. malemultiplex andfemalemultiplex ensure that there are no more than three men and three women in the system at a time (they are semaphores used as locks). 502/571 G. Castagna (CNRS) Cours de Programmation Avancée 502 / 571
46 The unisex bathroom problem A solution: Female Threads femaleswitch.lock(empty) femalemultiplex.wait() bathroom code here femalemultiplex.signal() femaleswitch.unlock(empty) Male Threads maleswitch.lock(empty) malemultiplex.wait() bathroom code here malemultiplex.signal() maleswitch.unlock(empty) 503/571 G. Castagna (CNRS) Cours de Programmation Avancée 503 / 571
47 The unisex bathroom problem A solution: Female Threads femaleswitch.lock(empty) femalemultiplex.wait() bathroom code here femalemultiplex.signal() femaleswitch.unlock(empty) Male Threads maleswitch.lock(empty) malemultiplex.wait() bathroom code here malemultiplex.signal() maleswitch.unlock(empty) Any problem with this solution? 503/571 G. Castagna (CNRS) Cours de Programmation Avancée 503 / 571
48 The unisex bathroom problem A solution: Female Threads femaleswitch.lock(empty) femalemultiplex.wait() bathroom code here femalemultiplex.signal() femaleswitch.unlock(empty) Male Threads maleswitch.lock(empty) malemultiplex.wait() bathroom code here malemultiplex.signal() maleswitch.unlock(empty) Any problem with this solution? This solution allows starvation. A long line of women can arrive and enter while there is a man waiting, and vice versa. Find a solution Hint: Use a turnstile to access to the lightswitches: when a man arrives and the bathroom is already occupied by women, block turnstile so that more women cannot check the light and enter. 503/571 G. Castagna (CNRS) Cours de Programmation Avancée 503 / 571
49 The no-starve unisex bathroom problem turnstile = Semaphore(1) empty = Semaphore(1) maleswitch = Lightswitch() femaleswitch = Lightswitch() malemultiplex = Semaphore(3) femalemultiplex = Semaphore(3) Female Threads turnstile.wait() femaleswitch.lock(empty) turnstile.signal() femalemultiplex.wait() bathroom code here femalemultiplex.signal() femaleswitch.unlock (empty) Male Threads turnstile.wait() maleswitch.lock(empty) turnstile.signal() malemultiplex.wait() bathroom code here malemultiplex.signal() maleswitch.unlock (empty) 504/571 G. Castagna (CNRS) Cours de Programmation Avancée 504 / 571
50 The no-starve unisex bathroom problem turnstile = Semaphore(1) empty = Semaphore(1) maleswitch = Lightswitch() femaleswitch = Lightswitch() malemultiplex = Semaphore(3) femalemultiplex = Semaphore(3) Female Threads turnstile.wait() femaleswitch.lock(empty) turnstile.signal() femalemultiplex.wait() bathroom code here femalemultiplex.signal() femaleswitch.unlock (empty) Male Threads turnstile.wait() maleswitch.lock(empty) turnstile.signal() malemultiplex.wait() bathroom code here malemultiplex.signal() maleswitch.unlock (empty) Actually we could have used the same multiplex for both females and males. 504/571 G. Castagna (CNRS) Cours de Programmation Avancée 504 / 571
51 Summary so far Solution composed of patterns Patterns can be encapsulated as objects or modules Unisex bathroom problem is a good example of use of both abstraction and modularity (lightswitches and turnstiles) 505/571 G. Castagna (CNRS) Cours de Programmation Avancée 505 / 571
52 Summary so far Solution composed of patterns Patterns can be encapsulated as objects or modules Unisex bathroom problem is a good example of use of both abstraction and modularity (lightswitches and turnstiles) Unfortunately, patterns often interact and interfere. Hard to be confident of solutions (formal verification and test are not production-ready yet). Especially true for semaphores which are very low level: They can be used to implement more complex synchronization patterns. This makes interference much more likely. 505/571 G. Castagna (CNRS) Cours de Programmation Avancée 505 / 571
53 Summary so far Solution composed of patterns Patterns can be encapsulated as objects or modules Unisex bathroom problem is a good example of use of both abstraction and modularity (lightswitches and turnstiles) Unfortunately, patterns often interact and interfere. Hard to be confident of solutions (formal verification and test are not production-ready yet). Especially true for semaphores which are very low level: They can be used to implement more complex synchronization patterns. This makes interference much more likely. Before discussing more general problems of shared memory synchronization, let us introduced some higher-level and more specialized tools that, being more specific, make interference less likely. Locks Conditional Variables Monitors G. Castagna (CNRS) Cours de Programmation Avancée 505 / /571
54 Outline 39 Concurrency 40 Preemptive multi-threading 41 Locks, Conditional Variables, Monitors 42 Doing without mutual exclusion 43 Cooperative multi-threading 44 Channeled communication 45 Software Transactional Memory 506/571 G. Castagna (CNRS) Cours de Programmation Avancée 506 / 571
55 Locks Locks are like those on a room door: Lock acquisition: A person enters the room and locks the door. Nobody else can enter. Lock release: The person in the room exits unlocking the door. Persons are threads, rooms are critical regions. 507/571 G. Castagna (CNRS) Cours de Programmation Avancée 507 / 571
56 Locks Locks are like those on a room door: Lock acquisition: A person enters the room and locks the door. Nobody else can enter. Lock release: The person in the room exits unlocking the door. Persons are threads, rooms are critical regions. A person that finds a door locked can either wait or come later (somebody lets it know that the room is available). 507/571 G. Castagna (CNRS) Cours de Programmation Avancée 507 / 571
57 Locks Locks are like those on a room door: Lock acquisition: A person enters the room and locks the door. Nobody else can enter. Lock release: The person in the room exits unlocking the door. Persons are threads, rooms are critical regions. A person that finds a door locked can either wait or come later (somebody lets it know that the room is available). Similarly there are two possibilities for a thread that failed to acquire a lock: 1 It keeps trying. This kind of lock is a spinlock. Meaningful only on multi-processors, they are common in High performance computing (where most of the time each thread is scheduled on its own processor anyway). 2 It is suspended until somebody signals it that the lock is available. The only meaningful lock for uniprocessor. This kind of lock is also called mutex (but often mutex is used as a synonym for lock). G. Castagna (CNRS) Cours de Programmation Avancée 507 / /571
58 Difference between a mutex and a binary semaphore A mutex is different from a binary semaphore (ie, a semaphore initialized to 1), since it combines the notion of exclusivity of manipulation (as for semaphores) with others extra features such as exclusivity of possession (only the process which has taken a mutex can free it) or priority inversion protection. The differences between mutexes and semaphores are operating system/language dependent, though mutexes are implemented by specialized, faster routines. 508/571 G. Castagna (CNRS) Cours de Programmation Avancée 508 / 571
59 Difference between a mutex and a binary semaphore A mutex is different from a binary semaphore (ie, a semaphore initialized to 1), since it combines the notion of exclusivity of manipulation (as for semaphores) with others extra features such as exclusivity of possession (only the process which has taken a mutex can free it) or priority inversion protection. The differences between mutexes and semaphores are operating system/language dependent, though mutexes are implemented by specialized, faster routines. Example What follows can be done with semaphoresbut not with a mutex, since B unlocks a lock of A (cf. the signaling pattern): Thread A Thread B. some stuff. wait(s). some other stuff.. some stuff. signal(s) (* A can continue *). G. Castagna (CNRS) Cours de Programmation Avancée 508 / /571
60 Mutex Since semaphores are for what concerns mutual exclusion a simplified version of mutexes, it is clear that mutexes have operations very similar to the former: Ainit orcreate operation. Await orlock operation that tries to acquire the lock and suspends the thread if it is not available Asignal orunlock operation that releases the lock and possibly awakes a thread waiting for the lock Sometimes atrylock, that is, a non blocking locking operation that returns an error or false if the lock is not available. 509/571 G. Castagna (CNRS) Cours de Programmation Avancée 509 / 571
61 Mutex Since semaphores are for what concerns mutual exclusion a simplified version of mutexes, it is clear that mutexes have operations very similar to the former: Ainit orcreate operation. Await orlock operation that tries to acquire the lock and suspends the thread if it is not available Asignal orunlock operation that releases the lock and possibly awakes a thread waiting for the lock Sometimes atrylock, that is, a non blocking locking operation that returns an error or false if the lock is not available. A mutex is reentrant if the same thread can acquire the lock multiple times. However, the lock must be released the same number of times or else other threads will be unable to acquire the lock. Nota Bene: A reentrant mutex has some similarities to a counting semaphore: the number of lock acquisitions is the counter, but only one thread can successfully perform multiple locks (exclusivity of possession). G. Castagna (CNRS) Cours de Programmation Avancée 509 / /571
62 Implementation of locks Some examples of lock implementations: Using hardware special instructions liketest-and-set or compare-and-swap Peterson algorithm (spinlock, deadlock free) in Python: flag=[0,0] turn = 0 # initially the priority is for thread 0 Thread 0 flag[0] = 1 turn = 1 while flag[1] and turn : pass critical section flag[0] = 0 Thread 1 flag[1] = 1 turn = 0 while flag[0] and not turn : pass critical section flag[1] = 0 flag[i] == 1: Threadiwants to enter; turn == i it is the turn of Threadito enter, if it wishes. Lamport s bakery algorithm (deadlock and starvation free) Every threads modifies only its own variables and accesses to other variables only by reading. 510/571 G. Castagna (CNRS) Cours de Programmation Avancée 510 / 571
63 Condition Variables Locks provides a passive form of synchronization: they allow waiting for shared data to be free, but do not allow waiting for the data to have a particular state. Condition variables are the solution to this problem. Definition A condition variable is an atomic waiting and signaling mechanism which allows a process or thread to atomically stop execution and release a lock until a signal is received. Rationale It allows a thread to sleep inside a critical region without risk of deadlock. 511/571 G. Castagna (CNRS) Cours de Programmation Avancée 511 / 571
64 Condition Variables Locks provides a passive form of synchronization: they allow waiting for shared data to be free, but do not allow waiting for the data to have a particular state. Condition variables are the solution to this problem. Definition A condition variable is an atomic waiting and signaling mechanism which allows a process or thread to atomically stop execution and release a lock until a signal is received. Rationale It allows a thread to sleep inside a critical region without risk of deadlock. Three main operations: wait() releases the lock, gives up the CPU until signaled and then re-acquire the lock. signal() wakes up a thread waiting on the condition variable, if any. broadcast() wakes up all threads waiting on the condition. 511/571 G. Castagna (CNRS) Cours de Programmation Avancée 511 / 571
65 Condition Variables Note: The term "condition variable" is misleading: it does not rely on a variable but rather on signaling at the system level. The term comes from the fact that condition variables are most often used to notify changes in the state of shared variables, such as in Notify a writer thread that a reader thread has filled its data set. Notify consumer processes that a producer thread has updated a shared data set. 512/571 G. Castagna (CNRS) Cours de Programmation Avancée 512 / 571
66 Condition Variables Note: The term "condition variable" is misleading: it does not rely on a variable but rather on signaling at the system level. The term comes from the fact that condition variables are most often used to notify changes in the state of shared variables, such as in Notify a writer thread that a reader thread has filled its data set. Notify consumer processes that a producer thread has updated a shared data set. Semaphores and Condition variables semaphores and condition variables both usewait andsignal as valid operations, the purpose of both is somewhat similar, but they are different: With a semaphore the signal operation increments the value of the semaphore even if there is no blocked process. The signal is remembered. If there are no processes blocked on the condition variable then the signal function does nothing. The signal is not remembered. With a semaphore you must be careful about deadlocks. G. Castagna (CNRS) Cours de Programmation Avancée 512 / /571
67 Monitors Motivation: Semaphores are incredibly versatile. The problem with them is that they are dual purpose: they can be used for both mutual exclusion and scheduling constraints. This makes the code hard to read, and hard to get right. In the previous slides we have introduced two separate constructs for each purpose: mutexes and conditional variables. Monitors groups them together (keeping each disctinct from the other) to protect some shared data: 513/571 G. Castagna (CNRS) Cours de Programmation Avancée 513 / 571
68 Monitors Motivation: Semaphores are incredibly versatile. The problem with them is that they are dual purpose: they can be used for both mutual exclusion and scheduling constraints. This makes the code hard to read, and hard to get right. In the previous slides we have introduced two separate constructs for each purpose: mutexes and conditional variables. Monitors groups them together (keeping each disctinct from the other) to protect some shared data: Definition (Monitor) a lock and zero or more condition variables for managing concurrent access to shared data by defining some given operations. 513/571 G. Castagna (CNRS) Cours de Programmation Avancée 513 / 571
69 Monitors Example: a synchronized queue In pseudo-code: monitor SynchQueue { lock = Lock.create condition = Condition.create addtoqueue(item) { lock.acquire(); put item on queue; condition.signal(); lock.release(); } } removefromqueue() { lock.acquire(); while nothing on queue do condition.wait(lock) done remove item from queue; lock.release(); return item } // release lock; go to // sleep; re-acquire lock 514/571 G. Castagna (CNRS) Cours de Programmation Avancée 514 / 571
70 Different kinds of Monitors Need to be careful about the precise definition of signal and wait: Mesa-style: (Nachos, most real operating systems) Signaler keeps lock, processor Waiter simply put on ready queue, with no special priority. (in other words, waiter may have to wait for lock) Hoare-style: (most textbooks) Signaler gives up lock, CPU to waiter; waiter runs immediately Waiter gives lock, processor back to signaler when it exits critical section or if it waits again. Above code for synchronized queuing happens to work with either style, but for many programs it matters which you are using. With Hoare-style, can change "while" in removefromqueue to an "if", because the waiter only gets woken up if item is on the list. With Mesa-style monitors, waiter may need to wait again after being woken up, because some other thread may have acquired the lock, and removed the item, before the original waiting thread gets to the front of the ready queue. G. Castagna (CNRS) Cours de Programmation Avancée 515 / /571
71 Different kinds of Monitors Need to be careful about the precise definition of signal and wait: Mesa-style: (Nachos, most real operating systems) Signaler keeps lock, processor Waiter simply put on ready queue, with no special priority. (in other words, waiter may have to wait for lock) Hoare-style: (most textbooks) Signaler gives up lock, CPU to waiter; waiter runs immediately Waiter gives lock, processor back to signaler when it exits critical section or if it waits again. Above code for synchronized queuing happens to work with either style, but for many programs it matters which you are using. With Hoare-style, can change "while" in removefromqueue to an "if", because the waiter only gets woken up if item is on the list. With Mesa-style monitors, waiter may need to wait again after being woken up, because some other thread may have acquired the lock, and removed the item, before the original waiting thread gets to the front of the ready queue. G. Castagna (CNRS) Cours de Programmation Avancée 515 / /571
72 Preemptive threads in OCaml Four main modules: ModuleThread: lightweight threads (abstract typethread.t) ModuleMutex: locks for mutual exclusion (abstract typemutex.t) ModuleCondition: condition variables to synchronize between threads (abstract typecondition.t) ModuleEvent: first-class synchronous channels (abstract types a Event.channel and a Event.event) 516/571 G. Castagna (CNRS) Cours de Programmation Avancée 516 / 571
73 Preemptive threads in OCaml Four main modules: ModuleThread: lightweight threads (abstract typethread.t) ModuleMutex: locks for mutual exclusion (abstract typemutex.t) ModuleCondition: condition variables to synchronize between threads (abstract typecondition.t) ModuleEvent: first-class synchronous channels (abstract types a Event.channel and a Event.event) Two implementations: System threads. Uses OS-provided threads: POSIX threads for Unix, and Win32 threads for Windows. Supports both bytecode and native-code. Green threads. Time-sharing and context switching at the level of the bytecode interpreter. Works on OS without multi-threading but cannot be used with native-code programs. 516/571 G. Castagna (CNRS) Cours de Programmation Avancée 516 / 571
74 Preemptive threads in OCaml Four main modules: ModuleThread: lightweight threads (abstract typethread.t) ModuleMutex: locks for mutual exclusion (abstract typemutex.t) ModuleCondition: condition variables to synchronize between threads (abstract typecondition.t) ModuleEvent: first-class synchronous channels (abstract types a Event.channel and a Event.event) Two implementations: System threads. Uses OS-provided threads: POSIX threads for Unix, and Win32 threads for Windows. Supports both bytecode and native-code. Green threads. Time-sharing and context switching at the level of the bytecode interpreter. Works on OS without multi-threading but cannot be used with native-code programs. Nota Bene: Always work on a single processor (because of OCaml s GC). No advantage from multi-processors (apart from explicit execution of C code or system calls): threads are just for structuring purposes. G. Castagna (CNRS) Cours de Programmation Avancée 516 / /571
75 ModuleThread create : ( a -> b) -> a -> Thread.t Thread.create f e creates a new thread of control, in which the function applicationf(e) is executed concurrently with the other threads of the program. kill : Thread.t -> unit kill p terminates prematurely the threadp join : Thread.t -> unit join p suspends the execution of the calling thread until the termination ofp delay : float -> unit delay d suspends the execution of the calling thread fordseconds. # let f () = for i=0 to 10 do Printf.printf "(%d)" i done;; val f : unit -> unit = <fun> # Printf.printf "begin "; Thread.join (Thread.create f ()); Printf.printf " end";; begin (0)(1)(2)(3)(4)(5)(6)(7)(8)(9)(10) end- : unit = () 517/571 G. Castagna (CNRS) Cours de Programmation Avancée 517 / 571
76 ModuleMutex create : unit -> Mutex.t Return a new mutex. lock : Mutex.t -> unit Lock the given mutex. try_lock : Mutex.t -> bool Non blocking lock. unlock : Mutex.t -> unit Unlock the given mutex. 518/571 G. Castagna (CNRS) Cours de Programmation Avancée 518 / 571
77 ModuleMutex create : unit -> Mutex.t Return a new mutex. lock : Mutex.t -> unit Lock the given mutex. try_lock : Mutex.t -> bool Non blocking lock. unlock : Mutex.t -> unit Unlock the given mutex. Dining philosophers Five philosophers sitting at a table doing one of two things: eating or meditate. While eating, they are not thinking, and while thinking, they are not eating. The five philosophers sit at a circular table with a large bowl of rice in the center. A chopstick is placed in between each pair of adjacent philosophers and to eat he needs two chopsticks. Each philosopher can only use the chopstick on his immediate left and immediate right. # let b = let b0 = Array.create 5 (Mutex.create()) in for i=1 to 4 do b0.(i) <- Mutex.create() done; b0 ;; val b : Mutex.t array = [ <abstr>; <abstr>; <abstr>; <abstr>; <abstr> ] 518/571 G. Castagna (CNRS) Cours de Programmation Avancée 518 / 571
78 Dining philosophers # let meditation = Thread.delay and eating = Thread.delay ;; let philosopher i = let ii = (i+1) mod 5 in while true do meditation 3. ; Mutex.lock b.(i); Printf.printf "Philo (%d) takes his left-hand chopstick" i ; Printf.printf " and meditates a little while more\n"; meditation 0.2; Mutex.lock b.(ii); Printf.printf "Philo (%d) takes his right-hand chopstick\n" i; eating 0.5; Mutex.unlock b.(i); Printf.printf "Philo (%d) puts down his left-hand chopstick" i; Printf.printf " and goes back to meditating\n"; meditation 0.15; Mutex.unlock b.(ii); Printf.printf "Philo (%d) puts down his right-hand chopstick\n" i done ;; 519/571 G. Castagna (CNRS) Cours de Programmation Avancée 519 / 571
79 Dining philosophers We can test this little program by executing: for i=0 to 4 do ignore (Thread.create philosopher i) done ; while true do Thread.delay 5. done ;; Problems: Deadlock: all philosophers can take their left-hand chopstick, so the program is stuck. Starvation: To avoid deadlock, the philosophers can put down a chopstick if they do not manage to take the second one. This is highly courteous, but still allows two philosophers to gang up against a third to stop him from eating. Exercise Think about solutions to avoid deadlock and starvation 520/571 G. Castagna (CNRS) Cours de Programmation Avancée 520 / 571
80 ModuleCondition create : unit -> Condition.t returns a new condition variable. wait : Condition.t -> Mutex.t -> unit wait c m atomically unlocks the mutexmand suspends the calling process on the condition variablec. The process will restart after the condition variablechas been signaled. The mutexmis locked again beforewait returns. signal : Condition.t -> unit signalc restarts one of the processes waiting on the condition variable c. broadcast : Condition.t -> unit broadcast c restarts all processes waiting on the condition variablec. 521/571 G. Castagna (CNRS) Cours de Programmation Avancée 521 / 571
81 ModuleCondition create : unit -> Condition.t returns a new condition variable. wait : Condition.t -> Mutex.t -> unit wait c m atomically unlocks the mutexmand suspends the calling process on the condition variablec. The process will restart after the condition variablechas been signaled. The mutexmis locked again beforewait returns. signal : Condition.t -> unit signalc restarts one of the processes waiting on the condition variable c. broadcast : Condition.t -> unit broadcast c restarts all processes waiting on the condition variablec. Typical usage pattern: Mutex.lock m; while (* some predicate P over D is not satisfied *) do Condition.wait c m // We put the wait in a while loop: why? done; (* Modify D *) if (* the predicate P over D is now satisfied *) then Condition.signal 521/571 Mutex.unlock m G. Castagna (CNRS) Cours de Programmation Avancée 521 / 571
82 ModuleCondition create : unit -> Condition.t returns a new condition variable. wait : Condition.t -> Mutex.t -> unit wait c m atomically unlocks the mutexmand suspends the calling process on the condition variablec. The process will restart after the condition variablechas been signaled. The mutexmis locked again beforewait returns. signal : Condition.t -> unit signalc restarts one of the processes waiting on the condition variable c. broadcast : Condition.t -> unit broadcast c restarts all processes waiting on the condition variablec. Typical usage pattern: Mutex.lock m; while (* some predicate P over D is not satisfied *) do Condition.wait c m // We put the wait in a while loop: why? done; (* Modify D *) if (* the predicate P over D is now satisfied *) then Condition.signal 521/571 Mutex.unlock m G. Castagna (CNRS) Cours de Programmation Avancée 521 / 571
83 Example: a Monitor module SynchQueue = struct type a t = { queue : a Queue.t; lock : Mutex.t; non_empty : Condition.t } let create () = { queue = Queue.create (); lock = Mutex.create (); non_empty = Condition.create () } let add e q = Mutex.lock q.lock; if Queue.length q.queue = 0 then Condition.broadcast q.non_empty; Queue.add e q.queue; Mutex.unlock q.lock let remove q = Mutex.lock q.lock; while Queue.length q.queue = 0 do Condition.wait q.non_empty q.lock done; let x = Queue.take q.queue in Mutex.unlock q.lock; x end 522/571 G. Castagna (CNRS) Cours de Programmation Avancée 522 / 571
84 Monitors OCaml does not provide explicit constructions for monitors. They must be implemented by using mutexes and condition variables. Other languages provides monitors instead, for instance Java. Monitors in Java: In Java a monitor is any object in which at least one method is declared synchronized When a thread is executing a synchronized method of some object, then the other threads are blocked if they call any synchronized method of that object. 523/571 G. Castagna (CNRS) Cours de Programmation Avancée 523 / 571
85 Monitors OCaml does not provide explicit constructions for monitors. They must be implemented by using mutexes and condition variables. Other languages provides monitors instead, for instance Java. Monitors in Java: In Java a monitor is any object in which at least one method is declared synchronized When a thread is executing a synchronized method of some object, then the other threads are blocked if they call any synchronized method of that object. class Account{ float balance; synchronized void deposit(float amt) { balance += amt; } synchronized void withdraw(float amt) { if (balance < amt) throw new OutOfMoneyError(); balance -= amt; } } G. Castagna (CNRS) Cours de Programmation Avancée 523 / /571
86 Outline 39 Concurrency 40 Preemptive multi-threading 41 Locks, Conditional Variables, Monitors 42 Doing without mutual exclusion 43 Cooperative multi-threading 44 Channeled communication 45 Software Transactional Memory 524/571 G. Castagna (CNRS) Cours de Programmation Avancée 524 / 571
87 What s wrong with locking Locking has many pitfalls for the inexperienced programmer Priority inversion: a lower priority thread is preempted while holding a lock needed by higher-priority threads. Convoying: A thread holding a lock is descheduled due to a time-slice interrupt or page fault causing other threads requiring that lock to queue up. When rescheduled it may take some time to drain the queue. The overhead of repeated context switches and underutilization of scheduling quanta degrade overall performance. Deadlock: threads that lock the same objects in different order. Deadlock avoidance is difficult if many objects are accessed at the same time and they are not statically known. Debugging: Lock related problems are difficult to debug (since, being time-related, they are difficult to reproduce). Fault-tolerance If a thread (or process) is killed or fails while holding a lock, what does happen? (cf.thread.delete) 525/571 G. Castagna (CNRS) Cours de Programmation Avancée 525 / 571
88 Programming is not easy with locks and requires difficult decisions: Taking too few locks leads to race conditions. Taking too many locks inhibits concurrency Locking at too coarse a level inhibits concurrency Taking locks in the wrong order leads to deadlock Error recovery is hard (eg, how to handle failure of threads holding locks?) A major problem: Composition Lock-based programs do not compose: For example, consider a hash table with thread-safe insert and delete operations. Now suppose that we want to delete one item A from table t1, and insert it into table t2; but the intermediate state (in which neither table contains the item) must not be visible to other threads. Unless the implementer of the hash table anticipates this need, there is simply no way to satisfy this requirement. 526/571 G. Castagna (CNRS) Cours de Programmation Avancée 526 / 571
89 Locks are non-compositional Consider the previous (correct) Java bank Account class: class Account{ float balance; } synchronized void deposit(float amt) { balance += amt; } synchronized void withdraw(float amt) { if (balance < amt) throw new OutOfMoneyError(); balance -= amt; } Now suppose we want to add the ability to transfer funds from one account to another. 527/571 G. Castagna (CNRS) Cours de Programmation Avancée 527 / 571
90 Locks are non-compositional Simply calling withdraw and deposit to implement transfer causes a race condition: class Account{ float balance;... void badtransfer(acct other, float amt) { other.withdraw(amt); // here checkbalances sees bad total balance this.deposit(amt); } } class Bank { Account[] accounts; float global_balance; } checkbalances () { return (sum(accounts) == global_balance); } 528/571 G. Castagna (CNRS) Cours de Programmation Avancée 528 / 571
91 Locks are non-compositional Synchronizing transfer can cause deadlock: class Account{ float balance; } synchronized void deposit(float amt) { balance += amt; } synchronized void withdraw(float amt) { if(balance < amt) throw new OutOfMoneyError(); balance -= amt; } synchronized void badtrans(acct other, float amt) { // can deadlock with parallel reverse-transfer this.deposit(amt); other.withdraw(amt); } 529/571 G. Castagna (CNRS) Cours de Programmation Avancée 529 / 571
92 Concurrency without locks We need to synchronize threads without resorting to locks 530/571 G. Castagna (CNRS) Cours de Programmation Avancée 530 / 571
93 Concurrency without locks We need to synchronize threads without resorting to locks 1 Cooperative threading disallow concurrent access The threads themselves relinquish control once they are at a stopping point. 530/571 G. Castagna (CNRS) Cours de Programmation Avancée 530 / 571
94 Concurrency without locks We need to synchronize threads without resorting to locks 1 Cooperative threading disallow concurrent access The threads themselves relinquish control once they are at a stopping point. 2 Channeled communication disallow shared memory The threads do not share memory. All data is exchanged by explicit communications that take place on channels. 530/571 G. Castagna (CNRS) Cours de Programmation Avancée 530 / 571
95 Concurrency without locks We need to synchronize threads without resorting to locks 1 Cooperative threading disallow concurrent access The threads themselves relinquish control once they are at a stopping point. 2 Channeled communication disallow shared memory The threads do not share memory. All data is exchanged by explicit communications that take place on channels. 3 Software transactional memory handle conflicts when happen Each thread declares the blocks that must be performed atomically. If the execution of an atomic block causes any conflict, the modifications are rolled back and the block is re-executed. 530/571 G. Castagna (CNRS) Cours de Programmation Avancée 530 / 571
96 Concurrency without locks We need to synchronize threads without resorting to locks 1 Cooperative threading disallow concurrent access The threads themselves relinquish control once they are at a stopping point. 2 Channeled communication disallow shared memory The threads do not share memory. All data is exchanged by explicit communications that take place on channels. 3 Software transactional memory handle conflicts when happen Each thread declares the blocks that must be performed atomically. If the execution of an atomic block causes any conflict, the modifications are rolled back and the block is re-executed. 530/571 G. Castagna (CNRS) Cours de Programmation Avancée 530 / 571
97 Concurrency without locks 1 Cooperative threading: The threads themselves relinquish control once they are at a stopping point. Pros: Programmer manage interleaving, no concurrent access happens Cons: The burden is on the programmer: the system may not be responsive (eg, Classic Mac OS 5.x to 9.x). Does not scale on multi-processors. Not always compositional. 2 Channeled communication: The threads do not share memory. All data is exchanged by explicit communications that take place on channels. Pros: Compositional. Easily scales to multi-processor and distributed programming (if asynchronous) Cons: Awkward when threads concurrently work on complex and large data-structures. 3 Software transactional memory: If the execution of an atomic block cause any conflict, modification are rolled back and the block re-executed. Pros: Very compositional. A no brainer for the programmer. Cons: Very new, poorly mastered. Feasibility depends on conflict 531/571 likelihood. G. Castagna (CNRS) Cours de Programmation Avancée 531 / 571
98 Concurrency without locks Besides the previous solution there is also a more drastic solution (not so general as the previous ones but composes with them): Lock-free programming Threads access to shared data without the use of synchronization primitives such as mutexes. The operations to access the data ensure the absence of conflicts. Idea: instead of giving operations for mutual exclusion of accesses, define the access operations so that they take into account concurrent accesses. Pros: A no-brainer if the data structures available in your lock-free library fit your problem. It has the granularity precisely needed (if you work on a queue with locks, should you use a lock for the whole queue?) Cons: requires to have specialized operations for each data structure. Not modular since composition may require using a different more complex data structure. Works with simple data structure but is hard to generalize to complex operations. Hard to implement in the absence of hardware support (e.g.,compare_and_swap). G. Castagna (CNRS) Cours de Programmation Avancée 532 / /571
99 Lock-free programming: an example Non blocking linked list: Insertion: Head Tail 533/571 G. Castagna (CNRS) Cours de Programmation Avancée 533 / 571
100 Lock-free programming: an example Non blocking linked list: Insertion: Head Tail The linked list above is ordered 533/571 G. Castagna (CNRS) Cours de Programmation Avancée 533 / 571
101 Lock-free programming: an example Non blocking linked list: Insertion: Head Tail 30 We want to insert a new element new at the right position 533/571 G. Castagna (CNRS) Cours de Programmation Avancée 533 / 571
102 Lock-free programming: an example Non blocking linked list: Insertion: Head Tail 30 Find the right position pos and set to it the field next of the new element 533/571 G. Castagna (CNRS) Cours de Programmation Avancée 533 / 571
103 Lock-free programming: an example Non blocking linked list: Insertion: Head Tail 30 Compare and swap: compare the two links pos and next 533/571 G. Castagna (CNRS) Cours de Programmation Avancée 533 / 571
104 Lock-free programming: an example Non blocking linked list: Insertion: Head Tail 30 Compare and swap: compare the two links pos and next If they are the same then no conflicting operation has been performed, so atomically swap the successor of the second element with the pointer to the new element. 533/571 G. Castagna (CNRS) Cours de Programmation Avancée 533 / 571
105 Lock-free programming: an example Non blocking linked list: Insertion: Head Tail 30 Compare and swap: compare the two links pos and next If they are the same then no conflicting operation has been performed, so atomically swap the successor of the second element with the pointer to the new element. comp_and_swap(pos, next, new) 533/571 G. Castagna (CNRS) Cours de Programmation Avancée 533 / 571
106 Lock-free programming: an example Non blocking linked list: Insertion: Head Tail 30 Compare and swap: compare the two links pos and next (marked in red) If they are the same then no conflicting operation has been performed, so atomically swap the successor of the second element with the pointer to the new element. Otherwise, a conflicting modification was performed: retry the insert. 533/571 G. Castagna (CNRS) Cours de Programmation Avancée 533 / 571
107 Lock-free programming: Non blocking linked list: Deletion is more difficult: Head Tail 534/571 G. Castagna (CNRS) Cours de Programmation Avancée 534 / 571
108 Lock-free programming: Non blocking linked list: Deletion is more difficult: Head Tail Imagine we want to delete the first element. First we try a naive solution 534/571 G. Castagna (CNRS) Cours de Programmation Avancée 534 / 571
109 Lock-free programming: Non blocking linked list: Deletion is more difficult: Head Tail Put a pointer to where the next field of the 10 element will point 534/571 G. Castagna (CNRS) Cours de Programmation Avancée 534 / 571
110 Lock-free programming: Non blocking linked list: Deletion is more difficult: Head Tail Use compare and swap to check whether in the meanwhile any update was done between the 10 and 20 element: 534/571 G. Castagna (CNRS) Cours de Programmation Avancée 534 / 571
111 Lock-free programming: Non blocking linked list: Deletion is more difficult: Head Tail Use compare and swap to check whether in the meanwhile any update was done between the 10 and 20 element: If the test suceeds then update the next field of the 10 element 534/571 G. Castagna (CNRS) Cours de Programmation Avancée 534 / 571
112 Lock-free programming: Non blocking linked list: Deletion is more difficult: Head Tail Use compare and swap to check whether in the meanwhile any update was done between the 10 and 20 element: If the test suceeds then update the next field of the 10 element (otherwise, restart). 534/571 G. Castagna (CNRS) Cours de Programmation Avancée 534 / 571
113 Lock-free programming: Non blocking linked list: Deletion is more difficult: Head Tail Use compare and swap to check whether in the meanwhile any update was done between the 10 and 20 element: If the test suceeds then update the next field of the 10 element (otherwise, restart). Eventually, garbage collect the 20 element. 534/571 G. Castagna (CNRS) Cours de Programmation Avancée 534 / 571
114 Lock-free programming: Non blocking linked list: Deletion is more difficult: Head Tail HOWEVER If before the compare and swap another thread makes an insertion such as the above, then: 534/571 G. Castagna (CNRS) Cours de Programmation Avancée 534 / 571
115 Lock-free programming: Non blocking linked list: Deletion is more difficult: Head Tail 30 HOWEVER If before the compare and swap another thread makes an insertion such as the above, then: 534/571 G. Castagna (CNRS) Cours de Programmation Avancée 534 / 571
116 Lock-free programming: Non blocking linked list: Deletion is more difficult: Head Tail 30 HOWEVER If before the compare and swap another thread makes an insertion such as the above, then: The compare and swap succeeds 534/571 G. Castagna (CNRS) Cours de Programmation Avancée 534 / 571
117 Lock-free programming: Non blocking linked list: Deletion is more difficult: Head Tail 30 HOWEVER If before the compare and swap another thread makes an insertion such as the above, then: The compare and swap succeeds The insertion succeeds (see the animation on the full slides) 534/571 G. Castagna (CNRS) Cours de Programmation Avancée 534 / 571
118 Lock-free programming: Non blocking linked list: Deletion is more difficult: Head Tail 30 lost insertion HOWEVER If before the compare and swap another thread makes an insertion such as the above, then: The compare and swap succeeds The insertion succeeds The insertion is lost (see the animation on the full slides) G. Castagna (CNRS) Cours de Programmation Avancée 534 / /571
119 Lock-free programming: Non blocking linked list: Deletion is more difficult: Head Tail 30 lost insertion we need to modify this solution 534/571 G. Castagna (CNRS) Cours de Programmation Avancée 534 / 571
120 Lock-free programming: Non blocking linked list: Deletion: Head Tail If we want to delete the first element then we must proceed as follows: 535/571 G. Castagna (CNRS) Cours de Programmation Avancée 535 / 571
121 Lock-free programming: Non blocking linked list: Deletion: * Head Tail If we want to delete the first element then we must proceed as follows: 1 Mark the element to delete: a marked point can still be traversed but will be ignored by concurrent insertion/deletions 535/571 G. Castagna (CNRS) Cours de Programmation Avancée 535 / 571
122 Lock-free programming: Non blocking linked list: Deletion: * Head Tail If we want to delete the first element then we must proceed as follows: 1 Mark the element to delete: a marked point can still be traversed but will be ignored by concurrent insertion/deletions 2 Record the pointer to the element next to the one to be deleted 535/571 G. Castagna (CNRS) Cours de Programmation Avancée 535 / 571
123 Lock-free programming: Non blocking linked list: Deletion: * Head Tail If we want to delete the first element then we must proceed as follows: 1 Mark the element to delete: a marked point can still be traversed but will be ignored by concurrent insertion/deletions 2 Record the pointer to the element next to the one to be deleted 3 Compare the old and new pointer to the successor of the 10 element 535/571 G. Castagna (CNRS) Cours de Programmation Avancée 535 / 571
124 Lock-free programming: Non blocking linked list: Deletion: * Head Tail If we want to delete the first element then we must proceed as follows: 1 Mark the element to delete: a marked point can still be traversed but will be ignored by concurrent insertion/deletions 2 Record the pointer to the element next to the one to be deleted 3 Compare the old and new pointer to the successor of the 10 element and swap them (atomic compare and swap) 535/571 G. Castagna (CNRS) Cours de Programmation Avancée 535 / 571
125 Lock-free programming: Non blocking linked list: Deletion: Head Tail If we want to delete the first element then we must proceed as follows: 1 Mark the element to delete: a marked point can still be traversed but will be ignored by concurrent insertion/deletions 2 Record the pointer to the element next to the one to be deleted 3 Compare the old and new pointer to the successor of the 10 element and swap them (atomic compare and swap) 4 Eventually the deleted node is to be garbage collected 535/571 G. Castagna (CNRS) Cours de Programmation Avancée 535 / 571
126 Lock-free programming Summary: Lock-free programming requires specifically programmed data structures. As such, it is of a less general application than the techniques we describe next. Also it may not fit modular development, since a structure composed of lock-free programmed data structures may fail to avoid global conflicts. However when it works, then it comes from free and can be combined with any of the techniques that follow, thus reducing the logical complexity of process synchronization. 536/571 G. Castagna (CNRS) Cours de Programmation Avancée 536 / 571
127 Outline 39 Concurrency 40 Preemptive multi-threading 41 Locks, Conditional Variables, Monitors 42 Doing without mutual exclusion 43 Cooperative multi-threading 44 Channeled communication 45 Software Transactional Memory 537/571 G. Castagna (CNRS) Cours de Programmation Avancée 537 / 571
128 Cooperative multi-threading in OCaml: Lwt Cooperative multi-threading in OCaml is available by thelwt module Rationale Instead of using few large monolithic threads, define(1) many small interdependent threads, (2) the interdependence relation between them. 538/571 G. Castagna (CNRS) Cours de Programmation Avancée 538 / 571
129 Cooperative multi-threading in OCaml: Lwt Cooperative multi-threading in OCaml is available by thelwt module Rationale Instead of using few large monolithic threads, define(1) many small interdependent threads, (2) the interdependence relation between them. A thread executes without interrupting till a cooperation point where it passes the control to another thread 538/571 G. Castagna (CNRS) Cours de Programmation Avancée 538 / 571
130 Cooperative multi-threading in OCaml: Lwt Cooperative multi-threading in OCaml is available by thelwt module Rationale Instead of using few large monolithic threads, define(1) many small interdependent threads, (2) the interdependence relation between them. A thread executes without interrupting till a cooperation point where it passes the control to another thread A cooperation point is reached either by explicitly passing the control (functionyield), or by calling a cooperative function (eg,read,sleep). 538/571 G. Castagna (CNRS) Cours de Programmation Avancée 538 / 571
131 Cooperative multi-threading in OCaml: Lwt Cooperative multi-threading in OCaml is available by thelwt module Rationale Instead of using few large monolithic threads, define(1) many small interdependent threads, (2) the interdependence relation between them. A thread executes without interrupting till a cooperation point where it passes the control to another thread A cooperation point is reached either by explicitly passing the control (functionyield), or by calling a cooperative function (eg,read,sleep). Lwt uses a non-preemptive scheduler which is an event loop. 538/571 G. Castagna (CNRS) Cours de Programmation Avancée 538 / 571
132 Cooperative multi-threading in OCaml: Lwt Cooperative multi-threading in OCaml is available by thelwt module Rationale Instead of using few large monolithic threads, define(1) many small interdependent threads, (2) the interdependence relation between them. A thread executes without interrupting till a cooperation point where it passes the control to another thread A cooperation point is reached either by explicitly passing the control (functionyield), or by calling a cooperative function (eg,read,sleep). Lwt uses a non-preemptive scheduler which is an event loop. At each cooperation point the thread passes the control to the scheduler, which handles it to another thread. 538/571 G. Castagna (CNRS) Cours de Programmation Avancée 538 / 571
133 Cooperative multi-threading in OCaml: Lwt Cooperative multi-threading in OCaml is available by thelwt module Rationale Instead of using few large monolithic threads, define(1) many small interdependent threads, (2) the interdependence relation between them. A thread executes without interrupting till a cooperation point where it passes the control to another thread A cooperation point is reached either by explicitly passing the control (functionyield), or by calling a cooperative function (eg,read,sleep). Lwt uses a non-preemptive scheduler which is an event loop. At each cooperation point the thread passes the control to the scheduler, which handles it to another thread. Nota bene Do not call blocking functions, otherwise all threads will be blocked. In particular do not useunix.sleep andunix.read, but the corresponding cooperative versionslwt_unix.sleep andlwt_unix.read, instead. G. Castagna (CNRS) Cours de Programmation Avancée 538 / /571
134 Lwt threads A thread computing a value of type a is a value of abstract type a Lwt.t 539/571 G. Castagna (CNRS) Cours de Programmation Avancée 539 / 571
135 Lwt threads A thread computing a value of type a is a value of abstract type a Lwt.t Each thread is in one of the following states: 1 Terminated: it successfully computed the value 2 Suspended: the computation is not over and will resume later 3 Failed: the computation failed (with an exception) 539/571 G. Castagna (CNRS) Cours de Programmation Avancée 539 / 571
136 Lwt threads A thread computing a value of type a is a value of abstract type a Lwt.t Each thread is in one of the following states: 1 Terminated: it successfully computed the value 2 Suspended: the computation is not over and will resume later 3 Failed: the computation failed (with an exception) Examples: Lwt.return : a -> a Lwt.t It immediately returns a terminated thread whose computed value is the one passed as argument. Lwt_unix.sleep : float -> unit Lwt.t It immediately returns a suspended thread that will return() after some time (if the scheduler reschedules it). Lwt.fail : exn -> a Lwt.t It immediately returns a failed thread whose exception is the one passed as argument. 539/571 G. Castagna (CNRS) Cours de Programmation Avancée 539 / 571
137 Lwt threads Lwt threads are a monad: Lwt.bind : a Lwt.t -> ( a -> b Lwt.t) -> b Lwt.t The expressionlwt.bind p f (that can also be written asp >>= f) immediately returns a thread of type b Lwt.t, defined as follows: If the threadpis terminated then it passes it results tof. If the threadpis suspended thenfis saved in the list of the functions waiting for the result ofp. Whenpterminates, then the scheduler actives these functions one after the other. If the threadpis failed, then so is the whole expression. 540/571 G. Castagna (CNRS) Cours de Programmation Avancée 540 / 571
138 Lwt threads Lwt threads are a monad: Lwt.bind : a Lwt.t -> ( a -> b Lwt.t) -> b Lwt.t The expressionlwt.bind p f (that can also be written asp >>= f) immediately returns a thread of type b Lwt.t, defined as follows: Nota bene If the threadpis terminated then it passes it results tof. If the threadpis suspended thenfis saved in the list of the functions waiting for the result ofp. Whenpterminates, then the scheduler actives these functions one after the other. If the threadpis failed, then so is the whole expression. Bind is not a cooperation point: it does not imply any suspension 540/571 G. Castagna (CNRS) Cours de Programmation Avancée 540 / 571
139 Example rlwrap ocaml Objective Caml version # #use "topfind";; # #require "lwt";; # Lwt.bind (Lwt.return 3) (fun x -> print_int x; Lwt.return());; 3- : unit Lwt.t = <abstr> # Lwt.bind (Lwt_unix.sleep 3.0) (fun () -> print_endline "hello"; Lwt.return ());; - : unit Lwt.t = <abstr> (Lwt.return 3) >>= (fun x -> print_int x; Lwt.return()) immediately returns a thread of typeunit Lwt.t after having printed3. Notice the use oflwt.return () for well typing (Lwt_unix.sleep 3.0) >>= (fun () -> print_endline "hello"; Lwt.return ()) immediately returns a thread of typeunit Lwt.t and nothing else. 541/571 G. Castagna (CNRS) Cours de Programmation Avancée 541 / 571
140 In order to see the last thread to behave as expected (and print after three seconds hello ) we have to run the scheduler, that is the function Lwt_unix.run : a Lwt.t -> a Lwt_unix.run t manage a queue of waiting threads without preemption. It terminates when the threadtdoes. When we run the scheduler we see the computation above to end (since the schedule reactivateslwt_unix.sleep 3.0 which can pass its hand to the next thread. # Lwt_unix.run (Lwt_unix.sleep 4.);; hello - : unit = () 542/571 G. Castagna (CNRS) Cours de Programmation Avancée 542 / 571
141 Lwt_unix The main function provided bylwt_unix isrun: It manages a queue of threads ready to be executed. As long as this queue is not empty it runs them in the order. It maintains a table of open file descriptors together with the threads that wait on them and insert them in the queue as soon as they have received the data they were waiting for. It inserts in the queue the threads that exceeded their sleep time. It iterates and stops when its argument thread does 543/571 G. Castagna (CNRS) Cours de Programmation Avancée 543 / 571
142 Lwt_unix The main function provided bylwt_unix isrun: It manages a queue of threads ready to be executed. As long as this queue is not empty it runs them in the order. It maintains a table of open file descriptors together with the threads that wait on them and insert them in the queue as soon as they have received the data they were waiting for. It inserts in the queue the threads that exceeded their sleep time. It iterates and stops when its argument thread does Besides the schedulerlwt_unix provides the cooperative version of most of the functions in theunix module: Lwt_unix.yield : unit -> unit Lwt.t forces a cooperation point adding the thread in the scheduler queue. Lwt_unix.read. Works asunix.read but while the latter immediately blocks, the former immediately returns a new thread which: It tries to read the file If data is available, then it returns a result Otherwise, it sleeps in the queue associated to the file descriptor G. Castagna (CNRS) Cours de Programmation Avancée 543 / /571
143 do notation forlwt_unix It is possible to use the lwt_in notation to mimic Haskell s do notation. So Lwt_chan.input_line ch >>= fun s -> Lwt_unix.sleep 3. >>= fun () -> print_endline s; Lwt.return () can be written as lwt s = input_line ch in lwt () = Lwt_unix.sleep 3. in print_endline s; Lwt.return () 544/571 G. Castagna (CNRS) Cours de Programmation Avancée 544 / 571
144 Example A thread that writes hello every ten seconds let rec f () = print_endline "hello"; Lwt_unix.sleep 10. >>= f in f (); 545/571 G. Castagna (CNRS) Cours de Programmation Avancée 545 / 571
145 Example A thread that writes hello every ten seconds let rec f () = print_endline "hello"; Lwt_unix.sleep 10. >>= f in f (); Join of threads Letfandgbe functions that return threads (e.g.,unit -> a Lwt.t) let first_thread = f () in // launch the first thread let second_thread = g () in // launch the second thread lwt fst_result = first-thread in // wait for the first thread result lwt snd_result = second_thread in // wait for the second thread result. 545/571 G. Castagna (CNRS) Cours de Programmation Avancée 545 / 571
146 Two versions of cooperativelist.map Two versions ofmap running a thread for each value in the list. Differences? # let rec map f l = match l with [] -> return [] v :: r -> let t = f v in let rt = map f r in t >>= fun v -> rt >>= fun l -> return (v :: l );; val map : ( a -> b Lwt.t) -> a list -> b list Lwt.t = <fun> All the threads are run in parallel. let rec map2 f l = match l with [] -> return [] v :: r -> f v >>= fun v -> map2 f r >>= fun l -> return (v :: l ) val map : ( a -> b Lwt.t) -> a list -> b list Lwt.t = <fun> The threads are run sequentially: it waits for the first thread to end before running the second G. Castagna (CNRS) Cours de Programmation Avancée 546 / /571
147 Two versions of cooperativelist.map Two versions ofmap running a thread for each value in the list. Differences? # let rec map f l = match l with [] -> return [] v :: r -> let t = f v in let rt = map f r in t >>= fun v -> rt >>= fun l -> return (v :: l );; val map : ( a -> b Lwt.t) -> a list -> b list Lwt.t = <fun> All the threads are run in parallel. let rec map2 f l = match l with [] -> return [] v :: r -> f v >>= fun v -> map2 f r >>= fun l -> return (v :: l ) val map : ( a -> b Lwt.t) -> a list -> b list Lwt.t = <fun> The threads are run sequentially: it waits for the first thread to end before running the second G. Castagna (CNRS) Cours de Programmation Avancée 546 / /571
148 Two versions of cooperativelist.map Two versions ofmap running a thread for each value in the list. Differences? # let rec map f l = match l with [] -> return [] v :: r -> let t = f v in let rt = map f r in t >>= fun v -> rt >>= fun l -> return (v :: l );; val map : ( a -> b Lwt.t) -> a list -> b list Lwt.t = <fun> All the threads are run in parallel. let rec map2 f l = match l with [] -> return [] v :: r -> f v >>= fun v -> map2 f r >>= fun l -> return (v :: l ) val map : ( a -> b Lwt.t) -> a list -> b list Lwt.t = <fun> The threads are run sequentially: it waits for the first thread to end before running the second G. Castagna (CNRS) Cours de Programmation Avancée 546 / /571
149 Outline 39 Concurrency 40 Preemptive multi-threading 41 Locks, Conditional Variables, Monitors 42 Doing without mutual exclusion 43 Cooperative multi-threading 44 Channeled communication 45 Software Transactional Memory 547/571 G. Castagna (CNRS) Cours de Programmation Avancée 547 / 571
150 Communication on channels Two kinds of communication: 1 Synchronous communication: sending a message on an action is blocking. ModuleEvent of the OCaml s thread library. 2 Asynchronous communications: sending a message is a non-blocking action: messages are buffered and the order is preserved. Erlang-style concurrency. 548/571 G. Castagna (CNRS) Cours de Programmation Avancée 548 / 571
151 Synchronous communications: OCaml sevent type a channel The type of communication channels carrying values of type a. new_channel : unit -> a channel Return a new channel. type + a event The type of communication events returning a result of type a. send : a channel -> a -> unit event send ch v returns the event consisting in sending the valuevover the channelch. The result value of this event is(). receive : a channel -> a event receive ch returns the event consisting in receiving a value from the channelch. The result value of this event is the value received. choose : a event list -> a event choose evl returns the event that is the parallel composition of all the events in the listevl. 549/571 G. Castagna (CNRS) Cours de Programmation Avancée 549 / 571
152 Synchronous communications: OCaml sevent The functionssend andreceive are not blocking functions The primitivessend andreceive build the elementary events sending a message or receiving a message but do not have an immediate effect. They just create a data structure describing the action to be done To make an event happen, a thread must synchronize with another thread wishing to make the complementary event happen Thesync primitive allows a thread to wait for the occurrence of the event passed as argument. sync : a event -> a Synchronize on an event: offer all the communication possibilities specified in the event to the outside world, and block until one of the communications succeeds. The result value of that communication is returned. poll : a event -> a option Non-blocking version ofevent.sync: offer all the communication possibilities specified in the event to the outside world, and if one can take place immediately, perform it and returnsome r whereris the result value of that communication. Otherwise, returnnone without blocking. G. Castagna (CNRS) Cours de Programmation Avancée 550 / /571
153 Example: access a shared variable via communications # let ch = Event.new_channel () ;; # let v = ref 0;; # let reader () = Event.sync (Event.receive ch);; # let writer () = Event.sync (Event.send ch ("S" ^ (string_of_int!v)));; # let loop_reader s d () = ( for i=1 to 10 do let r = reader() in print_string (s ^ ":" ^ r ^ "; "); flush stdout; Thread.delay d done; print_newline());; # let loop_writer d () = for i=1 to 10 do incr v; writer(); Thread.delay d done;; # Thread.create (loop_reader "A" 1.1) ();; # Thread.create (loop_reader "B" 1.5) ();; # Thread.create (loop_reader "C" 1.9) ();; # loop_writer 1. ();; A:S1; C:S2; B:S3; A:S4; C:S5; B:S6; A:S7; C:S8; B:S9; A:S10; - : unit = () # loop_writer 1. ();; C:S11; B:S12; A:S13; C:S14; B:S15; A:S16; C:S17; B:S18; A:S19; C:S20; - : unit = () 551/571 G. Castagna (CNRS) Cours de Programmation Avancée 551 / 571
154 Example: Single position queue We modify the Queue example so as mutual exclusion is obtained by channels rather than mutexes. We keep the same interface module Cell = struct type a t = { add_ch: a Event.channel; rmv_ch: a Event.channel } let create () = let ach = Event.new_channel () and rch = Event.new_channel () in let rec empty () = let e = Event.sync (Event.receive ach) in full e and full e = empty (Event.sync (Event.send rch e)) in ignore (Thread.create empty ()); {add_ch = ach ; rmv_ch = rch} let add e q = ignore( Event.sync(Event.send q.add_ch e) ) let remove q = Event.sync (Event.receive q.rmv_ch) end //multi-position buffer 552/571 G. Castagna (CNRS) Cours de Programmation Avancée 552 / 571
155 Example: Single position queue We modify the Queue example so as mutual exclusion is obtained by channels rather than mutexes. We keep the same interface module Cell = struct type a t = { add_ch: a Event.channel; rmv_ch: a Event.channel } let create () = let ach = Event.new_channel () and rch = Event.new_channel () in let rec empty () = let e = Event.sync (Event.receive ach) in full e and full e = empty (Event.sync (Event.send rch e)) in ignore (Thread.create empty ()); {add_ch = ach ; rmv_ch = rch} let add e q = ignore( Event.poll poll(event.send q.add_ch e) ) let remove q = Event.sync (Event.receive q.rmv_ch) end //multi-position buffer 552/571 G. Castagna (CNRS) Cours de Programmation Avancée 552 / 571
156 Asynchronous communications: Erlang s concurrency Erlang is a (open-source) general-purpose concurrent programming language and runtime system designed by Ericsson to support distributed, fault-tolerant, soft-real-time, non-stop applications. 553/571 G. Castagna (CNRS) Cours de Programmation Avancée 553 / 571
157 Asynchronous communications: Erlang s concurrency Erlang is a (open-source) general-purpose concurrent programming language and runtime system designed by Ericsson to support distributed, fault-tolerant, soft-real-time, non-stop applications. The sequential subset of Erlang is a strict and dynamically typed functional language. For concurrency it follows the Actor model. Three primitives spawn spawns a new process send asynchronously send a message to a given process receive reads one the of the received messages Concurrency is structured around processes. Erlang processes are are not OS processes: they are much lighter (scale up to hundreds of million of processes). Like OS processes and unlike OS threads or green threads they have no shared state between them. Process communication is done via asynchronous message passing: every process has a mailbox, a queue of messages that have been sent by other processes and not yet consumed. 553/571 G. Castagna (CNRS) Cours de Programmation Avancée 553 / 571
158 Process creation Pid = spawn(module, FunctionName, ArgumentList) Spawns a new process that executes the functionfunctionname in the modulemodule with argumentsargumentlist and returns immediately its identifier. Asynchronous send Pid! Message Put the messagemessage in the buffer of the process whose identifier is Pid. Sofoo(12)! bar(baz) will first evaluatefoo(12) to get the process identifier andbar(baz) for the message to send, and returns immediately (the value of the message) without waiting for the message either to arrive at the destination or to be received. Selective receive receive Pattern1 -> Actions1 ; Pattern2 -> Actions2 ;. end Select in the mailbox the first message that matches a pattern, remove it from the mailbox, and execute its actions. Otherwise suspend. 554/571 G. Castagna (CNRS) Cours de Programmation Avancée 554 / 571
159 An example % Create a process and invoke the function % web:start_server(port, MaxConnect) ServerProcess = spawn(web, start_server, [Port, MaxConnect]), % [Distribution support] % Create a remote process and invoke the function % web:start_server(port, MaxConnect) on machine RemoteNode RemoteProcess = spawn(remotenode, web, start_server, [Port, MaxConnect]), % Send a message to ServerProcess (asynchronously). The message consists % of a tuple with the atom "pause" and the number "10". ServerProcess! pause, 10, % Receive messages sent to this process receive data, DataContent -> handle(datacontent); hello, Text -> io:format("got hello message: ~s", [Text]); goodbye, Text -> io:format("got goodbye message: ~s", [Text]) end. 555/571 G. Castagna (CNRS) Cours de Programmation Avancée 555 / 571
160 Erlang-style concurrency This style of concurrency has been adopted in several other languages F#: cf. the MailboxProcessor class Scala: uses the same syntax (and semantics) as Erlang but instead of processes we have actor objects that run in separate threads. Retlang a Erlang inspired library for.net andjetlang its Java counterpart. Others: Termite Scheme, Coro Module for Perl, /571 G. Castagna (CNRS) Cours de Programmation Avancée 556 / 571
161 Outline 39 Concurrency 40 Preemptive multi-threading 41 Locks, Conditional Variables, Monitors 42 Doing without mutual exclusion 43 Cooperative multi-threading 44 Channeled communication 45 Software Transactional Memory 557/571 G. Castagna (CNRS) Cours de Programmation Avancée 557 / 571
162 Compositionality The main problem with locks is that they are not compositional action1 = withdraw a 100 action2 = deposit b 100 action3 = do withdraw a 100 Inconsistent state deposit b /571 G. Castagna (CNRS) Cours de Programmation Avancée 558 / 571
163 Compositionality The main problem with locks is that they are not compositional action1 = withdraw a 100 action2 = deposit b 100 action3 = do withdraw a 100 Inconsistent state deposit b 100 Solution: Expose all locks action3 = do lock a lock b withdraw a 100 deposit b 100 release a release b 558/571 G. Castagna (CNRS) Cours de Programmation Avancée 558 / 571
164 Compositionality The main problem with locks is that they are not compositional action1 = withdraw a 100 action2 = deposit b 100 action3 = do withdraw a 100 Inconsistent state deposit b 100 Solution: Expose all locks action3 = do lock a lock b withdraw a 100 deposit b 100 release a release b Problems Risk of deadlocks Unfeasible for more complex cases 558/571 G. Castagna (CNRS) Cours de Programmation Avancée 558 / 571
165 Software transactional memory Idea Borrow ideas from database people ACID transactions: atomic - all changes commit or all changes roll back; changes appear to happen at a single moment in time consistent - operate on a snapshot of memory using newest values at beginning of the txn isolated - changes are only visible to other threads after commit durable - does not apply to STM (changes do not persist on disk) but we can adapt it when changes are lost if software crashes or hardware fails (cf. locks). Add ideas from (pure) functional programming Computations are first-class values What side-effects can happen and where they can happen is controlled 559/571 G. Castagna (CNRS) Cours de Programmation Avancée 559 / 571
166 Software transactional memory Idea Borrow ideas from database people ACID transactions: atomic - all changes commit or all changes roll back; changes appear to happen at a single moment in time consistent - operate on a snapshot of memory using newest values at beginning of the txn isolated - changes are only visible to other threads after commit durable - does not apply to STM (changes do not persist on disk) but we can adapt it when changes are lost if software crashes or hardware fails (cf. locks). Add ideas from (pure) functional programming Computations are first-class values What side-effects can happen and where they can happen is controlled Software Transactional Memory First ideas in 1993 New developments in 2005 (Simon Peyton Jones, Simon Marlow, Tim Harris, Maurice Herlihy) 559/571 G. Castagna (CNRS) Cours de Programmation Avancée 559 / 571
167 Atomic transactions Write sequential code, and wrapatomically around it action3 = atomically{ withdraw a 100 deposit b 100 } 560/571 G. Castagna (CNRS) Cours de Programmation Avancée 560 / 571
168 Atomic transactions Write sequential code, and wrapatomically around it action3 = atomically{ withdraw a 100 deposit b 100 } How does it works? Execute body without locks Each memory access is logged to a thread-local transaction log. No actual update is performed in memory At the end, we try to commit the log to memory Commit may fail, then we retry the whole atomic block 560/571 G. Castagna (CNRS) Cours de Programmation Avancée 560 / 571
169 Atomic transactions Write sequential code, and wrapatomically around it action3 = atomically{ withdraw a 100 deposit b 100 } How does it works? Execute body without locks Each memory access is logged to a thread-local transaction log. No actual update is performed in memory At the end, we try to commit the log to memory Commit may fail, then we retry the whole atomic block Optimistic concurrency 560/571 G. Castagna (CNRS) Cours de Programmation Avancée 560 / 571
170 Caveats Simon Peyton-Jones s missile program: action3 = atomically{ withdraw a 100 launchnuclearmissiles deposit b 100 } No side effects allowed! 561/571 G. Castagna (CNRS) Cours de Programmation Avancée 561 / 571
171 Caveats Simon Peyton-Jones s missile program: action3 = atomically{ withdraw a 100 launchnuclearmissiles deposit b 100 } No side effects allowed! More in details: The logging capability is implemented by specific transactional variables. Absolutely forbidden: To read a transaction variable outside an atomic block To write to a transaction variable outside an atomic block To make actual changes (eg, file or network access, use of non-trasactional variables) inside an atomic block /571 G. Castagna (CNRS) Cours de Programmation Avancée 561 / 571
172 Caveats Simon Peyton-Jones s missile program: action3 = atomically{ withdraw a 100 launchnuclearmissiles deposit b 100 } No side effects allowed! More in details: The logging capability is implemented by specific transactional variables. Absolutely forbidden: To read a transaction variable outside an atomic block To write to a transaction variable outside an atomic block To make actual changes (eg, file or network access, use of non-trasactional variables) inside an atomic block... These constraints are enforced by the type system 561/571 G. Castagna (CNRS) Cours de Programmation Avancée 561 / 571
173 STM in Haskell Fully-fledged implementation of STM:Control.Concurrent.STM Implemented in the language also in Clojure and Perl6. Implementations for C++, Java, C#, F# being developed as libraries... difficult to solve all problems In Haskell, it is easy: controlled side-effects type STM a instance Monad STM atomically :: STM a -> IO a retry :: STM a orelse :: STM a -> STM a -> STM a type TVar a newtvar :: a -> STM (TVar a) readtvar :: TVar a -> STM a writetvar :: TVar a -> a -> STM () Sides effects must be performed on specific transactional variables TVar 562/571 G. Castagna (CNRS) Cours de Programmation Avancée 562 / 571
174 STM:TVar Threads in STM Haskell communicate by reading and writing transactional variables type Resource = TVar Int putr :: Resource -> Int -> STM () putr r i = do v <- readtvar r writetvar r (v+i) main = do... atomically (putr r 13)... Operationally,atomically takes the tentative updates and actually applies them to thetvars involved. The system maintains a per-thread transaction log that records the tentative accesses made totvars. 563/571 G. Castagna (CNRS) Cours de Programmation Avancée 563 / 571
175 STM:retry retry: Retries execution of the current memory transaction because it has seen values in TVars which mean that it should not continue (e.g., the TVars represent a shared buffer that is now empty). The implementation may block the thread until one of the TVars that it has read from has been updated. retry :: STM a atomically {if n_items == 0 then retry else...remove from queue...} 564/571 G. Castagna (CNRS) Cours de Programmation Avancée 564 / 571
176 STM:retry retry: Retries execution of the current memory transaction because it has seen values in TVars which mean that it should not continue (e.g., the TVars represent a shared buffer that is now empty). The implementation may block the thread until one of the TVars that it has read from has been updated. retry :: STM a atomically {if n_items == 0 then retry else...remove from queue...} In summary: retry says abandon the current transaction and re-execute it from scratch The implementation waits untiln_items changes No condition variables, no lost wake-ups! 564/571 G. Castagna (CNRS) Cours de Programmation Avancée 564 / 571
177 Blocking composes atomically { x = queue1.getitem() ; queue2.putitem( x ) } If eithergetitem orputitem retries, the whole transaction retries So the transaction waits until queue1 is not empty AND queue2 is not full No need to re-code getitem or putitem (Lock-based code does not compose) 565/571 G. Castagna (CNRS) Cours de Programmation Avancée 565 / 571
178 STMorElse orelse: it tries two alternative paths: Note: If the first retries, it runs the second If both retry, the wholeorelse retries. orelse :: STM a -> STM a -> STM a atomically { x = queue1.getitem(); queue2.putitem(x) orelse queue3.putitem(x) } So the transaction waits until queue1 is non-empty, AND EITHER queue2 is not full OR queue3 is not full without touchinggetitem orputitem m1 orelse (m2 orelse m3) = (m1 orelse m2) orelse m3 retry orelse m = m m orelse retry = m 566/571 G. Castagna (CNRS) Cours de Programmation Avancée 566 / 571
179 Compositionality All transactions are flat Calling transactional code from the current transaction is normal This simply extends the current transaction 567/571 G. Castagna (CNRS) Cours de Programmation Avancée 567 / 571
180 STM in Haskell Safe transactions through type safety A very specific monad STM (distinct from I/O) We can only access TVars TVars can only be accessed in STM monad Referential transparency inside blocks Explicit retry expressiveness Compositional choice expressiveness 568/571 G. Castagna (CNRS) Cours de Programmation Avancée 568 / 571
181 STM in Haskell Safe transactions through type safety A very specific monad STM (distinct from I/O) We can only access TVars TVars can only be accessed in STM monad Referential transparency inside blocks Explicit retry expressiveness Compositional choice expressiveness Problems Overhead: managing transactions bookkeeping requires some overhead Starvation: could the system thrash by continually colliding and re-executing? No: one transaction can be forced to re-execute only if another succeeds in committing. That gives a strong progress guarantee. But a particular thread could perhaps starve. Performance: potential for many retries resulting in wasted work Tools: support is currently lacking for learning which memory locations experienced write conflicts for learning how often each transaction is retried and why G. Castagna (CNRS) Cours de Programmation Avancée 568 / /571
182 Problems in C++/Java/C# Retry semantics IO in atomic blocks Access of transaction variables outside of atomic blocks Access to regular variables inside of atomic blocks 569/571 G. Castagna (CNRS) Cours de Programmation Avancée 569 / 571
183 A useful analogy Switch from manual to automatic gear: 570/571 G. Castagna (CNRS) Cours de Programmation Avancée 570 / 571
184 A useful analogy Switch from manual to automatic gear: Memory management malloc, free, manual refcounting Garbage collection 570/571 G. Castagna (CNRS) Cours de Programmation Avancée 570 / 571
185 A useful analogy Switch from manual to automatic gear: Memory management malloc, free, manual refcounting Garbage collection Concurrency Mutexes, semaphores, condition variables Software transactional memory 570/571 G. Castagna (CNRS) Cours de Programmation Avancée 570 / 571
186 Manual/auto tradeoffs Same kind of tradeoffs: 571/571 G. Castagna (CNRS) Cours de Programmation Avancée 571 / 571
187 Manual/auto tradeoffs Same kind of tradeoffs: Memory management Manual: Performance, footprint Auto: Safety against memory leaks, corruption 571/571 G. Castagna (CNRS) Cours de Programmation Avancée 571 / 571
188 Manual/auto tradeoffs Same kind of tradeoffs: Memory management Manual: Performance, footprint Auto: Safety against memory leaks, corruption Concurrency Manual: Fine tuning for high contention Auto: Safety against deadlock, corruption 571/571 G. Castagna (CNRS) Cours de Programmation Avancée 571 / 571
189 Manual/auto tradeoffs Same kind of tradeoffs: Memory management Manual: Performance, footprint Auto: Safety against memory leaks, corruption Concurrency Rationale Manual: Fine tuning for high contention Auto: Safety against deadlock, corruption In both cases you pay in terms of performance and of I do not quite know what is going on, but they allow you to build larger, more complex systems that won t break because of wrong by hand management. 571/571 G. Castagna (CNRS) Cours de Programmation Avancée 571 / 571
Last Class: Semaphores
Last Class: Semaphores A semaphore S supports two atomic operations: S Wait(): get a semaphore, wait if busy semaphore S is available. S Signal(): release the semaphore, wake up a process if one is waiting
Lecture 6: Semaphores and Monitors
HW 2 Due Tuesday 10/18 Lecture 6: Semaphores and Monitors CSE 120: Principles of Operating Systems Alex C. Snoeren Higher-Level Synchronization We looked at using locks to provide mutual exclusion Locks
Overview Motivating Examples Interleaving Model Semantics of Correctness Testing, Debugging, and Verification
Introduction Overview Motivating Examples Interleaving Model Semantics of Correctness Testing, Debugging, and Verification Advanced Topics in Software Engineering 1 Concurrent Programs Characterized by
Real Time Programming: Concepts
Real Time Programming: Concepts Radek Pelánek Plan at first we will study basic concepts related to real time programming then we will have a look at specific programming languages and study how they realize
! Past few lectures: Ø Locks: provide mutual exclusion Ø Condition variables: provide conditional synchronization
1 Synchronization Constructs! Synchronization Ø Coordinating execution of multiple threads that share data structures Semaphores and Monitors: High-level Synchronization Constructs! Past few lectures:
Multiprocessor Scheduling and Scheduling in Linux Kernel 2.6
Multiprocessor Scheduling and Scheduling in Linux Kernel 2.6 Winter Term 2008 / 2009 Jun.-Prof. Dr. André Brinkmann [email protected] Universität Paderborn PC² Agenda Multiprocessor and
Semaphores and Monitors: High-level Synchronization Constructs
Semaphores and Monitors: High-level Synchronization Constructs 1 Synchronization Constructs Synchronization Coordinating execution of multiple threads that share data structures Past few lectures: Locks:
Chapter 6, The Operating System Machine Level
Chapter 6, The Operating System Machine Level 6.1 Virtual Memory 6.2 Virtual I/O Instructions 6.3 Virtual Instructions For Parallel Processing 6.4 Example Operating Systems 6.5 Summary Virtual Memory General
Shared Address Space Computing: Programming
Shared Address Space Computing: Programming Alistair Rendell See Chapter 6 or Lin and Synder, Chapter 7 of Grama, Gupta, Karypis and Kumar, and Chapter 8 of Wilkinson and Allen Fork/Join Programming Model
Monitors, Java, Threads and Processes
Monitors, Java, Threads and Processes 185 An object-oriented view of shared memory A semaphore can be seen as a shared object accessible through two methods: wait and signal. The idea behind the concept
Mutual Exclusion using Monitors
Mutual Exclusion using Monitors Some programming languages, such as Concurrent Pascal, Modula-2 and Java provide mutual exclusion facilities called monitors. They are similar to modules in languages that
Operating Systems Concepts: Chapter 7: Scheduling Strategies
Operating Systems Concepts: Chapter 7: Scheduling Strategies Olav Beckmann Huxley 449 http://www.doc.ic.ac.uk/~ob3 Acknowledgements: There are lots. See end of Chapter 1. Home Page for the course: http://www.doc.ic.ac.uk/~ob3/teaching/operatingsystemsconcepts/
Real-Time Systems Prof. Dr. Rajib Mall Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur
Real-Time Systems Prof. Dr. Rajib Mall Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No. # 26 Real - Time POSIX. (Contd.) Ok Good morning, so let us get
SSC - Concurrency and Multi-threading Java multithreading programming - Synchronisation (I)
SSC - Concurrency and Multi-threading Java multithreading programming - Synchronisation (I) Shan He School for Computational Science University of Birmingham Module 06-19321: SSC Outline Outline of Topics
Replication on Virtual Machines
Replication on Virtual Machines Siggi Cherem CS 717 November 23rd, 2004 Outline 1 Introduction The Java Virtual Machine 2 Napper, Alvisi, Vin - DSN 2003 Introduction JVM as state machine Addressing non-determinism
Chapter 6 Concurrent Programming
Chapter 6 Concurrent Programming Outline 6.1 Introduction 6.2 Monitors 6.2.1 Condition Variables 6.2.2 Simple Resource Allocation with Monitors 6.2.3 Monitor Example: Circular Buffer 6.2.4 Monitor Example:
Linux Process Scheduling Policy
Lecture Overview Introduction to Linux process scheduling Policy versus algorithm Linux overall process scheduling objectives Timesharing Dynamic priority Favor I/O-bound process Linux scheduling algorithm
SYSTEM ecos Embedded Configurable Operating System
BELONGS TO THE CYGNUS SOLUTIONS founded about 1989 initiative connected with an idea of free software ( commercial support for the free software ). Recently merged with RedHat. CYGNUS was also the original
Tasks Schedule Analysis in RTAI/Linux-GPL
Tasks Schedule Analysis in RTAI/Linux-GPL Claudio Aciti and Nelson Acosta INTIA - Depto de Computación y Sistemas - Facultad de Ciencias Exactas Universidad Nacional del Centro de la Provincia de Buenos
Operating Systems 4 th Class
Operating Systems 4 th Class Lecture 1 Operating Systems Operating systems are essential part of any computer system. Therefore, a course in operating systems is an essential part of any computer science
Why Threads Are A Bad Idea (for most purposes)
Why Threads Are A Bad Idea (for most purposes) John Ousterhout Sun Microsystems Laboratories [email protected] http://www.sunlabs.com/~ouster Introduction Threads: Grew up in OS world (processes).
10.04.2008. Thomas Fahrig Senior Developer Hypervisor Team. Hypervisor Architecture Terminology Goals Basics Details
Thomas Fahrig Senior Developer Hypervisor Team Hypervisor Architecture Terminology Goals Basics Details Scheduling Interval External Interrupt Handling Reserves, Weights and Caps Context Switch Waiting
Outline of this lecture G52CON: Concepts of Concurrency
Outline of this lecture G52CON: Concepts of Concurrency Lecture 10 Synchronisation in Java Natasha Alechina School of Computer Science [email protected] mutual exclusion in Java condition synchronisation
Operating Systems OBJECTIVES 7.1 DEFINITION. Chapter 7. Note:
Chapter 7 OBJECTIVES Operating Systems Define the purpose and functions of an operating system. Understand the components of an operating system. Understand the concept of virtual memory. Understand the
The Little Book of Semaphores
The Little Book of Semaphores Allen B. Downey Version 2.1.2 2 The Little Book of Semaphores Second Edition Version 2.1.2 Copyright 2005, 2006, 2007 Allen B. Downey Permission is granted to copy, distribute
Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture
Last Class: OS and Computer Architecture System bus Network card CPU, memory, I/O devices, network card, system bus Lecture 3, page 1 Last Class: OS and Computer Architecture OS Service Protection Interrupts
Monitors and Semaphores
Monitors and Semaphores Annotated Condition Variable Example Condition *cv; Lock* cvmx; int waiter = 0; await() { cvmx->lock(); waiter = waiter + 1; /* I m sleeping */ cv->wait(cvmx); /* sleep */ cvmx->unlock();
Tools Page 1 of 13 ON PROGRAM TRANSLATION. A priori, we have two translation mechanisms available:
Tools Page 1 of 13 ON PROGRAM TRANSLATION A priori, we have two translation mechanisms available: Interpretation Compilation On interpretation: Statements are translated one at a time and executed immediately.
Processes and Non-Preemptive Scheduling. Otto J. Anshus
Processes and Non-Preemptive Scheduling Otto J. Anshus 1 Concurrency and Process Challenge: Physical reality is Concurrent Smart to do concurrent software instead of sequential? At least we want to have
First-class User Level Threads
First-class User Level Threads based on paper: First-Class User Level Threads by Marsh, Scott, LeBlanc, and Markatos research paper, not merely an implementation report User-level Threads Threads managed
An Implementation Of Multiprocessor Linux
An Implementation Of Multiprocessor Linux This document describes the implementation of a simple SMP Linux kernel extension and how to use this to develop SMP Linux kernels for architectures other than
Linux scheduler history. We will be talking about the O(1) scheduler
CPU Scheduling Linux scheduler history We will be talking about the O(1) scheduler SMP Support in 2.4 and 2.6 versions 2.4 Kernel 2.6 Kernel CPU1 CPU2 CPU3 CPU1 CPU2 CPU3 Linux Scheduling 3 scheduling
CS 377: Operating Systems. Outline. A review of what you ve learned, and how it applies to a real operating system. Lecture 25 - Linux Case Study
CS 377: Operating Systems Lecture 25 - Linux Case Study Guest Lecturer: Tim Wood Outline Linux History Design Principles System Overview Process Scheduling Memory Management File Systems A review of what
Operatin g Systems: Internals and Design Principle s. Chapter 10 Multiprocessor and Real-Time Scheduling Seventh Edition By William Stallings
Operatin g Systems: Internals and Design Principle s Chapter 10 Multiprocessor and Real-Time Scheduling Seventh Edition By William Stallings Operating Systems: Internals and Design Principles Bear in mind,
Linux Scheduler. Linux Scheduler
or or Affinity Basic Interactive es 1 / 40 Reality... or or Affinity Basic Interactive es The Linux scheduler tries to be very efficient To do that, it uses some complex data structures Some of what it
Embedded Systems. 6. Real-Time Operating Systems
Embedded Systems 6. Real-Time Operating Systems Lothar Thiele 6-1 Contents of Course 1. Embedded Systems Introduction 2. Software Introduction 7. System Components 10. Models 3. Real-Time Models 4. Periodic/Aperiodic
Operating System: Scheduling
Process Management Operating System: Scheduling OS maintains a data structure for each process called Process Control Block (PCB) Information associated with each PCB: Process state: e.g. ready, or waiting
ICS 143 - Principles of Operating Systems
ICS 143 - Principles of Operating Systems Lecture 5 - CPU Scheduling Prof. Nalini Venkatasubramanian [email protected] Note that some slides are adapted from course text slides 2008 Silberschatz. Some
How To Understand The History Of An Operating System
7 Operating Systems 7.1 Source: Foundations of Computer Science Cengage Learning Objectives After studying this chapter, the student should be able to: 7.2 Understand the role of the operating system.
Linux Driver Devices. Why, When, Which, How?
Bertrand Mermet Sylvain Ract Linux Driver Devices. Why, When, Which, How? Since its creation in the early 1990 s Linux has been installed on millions of computers or embedded systems. These systems may
Atomicity for Concurrent Programs Outsourcing Report. By Khilan Gudka <[email protected]> Supervisor: Susan Eisenbach
Atomicity for Concurrent Programs Outsourcing Report By Khilan Gudka Supervisor: Susan Eisenbach June 23, 2007 2 Contents 1 Introduction 5 1.1 The subtleties of concurrent programming.......................
Comp 204: Computer Systems and Their Implementation. Lecture 12: Scheduling Algorithms cont d
Comp 204: Computer Systems and Their Implementation Lecture 12: Scheduling Algorithms cont d 1 Today Scheduling continued Multilevel queues Examples Thread scheduling 2 Question A starvation-free job-scheduling
CS11 Java. Fall 2014-2015 Lecture 7
CS11 Java Fall 2014-2015 Lecture 7 Today s Topics! All about Java Threads! Some Lab 7 tips Java Threading Recap! A program can use multiple threads to do several things at once " A thread can have local
Synchronization. Todd C. Mowry CS 740 November 24, 1998. Topics. Locks Barriers
Synchronization Todd C. Mowry CS 740 November 24, 1998 Topics Locks Barriers Types of Synchronization Mutual Exclusion Locks Event Synchronization Global or group-based (barriers) Point-to-point tightly
Facing the Challenges for Real-Time Software Development on Multi-Cores
Facing the Challenges for Real-Time Software Development on Multi-Cores Dr. Fridtjof Siebert aicas GmbH Haid-und-Neu-Str. 18 76131 Karlsruhe, Germany [email protected] Abstract Multicore systems introduce
CPU Scheduling. Core Definitions
CPU Scheduling General rule keep the CPU busy; an idle CPU is a wasted CPU Major source of CPU idleness: I/O (or waiting for it) Many programs have a characteristic CPU I/O burst cycle alternating phases
Software and the Concurrency Revolution
Software and the Concurrency Revolution A: The world s fastest supercomputer, with up to 4 processors, 128MB RAM, 942 MFLOPS (peak). 2 Q: What is a 1984 Cray X-MP? (Or a fractional 2005 vintage Xbox )
An Easier Way for Cross-Platform Data Acquisition Application Development
An Easier Way for Cross-Platform Data Acquisition Application Development For industrial automation and measurement system developers, software technology continues making rapid progress. Software engineers
Chapter 3 Operating-System Structures
Contents 1. Introduction 2. Computer-System Structures 3. Operating-System Structures 4. Processes 5. Threads 6. CPU Scheduling 7. Process Synchronization 8. Deadlocks 9. Memory Management 10. Virtual
Chapter 2: OS Overview
Chapter 2: OS Overview CmSc 335 Operating Systems 1. Operating system objectives and functions Operating systems control and support the usage of computer systems. a. usage users of a computer system:
Concepts of Concurrent Computation
Chair of Software Engineering Concepts of Concurrent Computation Bertrand Meyer Sebastian Nanz Lecture 3: Synchronization Algorithms Today's lecture In this lecture you will learn about: the mutual exclusion
CSC 2405: Computer Systems II
CSC 2405: Computer Systems II Spring 2013 (TR 8:30-9:45 in G86) Mirela Damian http://www.csc.villanova.edu/~mdamian/csc2405/ Introductions Mirela Damian Room 167A in the Mendel Science Building [email protected]
CPU Scheduling. CSC 256/456 - Operating Systems Fall 2014. TA: Mohammad Hedayati
CPU Scheduling CSC 256/456 - Operating Systems Fall 2014 TA: Mohammad Hedayati Agenda Scheduling Policy Criteria Scheduling Policy Options (on Uniprocessor) Multiprocessor scheduling considerations CPU
CPU Scheduling. CPU Scheduling
CPU Scheduling Electrical and Computer Engineering Stephen Kim ([email protected]) ECE/IUPUI RTOS & APPS 1 CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling
CS4410 - Fall 2008 Homework 2 Solution Due September 23, 11:59PM
CS4410 - Fall 2008 Homework 2 Solution Due September 23, 11:59PM Q1. Explain what goes wrong in the following version of Dekker s Algorithm: CSEnter(int i) inside[i] = true; while(inside[j]) inside[i]
Real-Time Component Software. slide credits: H. Kopetz, P. Puschner
Real-Time Component Software slide credits: H. Kopetz, P. Puschner Overview OS services Task Structure Task Interaction Input/Output Error Detection 2 Operating System and Middleware Applica3on So5ware
OPERATING SYSTEMS PROCESS SYNCHRONIZATION
OPERATING SYSTEMS PROCESS Jerry Breecher 6: Process Synchronization 1 OPERATING SYSTEM Synchronization What Is In This Chapter? This is about getting processes to coordinate with each other. How do processes
Cloud Computing. Up until now
Cloud Computing Lecture 11 Virtualization 2011-2012 Up until now Introduction. Definition of Cloud Computing Grid Computing Content Distribution Networks Map Reduce Cycle-Sharing 1 Process Virtual Machines
How To Write A Multi Threaded Software On A Single Core (Or Multi Threaded) System
Multicore Systems Challenges for the Real-Time Software Developer Dr. Fridtjof Siebert aicas GmbH Haid-und-Neu-Str. 18 76131 Karlsruhe, Germany [email protected] Abstract Multicore systems have become
CS414 SP 2007 Assignment 1
CS414 SP 2007 Assignment 1 Due Feb. 07 at 11:59pm Submit your assignment using CMS 1. Which of the following should NOT be allowed in user mode? Briefly explain. a) Disable all interrupts. b) Read the
Chapter 11 I/O Management and Disk Scheduling
Operating Systems: Internals and Design Principles, 6/E William Stallings Chapter 11 I/O Management and Disk Scheduling Dave Bremer Otago Polytechnic, NZ 2008, Prentice Hall I/O Devices Roadmap Organization
Operating Systems for Parallel Processing Assistent Lecturer Alecu Felician Economic Informatics Department Academy of Economic Studies Bucharest
Operating Systems for Parallel Processing Assistent Lecturer Alecu Felician Economic Informatics Department Academy of Economic Studies Bucharest 1. Introduction Few years ago, parallel computers could
Automatic Logging of Operating System Effects to Guide Application-Level Architecture Simulation
Automatic Logging of Operating System Effects to Guide Application-Level Architecture Simulation Satish Narayanasamy, Cristiano Pereira, Harish Patil, Robert Cohn, and Brad Calder Computer Science and
Operating System Structures
Operating System Structures Meelis ROOS [email protected] Institute of Computer Science Tartu University fall 2009 Literature A. S. Tanenbaum. Modern Operating Systems. 2nd ed. Prentice Hall. 2001. G. Nutt.
Real-Time Scheduling 1 / 39
Real-Time Scheduling 1 / 39 Multiple Real-Time Processes A runs every 30 msec; each time it needs 10 msec of CPU time B runs 25 times/sec for 15 msec C runs 20 times/sec for 5 msec For our equation, A
CSCI E 98: Managed Environments for the Execution of Programs
CSCI E 98: Managed Environments for the Execution of Programs Draft Syllabus Instructor Phil McGachey, PhD Class Time: Mondays beginning Sept. 8, 5:30-7:30 pm Location: 1 Story Street, Room 304. Office
POSIX. RTOSes Part I. POSIX Versions. POSIX Versions (2)
RTOSes Part I Christopher Kenna September 24, 2010 POSIX Portable Operating System for UnIX Application portability at source-code level POSIX Family formally known as IEEE 1003 Originally 17 separate
Operating System Tutorial
Operating System Tutorial OPERATING SYSTEM TUTORIAL Simply Easy Learning by tutorialspoint.com tutorialspoint.com i ABOUT THE TUTORIAL Operating System Tutorial An operating system (OS) is a collection
Multi-core Programming System Overview
Multi-core Programming System Overview Based on slides from Intel Software College and Multi-Core Programming increasing performance through software multi-threading by Shameem Akhter and Jason Roberts,
Effective Computing with SMP Linux
Effective Computing with SMP Linux Multi-processor systems were once a feature of high-end servers and mainframes, but today, even desktops for personal use have multiple processors. Linux is a popular
CPU Scheduling. Basic Concepts. Basic Concepts (2) Basic Concepts Scheduling Criteria Scheduling Algorithms Batch systems Interactive systems
Basic Concepts Scheduling Criteria Scheduling Algorithms Batch systems Interactive systems Based on original slides by Silberschatz, Galvin and Gagne 1 Basic Concepts CPU I/O Burst Cycle Process execution
Project No. 2: Process Scheduling in Linux Submission due: April 28, 2014, 11:59pm
Project No. 2: Process Scheduling in Linux Submission due: April 28, 2014, 11:59pm PURPOSE Getting familiar with the Linux kernel source code. Understanding process scheduling and how different parameters
Introduction. What is an Operating System?
Introduction What is an Operating System? 1 What is an Operating System? 2 Why is an Operating System Needed? 3 How Did They Develop? Historical Approach Affect of Architecture 4 Efficient Utilization
Operating Systems. 05. Threads. Paul Krzyzanowski. Rutgers University. Spring 2015
Operating Systems 05. Threads Paul Krzyzanowski Rutgers University Spring 2015 February 9, 2015 2014-2015 Paul Krzyzanowski 1 Thread of execution Single sequence of instructions Pointed to by the program
CPU Scheduling Outline
CPU Scheduling Outline What is scheduling in the OS? What are common scheduling criteria? How to evaluate scheduling algorithms? What are common scheduling algorithms? How is thread scheduling different
PostgreSQL Concurrency Issues
PostgreSQL Concurrency Issues 1 PostgreSQL Concurrency Issues Tom Lane Red Hat Database Group Red Hat, Inc. PostgreSQL Concurrency Issues 2 Introduction What I want to tell you about today: How PostgreSQL
Chapter 1 FUNDAMENTALS OF OPERATING SYSTEM
Chapter 1 FUNDAMENTALS OF OPERATING SYSTEM An operating system is a program that acts as an intermediary between a user of a computer and the computer hardware. The purpose of an operating system is to
COS 318: Operating Systems. Semaphores, Monitors and Condition Variables
COS 318: Operating Systems Semaphores, Monitors and Condition Variables Prof. Margaret Martonosi Computer Science Department Princeton University http://www.cs.princeton.edu/courses/archive/fall11/cos318/
Understanding Hardware Transactional Memory
Understanding Hardware Transactional Memory Gil Tene, CTO & co-founder, Azul Systems @giltene 2015 Azul Systems, Inc. Agenda Brief introduction What is Hardware Transactional Memory (HTM)? Cache coherence
CPU SCHEDULING (CONT D) NESTED SCHEDULING FUNCTIONS
CPU SCHEDULING CPU SCHEDULING (CONT D) Aims to assign processes to be executed by the CPU in a way that meets system objectives such as response time, throughput, and processor efficiency Broken down into
Thread Synchronization and the Java Monitor
Articles News Weblogs Buzz Chapters Forums Table of Contents Order the Book Print Email Screen Friendly Version Previous Next Chapter 20 of Inside the Java Virtual Machine Thread Synchronization by Bill
General Introduction
Managed Runtime Technology: General Introduction Xiao-Feng Li ([email protected]) 2012-10-10 Agenda Virtual machines Managed runtime systems EE and MM (JIT and GC) Summary 10/10/2012 Managed Runtime
W4118 Operating Systems. Instructor: Junfeng Yang
W4118 Operating Systems Instructor: Junfeng Yang Outline Introduction to scheduling Scheduling algorithms 1 Direction within course Until now: interrupts, processes, threads, synchronization Mostly mechanisms
Chapter 3: Operating-System Structures. System Components Operating System Services System Calls System Programs System Structure Virtual Machines
Chapter 3: Operating-System Structures System Components Operating System Services System Calls System Programs System Structure Virtual Machines Operating System Concepts 3.1 Common System Components
A Survey of Parallel Processing in Linux
A Survey of Parallel Processing in Linux Kojiro Akasaka Computer Science Department San Jose State University San Jose, CA 95192 408 924 1000 [email protected] ABSTRACT Any kernel with parallel processing
CSE 120 Principles of Operating Systems. Modules, Interfaces, Structure
CSE 120 Principles of Operating Systems Fall 2000 Lecture 3: Operating System Modules, Interfaces, and Structure Geoffrey M. Voelker Modules, Interfaces, Structure We roughly defined an OS as the layer
ò Paper reading assigned for next Thursday ò Lab 2 due next Friday ò What is cooperative multitasking? ò What is preemptive multitasking?
Housekeeping Paper reading assigned for next Thursday Scheduling Lab 2 due next Friday Don Porter CSE 506 Lecture goals Undergrad review Understand low-level building blocks of a scheduler Understand competing
Review from last time. CS 537 Lecture 3 OS Structure. OS structure. What you should learn from this lecture
Review from last time CS 537 Lecture 3 OS Structure What HW structures are used by the OS? What is a system call? Michael Swift Remzi Arpaci-Dussea, Michael Swift 1 Remzi Arpaci-Dussea, Michael Swift 2
Lesson Objectives. To provide a grand tour of the major operating systems components To provide coverage of basic computer system organization
Lesson Objectives To provide a grand tour of the major operating systems components To provide coverage of basic computer system organization AE3B33OSD Lesson 1 / Page 2 What is an Operating System? A
A Comparison Of Shared Memory Parallel Programming Models. Jace A Mogill David Haglin
A Comparison Of Shared Memory Parallel Programming Models Jace A Mogill David Haglin 1 Parallel Programming Gap Not many innovations... Memory semantics unchanged for over 50 years 2010 Multi-Core x86
Process Scheduling CS 241. February 24, 2012. Copyright University of Illinois CS 241 Staff
Process Scheduling CS 241 February 24, 2012 Copyright University of Illinois CS 241 Staff 1 Announcements Mid-semester feedback survey (linked off web page) MP4 due Friday (not Tuesday) Midterm Next Tuesday,
Threads Scheduling on Linux Operating Systems
Threads Scheduling on Linux Operating Systems Igli Tafa 1, Stavri Thomollari 2, Julian Fejzaj 3 Polytechnic University of Tirana, Faculty of Information Technology 1,2 University of Tirana, Faculty of
Contributions to Gang Scheduling
CHAPTER 7 Contributions to Gang Scheduling In this Chapter, we present two techniques to improve Gang Scheduling policies by adopting the ideas of this Thesis. The first one, Performance- Driven Gang Scheduling,
Introduction to SPIN. Acknowledgments. Parts of the slides are based on an earlier lecture by Radu Iosif, Verimag. Ralf Huuck. Features PROMELA/SPIN
Acknowledgments Introduction to SPIN Parts of the slides are based on an earlier lecture by Radu Iosif, Verimag. Ralf Huuck Ralf Huuck COMP 4152 1 Ralf Huuck COMP 4152 2 PROMELA/SPIN PROMELA (PROcess MEta
