System Software Winter 2015

From this document you will learn the answers to the following questions:

What is a program that multiple threads accessing a single integer variable Shared Variable?

What variable must be atomic among readers?

Similar documents
Parallel Programming

Semaphores and Monitors: High-level Synchronization Constructs

! Past few lectures: Ø Locks: provide mutual exclusion Ø Condition variables: provide conditional synchronization

W4118 Operating Systems. Instructor: Junfeng Yang

Lecture 6: Semaphores and Monitors

Communication Motivation

Understanding Hardware Transactional Memory

Monitors, Java, Threads and Processes

Multi-core architectures. Jernej Barbic , Spring 2007 May 3, 2007

Concurrent programming in Java

Shared Address Space Computing: Programming

Distributed Systems [Fall 2012]

Last Class: Semaphores

SSC - Concurrency and Multi-threading Java multithreading programming - Synchronisation (I)

Lecture 6: Introduction to Monitors and Semaphores

Chapter 6 Concurrent Programming

COS 318: Operating Systems. Semaphores, Monitors and Condition Variables

CS414 SP 2007 Assignment 1

Processes and Non-Preemptive Scheduling. Otto J. Anshus

Outline of this lecture G52CON: Concepts of Concurrency

Chapter 5 Process Scheduling

Facing the Challenges for Real-Time Software Development on Multi-Cores

Multiprocessor Scheduling and Scheduling in Linux Kernel 2.6

(51) Int Cl.: G10L 15/26 ( )

How To Write A Multi Threaded Software On A Single Core (Or Multi Threaded) System

MUK-IT 63. Roundtable. Herzlich Willkommen bei der Software AG. Anton Hofmeier VP Sales Terracotta DACH / MdGL

Thread Synchronization and the Java Monitor

3C03 Concurrency: Condition Synchronisation

OPERATING SYSTEMS PROCESS SYNCHRONIZATION

CPU SCHEDULING (CONT D) NESTED SCHEDULING FUNCTIONS

Priority Inversion Problem and Deadlock Situations

Software and the Concurrency Revolution

Ports utilisés. Ports utilisés par le XT1000/5000 :

Chapter 6, The Operating System Machine Level

SWARM: A Parallel Programming Framework for Multicore Processors. David A. Bader, Varun N. Kanade and Kamesh Madduri

6 Synchronization with Semaphores

Course Development of Programming for General-Purpose Multicore Processors

Operating Systems Concepts: Chapter 7: Scheduling Strategies

Monitors and Semaphores

System Software and TinyAUTOSAR

Comp 204: Computer Systems and Their Implementation. Lecture 12: Scheduling Algorithms cont d

Road Map. Scheduling. Types of Scheduling. Scheduling. CPU Scheduling. Job Scheduling. Dickinson College Computer Science 354 Spring 2010.

Processor Scheduling. Queues Recall OS maintains various queues

Scaling Networking Applications to Multiple Cores

Performance Evaluation and Optimization of A Custom Native Linux Threads Library

Overview Motivating Examples Interleaving Model Semantics of Correctness Testing, Debugging, and Verification

Concurrent Programming

1Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Mutual Exclusion using Monitors

POSIX. RTOSes Part I. POSIX Versions. POSIX Versions (2)

Programming real-time systems with C/C++ and POSIX

Parallel Algorithm Engineering

Operatin g Systems: Internals and Design Principle s. Chapter 10 Multiprocessor and Real-Time Scheduling Seventh Edition By William Stallings

Process Scheduling CS 241. February 24, Copyright University of Illinois CS 241 Staff

Implementing Data Models and Reports with Microsoft SQL Server

SYSTEM ecos Embedded Configurable Operating System

TIn 1: Lecture 3: Lernziele. Lecture 3 The Belly of the Architect. Basic internal components of the Pointers and data storage in memory

Outline. Review. Inter process communication Signals Fork Pipes FIFO. Spotlights

4. Monitors. Monitors support data encapsulation and information hiding and are easily adapted to an object-oriented environment.

CS Fall 2008 Homework 2 Solution Due September 23, 11:59PM

CPU Scheduling Outline

REAL TIME OPERATING SYSTEMS. Lesson-10:

A Survey of Parallel Processing in Linux

Deciding which process to run. (Deciding which thread to run) Deciding how long the chosen process can run

Aktives Service-, Asset- und Lizenzmanagement mit Altiris

Embedded Systems: map to FPGA, GPU, CPU?

Linux Process Scheduling Policy

Semantic Web. Semantic Web: Resource Description Framework (RDF) cont. Resource Description Framework (RDF) W3C Definition:

Multicore Programming with LabVIEW Technical Resource Guide

Control 2004, University of Bath, UK, September 2004

J2EE-Application Server

Multi-Threading Performance on Commodity Multi-Core Processors

Java Concurrency Framework. Sidartha Gracias

sel4: from Security to Safety Gernot Heiser, Anna Lyons NICTA and UNSW Australia

Java Memory Model: Content

Chapter 1 FUNDAMENTALS OF OPERATING SYSTEM

CPU Scheduling. Basic Concepts. Basic Concepts (2) Basic Concepts Scheduling Criteria Scheduling Algorithms Batch systems Interactive systems

A Comparison Of Shared Memory Parallel Programming Models. Jace A Mogill David Haglin

supernova, a multiprocessor-aware synthesis server for SuperCollider

Kernel Synchronization and Interrupt Handling

Middleware. Peter Marwedel TU Dortmund, Informatik 12 Germany. technische universität dortmund. fakultät für informatik informatik 12

Why Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat

Operating Systems OBJECTIVES 7.1 DEFINITION. Chapter 7. Note:

Linux scheduler history. We will be talking about the O(1) scheduler

Embedded Systems. 6. Real-Time Operating Systems

Tasks Schedule Analysis in RTAI/Linux-GPL

MEGWARE HPC Cluster am LRZ eine mehr als 12-jährige Zusammenarbeit. Prof. Dieter Kranzlmüller (LRZ)

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip.

CommVault Simpana 7.0 Software Suite. und ORACLE Momentaufnahme. Robert Romanski Channel SE

Implementing AUTOSAR Scheduling and Resource Management on an Embedded SMT Processor

Transcription:

System Software Synchronization Das Kontinuum Aktivität (Zeit) Abfolge ausgeführter Instruktionen Kontrollfluß oder Thread Unterstützung für mehrere Threads Concurreny Control Raum Menge der adressierbaren Speicherzellen Adressraum oder Adress space Unterstützung für mehrere Adreßräume (c) Peter Sturm, Uni Trier 1

Adreßraum CPU definiert ein theoretisches, oberes Limit Anzahl der Adreß-Pins (address bus) Beispiele 32 bit Architektur (Intel Pentium, AMD Athlon XP, ) 32 Adressleitungen = 4 GByte Adreßraum 64 bit Architektur (AMD Athlon 64, Intel Itanium, ) 64 Leitungen = 4194304 TByte theoretischer Adreßraum Real limitiert auf ~40 Pins (Server 48 Pins) Virtuelle Prozessoren (c) Peter Sturm, Uni Trier 2

State Transistions Resign Which thread should be assigned = Scheduling Add Ready Assign Running Terminate Ready Blocked Block Round-Robin (RR) RR = FCFS mit Preemption Weit verbreiteter Ansatz Überschreiten eines Zeitquantums führt zur Preemption Gängige Werte: 10 ms and 20 ms (c) Peter Sturm, Uni Trier 3

Multi-Level Scheduling Bereit-Liste wird unterteilt 1 st Level Welche Teilliste? Meist Prioritäten 2 nd Level Welcher Thread in gewählter Liste? Häufig RR Ready Assign Gordon Moore Gene Amdahl AMDAHL UND MOORE (c) Peter Sturm, Uni Trier 4

Moores Gesetz Annahmen: Verdopplung Kernzahl alle 18 Monate 4-Kern-Prozessor in 2009 Moores Gesetz The Fifth Paradigm Nach Kurzweil (1999) und Moravec (1998) (c) Peter Sturm, Uni Trier 5

2014? Weiter? 2014 64 Cores 2018 1024 Cores 2028 8192 Cores Many Cores Enough Cores J (c) Peter Sturm, Uni Trier 6

Nicht-parallele Software wird nie mehr schneller laufen als heute! Amdahls Gesetz Annahmen: Zu 95% parallelisierbare Software Gleichbleibende Taktfrequenz Speedup gegenüber SingleCore (c) Peter Sturm, Uni Trier 7

Amdahl: SingleCore s = 0.2 T MultiCore ( s, n) = T SingleCore ( s) + T SingleCore 1 s n (1-s) = 0.8 t Exe = 1 s Amdahl: DualCore s = 0.2 (1-s) = 0.8 t Exe = 0.6 s (c) Peter Sturm, Uni Trier 8

Amdahl: OctCore s = 0.2 (1-s) = 0.8 t Exe = 0.3 s Amdahl: 32-Core s = 0.2 (1-s) = 0.8 t Exe = 0.225 s (c) Peter Sturm, Uni Trier 9

Motivation? Let s try! Example program Multiple threads Accessing a single integer variable Shared Variable (c) Peter Sturm, Uni Trier 10

Motivation (c) Peter Sturm, Uni Trier 11

Konkurrenz und Kooperation Konkurrenz Kampf um Ressourcen Ziel: Alles für mich Implizite Synchronisation Virtuelle Ressourcen OS serialisiert Wechselseitiger Ausschluß Kooperation Kooperierende Threads Teilt sich Ressourcen & Aufgaben Ziele Speedup Reaktionsgeschwindigkeit Beliebig komplexe Synchronisationsprobleme (c) Peter Sturm, Uni Trier 12

Mutual Exclusion Wechselseitiger Ausschluß Atomarität für eine Gruppe von Befehlen Höhere Synchronisationslösungen ableitbar Paßt zum Scheduling Blockiert-Liste wenn belegt IPC Interprozeßkommunikation Synchronisation A wartet auf Bedingung B erfüllt Bedingung und informiert A 1-Bit-Nachricht Kommunikation Austausch von Daten Vielfältige Verfahren (c) Peter Sturm, Uni Trier 13

Abstand In einem Prozeß (Adreßraum) Zugriff auf gemeinsamen Speicher Zwischen Prozessen auf einem Rechner Zwischen Prozessen auf verschiedenen Rechnern Verteilte Systeme Internet, IP, TCP, UDP SEMAPHORE (c) Peter Sturm, Uni Trier 14

Semaphore P Semaphore Fundamental synchronization primitive Two basic functions (on a semaphore s) s.p() May block in case semaphore already occupied s.v() Never blocks; releases semaphore and may free a blocked thread Dijkstra (1968), THE Multiprogramming System P = passeren (request entry) V = vrygeven (release) Semantic Semaphores are counting signals s.p(): Thread waits for some signal s s.v(): Thread sends a signal s Semaphor = Integer (s.value) Initialized with a s.value 0 void P () s.value--; if (s.value < 0) Block calling thread; Of course, P and V themselves are critical sections and must be protected! void V () s.value++; if (s.value < 1) Put a thread waiting in s into ready queue; (c) Peter Sturm, Uni Trier 15

Mutual Exclusion With Semaphore Semaphore s := 1; Thread 0 Thread 1 s.p(); s.p(); Critical Section Critical Section s.v(); s.v(); PRODUCER/CONSUMER (c) Peter Sturm, Uni Trier 16

Producers Motivation Buffer Erzeuger und Verbraucher Consumers Puffer fester Größe Geschwindigkeitsunterschiede Wie löst man das Problem? public class Buffer public Buffer ( int n ) Outline of Buffer /// Inserts another good into the buffer. /// May block until free space is available. public void Produce ( int good ) /// Consumes a good stored inside the buffer. /// May signal blocked producer threads. public int Consume () return 42; (c) Peter Sturm, Uni Trier 17

public class Producer public Producer ( Buffer b ) buffer = b; my_id = this.gethashcode(); ThreadStart ts = new ThreadStart(Run); my_thread = new Thread(ts); my_thread.start(); Producer (C#) private void Run () Console.WriteLine("Producer 0: started...",my_id); int good = this.gethashcode() * 1000000; while (true) buffer.produce(good); Console.WriteLine("Producer 0: good 1 stored",my_id,good); good++; private Buffer buffer; private Thread my_thread; private int my_id; Consumer (C#) public class Consumer public Consumer ( Buffer b ) buffer = b; my_id = this.gethashcode(); ThreadStart ts = new ThreadStart(Run); my_thread = new Thread(ts); my_thread.start(); private void Run () Console.WriteLine("Consumer 0: started...",my_id); while (true) int good = buffer.consume(); Console.WriteLine("Consumer 0: good 1 retrieved",my_id,good); private Buffer buffer; private Thread my_thread; private int my_id; (c) Peter Sturm, Uni Trier 18

public class Buffer public Buffer ( int n ) this.n = n; slots = new int[n]; mutex = new Semaphore(1); slots_available = new Semaphore(n); goods_available = new Semaphore(0); private int n; private int [] slots; private int free = 0; private int used = 0; private Semaphore mutex; private Semaphore slots_available; private Semaphore goods_available; The Buffer (C#) public void Produce ( int good ) slots_available.p(); mutex.p(); slots[free] = good; free = (free+1) % n; mutex.v(); goods_available.v(); public int Consume () goods_available.p(); mutex.p(); int good = slots[used]; used = (used+1) % n; mutex.v(); slots_available.v(); return good; The Optimal Buffer (C#) public class Buffer public Buffer ( int n ) this.n = n; slots = new int[n]; mutex_p = new Semaphore(1); mutex_c = new Semaphore(1); slots_available = new Semaphore(n); goods_available = new Semaphore(0); public void Produce ( int good ) slots_available.p(); mutex_p.p(); slots[free] = good; free = (free+1) % n; mutex_p.v(); goods_available.v(); private int n; private int [] slots; private int free = 0; private int used = 0; private Semaphore mutex_p, mutex_c; private Semaphore slots_available; private Semaphore goods_available; public int Consume () goods_available.p(); mutex_c.p(); int good = slots[used]; used = (used+1) % n; mutex_c.v(); slots_available.v(); return good; (c) Peter Sturm, Uni Trier 19

READER/WRITER The Problem Shared Data A shared data structure is used by multiple threads Writer threads These threads modify the shared data and require mutual exclusion Reader threads Since readers don t modify data, they can access the shared data structure simultaneously (c) Peter Sturm, Uni Trier 20

1. Mutual Exclusion As a starting point, implement a simple solution based on mutual exclusion for readers and writers. Of course, this version doesn t exploit the potential for parallelism by allowing multiple readers to access the shared data simultaneously. Start from scratch (C#) Inspect the complete example program (C#) Works as Expected But only one reader inside sanctum at any given time (c) Peter Sturm, Uni Trier 21

Starting Point: Mutual Exclusion Shared Data Semaphore Sanctum = new Semaphore(1); while (true) Sanctum.P(); // Change data Sanctum.V(); while (true) Sanctum.P(); // Read data Sanctum.V(); Writer Reader 2. Reader Preference Improve the previous program to enable multiple readers to access shared datas concurrently Hint Only the first reader must acquire semaphore Subsequent readers just enter Only the last reader leaving must release semaphore We accept the risk of starvation for writers This is a still faulty version of the program J (c) Peter Sturm, Uni Trier 22

Reader Preference (Faulty) Shared Data Semaphore Sanctum = new Semaphore(1); int readers_inside = 0; while (true) Sanctum.P(); // Change data Sanctum.V(); while (true) if (readers_inside == 0) Sanctum.P(); readers_inside++; // Read data readers_inside--; if (readers_inside == 0) Sanctum.V(); Writer Reader Why Does It Fail? while (true) if (readers_inside == 0) Sanctum.P(); readers_inside++; // Read data readers_inside--; if (readers_inside == 0) Sanctum.V(); Reader Multiple readers may access the shared variable readers_inside concurrently Testing and setting variable must be atomic among readers Mutual exclusion required (c) Peter Sturm, Uni Trier 23

2. Reader Preference (Correct) Shared Data Semaphore Sanctum = new Semaphore(1); Semaphore RMutex = new Semaphore(1); int readers_inside = 0; while (true) Sanctum.P(); // Change data Sanctum.V(); Writer while (true) RMutex.P(); if (readers_inside == 0) Sanctum.P(); readers_inside++; RMutex.V(); // Read data RMutex.P(); readers_inside--; if (readers_inside == 0) Sanctum.V(); RMutex.V(); Reader Obviously Reader Preference (c) Peter Sturm, Uni Trier 24

But Writer May Starve It takes a long time for a single writer, if there are 20 readers continuously accessing shared data Sometimes forever L 3. Writer Preference The moment, a single writer is requesting to enter the shared data section, no other reader might enter anymore The writer has to wait, until all readers inside left the sanctum Subsequent writers are preferred We accept a possible starvation among readers Hint Similar to reader preference, we count the number of writers willing to enter Another semaphore locks readers in case of writer interest (c) Peter Sturm, Uni Trier 25

3. Writer Preference Shared Data Semaphore Sanctum = new Semaphore(1); Semaphore RMutex = new Semaphore(1); Semaphore WMutex = new Semaphore(1); Semaphore PreferWriter = new Semaphore(1); int readers_inside = 0; int writers_interested = 0; while (true) WMutex.P(); if (writers_interested == 0) PreferWriter.P(); writers_interested++; WMutex.V(); Sanctum.P(); // Change data Sanctum.V(); WMutex.P(); writers_interested--; if (writers_interested == 0) PreferWriter.V(); WMutex.V(); Writer while (true) PreferWriter.P(); RMutex.P(); if (readers_inside == 0) Sanctum.P(); readers_inside++; RMutex.V(); PreferWriter.V(); // Read data RMutex.P(); readers_inside--; if (readers_inside == 0) Sanctum.V(); RMutex.V(); Reader Works Great (c) Peter Sturm, Uni Trier 26

and Writer Doesn t Starve 4. Writer Preference (2 nd Edition) Any ideas to improve the program with writer preference even further? Hint A single writer may compete with multiple readers when calling PreferWriter.P() Is it possible to limit that to at most one reader? (c) Peter Sturm, Uni Trier 27

4. Writer Preference (Improved) Shared Data Semaphore Sanctum = new Semaphore(1); Semaphore RMutex = new Semaphore(1); Semaphore WMutex = new Semaphore(1); Semaphore PreferWriter = new Semaphore(1); Semaphore ReaderQueue = new Semaphore(1); int readers_inside = 0; int writers_interested = 0; while (true) WMutex.P(); if (writers_interested == 0) PreferWriter.P(); writers_interested++; WMutex.V(); Sanctum.P(); // Change data Sanctum.V(); WMutex.P(); writers_interested--; if (writers_interested == 0) PreferWriter.V(); WMutex.V(); Writer while (true) ReaderQueue.P(); PreferWriter.P(); ReaderQueue.V(); RMutex.P(); if (readers_inside == 0) Sanctum.P(); readers_inside++; RMutex.V(); PreferWriter.V(); // Read data RMutex.P(); readers_inside--; if (readers_inside == 0) Sanctum.V(); RMutex.V(); Reader 4. Writer Preference (Improved but faulty) Shared Data Semaphore Sanctum = new Semaphore(1); Semaphore RMutex = new Semaphore(1); Semaphore WMutex = new Semaphore(1); Semaphore PreferWriter = new Semaphore(1); Semaphore ReaderQueue = new Semaphore(1); int readers_inside = 0; int writers_interested = 0; while (true) WMutex.P(); if (writers_interested == 0) PreferWriter.P(); writers_interested++; WMutex.V(); Sanctum.P(); // Change data Sanctum.V(); WMutex.P(); writers_interested--; if (writers_interested == 0) PreferWriter.V(); WMutex.V(); Writer while (true) ReaderQueue.P(); PreferWriter.P(); RMutex.P(); if (readers_inside == 0) Sanctum.P(); readers_inside++; RMutex.V(); ReaderQueue.V(); PreferWriter.V(); // Read data RMutex.P(); readers_inside--; if (readers_inside == 0) Sanctum.V(); RMutex.V(); Reader (c) Peter Sturm, Uni Trier 28

Works Even Better and Writer Still Doesn t Starve (c) Peter Sturm, Uni Trier 29

One Semaphore too much? Why not throw semaphore ReaderQueue away and move RMutex before PreferWriter? Wouldn t that serialize all readers too? while (true) ReaderQueue.P(); PreferWriter.P(); RMutex.P(); if (readers_inside == 0) Sanctum.P(); readers_inside++; RMutex.V(); PreferWriter.V(); ReaderQueue.V(); // Read data RMutex.P(); readers_inside--; if (readers_inside == 0) Sanctum.V(); RMutex.V(); Reader IMPLEMENTATION DETAILS (c) Peter Sturm, Uni Trier 30

Implementation S: value queue TCB TCB TCB P Execution of P must be atomic Thread k calls P V Execution of V must be atomic Calling thread normally doesn t block s.value--; if (s.value < 0) tail(s.queue) := k; block(k); s.value++; if (s.value <= 0) ready(head(s.queue)) delhead(s.queue) Binary Semaphores Value must be 0 or 1 Easy to implement with general semaphores Beware of V operation: Simple increment invalidates the semaphore value P V if (s.value = 0) tail(s.qqueue) := k; block(k); s.value := 0; if (s.qqueue!= 0) ready(head(s.queue)); delhead(s.queue) else s.value := 1; (c) Peter Sturm, Uni Trier 31

Nested (binary) Semaphores Bsemaphor Mutex(1); f1 (...) Mutex.P();... Mutex.V(); f2 (...) Mutex.P(); f1(); Mutex.V(); fn (...) Mutex.P();... Mutex.V(); Make multiple critical sections (procedures) thread-safe Mutual exclusion among library functions Reentrance Most semaphore implementations are smart or friendly Semaphores and Scheduling Priorities Priority P1 P2 P3 P4 P5 P6 P1 s.p() s.p() s.v() Time P1 and P6 are synchronized using semaphore s Priority inversion Apparent priority of P1 is less than priority of P6 Dangerous for real-time systems (c) Peter Sturm, Uni Trier 32

Priority Inheritance Priority P1 P2 P3 P4 P5 P6 s.p() s.v() Time Low priority thread inherits higher priority until calling V Any modern OS supports inheritance Not necessarily dequeue head of blocked list (FCFS) Implementing Priority Inheritance S: owner (o) PrioSave (p) value (c) queue (q) TCB TCB TCB TCB Limited to binary semaphores P operation No blocking: Store ownership and current priority Blocking: Transfer higher priority to owner thread if needed V operation Restore original priority Put blocked thread with highest priority in ready queue (c) Peter Sturm, Uni Trier 33