Multiprocessor Scheduling and Scheduling in Linux Kernel 2.6

Similar documents
Operatin g Systems: Internals and Design Principle s. Chapter 10 Multiprocessor and Real-Time Scheduling Seventh Edition By William Stallings

CPU Scheduling Outline

W4118 Operating Systems. Instructor: Junfeng Yang

Linux scheduler history. We will be talking about the O(1) scheduler

CPU Scheduling. Basic Concepts. Basic Concepts (2) Basic Concepts Scheduling Criteria Scheduling Algorithms Batch systems Interactive systems

W4118 Operating Systems. Instructor: Junfeng Yang

2. is the number of processes that are completed per time unit. A) CPU utilization B) Response time C) Turnaround time D) Throughput

CPU SCHEDULING (CONT D) NESTED SCHEDULING FUNCTIONS

Scheduling. Scheduling. Scheduling levels. Decision to switch the running process can take place under the following circumstances:

Deciding which process to run. (Deciding which thread to run) Deciding how long the chosen process can run

Operating Systems Concepts: Chapter 7: Scheduling Strategies

Comp 204: Computer Systems and Their Implementation. Lecture 12: Scheduling Algorithms cont d

Completely Fair Scheduler and its tuning 1

Objectives. Chapter 5: Process Scheduling. Chapter 5: Process Scheduling. 5.1 Basic Concepts. To introduce CPU scheduling

Scheduling 0 : Levels. High level scheduling: Medium level scheduling: Low level scheduling

Chapter 5: CPU Scheduling. Operating System Concepts 8 th Edition

OPERATING SYSTEMS SCHEDULING

Chapter 5 Process Scheduling

Objectives. Chapter 5: CPU Scheduling. CPU Scheduler. Non-preemptive and preemptive. Dispatcher. Alternating Sequence of CPU And I/O Bursts

ICS Principles of Operating Systems

Road Map. Scheduling. Types of Scheduling. Scheduling. CPU Scheduling. Job Scheduling. Dickinson College Computer Science 354 Spring 2010.

Real-Time Scheduling 1 / 39

Linux O(1) CPU Scheduler. Amit Gud amit (dot) gud (at) veritas (dot) com

CPU Scheduling. Core Definitions

Linux Scheduler. Linux Scheduler

Operating Systems. III. Scheduling.

Scheduling policy. ULK3e 7.1. Operating Systems: Scheduling in Linux p. 1

CPU Scheduling. CSC 256/456 - Operating Systems Fall TA: Mohammad Hedayati

ò Paper reading assigned for next Thursday ò Lab 2 due next Friday ò What is cooperative multitasking? ò What is preemptive multitasking?

Process Scheduling CS 241. February 24, Copyright University of Illinois CS 241 Staff

Process Scheduling II

Operating System: Scheduling

Linux Process Scheduling. sched.c. schedule() scheduler_tick() hooks. try_to_wake_up() ... CFS CPU 0 CPU 1 CPU 2 CPU 3

Tasks Schedule Analysis in RTAI/Linux-GPL

Effective Computing with SMP Linux

Processor Scheduling. Queues Recall OS maintains various queues

Scheduling Algorithms

Readings for this topic: Silberschatz/Galvin/Gagne Chapter 5

Scheduling. Yücel Saygın. These slides are based on your text book and on the slides prepared by Andrew S. Tanenbaum

Job Scheduling Model

Chapter 5 Linux Load Balancing Mechanisms

/ Operating Systems I. Process Scheduling. Warren R. Carithers Rob Duncan

A Survey of Parallel Processing in Linux

Scheduling in Distributed Systems

Linux Process Scheduling Policy

CPU Scheduling. CPU Scheduling

Introduction. Scheduling. Types of scheduling. The basics

EECS 750: Advanced Operating Systems. 01/28 /2015 Heechul Yun

Operating Systems, 6 th ed. Test Bank Chapter 7

Scheduling algorithms for Linux

Thomas Fahrig Senior Developer Hypervisor Team. Hypervisor Architecture Terminology Goals Basics Details

Lecture Outline Overview of real-time scheduling algorithms Outline relative strengths, weaknesses

Announcements. Basic Concepts. Histogram of Typical CPU- Burst Times. Dispatcher. CPU Scheduler. Burst Cycle. Reading

Konzepte von Betriebssystem-Komponenten. Linux Scheduler. Valderine Kom Kenmegne Proseminar KVBK Linux Scheduler Valderine Kom

Contributions to Gang Scheduling

An Implementation Of Multiprocessor Linux

Chapter 1: Introduction. What is an Operating System?

Process Scheduling in Linux

Chapter 2: OS Overview

Modular Real-Time Linux

Final Report. Cluster Scheduling. Submitted by: Priti Lohani

Overview of the Linux Scheduler Framework

Threads Scheduling on Linux Operating Systems

Project No. 2: Process Scheduling in Linux Submission due: April 28, 2014, 11:59pm

Processes and Non-Preemptive Scheduling. Otto J. Anshus

Multilevel Load Balancing in NUMA Computers

CPU Scheduling. Multitasking operating systems come in two flavours: cooperative multitasking and preemptive multitasking.

Operating System Tutorial

Main Points. Scheduling policy: what to do next, when there are multiple threads ready to run. Definitions. Uniprocessor policies

Mitigating Starvation of Linux CPU-bound Processes in the Presence of Network I/O

CPU Scheduling 101. The CPU scheduler makes a sequence of moves that determines the interleaving of threads.

Process Scheduling in Linux

Aperiodic Task Scheduling

Operating Systems OBJECTIVES 7.1 DEFINITION. Chapter 7. Note:

Lecture 3 Theoretical Foundations of RTOS

Why Relative Share Does Not Work

OpenMosix Presented by Dr. Moshe Bar and MAASK [01]

CS Fall 2008 Homework 2 Solution Due September 23, 11:59PM

Hard Real-Time Linux

Improvement of Scheduling Granularity for Deadline Scheduler

ò Scheduling overview, key trade-offs, etc. ò O(1) scheduler older Linux scheduler ò Today: Completely Fair Scheduler (CFS) new hotness

Load-Balancing for a Real-Time System Based on Asymmetric Multi-Processing

Task Scheduling for Multicore Embedded Devices

Thread level parallelism

3 - Introduction to Operating Systems

CS414 SP 2007 Assignment 1

Embedded Systems. 6. Real-Time Operating Systems

OS OBJECTIVE QUESTIONS

Operating Systems 4 th Class

Multi-core Programming System Overview

Why Threads Are A Bad Idea (for most purposes)

Operating Systems Lecture #6: Process Management

Types Of Operating Systems

Tools Page 1 of 13 ON PROGRAM TRANSLATION. A priori, we have two translation mechanisms available:

Transcription:

Multiprocessor Scheduling and Scheduling in Linux Kernel 2.6 Winter Term 2008 / 2009 Jun.-Prof. Dr. André Brinkmann Andre.Brinkmann@uni-paderborn.de Universität Paderborn PC²

Agenda Multiprocessor and Real-Time Scheduling Based on Stalling, Chapter 10 Linux Scheduling Taken from Scheduling in Kernel 2.6, Mahendra M, infosys Process Scheduling O(1) scheduler design, performance Pre-emption A. Brinkmann

Classifications of Multiprocessor Systems Loosely coupled or distributed multiprocessor, or cluster Each processor has its own memory and I/O channels Functionally specialized processors Such as I/O processor Controlled by a master processor Tightly coupled multiprocessing Processors share main memory Controlled by operating system A. Brinkmann 3

Independent Parallelism Separate application or job No synchronization among processes Example is time-sharing system A. Brinkmann 4

Coarse and Very Coarse-Grained Parallelism Synchronization among processes at a very gross level Good for concurrent processes running on a multiprogrammed uniprocessor Can by supported on a multiprocessor with little change A. Brinkmann 5

Medium-Grained Parallelism Single application is a collection of threads Threads usually interact frequently A. Brinkmann 6

Fine-Grained Parallelism Highly parallel applications Specialized and fragmented area A. Brinkmann 7

Thread Structure for Rendering Module A. Brinkmann 8

Scheduling Design Issues Assignment of processes to processors Use of multiprogramming on individual processors Actual dispatching of a process A. Brinkmann 9

Assignment of Processes to Processors Treat processors as a pooled resource and assign process to processors on demand Permanently assign process to a processor Known as group or gang scheduling (not correct) Dedicate short-term queue for each processor Less overhead Processor could be idle while another processor has a backlog A. Brinkmann 10

Assignment of Processes to Processors Global queue Schedule to any available processor A. Brinkmann 11

Assignment of Processes to Processors Master/slave architecture Key kernel functions always run on a particular processor Master is responsible for scheduling Slave sends service request to the master Disadvantages Failure of master brings down whole system Master can become a performance bottleneck A. Brinkmann 12

Assignment of Processes to Processors Peer architecture Kernel can execute on any processor Each processor does self-scheduling Complicates the operating system Make sure two processors do not choose the same process A. Brinkmann 13

Synchronization Granularity and Processes A. Brinkmann 14

Process Scheduling Single queue for all processes Multiple queues are used for priorities All queues feed to the common pool of processors A. Brinkmann 15

Thread Scheduling Executes separate from the rest of the process An application can be a set of threads that cooperate and execute concurrently in the same address space A. Brinkmann 16

Multiprocessor Thread Scheduling Load sharing Processes are not assigned to a particular processor Gang scheduling A set of related threads is scheduled to run on a set of processors at the same time A. Brinkmann 17

Multiprocessor Thread Scheduling Dedicated processor assignment Threads are assigned to a specific processor Dynamic scheduling Number of threads can be altered during course of execution A. Brinkmann 18

Comparison One and Two Processors A. Brinkmann 19

Comparison One and Two Processors A. Brinkmann 20

Load Sharing Load is distributed evenly across the processors No centralized scheduler required Use global queues A. Brinkmann 21

Disadvantages of Load Sharing Central queue needs mutual exclusion Preemptive threads are unlikely to resume execution on the same processor If all threads are in the global queue, all threads of a program will not gain access to the processors at the same time A. Brinkmann 22

Gang Scheduling Simultaneous scheduling of threads that make up a single process Useful for applications where performance severely degrades when any part of the application is not running Threads often need to synchronize with each other A. Brinkmann 23

Example Scheduling Groups A. Brinkmann 24

Dedicated Processor Assignment When application is scheduled, its threads are assigned to a processor Some processors may be idle No multiprogramming of processors A. Brinkmann 25

Application Speedup A. Brinkmann 26

Dynamic Scheduling Number of threads in a process are altered dynamically by the application Operating system adjust the load to improve utilization Assign idle processors New arrivals may be assigned to a processor that is used by a job currently using more than one processor Hold request until processor is available Assign processor a job in the list that currently has no processors (i.e., to all waiting new arrivals) A. Brinkmann 27

Agenda Multiprocessor and Real-Time Scheduling Based on Stalling, Chapter 10 Linux Scheduling Taken from Scheduling in Kernel 2.6, Mahendra M, infosys Process Scheduling O(1) scheduler design, performance Pre-emption A. Brinkmann 28

Considerations in Scheduler design Fairness Prevent starvation of tasks Scheduling latency Reduction in delay between a task waking up and actually running Time taken for the scheduler decisions Interrupt latency Delay in processing h/w interrupts Scheduler decisions A. Brinkmann

Process Scheduler Goals Good interactive performance during high load Fairness Priorities SMP efficiency SMP affinity Issues of random bouncing taken care No more 'timeslice squeeze RT Scheduling ( From /usr/src/linux-2.6.x/documentation/sched-design.txt ) A. Brinkmann

Goals ( contd... ) Full O(1) scheduling Great shift away from O(n) scheduler Perfect SMP scalability Per CPU runqueues and locks No global lock/runqueue All operations like wakeup, schedule, context-switch etc. are in parallel Batch scheduling ( bigger timeslices, RR ) No scheduling storms O(1) RT scheduling A. Brinkmann

Design 140 priority levels The lower the value, higher is the priority Eg : Priority level 110 will have a higher priority than 130. Two priority-ordered 'priority-arrays' per CPU 'Active' array : tasks which have timeslices left 'Expired' Array : tasks which have run Both accessed through pointers from per-cpu runqueue They are switched via a simple pointer swap A. Brinkmann

Per-CPU runqueue Each CPU on the system has it's own RunQueue RunQueue Active Doubly linked lists of tasks Task 1 Task 2 Task M [Priority: 1] Task 1 Task 2 Task M [Priority: 140] Expired Migration Thread From kernel / sched. c A. Brinkmann

O(1) Algorithm ( Constant time algorithm ) The highest priority level, with at-least ONE task in it, is selected This takes a fixed time ( say t1 ) The first task ( head of the doubly-linked list ) in this priority level is allowed to run This takes a fixed time ( say t2 ) The time taken for selecting a new process will be fixed ( constant time irrespective of number of tasks ) Deterministic algorithm!! A. Brinkmann

Linux Data Structures A. Brinkmann 35

2.6 v/s 2.4 Kernel 2.4 had A Global runqueue. All CPUs had to wait for other CPUs to finish execution. An O(n) scheduler. In 2.4, the scheduler used to go through the entire global runqueue to determine the next task to be run. This was an O(n) algorithm where 'n' is the number of processes. The time taken was proportional to the number of active processes in the system. This lead to large performance hits during heavy workloads. A. Brinkmann

Scheduling policies in 2.6 140 Priority levels 1-100 : RT prio ( MAX_RT_PRIO = 100 ) 101-140 : User task Prio ( MAX_PRIO = 140 ) Three different scheduling policies One for normal tasks Two for Real time tasks Normal tasks Each task assigned a Nice value PRIO = MAX_RT_PRIO + NICE + 20 Assigned a time slice Tasks at the same prio are round-robined. Ensures Priority + Fairness A. Brinkmann

Policies ( contd... ) RT tasks ( Static priority ) FIFO RT tasks Run until they relinquish the CPU voluntarily Priority levels maintained Not pre-empted!! RR RT tasks Assigned a timeslice and run till the timeslice is exhausted. Once all RR tasks of a given prio level exhaust their timeslices, their timeslices are refilled and they continue running. Prio levels are maintained The above can be unfair!! - Sane design expected!! A. Brinkmann

Interactivity estimator Dynamically scales a tasks priority based on it's interactivity Interactive tasks receive a prio bonus [ -5 ] Hence a larger timeslice CPU bound tasks receive a prio penalty [ +5 ] Interactivity estimated using a running sleep average. Interactive tasks are I/O bound. They wait for events to occur. Sleeping tasks are I/O bound or interactive!! Actual bonus/penalty is determined by comparing the sleep average against a constant maximum sleep average. Does not apply to RT tasks A. Brinkmann

Recalculation of priorities When a task finishes it's timeslice : It's interactivity is estimated Interactive tasks can be inserted into the 'Active' array again. Else, priority is recalculated Inserted into the NEW priority level in the 'Expired' array. A. Brinkmann

Finegrained timeslice distribution Priority is recalculated only after expiring a timeslice. Interactive tasks may become non-interactive during their LARGE timeslices, thus starving other processes. To prevent this, time-slices are divided into chunks of 20ms Task of equal priority may preempt running task every 20ms The preempted task is requeued and is round-robined in it s priority level. Also, priority recalculation happens every 20ms. A. Brinkmann

For programmers From /usr/src/linux-2.6.x/kernel/sched.c void schedule() The main scheduling function. Upon return, the highest priority process will be active Data struct runqueue() The main per-cpu runqueue data structure struct task_struct() The main per-process data structure A. Brinkmann

For programmers ( contd... ) Process Control methods void set_user_nice (... ) Sets the nice value of task p to given value int setscheduler(... ) Sets the scheduling policy and parameters for a given pid rt_task( pid ) Returns true if pid is real-time, false if not. yield() Place the current process at the end of the runqueue and call schedule(). A. Brinkmann

Handling SMP ( multiple CPUs ) A run-queue per CPU Each CPU handles it's own processes and do not have to wait till other CPU tasks finish their timeslices. A 'migration' thread runs for every CPU. void load_balance() This function call attempts to pull tasks from one CPU to another to balance CPU usage if needed. Called Explicitly if runqueues are inbalanced Periodically by the timer tick Processes can be made affine to a particular CPU. A. Brinkmann