ò Paper reading assigned for next Thursday ò Lab 2 due next Friday ò What is cooperative multitasking? ò What is preemptive multitasking?

Similar documents
Linux scheduler history. We will be talking about the O(1) scheduler

W4118 Operating Systems. Instructor: Junfeng Yang

EECS 750: Advanced Operating Systems. 01/28 /2015 Heechul Yun

Operating Systems Concepts: Chapter 7: Scheduling Strategies

ò Scheduling overview, key trade-offs, etc. ò O(1) scheduler older Linux scheduler ò Today: Completely Fair Scheduler (CFS) new hotness

Multiprocessor Scheduling and Scheduling in Linux Kernel 2.6

Linux Process Scheduling Policy

CPU Scheduling Outline

Process Scheduling CS 241. February 24, Copyright University of Illinois CS 241 Staff

Linux Scheduler. Linux Scheduler

Scheduling. Scheduling. Scheduling levels. Decision to switch the running process can take place under the following circumstances:

Thomas Fahrig Senior Developer Hypervisor Team. Hypervisor Architecture Terminology Goals Basics Details

Scheduling. Yücel Saygın. These slides are based on your text book and on the slides prepared by Andrew S. Tanenbaum

Scheduling policy. ULK3e 7.1. Operating Systems: Scheduling in Linux p. 1

Chapter 5 Linux Load Balancing Mechanisms

Process Scheduling II

Scheduling 0 : Levels. High level scheduling: Medium level scheduling: Low level scheduling

Advanced topics: reentrant function

Real-Time Scheduling 1 / 39

Comp 204: Computer Systems and Their Implementation. Lecture 12: Scheduling Algorithms cont d

CPU Scheduling. Basic Concepts. Basic Concepts (2) Basic Concepts Scheduling Criteria Scheduling Algorithms Batch systems Interactive systems

CPU Scheduling. Core Definitions

Operating Systems. 05. Threads. Paul Krzyzanowski. Rutgers University. Spring 2015

Objectives. Chapter 5: Process Scheduling. Chapter 5: Process Scheduling. 5.1 Basic Concepts. To introduce CPU scheduling

Scheduling. Monday, November 22, 2004

CPU Scheduling. CSC 256/456 - Operating Systems Fall TA: Mohammad Hedayati

Multi-core and Linux* Kernel

Completely Fair Scheduler and its tuning 1

Chapter 5: CPU Scheduling. Operating System Concepts 8 th Edition

Chapter 5 Process Scheduling

Job Scheduling Model

Linux Process Scheduling. sched.c. schedule() scheduler_tick() hooks. try_to_wake_up() ... CFS CPU 0 CPU 1 CPU 2 CPU 3

2. is the number of processes that are completed per time unit. A) CPU utilization B) Response time C) Turnaround time D) Throughput

An Implementation Of Multiprocessor Linux

Multilevel Load Balancing in NUMA Computers

Processes and Non-Preemptive Scheduling. Otto J. Anshus

W4118 Operating Systems. Instructor: Junfeng Yang

Effective Computing with SMP Linux

Operating Systems. III. Scheduling.

Task Scheduling for Multicore Embedded Devices

Process Scheduling in Linux

Project No. 2: Process Scheduling in Linux Submission due: April 28, 2014, 11:59pm

HyperThreading Support in VMware ESX Server 2.1

Chapter 6, The Operating System Machine Level

CPU Scheduling 101. The CPU scheduler makes a sequence of moves that determines the interleaving of threads.

Deciding which process to run. (Deciding which thread to run) Deciding how long the chosen process can run

CHAPTER 1 INTRODUCTION

CPU SCHEDULING (CONT D) NESTED SCHEDULING FUNCTIONS

Processor Scheduling. Queues Recall OS maintains various queues

White Paper Perceived Performance Tuning a system for what really matters

Operating System Resource Management. Burton Smith Technical Fellow Microsoft Corporation

Multi-core Programming System Overview

CS414 SP 2007 Assignment 1

Real-time KVM from the ground up

A Survey of Parallel Processing in Linux

Road Map. Scheduling. Types of Scheduling. Scheduling. CPU Scheduling. Job Scheduling. Dickinson College Computer Science 354 Spring 2010.

Main Points. Scheduling policy: what to do next, when there are multiple threads ready to run. Definitions. Uniprocessor policies

Outline: Operating Systems

Linux O(1) CPU Scheduler. Amit Gud amit (dot) gud (at) veritas (dot) com

These sub-systems are all highly dependent on each other. Any one of them with high utilization can easily cause problems in the other.

Page 1 of 5. IS 335: Information Technology in Business Lecture Outline Operating Systems

Overview of Presentation. (Greek to English dictionary) Different systems have different goals. What should CPU scheduling optimize?

SYSTEM ecos Embedded Configurable Operating System

Why Relative Share Does Not Work

Kernel Optimizations for KVM. Rik van Riel Senior Software Engineer, Red Hat June

Operatin g Systems: Internals and Design Principle s. Chapter 10 Multiprocessor and Real-Time Scheduling Seventh Edition By William Stallings

Lecture Outline Overview of real-time scheduling algorithms Outline relative strengths, weaknesses

CS 377: Operating Systems. Outline. A review of what you ve learned, and how it applies to a real operating system. Lecture 25 - Linux Case Study

REAL TIME OPERATING SYSTEMS. Lesson-10:

UNIVERSITY OF WISCONSIN-MADISON Computer Sciences Department A. Arpaci-Dusseau

The CPU Scheduler in VMware vsphere 5.1

OS OBJECTIVE QUESTIONS

Virtualization Technologies and Blackboard: The Future of Blackboard Software on Multi-Core Technologies

Chapter 2: OS Overview

ICS Principles of Operating Systems

/ Operating Systems I. Process Scheduling. Warren R. Carithers Rob Duncan

OS Thread Monitoring for DB2 Server

RTOS Debugger for ecos

Why Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat

BEST PRACTICES FOR PARALLELS CONTAINERS FOR LINUX

Operating Systems Lecture #6: Process Management

ELEC 377. Operating Systems. Week 1 Class 3

windows maurizio pizzonia roma tre university

Linux Block I/O Scheduling. Aaron Carroll December 22, 2007

CPU Scheduling. Multitasking operating systems come in two flavours: cooperative multitasking and preemptive multitasking.

Using Power to Improve C Programming Education

Readings for this topic: Silberschatz/Galvin/Gagne Chapter 5

Real-Time Systems Prof. Dr. Rajib Mall Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Overview of the Linux Scheduler Framework

Linux Load Balancing

Introduction. Application Performance in the QLinux Multimedia Operating System. Solution: QLinux. Introduction. Outline. QLinux Design Principles

Objectives. Chapter 5: CPU Scheduling. CPU Scheduler. Non-preemptive and preemptive. Dispatcher. Alternating Sequence of CPU And I/O Bursts

Course Development of Programming for General-Purpose Multicore Processors

Scheduling Algorithms

University of Pennsylvania Department of Electrical and Systems Engineering Digital Audio Basics

Benchmarking Hadoop & HBase on Violin

Operating System Tutorial

Operating System Software

The Linux Kernel: Process Management. CS591 (Spring 2001)

Multi-core architectures. Jernej Barbic , Spring 2007 May 3, 2007

Transcription:

Housekeeping Paper reading assigned for next Thursday Scheduling Lab 2 due next Friday Don Porter CSE 506 Lecture goals Undergrad review Understand low-level building blocks of a scheduler Understand competing policy goals Understand the O(1) scheduler CFS next lecture Familiarity with standard Unix scheduling APIs What is cooperative multitasking? Processes voluntarily yield CPU when they are done What is preemptive multitasking? OS only lets tasks run for a limited time, then forcibly context switches the CPU Pros/cons? Cooperative gives more control; so much that one task can hog the CPU forever Preemptive gives OS more control, more overheads/complexity 1

Where can we preempt a process? In other words, what are the logical points at which the OS can regain control of the CPU? System calls Before During (more next time on this) After Interrupts Timer interrupt ensures maximum time slice (Linux) Terminology mm_struct represents an address space in kernel task represents a thread in the kernel A task points to 0 or 1 mm_structs Kernel threads just borrow previous task s mm, as they only execute in kernel address space Many tasks can point to the same mm_struct Multi-threading Quantum CPU timeslice Outline Policy goals Policy goals Low-level mechanisms O(1) Scheduler CPU topologies Scheduling interfaces Fairness everything gets a fair share of the CPU Real-time deadlines CPU time before a deadline more valuable than time after Latency vs. Throughput: Timeslice length matters! GUI programs should feel responsive CPU-bound jobs want long timeslices, better throughput User priorities Virus scanning is nice, but I don t want it slowing things down 2

No perfect solution Context switching Optimizing multiple variables Like memory allocation, this is best-effort Some workloads prefer some scheduling strategies Nonetheless, some solutions are generally better than others What is it? Swap out the address space and running thread Address space: Need to change page tables Update cr3 register on x86 Simplified by convention that kernel is at same address range in all processes What would be hard about mapping kernel in different places? Other context switching tasks Swap out other register state Segments, debugging registers, MMX, etc. If descheduling a process for the last time, reclaim its memory Switch thread stacks Switching threads Programming abstraction: /* Do some work */ schedule(); /* Something else runs */ /* Do more work */ 3

How to switch stacks? Example Store register state on the stack in a well-defined format Carefully update stack registers to new stack Tricky: can t use stack-based storage for this step! ebp esp Thread 1 (prev) regs Thread 2 (next) regs ebp ebp eax /* eax is next->thread_info.esp */! /* push general-purpose regs*/! push ebp! mov esp, eax! pop ebp! /* pop other regs */! Weird code to write How to code this? Inside schedule(), you end up with code like: switch_to(me, next, &last);! /* possibly clean up last */! Where does last come from? Output of switch_to Written on my stack by previous thread (not me)! Pick a register (say ebx); before context switch, this is a pointer to last s location on the stack Pick a second register (say eax) to stores the pointer to the currently running task (me) Make sure to push ebx after eax After switching stacks: pop ebx /* eax still points to old task*/ mov (ebx), eax /* store eax at the location ebx points to */ pop eax /* Update eax to new task */ 4

Outline Strawman scheduler Policy goals Low-level mechanisms O(1) Scheduler CPU topologies Scheduling interfaces Organize all processes as a simple list In schedule(): Pick first one on list to run next Put suspended task at the end of the list Problem? Only allows round-robin scheduling Can t prioritize tasks Even straw-ier man O(1) scheduler Naïve approach to priorities: Scan the entire list on each run Or periodically reshuffle the list Problems: Goal: decide who to run next, independent of number of processes in system Still maintain ability to prioritize tasks, handle partially unused quanta, etc Forking where does child go? What about if you only use part of your quantum? E.g., blocking I/O 5

O(1) Bookkeeping O(1) Intuition runqueue: a list of runnable processes Blocked processes are not on any runqueue A runqueue belongs to a specific CPU Each task is on exactly one runqueue Task only scheduled on runqueue s CPU unless migrated 2 *40 * #CPUs runqueues 40 dynamic priority levels (more later) 2 sets of runqueues one active and one expired Take the first task off the lowest-numbered runqueue on active set Confusingly: a lower priority value means higher priority When done, put it on appropriate runqueue on expired set Once active is completely empty, swap which set of runqueues is active and expired Constant time, since fixed number of queues to check; only take first item from non-empty queue How is this better than a sorted list? Remember partial quantum use problem? Process uses half of its timeslice and then blocks on disk Once disk I/O is done, where to put the task? Simple: task goes in active runqueue at its priority Higher-priority tasks go to front of the line once they become runnable Time slice tracking If a process blocks and then becomes runnable, how do we know how much time it had left? Each task tracks ticks left in time_slice field On each lock tick: current->time_slice--! If time slice goes to zero, move to expired queue Refill time slice Schedule someone else An unblocked task can use balance of time slice Forking halves time slice with child 6

More on priorities 100 = highest priority 139 = lowest priority 120 = base priority nice value: user-specified adjustment to base priority Selfish (not nice) = -20 (I want to go first) Really nice = +19 (I will go last) Base time slice # % (140 prio)* 20ms prio <120 time = $ % (140 prio)* 5ms prio 120 & Higher priority tasks get longer time slices And run first Goal: Responsive UIs Idea: Infer from sleep time Most GUI programs are I/O bound on the user Unlikely to use entire time slice Users get annoyed when they type a key and it takes a long time to appear Idea: give UI programs a priority boost Go to front of line, run briefly, block on I/O again Which ones are the UI programs? By definition, I/O bound applications spend most of their time waiting on I/O We can monitor I/O wait time and infer which programs are GUI (and disk intensive) Give these applications a priority boost Note that this behavior can be dynamic Ex: GUI configures DVD ripping, then it is CPU-bound Scheduling should match program phases 7

Dynamic priority Rebalancing tasks dynamic priority = max ( 100, min ( static priority bonus + 5, 139 ) ) Bonus is calculated based on sleep time Dynamic priority determines a tasks runqueue This is a heuristic to balance competing goals of CPU throughput and latency in dealing with infrequent I/O May not be optimal As described, once a task ends up in one CPU s runqueue, it stays on that CPU forever What if all the processes on CPU 0 exit, and all of the processes on CPU 1 fork more children? We need to periodically rebalance Balance overheads against benefits Figuring out where to move tasks isn t free Idea: Idle CPUs rebalance Average load If a CPU is out of runnable tasks, it should take load from busy CPUs Busy CPUs shouldn t lose time finding idle CPUs to take their work if possible There may not be any idle CPUs How do we measure how busy a CPU is? Average number of runnable tasks over time Available in /proc/loadavg Overhead to figure out whether other idle CPUs exist Just have busy CPUs rebalance much less frequently 8

Rebalancing strategy Locking note Read the loadavg of each CPU Find the one with the highest loadavg (Hand waving) Figure out how many tasks we could take If worth it, lock the CPU s runqueues and take them If not, try again later If CPU A locks CPU B s runqueue to take some work: CPU B must lock its runqueues in the common case that no one is rebalancing Cf. Hoard and per-cpu heaps Idiosyncrasy: runqueue locks are acquired by one task and released by another Usually this would indicate a bug! Why not rebalance? SMP Intuition: If things run slower on another CPU Why might this happen? NUMA (Non-Uniform Memory Access) Hyper-threading Multi-core cache behavior Vs: Symmetric Multi-Processor (SMP) performance on all CPUs is basically the same CPU CPU CPU CPU Memory All CPUs similar, equally close to memory 9

Node NUMA Node Hyper-threading Precursor to multi-core CPU CPU CPU CPU Memory Memory Want to keep execution near memory; higher migration costs A few more transistors than Intel knew what to do with, but not enough to build a second core on a chip yet Duplicate architectural state (registers, etc), but not execution resources (ALU, floating point, etc) OS view: 2 logical CPUs CPU: pipeline bubble in one CPU can be filled with operations from another; yielding higher utilization Hyper-threaded scheduling Imagine 2 hyper-threaded CPUs 4 Logical CPUs But only 2 CPUs-worth of power Suppose I have 2 tasks They will do much better on 2 different physical CPUs than sharing one physical CPU They will also contend for space in the cache Multi-core More levels of caches Migration among CPUs sharing a cache preferable Why? More likely to keep data in cache Less of a problem for threads in same program. Why? 10

Scheduling Domains Outline General abstraction for CPU topology Tree of CPUs Each leaf node contains a group of close CPUs When an idle CPU rebalances, it starts at leaf node and works up to the root Most rebalancing within the leaf Higher threshold to rebalance across a parent Policy goals Low-level mechanisms O(1) Scheduler CPU topologies Scheduling interfaces Setting priorities Scheduler Affinity setpriority(which, who, niceval) and getpriority() Which: process, process group, or user id PID, PGID, or UID Niceval: -20 to +19 (recall earlier) nice(niceval) Historical interface (backwards compatible) Equivalent to: setpriority(prio_process, getpid(), niceval) sched_setaffinity and sched_getaffinity Can specify a bitmap of CPUs on which this can be scheduled Better not be 0! Useful for benchmarking: ensure each thread on a dedicated CPU 11

yield Summary Moves a runnable task to the expired runqueue Unless real-time (more later), then just move to the end of the active runqueue Several other real-time related APIs Understand competing scheduling goals Understand how context switching implemented Understand O(1) scheduler + rebalancing Understand various CPU topologies and scheduling domains Scheduling system calls 12