CPUInheritance Scheduling DepartmentofComputerScience ComputerSystemsLaboratory BryanFord SaiSusarla http://www.cs.utah.edu/projects/flux/ UniversityofUtah October30,1996 flux@cs.utah.edu 1
KeyConcepts Threadsscheduleeachotherbydonatingthe Onerootschedulerthreadperprocessorsources CPUusingadirectedyieldprimitive. allcputime. Kerneldispatchermanagesthreads,events,and CPUdonationwithoutmakinganyscheduling policydecisions. 2
TheDispatcher Implementsthreadsleep,wakeup,schedule,etc. clocks,ortimers. Runsinthecontextofcurrentlyrunningthread. Hasnonotionofthreadpriority,CPUusage, Dispatcherwakesaschedulerthreadwhen: Eventofinteresttothescheduleroccurs. Scheduler'sclientblocks. 3
SchedulingExample Scheduler CPU Scheduler thread Ready queues CPU donation Port scheduling requests Running thread Ready threads App 1 Waiting thread App 2 4
Theschedule()operation Sensitivitylevels: schedule(thread,port,sensitivity) ONBLOCK:Wakethescheduleranytimeits ONSWITCH:Wakethescheduleronlywhena clientthreadblocks. ONCONFLICT:Wakethescheduleronlywhen dierentclientisrequestingthecpu. twoormoreclientsarerunnableatthe sametime. 5
ImplicitDonation e.g.: Workslikeschedule(),exceptdoneimplicitly; Threadattemptingtolockaheldmutex Clientthreaddonatestoserverthreadfor donatestocurrentowner Analogoustopriorityinheritanceintraditional thedurationofanrpc systems. CPU S0 6 T0 (high-priority) T1 (low-priority)
MultiprocessorScheduling Scheduler CPU 0 Scheduler threads Ready queues CPU 1 App 1 App 2 7
Benets Hierarchical,stackableschedulingpolicies Application-specicschedulingpolicies Automaticpriorityinheritance ModularCPUusagecontrol AccurateCPUusageaccounting Naturallyextendstomultiprocessors Supportsprocessoranitypolicies andscheduleractivations 8
PrototypeImplementation Implementedasafancythreadspackageina Schedulersimplemented: BSDprocess. Fixedpriorityround-robinandFIFO Ratemonotonic Lottery 9
SchedulingHierarchy Root Scheduler Fixed-priority Real-time Scheduler Rate-monotonic RM1 Real-time periodic threads Timesharing Class Lottery scheduling RM2 LS1 Web browser Lottery scheduling JAVA1 Java applet threads JAVA2 Background Round-robin RR1 RR2 FIFO Scheduler Non-preemptive FIFO1 Cooperating threads FIFO2 10
Results Threemeasures: Schedulingbehavior(correctness) Overhead Implementationcomplexity 11
Multi-policy Scheduling Behavior 2.5 2 1.5 1 0.5 0 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100 Accumulated CPU usage (sec) Rate-monotonic thread 1 (50%) Rate monotonic thread 2 (25%) Lottery thread (Interactive - bursty) Round-robin thread 1 (Insatiable) Round-robin thread 2 (Insatiable) RM1 (50%) RR1 (compute) LS1 (burst) RR2 (compute) RM2 (25%) Time (clock ticks)
100 90 80 70 60 50 40 30 20 10 0 Modular Control of CPU Usage Applet thread 1 Applet thread 2 FIFO thread 1 FIFO thread 2 Round-robin thread 1 Round-robin thread 2 200 600 1000 1400 1800 2200 2600 3000 3400 3800 4200 4600 5000 5400 5800 6200 6600 7000 7400 7800 8200 8600 9000 9400 9800 Time (clock ticks) Relative CPU time allocation (percent)
Real-time Scheduling Behavior 70 60 CPU donation on mutex contention No CPU donation Number of occurrences 50 40 30 20 10 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Mutex lock latency for real-time thread (clock ticks)
Performance Dispatcheroverhead {Sensitivitytohierarchydepth {Basecost Contextswitchingoverhead {Numberofadditionalcontextswitches {Costofcontextswitches 15
SchedulingHierarchyDepthDispatchTime(s) DispatcherMicro-benchmarks Rootscheduleronly 2-levelscheduling 3-levelscheduling 4-levelscheduling 8-levelscheduling 11.2 14.0 16.2 24.4 8.0 16
Contextswitchoverhead Inprototype,measurewhatproportionof (i.e.,extra) contextswitchesaretoschedulerthreads OnarealOS,measurerateofcontext ProjectslowdownintwoOSs,basedonexpectedrateandspeedofcontextswitches switchesinvariousworkloads 17
ContextSwitchesforSimpleTests RM1 RM2 RM3 Client/ ServerDatabasetime 57 ParallelReal-General LS1 JAVA1 FIFO1 19 25 322 622 101 RR1 46 9 26 RR2 RR3 114 3 238 242 249 17 RR4 Userinvocations 492 234 243 9571193 165 147 Rootscheduler Ratemonotonic Lotteryscheduler Appletscheduler FIFOscheduler Round-robinsched 262 43 30 2 9561237 1 142 Schedulerinvoc. 18 8 3 Totalcsw Scheduler% 41% 346 838 19132496 50%52% 9561303 56% 218 383 8 18
StatisticsforCommon Applications Runtime(sec) Contextswitches/sec Traps/sec Systemcalls/sec 26.435.3 gzipgcc 11 32 tarconfigure Deviceinterrupts/sec4275093337 10562 23651 517 9.6 81 22 3470 1807 1055 26.0 202 19
10 8 Microkernel:configure (13000 csw/s) Microkernel:gcc (3500 csw/s) Microkernel:gzip (930 csw/s) FreeBSD:configure (202 csw/s) FreeBSD:gcc (32 csw/s) FreeBSD:gzip (11 csw/s) Overall slowdown (percent) 6 4 2 20 0 1 10 100 1000 Additional overhead per context switch (microsec)
CodeComplexity Dispatcher: Exampleschedulers: 550raw,160linesofsemicolons eachis100{200semicolons 21
RelatedWork Existingmulti-policysystems: Multi-classsystems:Mach,NT AegisExokernel 22
RelatedWork Existinghierarchicalschedulingpolicies: KeyKOSmeters Lottery/stridescheduling CPUinheritanceschedulingisnotapolicy. Start-timeFairQueuing(SFQ) 23
Status Works,butneedstobetriedinarealOS OSDIandFluxprojectwebpages: Sourceforprototypewillbeavailablefromthe Flukekernelimplementationinprogress http://www.cs.utah.edu/projects/flux/ 24
Conclusion CPUinheritancescheduling: ProvidesexibleCPUscheduling,andsupportsmanyexistingpoliciesandmechanisms Isecientenoughforcommonuses Isstraightforwardtoimplement(inuser SupportstheFlukenestedprocessmodel mode) 25