Multimedia Multiprocessor Systems: Analysis, Design and Management. Akash Kumar
|
|
- Shon Carter
- 7 years ago
- Views:
Transcription
1 Multimedia Multiprocessor Systems: Analysis, Design and Management Akash Kumar
2 2 Modern Multimedia Embedded Systems
3 3 Trends in Multimedia Systems Increasing number of features i.e. applications Simultaneously active applications Power increasingly becoming more important Short time-to-market, new devices released every few months Multiple standards to be supported Multiprocessors being used increasingly
4 4 Challenges in Multimedia System Design Ensuring all applications can meet their performance Handle the huge number of use-cases i.e. combinations of applications Each possible set of applications leads to a new use-case For 10 applications there are over a thousand use-cases! Limit the design time Late launch of products directly hurts profits Increased design-time implies higher design costs Deal with dynamism in the applications
5 5 Contributions Analysis Accurately predict performance of multiple applications executing concurrently Basic and iterative probabilistic techniques Design Synthesizing MPSoC for multiple applications Synthesizing MPSoC for multiple use-cases Management Resource manager for MPSoC systems Admission control and budget enforcement
6 6 Assumptions Heterogeneous MPSoC used increasingly more Different levels of parallelism in application uproc better for control-flow DSP better for signal processing Dedicated hardware blocks needed for certain parts Improves efficiency and saves power Applications modeled as SDF First-come-first-serve arbiter at cores Non-preemptive system tasks can not be stopped
7 7 Non-Preemptive Systems Task State-space needed is smaller Lower implementation cost Less overhead at run-time Cache pollution, memory size
8 8 Design Flow Use-case 2 System Design and Synthesis (Chapter 5 & 6) a0 a1 A a2 a3 b0 b1 B b2 Hardware Specification a0 a2 b1 b0 b2 Use-case 1 Applications Specifications Performance Analysis (Chapter 3) Throughput c0 c1 C c2 Use-case 3 Analysis Results A B C Applications Admission Control (Chapter 4) a0 Arbiter b1 a2 Arbiter Arbiter Hardware Specification Arbiter Arbiter Arbiter Arbiter Arbiter RM a1 a3 Arbiter RM a0 b1 b0 b2 Arbiter Arbiter Arbiter Arbiter RM a1 a3 Arbiter RM a0 b1 Budget Enforcement (Chapter 4)
9 9 Outline Introduction Multimedia Multiproc Systems Introduction to SDF Analysis Basic Probabilistic Performance Prediction Iterative Probabilistic Performance Prediction Design Synthesizing MPSoC for multiple applications Synthesizing MPSoC for multiple use-cases Management Resource Management for MPSoC systems
10 10 Synchronous Dataflow Graphs First proposed in 1987 by Edward Lee SDF Graphs used extensively SDFG: Synchronous Data Flow Graphs DSP applications Multimedia applications Similar to task graphs with dependencies
11 11 Synchronous Dataflow Graphs actor rate token channel execution time A 2 α 3 B 1 β 2 C fire A A 2 α 3 B 1 β 2 C
12 12 Synchronous Dataflow Graphs A 2 α 3 B 1 β 2 C fire B A 2 α 3 B 1 β 2 C
13 13 Synchronous Dataflow Graphs Example H263 Decoder VLD , IQ , ,800 IDCT ,000 1 Reconstruction 1188
14 14 Synchronous Dataflow Graphs Advantages Easily allows performance analysis of single applications Communication buffers can be easily modeled Disadvantages Sharing of resources is hard to model Only static resource arbitration can be modeled: infinite possibilities with multiple applications Difficult to analyze performance of multiple applications executing concurrently Unable to handle dynamism in the application
15 15 Problem: Predicting Multiple Application Performance A B Two applications each Mapping with & Scheduling three actors Mapped on a heterogeneous platform Non-preemptive scheduler P1 P2 P3
16 16 Considering Only Actors on a Processor A B Task Only Actors Individual Graph Worst Case A B Total Static Priority Based A pref. B pref. Iteration count for each task for 3,000 cycles
17 17 Considering Only Applications A B Task Only Actors Individual Graph Worst Case A B Total Static Priority Based A pref. B pref. Iteration count for each task for 3,000 cycles
18 18 Worst Case Waiting Time A B P1 P2 P3 Wait A Calculate waiting time
19 19 Worst Case Waiting Time A B P1 P2 P3 A
20 20 Worst Case Waiting Time Unrealistic! Lower Bound Task Only Actors Individual Graph Worst Case A B Total Static Priority Based A pref. B pref. Iteration count for each task for 3,000 cycles
21 21 Static Order Arbitration A B Add ordering dependencies (edges) P1 P2 A B P3 t 0 t 1 t 2 Steady t 3 state
22 22 Problem: Predicting Performance A B Task Only Actors Individual Graph Worst Case Static A B Total Priority Based A pref. B pref. Iteration count for each task for 3,000 cycles
23 23 Problem: Predicting Performance Priority Based A B P1 P2 A B P3 t 0 t 1 Steady t 2 t 3 State
24 24 Problem: Predicting Performance A B Task Only Actors Individual Graph Worst Case Static Priority Based A pref. B pref. A B Total Iteration count for each task for 3,000 cycles
25 25 Problem No good techniques exist to analyze performance of multiple applications on non-preemptive heterogeneous systems Use probabilistic approach to estimate the performance of multiple applications running on an MPSoC platform
26 26 Analyzing Multiple Applications Performance When resources need to be shared, the actor execution may be delayed Determining this waiting time is the key t resp = t exec + t wait???
27 27 Probability Distribution Compute the probability distribution of a resource being blocked by an actor A 2/3 1/1 P(x) 1/3 E( x) 1 = x. dx x =. 1 2 x denotes the time other actors have to wait for respective resources to be free from actors of A E(x) provides the expected time an actor will need to wait when sharing resources with actors of A x 0 = 8
28 28 Updated Response Time A B A B 58 58
29 29 Basic P 3 Algorithm Compute throughput of all applications Compute the probability of blocking a resource Estimate the waiting time for all actors Update the response time for all actors Response time = execution time + waiting time Re-compute the application throughput
30 30 Basic P 3 Algorithm Exponential Complexity So if actor a i and b i are mapped on the same resource, b i on average will need to wait for
31 31 Complexity Reduction Overall complexity is O(n n ) n is the number of actors mapped on a processing resource Higher order probability products Limit the equation to only second or fourthorder Complexity reduces significantly Algorithm Complexity Original O(n n ) Second-order O(n 2 ) Fourth-order O(n 4 )
32 32 Probabilistic Performance Prediction (P 3 ) Basic P 3 technique Looks at all possible combinations of other actors blocking a particular actor Results in exponential possibilities Iterative P 3 technique Looks at how an actor can contribute to waiting time of other actors Results in linear complexity Iterating over the algorithm while updating throughput improves the estimate further
33 33 Determining the Waiting Time Three states of an actor Not ready data not present Actors arriving in this state, are not affected by this actor Ready and waiting data present, but resource is busy Actors arriving in this state have to wait for the full execution of this actor Ready and executing data and resource available Waiting time for other actors depend on where the actor is in its execution Uniform distribution assumed
34 34 A s Waiting Time Due to B A B C D B not in queue B being served Arbiter Processor B waiting in queue
35 35 Updated Probability Distribution P(x) When the actor is not ready texec E ( x) = Pw. texec + Pe. 2 1-P w -P e P w When the actor is in queue P e 0 t exec x When the actor is executing
36 36 Updated Probability Distribution Conservative P(x) When the actor is not ready E( x) = P 1-P w -P e P w When the actor is in queue w = ( P. t w exec + P. t e e + P ). t exec exec 0 P e t exec x When the actor is executing
37 37 Iterative Probability Iterate until the analysis estimate stabilizes Updating the throughput in one iteration Compute throughput of all applications Compute the probability of blocking a resource both while waiting and executing Estimate the waiting time for all actors Update the response time for all actors Response time = execution time + waiting time Re-compute the application throughput
38 38 Experimental Results SDF 3 tool used to generate random graphs Ten graphs generated Each had 8-10 actors Over 1000 use-cases generated Simulations performed using POOSL Parallel Object Oriented Specification Language 28 hours for simulation 10 min for analysis using all approaches
39 39 Iterative Analysis all applications together Application period (normalized to original) A B C D E F G H I J Original Simulation Worst case WCSim Basic Iterative Applications
40 40 Iterative Analysis all applications together Application period (normalized to simulated) A B C D E F G H I J Simulation Basic Iterative Conservative Applications
41 41 Case-study with Mobile Phone Applications 160 Period of Applications (Normalized to original period) H263 Decoder H263 Encoder Simulation Iterative Analysis Conservative Analysis Worst Case Basic - Fourth Order JPEG Decoder Modem Voice Call Applications
42 42 FPGA Implementation Results Algorithm/Stage Load from CF Card Throughput Computation Worst Case Second Order Fourth Order Iterative - 1 Iteration Iterative - 1 Iteration* Iterative - 5 Iterations* Iterative - 10 Iterations* Clock cycles ms with 100 MHz Error (%age) Average Max N-number of applications n-number of actors in an application k-number of throughput equations for an application m-number of actors mapped on a processor M-number Copyright of processors 2010 Akash Kumar Complexity O(N.n.k) O(N.n.k) O(m.M) O(m 2.M) O(m 4.M) O(m.M) O(m.M+N.n.k) O(m.M+N.n.k) O(m.M+N.n.k) 2.8ms with 100 MHz
43 43 Outline Introduction Multimedia Multiproc Systems Introduction to SDF Analysis Basic Probabilistic Performance Prediction Iterative Probabilistic Performance Prediction Design Synthesizing MPSoC for multiple applications Synthesizing MPSoC for multiple use-cases Management Resource Management for MPSoC systems
44 44 Problem Current Design Practice for multiple applications Manual or Semi-automated Which is Error Prone Time Consuming
45 45 Current Tools - Example Xilinx Automatic tool chain limited to single processors No Support for multiple applications Design space exploration is manual
46 46 Solution Multi Application Multi-Processor Synthesis A design-flow that takes in application(s) specifications Generates the entire MPSoC hardware Creates the software models for it Real C-program can also be run Provides two main benefits Fast design space exploration Support for multiple applications
47 47 MAMPS Overview
48 48 MAMPS Software Arbitration Static Scheduling Dynamic Scheduling
49 49 MAMPS Example H263 Decoder IQ , VLD 120, ,000 1 Reconstruction , IDC T
50 MAMPS Example H263 Decoder Pro 0 VLD Pro 1 IQ Pro 2 IDCT Pro 3 Recon BUS Timer UART CF Card DDR RAM FIFO LINKS
51 51 Standalone Automated DSE Data Collection
52 52 DSE Case Study Buffer-throughput trade-off JPEG and H263 decoders
53 53 DSE Case Study Design Time Manual Design Generating Single Design Complete DSE Hardware Generation ~2 days 40ms 40ms Software Generation ~3 days 60ms 60ms Hardware Synthesis 35:40 min 35:40 min 35:40 min Software Synthesis 0:25 min 0:25 min 10:00 min Total time ~5 days 36:05 min 45:40 min Iterations Average time/ iteration ~5 days 36:05 min 1:54 min Speed-Up - 1x 19x Speedup!
54 54 MAMPS Used by following people Ahsan Shabbir TUe. Michiel Rooijakkers TUe. Thom Gielen TUe and NUS, Singapore. Abhinav Krishna NUS, Singapore. Priyantha Desilva NUS, Singapore. Shakith Fernando NUS, Singapore. Zhonglei TU Munchen, Germany. James Young - Brigham Young University. Amit Kumar Singh Nanyang Technical University, Singapore. Guan Yu IMEC, Belgium.
55 55 Handling Multiple Use-cases For rapid prototyping, hardware synthesis time is the bottleneck Limits the design space exploration For real system, more use-cases implies More memory to store the configuration Increased switching Use-case merging and partitioning Reduces the number of partitions Reduces the synthesis time Better for DSE, and run-time memory
56 56 Use-case Merging Use-case A Use-case B Proc 0 Proc 1 Proc 0 Proc 1 Proc 2 Proc 3 Proc 2 Merged Design Proc 0 Proc 1 Proc 3 Proc 2
57 57 Use-case Partitioning Use-case
58 58 Use-case Merging and Partitioning Results Random Graphs Mobile Phone Without Reduction With Reduction # Partitions Time (ms) # Partitions Time (ms) Without Merging Greedy Out of Memory Out of Memory First-Fit Without Merging Greedy 112 3, First-Fit Optimal Partitions > Reduction Factor
59 59 Outline Introduction Multimedia Multiproc Systems Introduction to SDF Analysis Basic Probabilistic Performance Prediction Iterative Probabilistic Performance Prediction Design Synthesizing MPSoC for multiple applications Synthesizing MPSoC for multiple use-cases Management Resource Management for MPSoC systems
60 60 Dynamism in Applications Multimedia applications are often dynamic SDF assumes worst-case-execution-time not realistic Analysis results may be pessimistic lead to waste of resources & energy Dynamic execution time may lead to unpredictable application performance
61 61 Unpredictability Variation in Execution Time A B P1 P2 A B P3 t 0 t 1 Steady t 2 t 3 State
62 62 Resource Manager Budget enforcement When running, each application signals RM when it completes an iteration RM keeps track of each application s progress Operation modes Polling mode Interrupt mode Suspends application if needed
63 63 Budget Enforcement (Polling) Resource Manager New job enters! job job suspended! resumed! Performance goes down! Better than required!
64 65 Performance without Resource Manager
65 66 Performance with RM I (2.5m cycles)
66 67 Performance with RM II (0k cycles)
67 68 Conclusions Modern multimedia systems support a number of applications executing concurrently. A number of challenges remain for designers Probabilistic performance prediction presented for multiple applications executing concurrently The approach is fast, yet accurate: ideal for DSE A design methodology is proposed that take application(s) specification and generates the MPSoC platform Handle multiple use-cases by merging and partitioning Resource manager presented: admission control and budget enforcement
68 69 Future Work Support for hard real-time applications: both analysis and design-flow Provide soft real-time guarantee: analysis Mixing hard and soft real-time tasks Extend MAMPS to CSDF, SADF models Achieving predictability in suspension Considering the use-case usage when partitioning them
69 70 Relevant Publications Journals (first author) Akash Kumar et al. Multi-processor Systems Synthesis for Multiple Use-Cases of Multiple Applications on FPGA. Transactions on Design Automation in Electronic Systems (ToDAES), ACM. Akash Kumar et al. Analyzing Composability of Applications on MPSoC Platforms, Journal of Systems Architecture (JSA), Elsevier. Akash Kumar et al. Iterative Probabilistic Performance Prediction for Multi-Application Multi-Processor Systems, Transactions on Computer Aided Design (TCAD), IEEE.
70 71 Relevant Publications Conferences (first author) Akash Kumar et al. Global Analysis of Resource Arbitration for MPSoC. Digital Systems Design (DSD), IEEE. Akash Kumar et al. Resource Manager for Non-preemptive Heterogeneous Multiprocessor System-on-chip. Embedded Systems for Real-Time Multimedia (Estimedia) IEEE. Akash Kumar et al. An FPGA Design Flow for Reconfigurable Network-Based Multi-Processor Systems-on-Chip. Design Automation and Test in Europe (DATE), IEEE. Akash Kumar et al. A Probabilistic Approach to Model Resource Contention for Performance Estimation of Multi-featured Media Devices, Design Automation Conference (DAC), ACM/IEEE. Akash Kumar et al. Multi-processor System-level Synthesis for Multiple Applications on Platform FPGA, Field Programmable Logic (FPL), IEEE.
FPGA area allocation for parallel C applications
1 FPGA area allocation for parallel C applications Vlad-Mihai Sima, Elena Moscu Panainte, Koen Bertels Computer Engineering Faculty of Electrical Engineering, Mathematics and Computer Science Delft University
More informationThroughput constraint for Synchronous Data Flow Graphs
Throughput constraint for Synchronous Data Flow Graphs *Alessio Bonfietti Michele Lombardi Michela Milano Luca Benini!"#$%&'()*+,-)./&0&20304(5 60,7&-8990,.+:&;/&."!?@A>&"'&=,0B+C. !"#$%&'()* Resource
More information7a. System-on-chip design and prototyping platforms
7a. System-on-chip design and prototyping platforms Labros Bisdounis, Ph.D. Department of Computer and Communication Engineering 1 What is System-on-Chip (SoC)? System-on-chip is an integrated circuit
More informationReal-Time Scheduling (Part 1) (Working Draft) Real-Time System Example
Real-Time Scheduling (Part 1) (Working Draft) Insup Lee Department of Computer and Information Science School of Engineering and Applied Science University of Pennsylvania www.cis.upenn.edu/~lee/ CIS 41,
More informationHardware Task Scheduling and Placement in Operating Systems for Dynamically Reconfigurable SoC
Hardware Task Scheduling and Placement in Operating Systems for Dynamically Reconfigurable SoC Yuan-Hsiu Chen and Pao-Ann Hsiung National Chung Cheng University, Chiayi, Taiwan 621, ROC. pahsiung@cs.ccu.edu.tw
More informationDeciding which process to run. (Deciding which thread to run) Deciding how long the chosen process can run
SFWR ENG 3BB4 Software Design 3 Concurrent System Design 2 SFWR ENG 3BB4 Software Design 3 Concurrent System Design 11.8 10 CPU Scheduling Chapter 11 CPU Scheduling Policies Deciding which process to run
More information21152 PCI-to-PCI Bridge
Product Features Brief Datasheet Intel s second-generation 21152 PCI-to-PCI Bridge is fully compliant with PCI Local Bus Specification, Revision 2.1. The 21152 is pin-to-pin compatible with Intel s 21052,
More informationOptimizing Configuration and Application Mapping for MPSoC Architectures
Optimizing Configuration and Application Mapping for MPSoC Architectures École Polytechnique de Montréal, Canada Email : Sebastien.Le-Beux@polymtl.ca 1 Multi-Processor Systems on Chip (MPSoC) Design Trends
More informationMultiprocessor System-on-Chip
http://www.artistembedded.org/fp6/ ARTIST Workshop at DATE 06 W4: Design Issues in Distributed, CommunicationCentric Systems Modelling Networked Embedded Systems: From MPSoC to Sensor Networks Jan Madsen
More informationSoftware Synthesis from Dataflow Models for G and LabVIEW
Presented at the Thirty-second Annual Asilomar Conference on Signals, Systems, and Computers. Pacific Grove, California, U.S.A., November 1998 Software Synthesis from Dataflow Models for G and LabVIEW
More informationMulti-GPU Load Balancing for Simulation and Rendering
Multi- Load Balancing for Simulation and Rendering Yong Cao Computer Science Department, Virginia Tech, USA In-situ ualization and ual Analytics Instant visualization and interaction of computing tasks
More informationOperating Systems 4 th Class
Operating Systems 4 th Class Lecture 1 Operating Systems Operating systems are essential part of any computer system. Therefore, a course in operating systems is an essential part of any computer science
More informationRoad Map. Scheduling. Types of Scheduling. Scheduling. CPU Scheduling. Job Scheduling. Dickinson College Computer Science 354 Spring 2010.
Road Map Scheduling Dickinson College Computer Science 354 Spring 2010 Past: What an OS is, why we have them, what they do. Base hardware and support for operating systems Process Management Threads Present:
More informationOperating System Support for Multiprocessor Systems-on-Chip
Operating System Support for Multiprocessor Systems-on-Chip Dr. Gabriel marchesan almeida Agenda. Introduction. Adaptive System + Shop Architecture. Preliminary Results. Perspectives & Conclusions Dr.
More informationA Framework for Automatic Generation of Configuration Files for a Custom Hardware/Software RTOS
A Framework for Automatic Generation of Configuration Files for a Custom Hardware/Software Jaehwan Lee, Kyeong Keol Ryu and Vincent John Mooney III School of Electrical and Computer Engineering Georgia
More informationA Mixed Time-Criticality SDRAM Controller
NEST COBRA CA4 A Mixed Time-Criticality SDRAM Controller MeAOW 3-9-23 Sven Goossens, Benny Akesson, Kees Goossens Mixed Time-Criticality 2/5 Embedded multi-core systems are getting more complex: Integrating
More informationReconfigurable Architecture Requirements for Co-Designed Virtual Machines
Reconfigurable Architecture Requirements for Co-Designed Virtual Machines Kenneth B. Kent University of New Brunswick Faculty of Computer Science Fredericton, New Brunswick, Canada ken@unb.ca Micaela Serra
More informationOpen Flow Controller and Switch Datasheet
Open Flow Controller and Switch Datasheet California State University Chico Alan Braithwaite Spring 2013 Block Diagram Figure 1. High Level Block Diagram The project will consist of a network development
More informationInternational Journal of Advancements in Research & Technology, Volume 2, Issue3, March -2013 1 ISSN 2278-7763
International Journal of Advancements in Research & Technology, Volume 2, Issue3, March -2013 1 FPGA IMPLEMENTATION OF HARDWARE TASK MANAGEMENT STRATEGIES Assistant professor Sharan Kumar Electronics Department
More informationLecture Outline Overview of real-time scheduling algorithms Outline relative strengths, weaknesses
Overview of Real-Time Scheduling Embedded Real-Time Software Lecture 3 Lecture Outline Overview of real-time scheduling algorithms Clock-driven Weighted round-robin Priority-driven Dynamic vs. static Deadline
More informationEmbedded System Hardware - Processing (Part II)
12 Embedded System Hardware - Processing (Part II) Jian-Jia Chen (Slides are based on Peter Marwedel) Informatik 12 TU Dortmund Germany Springer, 2010 2014 年 11 月 11 日 These slides use Microsoft clip arts.
More informationAn Interactive Visualization Tool for the Analysis of Multi-Objective Embedded Systems Design Space Exploration
An Interactive Visualization Tool for the Analysis of Multi-Objective Embedded Systems Design Space Exploration Toktam Taghavi, Andy D. Pimentel Computer Systems Architecture Group, Informatics Institute
More informationFPGA-based Multithreading for In-Memory Hash Joins
FPGA-based Multithreading for In-Memory Hash Joins Robert J. Halstead, Ildar Absalyamov, Walid A. Najjar, Vassilis J. Tsotras University of California, Riverside Outline Background What are FPGAs Multithreaded
More informationSPEED-POWER EXPLORATION OF 2-D INTELLIGENCE NETWORK- ON-CHIP FOR MULTI-CLOCK MULTI-MICROCONTROLLER ON 28nm FPGA (Zynq-7000) DESIGN
SPEED-POWER EXPLORATION OF 2-D INTELLIGENCE NETWORK- ON-CHIP FOR MULTI-CLOCK MULTI-MICROCONTROLLER ON 28nm FPGA (Zynq-7000) DESIGN Anoop Kumar Vishwakarma 1, Uday Arun 2 1 Student (M.Tech.), ECE, ABES
More informationMotivation: Smartphone Market
Motivation: Smartphone Market Smartphone Systems External Display Device Display Smartphone Systems Smartphone-like system Main Camera Front-facing Camera Central Processing Unit Device Display Graphics
More informationA Dynamic Link Allocation Router
A Dynamic Link Allocation Router Wei Song and Doug Edwards School of Computer Science, the University of Manchester Oxford Road, Manchester M13 9PL, UK {songw, doug}@cs.man.ac.uk Abstract The connection
More informationFast Hybrid Simulation for Accurate Decoded Video Quality Assessment on MPSoC Platforms with Resource Constraints
Fast Hybrid Simulation for Accurate Decoded Video Quality Assessment on MPSoC Platforms with Resource Constraints Deepak Gangadharan and Roger Zimmermann Department of Computer Science, National University
More informationDesign and Implementation of an On-Chip timing based Permutation Network for Multiprocessor system on Chip
Design and Implementation of an On-Chip timing based Permutation Network for Multiprocessor system on Chip Ms Lavanya Thunuguntla 1, Saritha Sapa 2 1 Associate Professor, Department of ECE, HITAM, Telangana
More informationHigh-Level Synthesis for FPGA Designs
High-Level Synthesis for FPGA Designs BRINGING BRINGING YOU YOU THE THE NEXT NEXT LEVEL LEVEL IN IN EMBEDDED EMBEDDED DEVELOPMENT DEVELOPMENT Frank de Bont Trainer consultant Cereslaan 10b 5384 VT Heesch
More informationIntroduction to System-on-Chip
Introduction to System-on-Chip COE838: Systems-on-Chip Design http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer Engineering Ryerson University
More informationScheduling. Yücel Saygın. These slides are based on your text book and on the slides prepared by Andrew S. Tanenbaum
Scheduling Yücel Saygın These slides are based on your text book and on the slides prepared by Andrew S. Tanenbaum 1 Scheduling Introduction to Scheduling (1) Bursts of CPU usage alternate with periods
More informationGEDAE TM - A Graphical Programming and Autocode Generation Tool for Signal Processor Applications
GEDAE TM - A Graphical Programming and Autocode Generation Tool for Signal Processor Applications Harris Z. Zebrowitz Lockheed Martin Advanced Technology Laboratories 1 Federal Street Camden, NJ 08102
More informationAperiodic Task Scheduling
Aperiodic Task Scheduling Jian-Jia Chen (slides are based on Peter Marwedel) TU Dortmund, Informatik 12 Germany Springer, 2010 2014 年 11 月 19 日 These slides use Microsoft clip arts. Microsoft copyright
More informationIMCM: A Flexible Fine-Grained Adaptive Framework for Parallel Mobile Hybrid Cloud Applications
Open System Laboratory of University of Illinois at Urbana Champaign presents: Outline: IMCM: A Flexible Fine-Grained Adaptive Framework for Parallel Mobile Hybrid Cloud Applications A Fine-Grained Adaptive
More informationReal-Time Operating Systems for MPSoCs
Real-Time Operating Systems for MPSoCs Hiroyuki Tomiyama Graduate School of Information Science Nagoya University http://member.acm.org/~hiroyuki MPSoC 2009 1 Contributors Hiroaki Takada Director and Professor
More informationReal Time Network Server Monitoring using Smartphone with Dynamic Load Balancing
www.ijcsi.org 227 Real Time Network Server Monitoring using Smartphone with Dynamic Load Balancing Dhuha Basheer Abdullah 1, Zeena Abdulgafar Thanoon 2, 1 Computer Science Department, Mosul University,
More informationOperating Systems. III. Scheduling. http://soc.eurecom.fr/os/
Operating Systems Institut Mines-Telecom III. Scheduling Ludovic Apvrille ludovic.apvrille@telecom-paristech.fr Eurecom, office 470 http://soc.eurecom.fr/os/ Outline Basics of Scheduling Definitions Switching
More informationVHDL DESIGN OF EDUCATIONAL, MODERN AND OPEN- ARCHITECTURE CPU
VHDL DESIGN OF EDUCATIONAL, MODERN AND OPEN- ARCHITECTURE CPU Martin Straka Doctoral Degree Programme (1), FIT BUT E-mail: strakam@fit.vutbr.cz Supervised by: Zdeněk Kotásek E-mail: kotasek@fit.vutbr.cz
More informationCHAPTER 1 INTRODUCTION
1 CHAPTER 1 INTRODUCTION 1.1 MOTIVATION OF RESEARCH Multicore processors have two or more execution cores (processors) implemented on a single chip having their own set of execution and architectural recourses.
More informationBreaking the Interleaving Bottleneck in Communication Applications for Efficient SoC Implementations
Microelectronic System Design Research Group University Kaiserslautern www.eit.uni-kl.de/wehn Breaking the Interleaving Bottleneck in Communication Applications for Efficient SoC Implementations Norbert
More informationSwitch Fabric Implementation Using Shared Memory
Order this document by /D Switch Fabric Implementation Using Shared Memory Prepared by: Lakshmi Mandyam and B. Kinney INTRODUCTION Whether it be for the World Wide Web or for an intra office network, today
More informationPredictable response times in event-driven real-time systems
Predictable response times in event-driven real-time systems Automotive 2006 - Security and Reliability in Automotive Systems Stuttgart, October 2006. Presented by: Michael González Harbour mgh@unican.es
More informationAgenda. Michele Taliercio, Il circuito Integrato, Novembre 2001
Agenda Introduzione Il mercato Dal circuito integrato al System on a Chip (SoC) La progettazione di un SoC La tecnologia Una fabbrica di circuiti integrati 28 How to handle complexity G The engineering
More informationSupercomputing applied to Parallel Network Simulation
Supercomputing applied to Parallel Network Simulation David Cortés-Polo Research, Technological Innovation and Supercomputing Centre of Extremadura, CenitS. Trujillo, Spain david.cortes@cenits.es Summary
More informationOperatin g Systems: Internals and Design Principle s. Chapter 10 Multiprocessor and Real-Time Scheduling Seventh Edition By William Stallings
Operatin g Systems: Internals and Design Principle s Chapter 10 Multiprocessor and Real-Time Scheduling Seventh Edition By William Stallings Operating Systems: Internals and Design Principles Bear in mind,
More informationMS SQL Server 2000 Data Collector. Status: 12/8/2008
MS SQL Server 2000 Data Collector Status: 12/8/2008 Contents Introduction... 3 The performance features of the ApplicationManager Data Collector for MS SQL Server:... 4 Overview of Microsoft SQL Server:...
More informationLoad Balancing. Load Balancing 1 / 24
Load Balancing Backtracking, branch & bound and alpha-beta pruning: how to assign work to idle processes without much communication? Additionally for alpha-beta pruning: implementing the young-brothers-wait
More informationArchitectures and Platforms
Hardware/Software Codesign Arch&Platf. - 1 Architectures and Platforms 1. Architecture Selection: The Basic Trade-Offs 2. General Purpose vs. Application-Specific Processors 3. Processor Specialisation
More informationA Hardware-Software Cosynthesis Technique Based on Heterogeneous Multiprocessor Scheduling
A Hardware-Software Cosynthesis Technique Based on Heterogeneous Multiprocessor Scheduling ABSTRACT Hyunok Oh cosynthesis problem targeting the system-on-chip (SOC) design. The proposed algorithm covers
More informationScheduling. Scheduling. Scheduling levels. Decision to switch the running process can take place under the following circumstances:
Scheduling Scheduling Scheduling levels Long-term scheduling. Selects which jobs shall be allowed to enter the system. Only used in batch systems. Medium-term scheduling. Performs swapin-swapout operations
More informationPredictable Mapping of Streaming Applications on Multiprocessors
Predictable Mapping of Streaming Applications on Multiprocessors PROEFSCHRIFT ter verkrijging van de graad van doctor aan de Technische Universiteit Eindhoven, op gezag van de Rector Magnificus prof.dr.ir.
More informationWeighted Total Mark. Weighted Exam Mark
CMP2204 Operating System Technologies Period per Week Contact Hour per Semester Total Mark Exam Mark Continuous Assessment Mark Credit Units LH PH TH CH WTM WEM WCM CU 45 30 00 60 100 40 100 4 Rationale
More informationRun-Time Scheduling Support for Hybrid CPU/FPGA SoCs
Run-Time Scheduling Support for Hybrid CPU/FPGA SoCs Jason Agron jagron@ittc.ku.edu Acknowledgements I would like to thank Dr. Andrews, Dr. Alexander, and Dr. Sass for assistance and advice in both research
More informationOutline. Introduction. Multiprocessor Systems on Chip. A MPSoC Example: Nexperia DVP. A New Paradigm: Network on Chip
Outline Modeling, simulation and optimization of Multi-Processor SoCs (MPSoCs) Università of Verona Dipartimento di Informatica MPSoCs: Multi-Processor Systems on Chip A simulation platform for a MPSoC
More information159.735. Final Report. Cluster Scheduling. Submitted by: Priti Lohani 04244354
159.735 Final Report Cluster Scheduling Submitted by: Priti Lohani 04244354 1 Table of contents: 159.735... 1 Final Report... 1 Cluster Scheduling... 1 Table of contents:... 2 1. Introduction:... 3 1.1
More informationControl 2004, University of Bath, UK, September 2004
Control, University of Bath, UK, September ID- IMPACT OF DEPENDENCY AND LOAD BALANCING IN MULTITHREADING REAL-TIME CONTROL ALGORITHMS M A Hossain and M O Tokhi Department of Computing, The University of
More informationToday. Intro to real-time scheduling Cyclic executives. Scheduling tables Frames Frame size constraints. Non-independent tasks Pros and cons
Today Intro to real-time scheduling Cyclic executives Scheduling tables Frames Frame size constraints Generating schedules Non-independent tasks Pros and cons Real-Time Systems The correctness of a real-time
More informationCPU Scheduling Outline
CPU Scheduling Outline What is scheduling in the OS? What are common scheduling criteria? How to evaluate scheduling algorithms? What are common scheduling algorithms? How is thread scheduling different
More informationComparison of Request Admission Based Performance Isolation Approaches in Multi-tenant SaaS Applications
Comparison of Request Admission Based Performance Isolation Approaches in Multi-tenant SaaS Applications Rouven Kreb 1 and Manuel Loesch 2 1 SAP AG, Walldorf, Germany 2 FZI Research Center for Information
More informationLoad Balancing on a Non-dedicated Heterogeneous Network of Workstations
Load Balancing on a Non-dedicated Heterogeneous Network of Workstations Dr. Maurice Eggen Nathan Franklin Department of Computer Science Trinity University San Antonio, Texas 78212 Dr. Roger Eggen Department
More informationOn the Traffic Capacity of Cellular Data Networks. 1 Introduction. T. Bonald 1,2, A. Proutière 1,2
On the Traffic Capacity of Cellular Data Networks T. Bonald 1,2, A. Proutière 1,2 1 France Telecom Division R&D, 38-40 rue du Général Leclerc, 92794 Issy-les-Moulineaux, France {thomas.bonald, alexandre.proutiere}@francetelecom.com
More informationPerformance Oriented Management System for Reconfigurable Network Appliances
Performance Oriented Management System for Reconfigurable Network Appliances Hiroki Matsutani, Ryuji Wakikawa, Koshiro Mitsuya and Jun Murai Faculty of Environmental Information, Keio University Graduate
More informationVon der Hardware zur Software in FPGAs mit Embedded Prozessoren. Alexander Hahn Senior Field Application Engineer Lattice Semiconductor
Von der Hardware zur Software in FPGAs mit Embedded Prozessoren Alexander Hahn Senior Field Application Engineer Lattice Semiconductor AGENDA Overview Mico32 Embedded Processor Development Tool Chain HW/SW
More informationSoftware Stacks for Mixed-critical Applications: Consolidating IEEE 802.1 AVB and Time-triggered Ethernet in Next-generation Automotive Electronics
Software : Consolidating IEEE 802.1 AVB and Time-triggered Ethernet in Next-generation Automotive Electronics Soeren Rumpf Till Steinbach Franz Korf Thomas C. Schmidt till.steinbach@haw-hamburg.de September
More informationAchieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging
Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging In some markets and scenarios where competitive advantage is all about speed, speed is measured in micro- and even nano-seconds.
More informationOperating Systems, 6 th ed. Test Bank Chapter 7
True / False Questions: Chapter 7 Memory Management 1. T / F In a multiprogramming system, main memory is divided into multiple sections: one for the operating system (resident monitor, kernel) and one
More informationLoad balancing in a heterogeneous computer system by self-organizing Kohonen network
Bull. Nov. Comp. Center, Comp. Science, 25 (2006), 69 74 c 2006 NCC Publisher Load balancing in a heterogeneous computer system by self-organizing Kohonen network Mikhail S. Tarkov, Yakov S. Bezrukov Abstract.
More informationA Configurable Hardware Scheduler for Real-Time Systems
A Configurable Hardware Scheduler for Real-Time Systems Pramote Kuacharoen, Mohamed A. Shalan and Vincent J. Mooney III Center for Research on Embedded Systems and Technology School of Electrical and Computer
More informationfakultät für informatik informatik 12 technische universität dortmund Data flow models Peter Marwedel Informatik 12 TU Dortmund Germany
12 Data flow models Peter Marwedel Informatik 12 TU Dortmund Germany Models of computation considered in this course Communication/ local computations Communicating finite state machines Data flow model
More informationtheguard! ApplicationManager System Windows Data Collector
theguard! ApplicationManager System Windows Data Collector Status: 10/9/2008 Introduction... 3 The Performance Features of the ApplicationManager Data Collector for Microsoft Windows Server... 3 Overview
More informationDigitale Signalverarbeitung mit FPGA (DSF) Soft Core Prozessor NIOS II Stand Mai 2007. Jens Onno Krah
(DSF) Soft Core Prozessor NIOS II Stand Mai 2007 Jens Onno Krah Cologne University of Applied Sciences www.fh-koeln.de jens_onno.krah@fh-koeln.de NIOS II 1 1 What is Nios II? Altera s Second Generation
More informationMultiprocessor Scheduling and Scheduling in Linux Kernel 2.6
Multiprocessor Scheduling and Scheduling in Linux Kernel 2.6 Winter Term 2008 / 2009 Jun.-Prof. Dr. André Brinkmann Andre.Brinkmann@uni-paderborn.de Universität Paderborn PC² Agenda Multiprocessor and
More informationOperating System Aspects. Real-Time Systems. Resource Management Tasks
Operating System Aspects Chapter 2: Basics Chapter 3: Multimedia Systems Communication Aspects and Services Multimedia Applications and Communication Multimedia Transfer and Control Protocols Quality of
More informationComputer System Design. System-on-Chip
Brochure More information from http://www.researchandmarkets.com/reports/2171000/ Computer System Design. System-on-Chip Description: The next generation of computer system designers will be less concerned
More informationPartial and Dynamic reconfiguration of FPGAs: a top down design methodology for an automatic implementation
Partial and Dynamic reconfiguration of FPGAs: a top down design methodology for an automatic implementation Florent Berthelot, Fabienne Nouvel, Dominique Houzet To cite this version: Florent Berthelot,
More informationW4118 Operating Systems. Instructor: Junfeng Yang
W4118 Operating Systems Instructor: Junfeng Yang Outline Introduction to scheduling Scheduling algorithms 1 Direction within course Until now: interrupts, processes, threads, synchronization Mostly mechanisms
More informationImplementing Parameterized Dynamic Load Balancing Algorithm Using CPU and Memory
Implementing Parameterized Dynamic Balancing Algorithm Using CPU and Memory Pradip Wawge 1, Pritish Tijare 2 Master of Engineering, Information Technology, Sipna college of Engineering, Amravati, Maharashtra,
More informationIntel DPDK Boosts Server Appliance Performance White Paper
Intel DPDK Boosts Server Appliance Performance Intel DPDK Boosts Server Appliance Performance Introduction As network speeds increase to 40G and above, both in the enterprise and data center, the bottlenecks
More informationCPU SCHEDULING (CONT D) NESTED SCHEDULING FUNCTIONS
CPU SCHEDULING CPU SCHEDULING (CONT D) Aims to assign processes to be executed by the CPU in a way that meets system objectives such as response time, throughput, and processor efficiency Broken down into
More informationCHAPTER 4: SOFTWARE PART OF RTOS, THE SCHEDULER
CHAPTER 4: SOFTWARE PART OF RTOS, THE SCHEDULER To provide the transparency of the system the user space is implemented in software as Scheduler. Given the sketch of the architecture, a low overhead scheduler
More informationRapid System Prototyping with FPGAs
Rapid System Prototyping with FPGAs By R.C. Coferand Benjamin F. Harding AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO Newnes is an imprint of
More informationProcess Scheduling CS 241. February 24, 2012. Copyright University of Illinois CS 241 Staff
Process Scheduling CS 241 February 24, 2012 Copyright University of Illinois CS 241 Staff 1 Announcements Mid-semester feedback survey (linked off web page) MP4 due Friday (not Tuesday) Midterm Next Tuesday,
More informationHARDWARE IMPLEMENTATION OF TASK MANAGEMENT IN EMBEDDED REAL-TIME OPERATING SYSTEMS
HARDWARE IMPLEMENTATION OF TASK MANAGEMENT IN EMBEDDED REAL-TIME OPERATING SYSTEMS 1 SHI-HAI ZHU 1Department of Computer and Information Engineering, Zhejiang Water Conservancy and Hydropower College Hangzhou,
More informationCPU Shielding: Investigating Real-Time Guarantees via Resource Partitioning
CPU Shielding: Investigating Real-Time Guarantees via Resource Partitioning Progress Report 1 John Scott Tillman jstillma@ncsu.edu CSC714 Real-Time Computer Systems North Carolina State University Instructor:
More informationSystem-Level Performance Analysis for Designing On-Chip Communication Architectures
768 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 20, NO. 6, JUNE 2001 System-Level Performance Analysis for Designing On-Chip Communication Architectures Kanishka
More informationVirtualized Execution and Management of Hardware Tasks on a Hybrid ARM-FPGA Platform
J Sign Process Syst (2014) 77:61 76 DOI 10.1007/s11265-014-0884-1 Virtualized Execution and Management of Hardware Tasks on a Hybrid ARM-FPGA Platform Abhishek Kumar Jain Khoa Dang Pham Jin Cui Suhaib
More informationLoad Balancing & DFS Primitives for Efficient Multicore Applications
Load Balancing & DFS Primitives for Efficient Multicore Applications M. Grammatikakis, A. Papagrigoriou, P. Petrakis, G. Kornaros, I. Christophorakis TEI of Crete This work is implemented through the Operational
More informationArchitectural Level Power Consumption of Network on Chip. Presenter: YUAN Zheng
Architectural Level Power Consumption of Network Presenter: YUAN Zheng Why Architectural Low Power Design? High-speed and large volume communication among different parts on a chip Problem: Power consumption
More informationXtratuM integration on bespoke TTNoCbased. Assessment of the HW/SW.
XtratuM integration on bespoke TTNoCbased HW. Assessment of the HW/SW. Deliverable 6.5.3 Project acronym : MultiPARTES Project Number: 287702 Version: v1.1 Due date of deliverable: September 2014 Submission
More informationMaking Multicore Work and Measuring its Benefits. Markus Levy, president EEMBC and Multicore Association
Making Multicore Work and Measuring its Benefits Markus Levy, president EEMBC and Multicore Association Agenda Why Multicore? Standards and issues in the multicore community What is Multicore Association?
More informationICS 143 - Principles of Operating Systems
ICS 143 - Principles of Operating Systems Lecture 5 - CPU Scheduling Prof. Nalini Venkatasubramanian nalini@ics.uci.edu Note that some slides are adapted from course text slides 2008 Silberschatz. Some
More informationInstitut d Electronique et des Télécommunications de Rennes. Equipe Image
1 D ÉLCTRONI QU T D NICATIONS D RNNS Institut d lectronique et des Télécommunications de Rennes March 13 2015 quipe Image 2 The team xpertise: ITR Image Team D ÉLCTRONI 10 teachers-researcher QU ~ T 15
More informationLecture 3 Theoretical Foundations of RTOS
CENG 383 Real-Time Systems Lecture 3 Theoretical Foundations of RTOS Asst. Prof. Tolga Ayav, Ph.D. Department of Computer Engineering Task States Executing Ready Suspended (or blocked) Dormant (or sleeping)
More informationComparison on Different Load Balancing Algorithms of Peer to Peer Networks
Comparison on Different Load Balancing Algorithms of Peer to Peer Networks K.N.Sirisha *, S.Bhagya Rekha M.Tech,Software Engineering Noble college of Engineering & Technology for Women Web Technologies
More informationDS1104 R&D Controller Board
DS1104 R&D Controller Board Cost-effective system for controller development Highlights Single-board system with real-time hardware and comprehensive I/O Cost-effective PCI hardware for use in PCs Application
More informationChapter 12: Multiprocessor Architectures. Lesson 01: Performance characteristics of Multiprocessor Architectures and Speedup
Chapter 12: Multiprocessor Architectures Lesson 01: Performance characteristics of Multiprocessor Architectures and Speedup Objective Be familiar with basic multiprocessor architectures and be able to
More informationA STUDY OF TASK SCHEDULING IN MULTIPROCESSOR ENVIROMENT Ranjit Rajak 1, C.P.Katti 2, Nidhi Rajak 3
A STUDY OF TASK SCHEDULING IN MULTIPROCESSOR ENVIROMENT Ranjit Rajak 1, C.P.Katti, Nidhi Rajak 1 Department of Computer Science & Applications, Dr.H.S.Gour Central University, Sagar, India, ranjit.jnu@gmail.com
More informationIntroduction. Application Performance in the QLinux Multimedia Operating System. Solution: QLinux. Introduction. Outline. QLinux Design Principles
Application Performance in the QLinux Multimedia Operating System Sundaram, A. Chandra, P. Goyal, P. Shenoy, J. Sahni and H. Vin Umass Amherst, U of Texas Austin ACM Multimedia, 2000 Introduction General
More informationTIME PREDICTABLE CPU AND DMA SHARED MEMORY ACCESS
TIME PREDICTABLE CPU AND DMA SHARED MEMORY ACCESS Christof Pitter Institute of Computer Engineering Vienna University of Technology, Austria cpitter@mail.tuwien.ac.at Martin Schoeberl Institute of Computer
More informationProcessor Scheduling. Queues Recall OS maintains various queues
Processor Scheduling Chapters 9 and 10 of [OS4e], Chapter 6 of [OSC]: Queues Scheduling Criteria Cooperative versus Preemptive Scheduling Scheduling Algorithms Multi-level Queues Multiprocessor and Real-Time
More information