Optimizing Configuration and Application Mapping for MPSoC Architectures

Similar documents
Hyper Node Torus: A New Interconnection Network for High Speed Packet Processors

ESE566 REPORT3. Design Methodologies for Core-based System-on-Chip HUA TANG OVIDIU CARNU

Breaking the Interleaving Bottleneck in Communication Applications for Efficient SoC Implementations

3D On-chip Data Center Networks Using Circuit Switches and Packet Switches

Agenda. Michele Taliercio, Il circuito Integrato, Novembre 2001

Making Multicore Work and Measuring its Benefits. Markus Levy, president EEMBC and Multicore Association

Introduction to Exploration and Optimization of Multiprocessor Embedded Architectures based on Networks On-Chip

On-Chip Communications Network Report

Outline. Introduction. Multiprocessor Systems on Chip. A MPSoC Example: Nexperia DVP. A New Paradigm: Network on Chip

From Bus and Crossbar to Network-On-Chip. Arteris S.A.

The functions of system LSI become more and more complicated

Power-Aware High-Performance Scientific Computing

GEDAE TM - A Graphical Programming and Autocode Generation Tool for Signal Processor Applications

Asynchronous Bypass Channels

Multi-objective Design Space Exploration based on UML

A Network Management Framework for Emerging Telecommunications Network.

Memory Architecture and Management in a NoC Platform

SOC architecture and design

Interconnection Generation for System-on-Chip Design and Design Space Exploration

Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand

PART II. OPS-based metro area networks

7a. System-on-chip design and prototyping platforms

Contents. System Development Models and Methods. Design Abstraction and Views. Synthesis. Control/Data-Flow Models. System Synthesis Models

Multistage Interconnection Network for MPSoC: Performances study and prototyping on FPGA

Design and Implementation of an On-Chip timing based Permutation Network for Multiprocessor system on Chip

Computer Engineering: MS Program Overview, Fall 2013

COMP 422, Lecture 3: Physical Organization & Communication Costs in Parallel Machines (Sections 2.4 & 2.5 of textbook)

MULTISTAGE INTERCONNECTION NETWORKS: A TRANSITION TO OPTICAL

Mixed-Criticality Systems Based on Time- Triggered Ethernet with Multiple Ring Topologies. University of Siegen Mohammed Abuteir, Roman Obermaisser

LIST OF FIGURES. Figure No. Caption Page No.

Extending the Power of FPGAs. Salil Raje, Xilinx

Technical White Paper for Multi-Layer Network Planning

Universal Flash Storage: Mobilize Your Data

Principles and characteristics of distributed systems and environments

LOGICAL TOPOLOGY DESIGN Practical tools to configure networks

A RDT-Based Interconnection Network for Scalable Network-on-Chip Designs

Networking Virtualization Using FPGAs

Design and Implementation of an On-Chip Permutation Network for Multiprocessor System-On-Chip

Photonic Networks for Data Centres and High Performance Computing

Topological Properties

PART III. OPS-based wide area networks

Dynamic Network Resources Allocation in Grids through a Grid Network Resource Broker

Relationship between SMP, ASON, GMPLS and SDN

Scalability and Classifications

Interconnection Networks

CHAPTER 6. VOICE COMMUNICATION OVER HYBRID MANETs

Towards a Design Space Exploration Methodology for System-on-Chip

Introduction to System-on-Chip

Throughput constraint for Synchronous Data Flow Graphs

Parallel Programming Survey

White Paper. Requirements of Network Virtualization

Network Virtualization Server for Adaptive Network Control

Architectures and Platforms

Multi-Objective Genetic Test Generation for Systems-on-Chip Hardware Verification

Latency in High Performance Trading Systems Feb 2010

Hardware Implementation of Improved Adaptive NoC Router with Flit Flow History based Load Balancing Selection Strategy

Traffic Engineering for Multiple Spanning Tree Protocol in Large Data Centers

Extending Platform-Based Design to Network on Chip Systems

Management and Orchestration of Virtualized Network Functions

Influence of Load Balancing on Quality of Real Time Data Transmission*

A Generic Network Interface Architecture for a Networked Processor Array (NePA)

Facility Usage Scenarios

Design of a Feasible On-Chip Interconnection Network for a Chip Multiprocessor (CMP)

In-network Monitoring and Control Policy for DVFS of CMP Networkson-Chip and Last Level Caches

AN EFFICIENT DESIGN OF LATCHES FOR MULTI-CLOCK MULTI- MICROCONTROLLER SYSTEM ON CHIP FOR BUS SYNCHRONIZATION

How To Provide Qos Based Routing In The Internet

Introduction to Parallel Computing. George Karypis Parallel Programming Platforms

Load balancing in a heterogeneous computer system by self-organizing Kohonen network

Performance Evaluation of 2D-Mesh, Ring, and Crossbar Interconnects for Chip Multi- Processors. NoCArc 09

Multiple Layer Traffic Engineering in NTT Network Service

Physical Synthesis of Bus Matrix for High Bandwidth Low Power On-chip Communications

Architectural Level Power Consumption of Network on Chip. Presenter: YUAN Zheng

System Interconnect Architectures. Goals and Analysis. Network Properties and Routing. Terminology - 2. Terminology - 1

Applying the Benefits of Network on a Chip Architecture to FPGA System Design

On the Placement of Management and Control Functionality in Software Defined Networks

Chapter 2. Multiprocessors Interconnection Networks

Multiprocessor System-on-Chip

Juniper Networks QFabric: Scaling for the Modern Data Center

AMD Opteron Quad-Core

An Interactive Visualization Tool for the Analysis of Multi-Objective Embedded Systems Design Space Exploration

Interconnection Networks. Interconnection Networks. Interconnection networks are used everywhere!

RAMCloud and the Low- Latency Datacenter. John Ousterhout Stanford University

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip.

Lecture 18: Interconnection Networks. CMU : Parallel Computer Architecture and Programming (Spring 2012)

A Software Architecture for a Photonic Network Planning Tool

TÓPICOS AVANÇADOS EM REDES ADVANCED TOPICS IN NETWORKS

On real-time delay monitoring in software-defined networks

Traffic Engineering & Network Planning Tool for MPLS Networks

Distributed Elastic Switch Architecture for efficient Networks-on-FPGAs

Kevin Webb, Alex Snoeren, Ken Yocum UC San Diego Computer Science March 29, 2011 Hot-ICE 2011

Transcription:

Optimizing Configuration and Application Mapping for MPSoC Architectures École Polytechnique de Montréal, Canada Email : Sebastien.Le-Beux@polymtl.ca 1

Multi-Processor Systems on Chip (MPSoC) Design Trends SoC Design two contradictory trends Rising platform development cost Reducing the product market window Directions to tackle these challenges Exploit domain-specific MPSoC reusable platforms Several interconnected processors Networks-on-Chip Proc 1 Proc 2 Proc n Network-on-Chip 2

Multi-Processor SoC (MPSoC) Design Trends System-on-Chip (SoC) Higher performance More Moore System-in-Package (SiP) Extended functionalities More Than Moore 3D Integration Combining SoC and SiP Logic transistors per chip (in millions) Note: logarithmic scale 10,000 1,000 100 10 1 0.1 0.01 0.001 3

Multi-Processor SoC (MPSoC) Design Trends System-on-Chip (SoC) Higher performance More Moore System-in-Package (SiP) Extended functionalities More Than Moore 3D Integration Combining SoC and SiP Source : Balinga, Banerjee 4

Multi-Processor SoC (MPSoC) Design Trends System-on-Chip (SoC) Higher performance More Moore System-in-Package (SiP) Extended functionalities More Than Moore 3D Integration Combining SoC and SiP 5

Multi-Processor SoC (MPSoC) Design Trends 3D Integration Technology Promising paradigm for Heterogeneous Systems Multiple tiers multiple technologies Functions will use the best technology available Ex. computing electronics / communication optics Optical Layer Electric Layers Proc 1 Proc 2 Proc n Network-on-Chip 6

Configuration Parameters in MPSoC Design No. of proc. in the architecture No. of layers (tiers) Type of Network on Chip Technology used for each layer Application Mapping T1 T3 T5 T7 T9 T0 T11 T2 T4 T6 T8 T10 Proc 1 Proc 2 Proc n Network-on-Chip 7

Challenges for MPSoC Design Huge solutions space System-level approaches for design-space exploration are mandatory T1 T3 T5 T7 T9 T0 T11 T2 T4 T6 T8 T10 Proc 1 Proc 2 Proc n Network-on-Chip 8

Outline System-Level Approach for Optimizing Configuration and Application Mapping for MPSoC Architectures Case Studies Conclusions 9

System-Level Exploration Flow Application Model T0 T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 Optimization Metrics Architecture Model Proc 1 Proc 2 Proc n Network-on-Chip Multi-Objective Optimization - exploration engine - Performance Analysis Set of Pareto Solutions Promising Mappings and Configurations Visualization & Debug 10

Application Model Streaming Applications Regular and repeating computations Explicit parallel, independent computations Explicit communication T1 T3 T5 T0 T2 T4 T6 T7 T8 T9 T10 T11 11

Application Model Directed acyclic graph G = (T,E) T set of tasks Annotated with an execution time Expressed in no. of Clock Cycles (cc) required for the execution E - set of edges Annotated with the amount of data transferred between the tasks connected by the edge Expressed in bytes (b) T1 T3 T5 T7 T9 T0 T11 100 kcc 12 kb 12 kb T2 537 kcc 537 kcc 16 kb 16 kb 536 kcc 268 kcc T4 24 kb 24 kb T6 T8 268 kcc 268kcc T10 536 kcc 268 kcc 28 kb 28 kb 536 kcc 536 kcc 32 kb 32 kb 24 kb 24 kb 100 kcc 12

System-Level Exploration Flow Application Model T0 T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 Optimization Metrics Architecture Model Proc 1 Proc 2 Proc n Network-on-Chip Multi-Objective Optimization - exploration engine - Performance Analysis Set of Pareto Solutions Promising Mappings and Configurations Visualization & Debug 13

Architecture Model Planar Architectures Set of Nodes interconnected through a Network-on-Chip (NoC) Node - a subsystem including a processor and its local memory NoC - composed of a set of Links and Switches Mainly bandwidth (streaming) 3x3 MESH 14

Architecture Model: 3D MESH 3D MESH Architectures Extrapolation of existing planar architectures Switches adapted for vertical routing Interconnection types Intra-layer Inter-layer 15

3D MPSoC Including Optical Networks-on-Chip (ONoC) Architecture defined by extrapolation of 2 planar approaches : 1. Electrical Network on Chip 2. Optical Network on Chip Wavelength Division Multiplexing (WDM) Low latency (<1 ns) Limited by optical/electrical interfaces Objectives: exploration of the design space by considering technological constraints 16

3D MPSoC Including Optical Networks-on-Chip Optical Network Interface (ONI) receiver driver SER Interconnect Ratio IR = No of ONI Total No of Nodes 17

System-Level Exploration Flow Application Model T0 T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 Optimization Metrics Architecture Model Proc 1 Proc 2 Proc n Network-on-Chip Multi-Objective Optimization - exploration engine - Performance Analysis Set of Pareto Solutions Promising Mappings and Configurations Visualization & Debug 18

Optimization Metrics 1. Execution time 2. Critical Delay 3. Area cost 19

Optimization Metrics 1. Execution time The time required to execute a complete iteration of an application 20

Optimization Metrics 1. Execution time 2. Critical Delay The delay between executions of application iterations Defines the throughput of the system 21

Optimization Metrics 1. Execution time 2. Critical Delay Communication oriented event-based simulator 22

System-Level Exploration Flow Application Model T0 T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 Optimization Metrics Architecture Model Proc 1 Proc 2 Proc n Network-on-Chip Multi-Objective Optimization - exploration engine - Performance Analysis Set of Pareto Solutions Promising Mappings and Configurations Visualization & Debug 23

Exploration Engine Automatic multi-objective optimization Exploits evolutionary algorithms Chromosome-like representation 24

Combining Mapping and HW- SW Partitioning 25

Outline System-Level Approach for Optimizing Configuration and Application Mapping for MPSoC Architectures Case Studies Conclusions 26

Demosaic Image Processing Application 27

Mapping Demosaic Application on Planar Architectures 28

Mapping Demosaic Application on Planar Architectures 29

Mapping Demosaic Application on 3D Architectures 30

ONoC Design Feasibility for Two Layers Architectures 31

Conclusions Optimizing Configuration and Application Mapping for MPSoC Architectures System-level Automated Case studies Configuration of planar and 3D MPSoC architectures Demosaic application T1 T3 T5 T7 T9 T0 T11 T2 T4 T6 T8 T10 Proc 1 Proc 2 Proc n Network-on-Chip 32

Optical Network-on-Chip wavelength matrix network topology 33

Throughput performances (random traffic generation) 34

Multi-ONoC 35

Multi-ONoC Layout 36