Institut d Electronique et des Télécommunications de Rennes. Equipe Image

Size: px
Start display at page:

Download "Institut d Electronique et des Télécommunications de Rennes. Equipe Image"

Transcription

1 1 D ÉLCTRONI QU T D NICATIONS D RNNS Institut d lectronique et des Télécommunications de Rennes March quipe Image

2 2 The team xpertise: ITR Image Team D ÉLCTRONI 10 teachers-researcher QU ~ T 15 D PhD & post-docs NICATIONS D RNNS Image : analysis, compression Architecture : multi-core, embedded systems Research themes: Image analysis for semantic indexation and embedded vision, 2D/3D image and video coding, Cryptography, Architecture,

3 3 D ÉLCTRONI QU T D NICATIONS D RNNS ITR Image Architecture theme

4 4 Objectives D ÉLCTRONI signal processing applications QU T D distributed and embedded platforms NICATIONS D RNNS Throughput Latency nergy Memory Programming Time Dataflow-based Methods and Tools for: Optimizing

5 5 Target Applications D ÉLCTRONI MPG4 Part2, AVC, SVC, HVC, SHVC MPG Participation QU T D NICATIONS D RNNS Stereo Vision, SLAM MPG Decoders Computer Vision and 3D Processing Cryptography Chaotic-based Cryptography Telecommunications 3GPP LT enodeb

6 6 D ÉLCTRONI QU T D NICATIONS D RNNS Target Platforms Texas Instruments Keystone I and II Zboard with Xilinx Zynq Odroid with Samsung xynos 5 Kalray MPPA

7 7 D ÉLCTRONI Throughput QU T D NICATIONS Latency D RNNS Optimizing nergy Memory Programming Time Methods Dataflow programming SIMD & Parallelism Data representation nergy-aware processing

8 8 Softwares D ÉLCTRONI QU T D NICATIONS D RNNS Open SVC Decoder (C code, x86 ASM) Open HVC Decoder (C code, x86 & ARM ASM) FFmpeg https://github.com/openhvc/openhvc Orcc Compiler (Java, XTend) PRSM Rapid Prototyping Tool (Java, XTend)

9 9 D ÉLCTRONI QU T D NICATIONS D RNNS Academic Partners

10 10 D ÉLCTRONI QU T D NICATIONS D RNNS Industrial Partners

11 11 D ÉLCTRONI QU T D NICATIONS D RNNS

12 D ÉLCTRONI QU T D NICATIONS D RNNS Motivations log Introduction Lines of code/chip x2 every 10 months Transistors/chip x2 every 18 months Software Productivity Gap Lines of code/day x2 every 5 years Source: ITRS & Hardware-dependent Software, cker et al., Springer

13 Hardware Complexity D ÉLCTRONI QU T D NICATIONS 5000 D RNNS Nb of P per SoC Source: ITRS System Drivers 2011 Introduction

14 14 D ÉLCTRONI QU T D NICATIONS D RNNS What is PRSM? Algorithm PRSM Architecture PRSM +C compiler Simulator + Debugger + Profiler P Multicore Runtime DSP DSP P P P P DSP DSP Peripherals Main Memory

15 15 PRSM Tool and applications available on GitHub D ÉLCTRONI QU T D NICATIONS D RNNS What is PRSM? (Parallel Real-time mbedded xecutives Scheduling Method) A rapid prototyping framework An open-source project A set of eclipse plugins

16 16 PRSM D ÉLCTRONI QU T D Design of parallel algorithms NICATIONS Throughput/Latency D evaluation RNNS Using PRSM to design an embedded system: To provide metrics Predictable memory footprints To build a working prototype Code generation for multicore architectures Guaranteed deadlock-freeness Inter-core communications For design-space exploration Seamless porting to a new architecture Legacy code reusability

17 17 D ÉLCTRONI QU T D NICATIONS D RNNS Inputs Algorithm Architecture PRSM +C compiler Simulator + Debugger + Profiler P Multicore Runtime DSP DSP P P P P DSP DSP Peripherals Main Memory

18 D ÉLCTRONI QU T D Actors and Data ports NICATIONS FIFO queues D RNNS PRSM Inputs Algorithm descriptions using Dataflow Graphs Synchronous Dataflow (SDF) A B C D. Lee and D. Messerschmitt, Synchronous data flow, Proceedings of the I,

19 PRSM Inputs D ÉLCTRONI QU T D An actor is fired when its input FIFOs contain enough data-tokens. NICATIONS D RNNS A Algorithm descriptions using Dataflow Graphs Data-driven execution B 1 2 C D Core 1 A B C C D. Lee and D. Messerschmitt, Synchronous data flow, Proceedings of the I,

20 D ÉLCTRONI QU T D A 2 NICATIONS D 1 RNNS 1 2 PRSM Inputs Algorithm descriptions using Dataflow Graphs xpression of parallelisms: Task / Data / Pipeline / B Core 1 Core 2 C D x2 Pipeline Internal Task Data parallelism A B C C D Internal Core 3. Lee and D. Messerschmitt, Synchronous data flow, Proceedings of the I,

21 in out PRSM Inputs PiSDF (Parameterized and Interfaced Synchronous Dataflow) D ÉLCTRONI QU T D Read Header Size =4 NICATIONS D RNNS Read Size Size Filter Send Size Size Image Size SetNb Slices =2 N Size Size Size /N Kernel Size /N Size

22 PRSM Inputs PiSDF (Parameterized and Interfaced Synchronous Dataflow) D ÉLCTRONI QU PiSDF T is: D Hierarchical & Compositional NICATIONS Statically D parameterizable RNNS Dynamically reconfigurable Lightweight runtime overhead PiSDF fosters: Predictability Parallelism Developer-friendliness K. Desnos, M. Pelcat, J.-F. Nezan, S. S. Bhattacharyya, S. Aridhi PiMM: Parameterized and Interfaced Dataflow Meta-Model for MPSoCs Runtime Reconfiguration, SAMOS XIII

23 D ÉLCTRONI QU T D NICATIONS D RNNS PRSM Inputs Directed Data Link S-LAM (System-Level Architecture Model) Communication Nodes Parallel Node Communication nablers Contention Node Processing lement Operator Set-up Link Undirected Data Link RAM DMA M. Pelcat, J.-F. Nezan, J. Piat, J. Croizer and S. Aridhi, A System-Level Architecture Model for Rapid Prototyping of Heterogeneous Multicore mbedded Systems, DASIP2009

24 D ÉLCTRONI QU T D NICATIONS D RNNS PRSM Inputs S-LAM (System-Level Architecture Model) core1 DMA RAM CN 1 Gbit/s core2 core3 M. Pelcat, J.-F. Nezan, J. Piat, J. Croizer and S. Aridhi, A System-Level Architecture Model for Rapid Prototyping of Heterogeneous Multicore mbedded Systems, DASIP2009

25 D ÉLCTRONI QU T D core1 NICATIONS D RNNS PRSM Inputs S-LAM (System-Level Architecture Model) core2 core3 TCP2 DMA SCR VCP2 DMA RIO SCR 1 Gb/s 2 GB/s 2 GB/s VCP2 TCP2 core1 core2 core3 DSP 1 DSP 2

26 D ÉLCTRONI QU T D NICATIONS D RNNS PRSM Inputs Algorithm/Architecture independence PiSDF graphs are architecture-independent S-LAM graphs are application-independent Scenario Define information/constraints for the deployment of a specific algorithm on a specific architecture Mapping constraints Heterogeneous timing constraints

27 27 D ÉLCTRONI QU T D NICATIONS D RNNS Algorithm Architecture Deployment PRSM +C compiler Simulator + Debugger + Profiler P Multicore Runtime DSP DSP P P P P DSP DSP Peripherals Main Memory

28 PRSM Deployment Customizable accuracy (w.r.t. communications) D ÉLCTRONI QU T D NICATIONS D RNNS Mapping/Scheduling for static graphs State-of-the-art algorithms (FAST, List, ) Latency and load balancing optimization core1 core2 core3 core4

29 PRSM Deployment D ÉLCTRONI QU SPIDR: T D Synchronous Parameterized and Interfaced Dataflow mbedded Runtime NICATIONS D RNNS Mapping/Scheduling for reconfigurable PiSDF Timings Jobs Params Jobs Slave Master Master tasks: - Run jobs - Map & Schedule - Manage graphs - Monitor & Trace Data Data Pool of data FIFOs Jobs Slave Slave task: - Run jobs

30 PRSM Deployment D ÉLCTRONI QU T D valuate the memory requirements NICATIONS Adjust the D size of architecture memory RNNS Memory optimizations for static graphs Bounding the memory needs of an application graph to Assess the optimality of a memory allocation Insufficient memory Possible allocated memory Wasted memory 0 Lower Bound Upper Bound Available Memory

31 D ÉLCTRONI QU T D 200 NICATIONS D RNNS PRSM Deployment Memory optimizations for static graphs Graph level memory reuse optimization x1 x2 x2 x2 x1 A B C D x75 A B 2 C 2 B 1 C 1 D x D AB AB B 1 C B 2 C C 1 C 2 75 C 1 D 1 50 C 2 D 2 50 D 2 25 D 1 25 Core 1 Core 2 A B 1 C 2 D 1 B 2 C 1 D 2 xecution order AB AB B 1 C B 2 C C 1 C 2 75 C 1 D 1 50 C 2 D 2 50 D 2 25 D 1 25

32 D ÉLCTRONI QU T D NICATIONS D RNNS PRSM Deployment Memory optimizations for static graphs Buffer merging technique for SDF graphs A 30 AB 30 B BC 20 BD C D No buffer merging AB 30 memory BC 20 BC 20 Buffer merging AB 30 memory BD 10 BD 10

33 PRSM Deployment Multiple input/output buffers merge. D ÉLCTRONI QU T D NICATIONS D RNNS Memory optimizations for static graphs 48% less memory than state-of-the-art techniques Techniques are independent from host language. No modification of the SDF MoC/applications graphs.

34 D ÉLCTRONI QU T D NICATIONS D RNNS PRSM Deployment nergy optimization: platform xynos 5 Odroid xynos 5 Big.LITTL A7 A7 A7 A7 A15 A15 A15 A15

35 D ÉLCTRONI QU core1 T D core2 NICATIONS core3 D core4 RNNS nergy optimization setup PRSM Deployment Image Processing QoS P=0 P=1 P=0.5 P=1 P=0.5 P=0 P-Value Linux-based Runtime (Abo Akademi) DVFS DPM Odroid xynos 5 Big.LITTL A7 A7 A7 A7 A15 A15 A15 A15

36 D ÉLCTRONI QU T D NICATIONS D RNNS nergy optimization results PRSM Deployment 20% energy savings on a parallel Sobel + sequential postprocessing wrt. Linux completely fair scheduler and on-demand governor S. Holmbacka,. Nogues, M. Pelcat, S. Lafond, and J. Lilius. nergy fficiency and Performance Management of Parallel Dataflow Applications. DASIP 2014, Madrid

37 37 D ÉLCTRONI QU T D NICATIONS D RNNS Algorithm Architecture Outputs PRSM +C compiler Simulator + Debugger + Profiler Multicore Runtime P DSP DSP P P P P DSP DSP Peripherals Main Memory

38 D ÉLCTRONI QU T D B A NICATIONS D C RNNS PRSM Outputs Generation of self-timed multicore code D o1 Actor A Actor B Actor D o1 o2 A B C D o2 Actor C time Actor

39 PRSM Outputs D ÉLCTRONI QU T D TMS320c6678 from Texas Instruments NICATIONS Supports D the activation of the DSP caches. RNNS Code generation for multiple targets Multi-C6X DSPs: Multi-x86 and multi-arm CPUs: Linux and Windows, pthread OMAP4 heterogeneous platform: dual-core ARM Cortex-A9, 2 Cortex-M3, and a C64xT DSP.

40 40 D ÉLCTRONI QU T D Algorithm NICATIONS D RNNS Demo Time Architecture PRSM +C compiler Simulator + Debugger + Profiler Multicore Runtime P DSP DSP P P P P DSP DSP Peripherals Main Memory

41 D ÉLCTRONI QU T D Available on GitHub NICATIONS D RNNS PRSM features Open Source Tool Research-Oriented Tool Summary New models, optimizations, scheduling clipse-based Integrated Tool Several plug-ins, metamodels xtended Web Tutorials

42 D ÉLCTRONI QU T D NICATIONS D RNNS Questions?

Dataflow-Based Rapid Prototyping for Multicore DSP Systems

Dataflow-Based Rapid Prototyping for Multicore DSP Systems Dataflow-Based Rapid Prototyping for Multicore DSP Systems Technical Report PREESM/2014-05TR01, 2014 Maxime Pelcat, Karol Desnos, Julien Heulot, Clément Guy, Jean-François Nezan, Slaheddine Aridhi 1 Introduction

More information

Throughput constraint for Synchronous Data Flow Graphs

Throughput constraint for Synchronous Data Flow Graphs Throughput constraint for Synchronous Data Flow Graphs *Alessio Bonfietti Michele Lombardi Michela Milano Luca Benini!"#$%&'()*+,-)./&0&20304(5 60,7&-8990,.+:&;/&."!?@A>&"'&=,0B+C. !"#$%&'()* Resource

More information

A Generic Network Interface Architecture for a Networked Processor Array (NePA)

A Generic Network Interface Architecture for a Networked Processor Array (NePA) A Generic Network Interface Architecture for a Networked Processor Array (NePA) Seung Eun Lee, Jun Ho Bahn, Yoon Seok Yang, and Nader Bagherzadeh EECS @ University of California, Irvine Outline Introduction

More information

Building a RTOS for MPSoC Dataflow Programming

Building a RTOS for MPSoC Dataflow Programming Building a RTOS for MPSoC Dataflow Programming Yaset Oliva, Maxime Pelcat, Jean François Nezan, Jean-Christophe Prévotet, Slaheddine Aridhi To cite this version: Yaset Oliva, Maxime Pelcat, Jean François

More information

GEDAE TM - A Graphical Programming and Autocode Generation Tool for Signal Processor Applications

GEDAE TM - A Graphical Programming and Autocode Generation Tool for Signal Processor Applications GEDAE TM - A Graphical Programming and Autocode Generation Tool for Signal Processor Applications Harris Z. Zebrowitz Lockheed Martin Advanced Technology Laboratories 1 Federal Street Camden, NJ 08102

More information

Going Linux on Massive Multicore

Going Linux on Massive Multicore Embedded Linux Conference Europe 2013 Going Linux on Massive Multicore Marta Rybczyńska 24th October, 2013 Agenda Architecture Linux Port Core Peripherals Debugging Summary and Future Plans 2 Agenda Architecture

More information

Performance Monitor Based Power Management for big.little Platforms

Performance Monitor Based Power Management for big.little Platforms Performance Monitor Based Power Management for big.little Platforms Simon Holmbacka, Sébastien Lafond, Johan Lilius Department of Information Technologies, Åbo Akademi University 20530 Turku, Finland firstname.lastname@abo.fi

More information

An SP-based Programming Model for Consumer Electronics Streaming Applications

An SP-based Programming Model for Consumer Electronics Streaming Applications C P S A L http://scalp.ewi.tudelft.nl SP@CE An SP-based Programming Model for Consumer Electronics Streaming Applications Ana Lucia Varbanescu, Maik Nijhuis, Arturo Gonzalez-Escribano Herbert Bos, Henk

More information

Embedded Systems. 7. System Components

Embedded Systems. 7. System Components Embedded Systems 7. System Components Lothar Thiele 7-1 Contents of Course 1. Embedded Systems Introduction 2. Software Introduction 7. System Components 10. Models 3. Real-Time Models 4. Periodic/Aperiodic

More information

LabVIEW Based Embedded Design

LabVIEW Based Embedded Design LabVIEW Based Embedded Design Sadia Malik Ram Rajagopal Department of Electrical and Computer Engineering University of Texas at Austin Austin, TX 78712 malik@ece.utexas.edu ram.rajagopal@ni.com Abstract

More information

HW/SW Codesign. May Axel Jantsch Royal Institute of Technology ROYAL INSTITUTE OF TECHNOLOGY L ABORATORY E LECTRONIC S YSTEM D ESIGN. A.

HW/SW Codesign. May Axel Jantsch Royal Institute of Technology ROYAL INSTITUTE OF TECHNOLOGY L ABORATORY E LECTRONIC S YSTEM D ESIGN. A. HW/SW Codesign May 2001 Axel Jantsch Royal Institute of Technology HW/SW Codesign, May 2001, 1 (45) Overview Introduction Types of Codesign Main issues and challanges Methodology HW/SW Cosimulation HW/SW

More information

The Construction of a Retargetable Simulator for an Architecture Template

The Construction of a Retargetable Simulator for an Architecture Template The Construction of a Retargetable Simulator for an Template Bart Kienhuis, Ed Deprettere,Kees Vissers, Pieter van der Wolf Delft University of Technology Philips Research Laboratories Eindhoven E-mail:

More information

OPTIMIZE DMA CONFIGURATION IN ENCRYPTION USE CASE. Guillène Ribière, CEO, System Architect

OPTIMIZE DMA CONFIGURATION IN ENCRYPTION USE CASE. Guillène Ribière, CEO, System Architect OPTIMIZE DMA CONFIGURATION IN ENCRYPTION USE CASE Guillène Ribière, CEO, System Architect Problem Statement Low Performances on Hardware Accelerated Encryption: Max Measured 10MBps Expectations: 90 MBps

More information

MPSoC Designs: Driving Memory and Storage Management IP to Critical Importance

MPSoC Designs: Driving Memory and Storage Management IP to Critical Importance MPSoC Designs: Driving Storage Management IP to Critical Importance Design IP has become an essential part of SoC realization it is a powerful resource multiplier that allows SoC design teams to focus

More information

Zynq-7000 Extensible Processing Platform Press Backgrounder

Zynq-7000 Extensible Processing Platform Press Backgrounder Press Backgrounder March 1, 2011 Zynq-7000 Extensible Processing Platform Press Backgrounder The first question you may ask about the new Extensible Processing Platform is what exactly was the thinking

More information

Embedded Development Tools

Embedded Development Tools Embedded Development Tools Software Development Tools by ARM ARM tools enable developers to get the best from their ARM technology-based systems. Whether implementing an ARM processor-based SoC, writing

More information

Processor SDK Overview

Processor SDK Overview Processor SDK Overview Agenda Why Processor SDK? Cores Determine Software SDK Architectures TI Development Ecosystem Why Processor SDK? Processor SDK Overview Processor SDK Purpose The Processor SDK was

More information

Heterogeneous Computing in ARM Architecture. Media Processing Division ARM June 25 th 2013

Heterogeneous Computing in ARM Architecture. Media Processing Division ARM June 25 th 2013 Heterogeneous Computing in ARM Architecture Media Processing Division ARM June 25 th 2013 Agenda Trends in Heterogeneous Computing GPU Computing with ARM Mali -T600 series as example Heterogeneous System

More information

Designing and Embodiment of Software that Creates Middle Ware for Resource Management in Embedded System

Designing and Embodiment of Software that Creates Middle Ware for Resource Management in Embedded System , pp.97-108 http://dx.doi.org/10.14257/ijseia.2014.8.6.08 Designing and Embodiment of Software that Creates Middle Ware for Resource Management in Embedded System Suk Hwan Moon and Cheol sick Lee Department

More information

Multimedia Multiprocessor Systems: Analysis, Design and Management. Akash Kumar

Multimedia Multiprocessor Systems: Analysis, Design and Management. Akash Kumar Multimedia Multiprocessor Systems: Analysis, Design and Management Akash Kumar 2 Modern Multimedia Embedded Systems 3 Trends in Multimedia Systems Increasing number of features i.e. applications Simultaneously

More information

Integrated Development Environment (IDE) for the Enea OSE Real-Time Operating System

Integrated Development Environment (IDE) for the Enea OSE Real-Time Operating System Integrated Development Environment (IDE) for the Enea OSE Real-Time Operating System 1 operating system. Based on the standard open source Eclipse platform and C/C++ development tools, Enea Optima provides

More information

EC EMBEDDED SYSTEM

EC EMBEDDED SYSTEM Shri Angalamman College of Engineering & Technology (An ISO 9000:2008 Certified Institution) Siruganoor, Tiruchirappalli 621 105. EC 1306- EMBEDDED SYSTEM UNIT - I 1. Draw the block diagram for program

More information

INTEL IPP REALISTIC RENDERING MOBILE PLATFORM SOFTWARE DEVELOPMENT KIT

INTEL IPP REALISTIC RENDERING MOBILE PLATFORM SOFTWARE DEVELOPMENT KIT INTEL IPP REALISTIC RENDERING MOBILE PLATFORM SOFTWARE DEVELOPMENT KIT Department of computer science and engineering, Sogang university 2008. 7. 22 Deukhyun Cha INTEL PERFORMANCE LIBRARY: INTEGRATED PERFORMANCE

More information

Optimizing Configuration and Application Mapping for MPSoC Architectures

Optimizing Configuration and Application Mapping for MPSoC Architectures Optimizing Configuration and Application Mapping for MPSoC Architectures École Polytechnique de Montréal, Canada Email : Sebastien.Le-Beux@polymtl.ca 1 Multi-Processor Systems on Chip (MPSoC) Design Trends

More information

DESIGN METHODOLOGY FOR EMBEDDED COMPUTER VISION SYSTEMS

DESIGN METHODOLOGY FOR EMBEDDED COMPUTER VISION SYSTEMS DESIGN METHODOLOGY FOR EMBEDDED COMPUTER VISION SYSTEMS Sankalita Saha and Shuvra S. Bhattacharyya Abstract Computer vision has emerged as one of the most popular domains of embedded applications. The

More information

Low-Overhead Hard Real-time Aware Interconnect Network Router

Low-Overhead Hard Real-time Aware Interconnect Network Router Low-Overhead Hard Real-time Aware Interconnect Network Router Michel A. Kinsy! Department of Computer and Information Science University of Oregon Srinivas Devadas! Department of Electrical Engineering

More information

Resource Utilization of Middleware Components in Embedded Systems

Resource Utilization of Middleware Components in Embedded Systems Resource Utilization of Middleware Components in Embedded Systems 3 Introduction System memory, CPU, and network resources are critical to the operation and performance of any software system. These system

More information

Cisco Integrated Services Routers Performance Overview

Cisco Integrated Services Routers Performance Overview Integrated Services Routers Performance Overview What You Will Learn The Integrated Services Routers Generation 2 (ISR G2) provide a robust platform for delivering WAN services, unified communications,

More information

Overview. Surveillance Systems. The Smart Camera - Hardware

Overview. Surveillance Systems. The Smart Camera - Hardware Overview A Mobile AgentAgent-based System for Dynamic Task Allocation in Clusters of Embedded Smart Cameras Introduction The Smart Camera Michael Bramberger1,, Bernhard Rinner1, and Helmut Schwabach Surveillance

More information

Chapter 13: I/O Systems

Chapter 13: I/O Systems Chapter 13: I/O Systems Chapter 13: I/O Systems I/O Hardware Application I/O Interface Kernel I/O Subsystem Transforming I/O Requests to Hardware Operations Streams Performance 13.2 Objectives Explore

More information

ISSUES IN HARDWARE SOFTWARE DESIGN AND CO-DESIGN

ISSUES IN HARDWARE SOFTWARE DESIGN AND CO-DESIGN Embedded Software development Process and Tools: Lesson-6 ISSUES IN HARDWARE SOFTWARE DESIGN AND CO-DESIGN 1 1. Embedded system design 2 Two approaches for the embedded system design device programmer

More information

Real-Time Operating Systems for MPSoCs

Real-Time Operating Systems for MPSoCs Real-Time Operating Systems for MPSoCs Hiroyuki Tomiyama Graduate School of Information Science Nagoya University http://member.acm.org/~hiroyuki MPSoC 2009 1 Contributors Hiroaki Takada Director and Professor

More information

High Performance or Cycle Accuracy?

High Performance or Cycle Accuracy? CHIP DESIGN High Performance or Cycle Accuracy? You can have both! Bill Neifert, Carbon Design Systems Rob Kaye, ARM ATC-100 AGENDA Modelling 101 & Programmer s View (PV) Models Cycle Accurate Models Bringing

More information

Multiple Choice Questions. Chapter 1

Multiple Choice Questions. Chapter 1 Multiple Choice Questions Chapter 1 Each question has four choices. Choose most appropriate choice of the answer. 1. An embedded system must have (a) hard disk (b) processor and memory (c) operating system

More information

Using FPGA Prototyping Board as an SoC Verification and Integration Platform

Using FPGA Prototyping Board as an SoC Verification and Integration Platform Using FPGA Prototyping Board as an SoC Verification and Integration Platform 06-25-10 Abstract Size of new designs has grown so much that it easily allows creation of the entire system containing microprocessor

More information

RAPID PROTOTYPING PLATFORM FOR RECONFIGURABLE IMAGE PROCESSING

RAPID PROTOTYPING PLATFORM FOR RECONFIGURABLE IMAGE PROCESSING RAPID PROTOTYPING PLATFORM FOR RECONFIGURABLE IMAGE PROCESSING B.Kovář 1, J. Kloub 1, J. Schier 1, A. Heřmánek 1, P. Zemčík 2, A. Herout 2 (1) Institute of Information Theory and Automation Academy of

More information

Extending Your Skills to LabVIEW Real-Time and LabVIEW FPGA

Extending Your Skills to LabVIEW Real-Time and LabVIEW FPGA Extending Your Skills to LabVIEW Real-Time and LabVIEW FPGA Fanie Coetzer Field Sales Engineer Agenda 1. Using the LabVIEW Project 2. LabVIEW FPGA I/O Nodes 3. Working with the fixed-point data type 4.

More information

Stream Processing on GPUs Using Distributed Multimedia Middleware

Stream Processing on GPUs Using Distributed Multimedia Middleware Stream Processing on GPUs Using Distributed Multimedia Middleware Michael Repplinger 1,2, and Philipp Slusallek 1,2 1 Computer Graphics Lab, Saarland University, Saarbrücken, Germany 2 German Research

More information

A Dynamic Resource Management with Energy Saving Mechanism for Supporting Cloud Computing

A Dynamic Resource Management with Energy Saving Mechanism for Supporting Cloud Computing A Dynamic Resource Management with Energy Saving Mechanism for Supporting Cloud Computing Liang-Teh Lee, Kang-Yuan Liu, Hui-Yang Huang and Chia-Ying Tseng Department of Computer Science and Engineering,

More information

Introduction to Embedded System Design using Zynq

Introduction to Embedded System Design using Zynq Introduction to Embedded System Design using Zynq Zynq Vivado 2015.4 Version This material exempt per Department of Commerce license exception TSU Objectives After completing this module, you will be able

More information

Virtual Network Provisioning and Fault-Management across Multiple Domains

Virtual Network Provisioning and Fault-Management across Multiple Domains Virtual Network Provisioning and Fault-Management across Multiple Domains Distinguished Speaker Series Democritus University of Thrace, Greece Panagiotis Papadimitriou November 2010 Introduction The Internet

More information

3. Name some network architectures prevalent in machines supporting the message passing paradigm. Ans: Ethernet, Infiniband, Tree

3. Name some network architectures prevalent in machines supporting the message passing paradigm. Ans: Ethernet, Infiniband, Tree Frequently asked questions Parallel Computing by Prof. Subodh Kumar, Department of Computer Science and Engineering, IIT Delhi, Frequently asked questions: 1. What is shared-memory architecture? Ans: A

More information

C for Process Networks

C for Process Networks C for Process Networks Stefan Schürmans, Weihua Sheng, Anastasia Stulova, Jeronimo Castrillon 3 rd Workshop on Mapping of Applications to MPSoCs, Schloss Rheinfels, June 29 th 2010 Institute for Integrated

More information

Partial and Dynamic reconfiguration of FPGAs: a top down design methodology for an automatic implementation

Partial and Dynamic reconfiguration of FPGAs: a top down design methodology for an automatic implementation Partial and Dynamic reconfiguration of FPGAs: a top down design methodology for an automatic implementation Florent Berthelot, Fabienne Nouvel, Dominique Houzet To cite this version: Florent Berthelot,

More information

Energiatehokas laskenta Ubi-sovelluksissa

Energiatehokas laskenta Ubi-sovelluksissa Energiatehokas laskenta Ubi-sovelluksissa Jarmo Takala Tampereen teknillinen yliopisto Tietokonetekniikan laitos email: jarmo.takala@tut.fi Energy-Efficiency Comparison: VGA 30 frames/s, 512kbit/s Software

More information

Certified LabVIEW Embedded Systems Developer (CLED) Certification and Exam Preparation Guide

Certified LabVIEW Embedded Systems Developer (CLED) Certification and Exam Preparation Guide Certified LabVIEW Embedded Systems Developer (CLED) Certification and Exam Preparation Guide CLED Overview... 2 CLED Exam Eligibility Criteria... 3 CLED Exam Preparation Resources... 3 CLED Overview...

More information

Graphical Programming of All programmable SoC s Corné Westeneng Field Sales Engineer

Graphical Programming of All programmable SoC s Corné Westeneng Field Sales Engineer Graphical Programming of All programmable SoC s Corné Westeneng Field Sales Engineer We all have a challenge to solve 2 The LabVIEW RIO Architecture Analog Input Processor FPGA Analog Output Digital I/O

More information

ARM Cortex A9. Alyssa Colyette Xiao Ling Zhuang

ARM Cortex A9. Alyssa Colyette Xiao Ling Zhuang ARM Cortex A9 Alyssa Colyette Xiao Ling Zhuang Outline Introduction ARMv7-A ISA Cortex-A9 Microarchitecture o Single and Multicore Processor Advanced Multicore Technologies Integrating System on Chips

More information

Experience with the integration of distribution middleware into partitioned systems

Experience with the integration of distribution middleware into partitioned systems Experience with the integration of distribution middleware into partitioned systems Héctor Pérez Tijero (perezh@unican.es) J. Javier Gutiérrez García (gutierjj@unican.es) Computers and Real-Time Group,

More information

Software Driven Embedded Systems Design. A Use Case Analysis: Avoiding a hardware dependent software disaster using Virtual System Prototyping

Software Driven Embedded Systems Design. A Use Case Analysis: Avoiding a hardware dependent software disaster using Virtual System Prototyping Software Driven Embedded Systems Design A Use Case Analysis: Avoiding a hardware dependent software disaster using Virtual System Prototyping Overview Traditional System Development: A use case Traditional

More information

Best Practises for LabVIEW FPGA Design Flow. uk.ni.com ireland.ni.com

Best Practises for LabVIEW FPGA Design Flow. uk.ni.com ireland.ni.com Best Practises for LabVIEW FPGA Design Flow 1 Agenda Overall Application Design Flow Host, Real-Time and FPGA LabVIEW FPGA Architecture Development FPGA Design Flow Common FPGA Architectures Testing and

More information

The Benefits of Using MIPS Processors for Consumer Audio Applications

The Benefits of Using MIPS Processors for Consumer Audio Applications The Benefits of Using MIPS Processors for Consumer Audio Applications by Rajesh Palani and Radhika Thekkath MIPS Technologies, Inc. Consumer devices such as mobile audio players, set-top boxes (STBs),

More information

Product Development Flow Including Model- Based Design and System-Level Functional Verification

Product Development Flow Including Model- Based Design and System-Level Functional Verification Product Development Flow Including Model- Based Design and System-Level Functional Verification 2006 The MathWorks, Inc. Ascension Vizinho-Coutry, avizinho@mathworks.fr Agenda Introduction to Model-Based-Design

More information

System Considerations

System Considerations System Considerations Interfacing Performance Power Size Ease-of Use Programming Interfacing Debugging Cost Device cost System cost Development cost Time to market Integration Peripherals Different Needs?

More information

Inspecting GNU Radio Applications with ControlPort and Performance Counters

Inspecting GNU Radio Applications with ControlPort and Performance Counters Inspecting GNU Radio Applications with ControlPort and Performance Counters Thomas W. Rondeau University of Pennsylvania Philadelphia, PA 19104, USA tom@trondeau.com Timothy O Shea University of Maryland

More information

Thèse. Memory Study and Dataflow Representations for Rapid Prototyping of Signal Processing Applications on MPSoCs

Thèse. Memory Study and Dataflow Representations for Rapid Prototyping of Signal Processing Applications on MPSoCs Thèse THESE INSA Rennes sous le sceau de l Université européenne de Bretagne pour obtenir le titre de DOCTEUR DE L INSA DE RENNES Spécialité : Traitement du Signal et des Images présentée par Karol Desnos

More information

Architectures and Platforms

Architectures and Platforms Hardware/Software Codesign Arch&Platf. - 1 Architectures and Platforms 1. Architecture Selection: The Basic Trade-Offs 2. General Purpose vs. Application-Specific Processors 3. Processor Specialisation

More information

Reconfigurable Computing for Embedded Systems, FPGA Devices and Software Components

Reconfigurable Computing for Embedded Systems, FPGA Devices and Software Components Reconfigurable Computing for Embedded Systems, FPGA Devices and Software Components Graham Bardouleau and James Kulp Mercury Computer Systems, Inc. Phone: 978-967-1653 Email Addresses: {gpb, jek}@mc.com

More information

Which ARM Cortex Core Is Right for Your Application: A, R or M?

Which ARM Cortex Core Is Right for Your Application: A, R or M? Which ARM Cortex Core Is Right for Your Application: A, R or M? Introduction The ARM Cortex series of cores encompasses a very wide range of scalable performance options offering designers a great deal

More information

The Effect and Technique of System Coherence in ARM Multicore Technology

The Effect and Technique of System Coherence in ARM Multicore Technology The Effect and Technique of System Coherence in ARM Multicore Technology John Goodacre Senior Program Manager ARM Division Cambridge, UK Cortex -A9 Microarchitecture (single core variant) Coresight / JTAG

More information

Case Study on MSC8144

Case Study on MSC8144 FTF-Orlando, June 25-28 2007 AN317: Porting Single-Core Applications to Multi-Core Platforms Case Study on MSC8144 Michael Kardonik Applications Engineer After This Presentation You Will Know basic approaches

More information

MAQAO Performance Analysis and Optimization Tool

MAQAO Performance Analysis and Optimization Tool MAQAO Performance Analysis and Optimization Tool Andres S. CHARIF-RUBIAL andres.charif@uvsq.fr Performance Evaluation Team, University of Versailles S-Q-Y http://www.maqao.org VI-HPS 18 th Grenoble 18/22

More information

Chapter 1 - Web Server Management and Cluster Topology

Chapter 1 - Web Server Management and Cluster Topology Objectives At the end of this chapter, participants will be able to understand: Web server management options provided by Network Deployment Clustered Application Servers Cluster creation and management

More information

Packet Processing with PowerPC on the NetFPGA

Packet Processing with PowerPC on the NetFPGA Packet Processing with PowerPC on the NetFPGA CSE237B Final Report Erik Rubow December 11, 2009 1 Introduction The NetFPGA[2] community has made significant progress in making experimentation with high-speed

More information

Implementing Video Image Processing Algorithms on FPGA The MathWorks, Inc. 1

Implementing Video Image Processing Algorithms on FPGA The MathWorks, Inc. 1 Implementing Video Image Processing Algorithms on FPGA 2015 The MathWorks, Inc. 1 Video Image Processing and Computer Vision Video Image Processing Video in and out Gamma correction Color balancing Noise

More information

THE MULTI-DATAFLOW COMPOSER TOOL: A RUNTIME RECONFIGURABLE HDL PLATFORM COMPOSER

THE MULTI-DATAFLOW COMPOSER TOOL: A RUNTIME RECONFIGURABLE HDL PLATFORM COMPOSER Conference on Design and Architectures for Signal and Image Processing -2011 Electronic Chips & Systems design Initiative November 2nd-4th, 2011, Tampere, Finland THE MULTI-DATAFLOW COMPOSER TOOL: A RUNTIME

More information

MPSoC Virtual Platforms

MPSoC Virtual Platforms CASTNESS 2007 Workshop MPSoC Virtual Platforms Rainer Leupers Software for Systems on Silicon (SSS) RWTH Aachen University Institute for Integrated Signal Processing Systems Why focus on virtual platforms?

More information

Design of AMBA AHB interface around OpenRISC 1200 processor and comparing the implementation with existing architecture

Design of AMBA AHB interface around OpenRISC 1200 processor and comparing the implementation with existing architecture IJSRD - International Journal for Scientific Research & Development Vol. 1, Issue 3, 2013 ISSN (online): 2321-0613 Design of AMBA AHB interface around OpenRISC 1200 processor and comparing the implementation

More information

Embedded vision with FPGA vs CUDA processing. Directions and platform proposal

Embedded vision with FPGA vs CUDA processing. Directions and platform proposal Reconfigurable and High Performance Computing Lab INAOE Puebla, Mexico Embedded vision with FPGA vs CUDA processing. Directions and platform proposal WASC 2014 20 June 2014 Dr. Miguel Arias Estrada ariasmo@inaoep.mx

More information

White Paper. Real-time Capabilities for Linux SGI REACT Real-Time for Linux

White Paper. Real-time Capabilities for Linux SGI REACT Real-Time for Linux White Paper Real-time Capabilities for Linux SGI REACT Real-Time for Linux Abstract This white paper describes the real-time capabilities provided by SGI REACT Real-Time for Linux. software. REACT enables

More information

Hardware Acceleration for Just-In-Time Compilation on Heterogeneous Embedded Systems

Hardware Acceleration for Just-In-Time Compilation on Heterogeneous Embedded Systems Hardware Acceleration for Just-In-Time Compilation on Heterogeneous Embedded Systems A. Carbon, Y. Lhuillier, H.-P. Charles CEA LIST DACLE division Embedded Computing Embedded Software Laboratories France

More information

Synopsys experience with OpenVX for a Face Tracking Application

Synopsys experience with OpenVX for a Face Tracking Application Synopsys experience with OpenVX for a Face Tracking Application Pierre Paulin May 4 th, 2016 Copyright 2016 Synopsys 1 Outline Optimizing the OpenVX Graph Manager for Embedded Multi-core Architectures

More information

F6 Model-driven Development Kit: Information Architecture Platform Last updated: June 12, 2013

F6 Model-driven Development Kit: Information Architecture Platform Last updated: June 12, 2013 F6 Model-driven Development Kit: Information Architecture Platform Last updated: June 12, 2013 The F6 IAP is a state-of-the-art software platform and toolsuite for the model-driven development of distributed

More information

Kalray MPPA Massively Parallel Processing Array

Kalray MPPA Massively Parallel Processing Array Kalray MPPA Massively Parallel Processing Array Next-Generation Accelerated Computing February 2015 2015 Kalray, Inc. All Rights Reserved February 2015 1 Accelerated Computing 2015 Kalray, Inc. All Rights

More information

Design a medical application for Android platform using model-driven development approach

Design a medical application for Android platform using model-driven development approach Design a medical application for Android platform using model-driven development approach J. Yepes, L. Cobaleda 2, J. Villa D, J. Aedo ARTICA, Microelectronic and Control Research Group 2 ARTICA, Software

More information

MAXware: acceleration in HPC. R. Dimond, M. J. Flynn, O. Mencer and O. Pell Maxeler Technologies contact:

MAXware: acceleration in HPC. R. Dimond, M. J. Flynn, O. Mencer and O. Pell Maxeler Technologies contact: MAXware: acceleration in HPC R. Dimond, M. J. Flynn, O. Mencer and O. Pell Maxeler Technologies contact: flynn@maxeler.com Maxeler Technologies MAXware: acceleration in HPC 2 / 26 HPC: the case for accelerators

More information

Move From Design to Deployment Faster. ni.com

Move From Design to Deployment Faster. ni.com What s New in LabVIEW Real-Time and LabVIEW FPGA Move From Design to Deployment Faster Supporting Embedded Designers Through Integrated System Design Software Communication Interface Processing Elements

More information

TrustZone, DSP and SIMD Extensions

TrustZone, DSP and SIMD Extensions ARCHITECTURE FOR MULTIMEDIA SYSTEMS ARM Cortex-A Series with Jazelle, TrustZone, DSP and SIMD Extensions Professor: Cristina Silvano P t d b Presented by: Vu Duc Xuan Quang 736324 Contents Cortex-A series

More information

Adaptive resource remapping through live migration of virtual machines

Adaptive resource remapping through live migration of virtual machines Adaptive resource remapping through live migration of virtual machines Muhammad Atif Peter Strazdins* Research School of Computer Science The Australian National University 2 Contents Introduction Related

More information

SoC Platforms and CPU Cores

SoC Platforms and CPU Cores SoC Platforms and CPU Cores COE838: Systems on Chip Design http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer Engineering Ryerson University

More information

Chapter 13: I/O Systems

Chapter 13: I/O Systems Chapter 13: I/O Systems I/O Hardware Application I/O Interface Chapter 13: I/O Systems Kernel I/O Subsystem Transforming I/O Requests to Hardware Operations Streams Performance, 13.2 Objectives I/O Hardware

More information

Existing Architectures & Systems: OpenTracker, DWARF, VRPN, Trackd

Existing Architectures & Systems: OpenTracker, DWARF, VRPN, Trackd 3rd Joint Advanced Student School (JASS 2005) Course 3 - : OpenTracker, DWARF, VRPN, Trackd nachev@in.tum.de Technische Universität München Content Introduction Important goals and requirements OpenTracker

More information

System Design and Methodology/ Embedded Systems Design (Modeling and Design of Embedded Systems)

System Design and Methodology/ Embedded Systems Design (Modeling and Design of Embedded Systems) System Design&Methodologies Fö 1&2-1 System Design&Methodologies Fö 1&2-2 Course Information System Design and Methodology/ Embedded Systems Design (Modeling and Design of Embedded Systems) TDTS30/TDDI08

More information

On some Potential Research Contributions to the Multi-Core Enterprise

On some Potential Research Contributions to the Multi-Core Enterprise On some Potential Research Contributions to the Multi-Core Enterprise Oded Maler CNRS - VERIMAG Grenoble, France February 2009 Background This presentation is based on observations made in the Athole project

More information

How to Choose a CPU Core for Multi-CPU SOC Designs

How to Choose a CPU Core for Multi-CPU SOC Designs How to Choose a CPU Core for Multi-CPU SOC Designs MIPS Technologies, Inc. June 2002 The use of multiple CPUs in SOC designs is becoming increasingly popular. Processor cores being considered for multi-cpu

More information

OPART: Towards an Open Platform for Abstraction of Real-Time Communication in Cross-Domain Applications

OPART: Towards an Open Platform for Abstraction of Real-Time Communication in Cross-Domain Applications OPART: Towards an Open Platform for Abstraction of Real-Time Communication in Cross-Domain Applications Simplification of Developing Process in Real-time Networked Medical Systems Morteza Hashemi Farzaneh,

More information

Address SoC routing congestion with 2.5D SiP

Address SoC routing congestion with 2.5D SiP Address SoC routing congestion with 2.5D SiP The System in Package in a 2D package is the best of both worlds approach that the electronics industry has come up with to resolve a design dilemma. By Ayan

More information

PTask: Operating System Abstractions To Manage GPUs as Compute Devices

PTask: Operating System Abstractions To Manage GPUs as Compute Devices PTask: Operating System Abstractions To Manage GPUs as Compute Devices C.J. Rossbach, J. Currey - Microsoft Research B. Ray, E. Witchel - University of Texas M.Silberstein - Technion Presentation: Adam

More information

Linux Performance Optimizations for Big Data Environments

Linux Performance Optimizations for Big Data Environments Linux Performance Optimizations for Big Data Environments Dominique A. Heger Ph.D. DHTechnologies (Performance, Capacity, Scalability) www.dhtusa.com Data Nubes (Big Data, Hadoop, ML) www.datanubes.com

More information

LabVIEW programming II

LabVIEW programming II FYS3240 PC-based instrumentation and microcontrollers LabVIEW programming II Spring 2011 Lecture #3 Bekkeng 30.1.2011 Control flow vs. dataflow programming Dataflow Programming Overview With a dataflow

More information

Xilinx Zynq. Development. Project Experience. Xilinx Zynq. Xilinx Zynq. CIC Offshore Solution. Xilinx Zynq

Xilinx Zynq. Development. Project Experience. Xilinx Zynq. Xilinx Zynq. CIC Offshore Solution. Xilinx Zynq CIC Offshore Solution Xilinx Zynq Xilinx Zynq Zynq Zynq Xilinx Zynq Daiichi Systems Pvt. Ltd. Programming Tools Re-engineering/ Refactoring Zynq Hardware & Software Embedded Firmware Embedded Application

More information

12. Introduction to Virtual Machines

12. Introduction to Virtual Machines 12. Introduction to Virtual Machines 12. Introduction to Virtual Machines Modern Applications Challenges of Virtual Machine Monitors Historical Perspective Classification 332 / 352 12. Introduction to

More information

Operating System Support for Multiprocessor Systems-on-Chip

Operating System Support for Multiprocessor Systems-on-Chip Operating System Support for Multiprocessor Systems-on-Chip Dr. Gabriel marchesan almeida Agenda. Introduction. Adaptive System + Shop Architecture. Preliminary Results. Perspectives & Conclusions Dr.

More information

NORTHEASTERN UNIVERSITY Graduate School of Engineering. Thesis Title: CRASH: Cognitive Radio Accelerated with Software and Hardware

NORTHEASTERN UNIVERSITY Graduate School of Engineering. Thesis Title: CRASH: Cognitive Radio Accelerated with Software and Hardware NORTHEASTERN UNIVERSITY Graduate School of Engineering Thesis Title: CRASH: Cognitive Radio Accelerated with Software and Hardware Author: Jonathon Pendlum Department: Electrical and Computer Engineering

More information

Automatic CUDA Code Synthesis Framework for Multicore CPU and GPU architectures

Automatic CUDA Code Synthesis Framework for Multicore CPU and GPU architectures Automatic CUDA Code Synthesis Framework for Multicore CPU and GPU architectures 1 Hanwoong Jung, and 2 Youngmin Yi, 1 Soonhoi Ha 1 School of EECS, Seoul National University, Seoul, Korea {jhw7884, sha}@iris.snu.ac.kr

More information

Accelerate Cloud Computing with the Xilinx Zynq SoC

Accelerate Cloud Computing with the Xilinx Zynq SoC X C E L L E N C E I N N E W A P P L I C AT I O N S Accelerate Cloud Computing with the Xilinx Zynq SoC A novel reconfigurable hardware accelerator speeds the processing of applications based on the MapReduce

More information

Digital Signal Processing

Digital Signal Processing Digital Signal Processing Lab 2: Embedded DSP implementation of energy-based voice activity detector Toon van Waterschoot, Marc Moonen ESAT Departement of Electrical Engineering KU Leuven, Belgium Digital

More information

LabVIEW Real-Time and Embedded

LabVIEW Real-Time and Embedded FYS3240 PC-based instrumentation and microcontrollers LabVIEW Real-Time and Embedded Spring 2011 Lecture #10 Bekkeng, 11.5.2011 Embedded Computing An embedded system is a computer system designed to perform

More information