High-Level Synthesis for FPGA Designs

Similar documents

Vivado Design Suite Tutorial

Model-based system-on-chip design on Altera and Xilinx platforms

Extending the Power of FPGAs. Salil Raje, Xilinx

Agenda. Michele Taliercio, Il circuito Integrato, Novembre 2001

7a. System-on-chip design and prototyping platforms

Eli Levi Eli Levi holds B.Sc.EE from the Technion.Working as field application engineer for Systematics, Specializing in HDL design with MATLAB and

Digital Systems Design! Lecture 1 - Introduction!!

Xilinx Training Course Listing

Product Development Flow Including Model- Based Design and System-Level Functional Verification

LogiCORE IP AXI Performance Monitor v2.00.a

9 REASONS WHY THE VIVADO DESIGN SUITE ACCELERATES DESIGN PRODUCTIVITY

Codesign: The World Of Practice

Best Practises for LabVIEW FPGA Design Flow. uk.ni.com ireland.ni.com

Rapid System Prototyping with FPGAs

Beschleunigen von Algorithmen mit High-Level Synthese auf Xilinx Zynq

Quartus II Software Design Series : Foundation. Digitale Signalverarbeitung mit FPGA. Digitale Signalverarbeitung mit FPGA (DSF) Quartus II 1

Architectures and Platforms

Reconfigurable System-on-Chip Design

Hunting Asynchronous CDC Violations in the Wild

Simplifying Embedded Hardware and Software Development with Targeted Reference Designs

Seeking Opportunities for Hardware Acceleration in Big Data Analytics

Aims and Objectives. E 3.05 Digital System Design. Course Syllabus. Course Syllabus (1) Programmable Logic

Modeling a GPS Receiver Using SystemC

How To Design An Image Processing System On A Chip

FPGA Design From Scratch It all started more than 40 years ago

Using Vivado Design Suite with Version Control Systems Author: Jim Wu

System-on. on-chip Design Flow. Prof. Jouni Tomberg Tampere University of Technology Institute of Digital and Computer Systems.

BUILD VERSUS BUY. Understanding the Total Cost of Embedded Design.

Getting Started with Embedded System Development using MicroBlaze processor & Spartan-3A FPGAs. MicroBlaze

Digitale Signalverarbeitung mit FPGA (DSF) Soft Core Prozessor NIOS II Stand Mai Jens Onno Krah

White Paper FPGA Performance Benchmarking Methodology

All Programmable Logic. Hans-Joachim Gelke Institute of Embedded Systems. Zürcher Fachhochschule

Systems on Chip Design

International Journal of Advancements in Research & Technology, Volume 2, Issue3, March ISSN

Open Flow Controller and Switch Datasheet

Building an Embedded Processor System on a Xilinx Zync FPGA (Profiling): A Tutorial

Echtzeittesten mit MathWorks leicht gemacht Simulink Real-Time Tobias Kuschmider Applikationsingenieur

CFD Implementation with In-Socket FPGA Accelerators

AXI Performance Monitor v5.0

Implementation of emulated digital CNN-UM architecture on programmable logic devices and its applications

Exploiting Stateful Inspection of Network Security in Reconfigurable Hardware

Attention. restricted to Avnet s X-Fest program and Avnet employees. Any use

KEEP IT SYNPLE STUPID

How to download and install ISE WebPACK

Architekturen und Einsatz von FPGAs mit integrierten Prozessor Kernen. Hans-Joachim Gelke Institute of Embedded Systems Professur für Mikroelektronik

VHDL-Testbench as Executable Specification

FPGA Synthesis Example: Counter

Digital Systems. Role of the Digital Engineer

Introduction to Xilinx System Generator Part II. Evan Everett and Michael Wu ELEC Spring 2013

A DA Serial Multiplier Technique based on 32- Tap FIR Filter for Audio Application

Lesson 7: SYSTEM-ON. SoC) AND USE OF VLSI CIRCUIT DESIGN TECHNOLOGY. Chapter-1L07: "Embedded Systems - ", Raj Kamal, Publs.: McGraw-Hill Education

Introduction to Digital System Design

LMS is a simple but powerful algorithm and can be implemented to take advantage of the Lattice FPGA architecture.

EEM870 Embedded System and Experiment Lecture 1: SoC Design Overview

FPGA area allocation for parallel C applications

Dataflow Programming with MaxCompiler

FPGA Prototyping Primer

Reconfigurable Architecture Requirements for Co-Designed Virtual Machines

synthesizer called C Compatible Architecture Prototyper(CCAP).

Design of a High Speed Communications Link Using Field Programmable Gate Arrays

Systolic Computing. Fundamentals

FPGA Clocking. Clock related issues: distribution generation (frequency synthesis) multiplexing run time programming domain crossing

System Generator for DSP

Custom design services

PCIe Core Output Products Generation (Generate Example Design)

Clocking Wizard v5.1

International Workshop on Field Programmable Logic and Applications, FPL '99

Embedded Vision on FPGAs The MathWorks, Inc. 1

Module-I Lecture-I Introduction to Digital VLSI Design Flow

Case Study: Improving FPGA Design Speed with Floorplanning

FPGA-based MapReduce Framework for Machine Learning

if-then else : 2-1 mux mux: process (A, B, Select) begin if (select= 1 ) then Z <= A; else Z <= B; end if; end process;

Implementation and Design of AES S-Box on FPGA

High-Speed Computing & Co-Processing with FPGAs

FPGA and ASIC Implementation of Rho and P-1 Methods of Factoring. Master s Thesis Presentation Ramakrishna Bachimanchi Director: Dr.

Basics of Simulation Technology (SPICE), Virtual Instrumentation and Implications on Circuit and System Design

Verification. Formal. OneSpin 360 LaunchPad Adaptive Formal Platform. May 2015

Modeling Registers and Counters

Electronic system-level development: Finding the right mix of solutions for the right mix of engineers.

DRAFT Gigabit network intrusion detection systems

White Paper. S2C Inc Technology Drive, Suite 620 San Jose, CA 95110, USA Tel: Fax:

System on Chip Design. Michael Nydegger

DDS. 16-bit Direct Digital Synthesizer / Periodic waveform generator Rev Key Design Features. Block Diagram. Generic Parameters.

FSMD and Gezel. Jan Madsen

MAJORS: Computer Engineering, Computer Science, Electrical Engineering

Tensilica Software Development Toolkit (SDK)

Example-driven Interconnect Synthesis for Heterogeneous Coarse-Grain Reconfigurable Logic

Software-Programmable FPGA IoT Platform. Kam Chuen Mak (Lattice Semiconductor) Andrew Canis (LegUp Computing) July 13, 2016

ELEC 5260/6260/6266 Embedded Computing Systems

FPGA Acceleration using OpenCL & PCIe Accelerators MEW 25

Isolation Design Flow for Xilinx 7 Series FPGAs or Zynq-7000 AP SoCs (ISE Tools)

System-on-Chip Design with Virtual Components

Achieving High Performance DDR3 Data Rates

RAPID PROTOTYPING OF DIGITAL SYSTEMS Second Edition

Transcription:

High-Level Synthesis for FPGA Designs BRINGING BRINGING YOU YOU THE THE NEXT NEXT LEVEL LEVEL IN IN EMBEDDED EMBEDDED DEVELOPMENT DEVELOPMENT Frank de Bont Trainer consultant Cereslaan 10b 5384 VT Heesch +31 (0)412 660088 info@core-vision.nl www.core-vision.nl 1

Agenda Need for High-Level Synthesis High-Level Synthesis System Integration Design Exploration High-Level Synthesis Flow Control & Datapath Extraction Scheduling and Binding Example FIR C-code Who is Core Vision 2

- 4 1 Need for High-Level Synthesis 54255**slide Algorithmic-based approaches are popular due to accelerated design time and time-to-market pressures Larger designs pose challenges in design and verification of hardware Industry trend is moving towards hardware acceleration to enhance performance and productivity CPU-intensive tasks are now offloaded to hardware accelerator Hardware accelerators require a lot of time to understand and design Vivado HLS tool converts algorithmic description written in C-based design flow into hardware description (RTL) Elevates the abstraction level from RTL to algorithms High-level synthesis is essential for maintaining design productivity for large designs 3

High-Level Synthesis: HLS High-level synthesis Creates an RTL implementation from source code C, C++, SystemC Coding style impacts hardware realization Limitations on certain constructs and access to libraries Extracts control and dataflow from the source code Implements the design based on defaults and user-applied directives Many implementations are possible from the same source description Smaller designs, faster designs, optimal designs Enables design exploration 4

-6 1 80692**slide System Integration 5

-7 1 54257**slide Design Exploration 6

High-Level Synthesis flow Hardware extraction from C code Control and datapath can be extracted from C code at the top level Same principles used in the example can be applied to sub-functions At some point in the top-level control flow, control is passed to a sub-function Sub-function can be implemented to execute concurrently with the top level and or other sub-functions 7

High-Level Synthesis Flow cont Scheduling and binding processes create hardware design from control flow graph considering the constraints and directives Scheduling process maps the operations into cycles Binding process determines which hardware resource, or core, is used for each operation Binding decisions are considered during scheduling because the decisions in the binding process can influence the scheduling of operations For example, using a pipelined multiplier instead of a standard combinational multiplier 8

HLS: Control Extraction 9

-1 1 0 54260**slide Control and Datapath Extraction 10

Scheduling and Binding Scheduling and binding Heart of HLS Scheduling determines in which clock cycle an operation will occur Takes into account the control, dataflow, and user directives Allocation of resources can be constrained (discussed in detail later) Binding determines which library cell is used for each operation Takes into account component delays and user directives 11

Scheduling ide Operations in the control flow graph are mapped into clock cycles Technology and user constraints impact the schedule Faster technology (or slower clock) can allow more operations to occur in the same clock cycle Code also impacts the schedule Code implications and data dependencies must be obeyed 12

Binding Binding is where operations are mapped to hardware Operators extracted from the C code are mapped to RTL cores Binding decision: to share Given the following schedule Binding must use two multipliers because both are in the same cycle It can decide to use an adder and subtractor or one addsub Binding decision: or not to share Given the following schedule Binding may decide to share the multipliers (each is used in a different cycle) Or it may decide the cost of sharing (MUXing) would impact timing and it may decide not to share them It may make this same decision in the first example above as well 13

Example FIR C-Code By default, loops are rolled Each C loop iteration implemented in the same state Each C loop iteration implemented with same resources Loops can be unrolled if their indices are statically determinable at elaboration time 14

Example FIR C-Code cont void fir (... acc = 0; loop: for ( i=3; i>= 0; i--){ if (i==0){ acc +=x*c[0]; shiftreg[0] = x; } else { shiftreg[i] = shiftreg[i-1]; acc += shiftreg[i]*c[i]; } } *y = acc; } Read on port X can occur anywhere from the start to iteration 4 Only constraint on RDx is that it occur before the final multiplication There are no advantages to reading any earlier (unless you want it registered) However, the final multiplication is very constrained 15

Example FIR C-Code cont Schedule after loop optimization With the loop unrolled (partial / full) Dependeny on loop iterations is gone Operations can occur in parallel Design finished faster but more operators Two multipliers and two adders Schedule after array optimization With the existing code and defaults Port C is by default dual port RAM Allows two reads per clock cycle Max number of simultaneous reads and writes void fir (... acc = 0; loop: for ( i=3; i>= 0; i--){ if (i==0){ acc +=x*c[0]; shiftreg[0] = x; } else { shiftreg[i] = shiftreg[i-1]; acc += shiftreg[i]*c[i]; } } *y = acc; } 16

Example FIR C-Code cont With the C port partitioned into (4) separate ports All reads and multiply can occur in one cycle If the timing allows The additions can also occur in the same cycle The write can be performed in the same cycles Optionally the port reads and writes could be registered This solution uses much more hardware resources in only one clock cycle void fir (... acc = 0; loop: for ( i=3; i>= 0; i--){ if (i==0){ acc +=x*c[0]; shiftreg[0] = x; } else { shiftreg[i] = shiftreg[i-1]; acc += shiftreg[i]*c[i]; } } *y = acc; } 17

Core Vision Our competences Core Vision has more than 100 man years of design experience in hardand software development. Our competence areas are: System Design FPGA Design Consultancy / Training Digital Signal Processing Embedded Real-time Software App development, IOS Android Data Acquisition, digital and analog Modeling & Simulation ASIC Conversion & Prototyping PCB design & Layout Doulos & Xilinx Training Partner 18

??? Cereslaan 10b 5384 VT Heesch +31 (0)412 660088 www.core-vision.nl Email : info@core-vision.nl 19

Training Program Essentials of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Design Techniques for Lower Cost 1 day Designing with Spartan-6 and Virtex-6 Family 3 days Essential Design with the PlanAhead Analysis Tool 1 day Advanced Design with the PlanAhead Analysis Tool 2 days Xilinx Partial Reconfiguration Tools and Techniques 2 days Designing with the 7 Series Families 2 days 20

Training Program Vivado Essentials of FPGA Design 2 days Vivado Design Suite Tool Flow 1 day Vivado Design Suite for ISE Users 1 day Vivado Avanced XDC and STA for ISE Users 2 days Vivado Advanced Tools & Techniques 2 days Vivado Static Timing Analysis and XDC 2 days Debugging Techniques Using Vivado Logic Analyzer 1 day Essential Tcl Scripting for Vivado Design Suite 1 day Vivado FPGA Design Methodology 1 day 21

Training Program Designing with Multi Gigabit Serial IO 3 days High Level Synthesis with Vivado 2 days C-Based HLS Coding for Hardware Designers 1 day C-Based HLS Coding for Software Designers 1 day DSP Design Using System Generator 2 days Essential DSP Implementation Techniques for Xilinx FPGAs 2 days 22

Training Program Embedded Systems Development 2 days Embedded Systems Software Development 2 days Advanced Features and Techniques of SDK 2 days Advanced Features and Techniques of EDK 2 days Zynq All Programmable SoC Systems Archicture 2 days C Language Programming with SDK 2 days 23

Training Program VHDL Design for FPGA 3 days Advanced VDHL 2 days Comprehensive VHDL 5 days Exprt VHDL Verification 3 days Expert VDHL Design 2 days Expert VHDL 5 days Essential Digital Design Techniques 2 days 24