Introduction to Xilinx System Generator Part II. Evan Everett and Michael Wu ELEC 433 - Spring 2013



Similar documents
LMS is a simple but powerful algorithm and can be implemented to take advantage of the Lattice FPGA architecture.

Best Practises for LabVIEW FPGA Design Flow. uk.ni.com ireland.ni.com

Implementation of Modified Booth Algorithm (Radix 4) and its Comparison with Booth Algorithm (Radix-2)

Computer Science 281 Binary and Hexadecimal Review

Binary Division. Decimal Division. Hardware for Binary Division. Simple 16-bit Divider Circuit

The string of digits in the binary number system represents the quantity

This Unit: Floating Point Arithmetic. CIS 371 Computer Organization and Design. Readings. Floating Point (FP) Numbers

Floating point package user s guide By David Bishop (dbishop@vhdl.org)

Data Storage 3.1. Foundations of Computer Science Cengage Learning

System Generator for DSP

Floating Point Fused Add-Subtract and Fused Dot-Product Units

Divide: Paper & Pencil. Computer Architecture ALU Design : Division and Floating Point. Divide algorithm. DIVIDE HARDWARE Version 1

Binary Numbering Systems

Lecture 8: Binary Multiplication & Division

Life Cycle of a Memory Request. Ring Example: 2 requests for lock 17

Chapter 13: Verification

Measures of Error: for exact x and approximation x Absolute error e = x x. Relative error r = (x x )/x.

Oct: 50 8 = 6 (r = 2) 6 8 = 0 (r = 6) Writing the remainders in reverse order we get: (50) 10 = (62) 8

CHAPTER 3 Boolean Algebra and Digital Logic

Lab 1: Full Adder 0.0

DAC Digital To Analog Converter

Product Development Flow Including Model- Based Design and System-Level Functional Verification

Digital Design. Assoc. Prof. Dr. Berna Örs Yalçın

Converting Models from Floating Point to Fixed Point for Production Code Generation

This 3-digit ASCII string could also be calculated as n = (Data[2]-0x30) +10*((Data[1]-0x30)+10*(Data[0]-0x30));

ECE 0142 Computer Organization. Lecture 3 Floating Point Representations

High-Level Synthesis for FPGA Designs

Binary Number System. 16. Binary Numbers. Base 10 digits: Base 2 digits: 0 1

EE360: Digital Design I Course Syllabus

A single register, called the accumulator, stores the. operand before the operation, and stores the result. Add y # add y from memory to the acc

Asynchronous counters, except for the first block, work independently from a system clock.

Binary Adders: Half Adders and Full Adders

Department of Electrical and Computer Engineering Ben-Gurion University of the Negev. LAB 1 - Introduction to USRP

VHDL Test Bench Tutorial

A DA Serial Multiplier Technique based on 32- Tap FIR Filter for Audio Application

Hardware Implementations of RSA Using Fast Montgomery Multiplications. ECE 645 Prof. Gaj Mike Koontz and Ryon Sumner

A High-Performance 8-Tap FIR Filter Using Logarithmic Number System

White Paper FPGA Performance Benchmarking Methodology

Simulink Modeling Guidelines for High-Integrity Systems

Modeling Sequential Elements with Verilog. Prof. Chien-Nan Liu TEL: ext: Sequential Circuit

Video-Conferencing System

Architekturen und Einsatz von FPGAs mit integrierten Prozessor Kernen. Hans-Joachim Gelke Institute of Embedded Systems Professur für Mikroelektronik

DIGITAL-TO-ANALOGUE AND ANALOGUE-TO-DIGITAL CONVERSION

EE361: Digital Computer Organization Course Syllabus

Data Storage. Chapter 3. Objectives. 3-1 Data Types. Data Inside the Computer. After studying this chapter, students should be able to:

DDS. 16-bit Direct Digital Synthesizer / Periodic waveform generator Rev Key Design Features. Block Diagram. Generic Parameters.

Clock and Data Recovery Unit based on Deserialized Oversampled Data

Step : Create Dependency Graph for Data Path Step b: 8-way Addition? So, the data operations are: 8 multiplications one 8-way addition Balanced binary

Binary Representation. Number Systems. Base 10, Base 2, Base 16. Positional Notation. Conversion of Any Base to Decimal.

LSN 2 Number Systems. ECT 224 Digital Computer Fundamentals. Department of Engineering Technology

Design and FPGA Implementation of a Novel Square Root Evaluator based on Vedic Mathematics

Infinite Impulse Response Filter Structures in Xilinx FPGAs

ALFFT FAST FOURIER Transform Core Application Notes

INTRODUCTION TO DIGITAL SYSTEMS. IMPLEMENTATION: MODULES (ICs) AND NETWORKS IMPLEMENTATION OF ALGORITHMS IN HARDWARE

LogiCORE IP AXI Performance Monitor v2.00.a

Vivado Design Suite Tutorial

Understanding Logic Design

FPGA Synthesis Example: Counter

Embedded System Hardware - Processing (Part II)

CS 61C: Great Ideas in Computer Architecture Finite State Machines. Machine Interpreta4on

VHDL GUIDELINES FOR SYNTHESIS

Hunting Asynchronous CDC Violations in the Wild

Analog Representations of Sound

FPGA Clocking. Clock related issues: distribution generation (frequency synthesis) multiplexing run time programming domain crossing

Digital Hardware Design Decisions and Trade-offs for Software Radio Systems

Below is a diagram explaining the data packet and the timing related to the mouse clock while receiving a byte from the PS-2 mouse:

All Programmable Logic. Hans-Joachim Gelke Institute of Embedded Systems. Zürcher Fachhochschule

2010/9/19. Binary number system. Binary numbers. Outline. Binary to decimal

RAPID PROTOTYPING OF DIGITAL SYSTEMS Second Edition

A Verilog HDL Test Bench Primer Application Note

Guru Ghasidas Vishwavidyalaya, Bilaspur (C.G.) Institute of Technology. Electronics & Communication Engineering. B.

Software Defined Radio

Quality of Service versus Fairness. Inelastic Applications. QoS Analogy: Surface Mail. How to Provide QoS?

Useful Number Systems

KS3 Computing Group 1 Programme of Study hours per week

STUDY ON HARDWARE REALIZATION OF GPS SIGNAL FAST ACQUISITION

An Efficient Architecture for Image Compression and Lightweight Encryption using Parameterized DWT

CPEN Digital Logic Design Binary Systems

Digitale Signalverarbeitung mit FPGA (DSF) Soft Core Prozessor NIOS II Stand Mai Jens Onno Krah

High Speed and Efficient 4-Tap FIR Filter Design Using Modified ETA and Multipliers

Gates, Plexers, Decoders, Registers, Addition and Comparison

RN-coding of Numbers: New Insights and Some Applications

Digital to Analog and Analog to Digital Conversion

Digital Systems. Role of the Digital Engineer

Understanding CIC Compensation Filters

Fixed-Point Arithmetic: An Introduction

Multipliers. Introduction

Lecture 11: Number Systems

Computer Science 217

Float to Fix conversion

Open Flow Controller and Switch Datasheet

7a. System-on-chip design and prototyping platforms

Digital System Design Prof. D Roychoudhry Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Lecture N -1- PHYS Microcontrollers

Non-Data Aided Carrier Offset Compensation for SDR Implementation

Quartus II Software Design Series : Foundation. Digitale Signalverarbeitung mit FPGA. Digitale Signalverarbeitung mit FPGA (DSF) Quartus II 1

SIM-PL: Software for teaching computer hardware at secondary schools in the Netherlands

Transcription:

Introduction to Xilinx System Generator Part II Evan Everett and Michael Wu ELEC 433 - Spring 2013

Outline Introduction to FPGAs and Xilinx System Generator System Generator basics Fixed point data representation Sample times Tips for building models

Fixed Point Binary Numbers MATLAB generally uses high precision values 64-bit floating point - huge dynamic range Impractical in hardware System Generator uses fixed point numbers instead Limited, but flexible, range & precision Pro: smaller hardware Con: requires attention to overflow & quantization

Fixed-Point Representation total bits fractional bits unsigned UFix8_5 1 0 1 1 0 0 1 0 = 5.5625 8 total bits 8-5 = 3 integer bits 5 fractional bits

Fixed-Point Representation total bits fractional bits signed Fix8_4 1 0 1 1 0 0 1 0 = -4.875 8 total bits 8-4 = 4 integer bits 4 fractional bits

Range vs. Precision Unsigned 15 0 14 1111 0000 1110 13 1101 12 UFix4_0 1100 11 1011 10 1010 9 8 1001 1000 1 0001 2 0010 3 0011 7 0111 4 0100 5 0101 6 0110 Signed -1 0-2 1111 0000 1110-3 1101-4 Fix4_0 1100-5 1011-6 1010-7 -8 1001 1000 1 0001 2 0010 3 0011 7 0111 4 0100 5 0101 6 0110

Range vs. Precision Unsigned 7.5 0 7.0 1111 0000 1110 6.5 1101 6.0 UFix4_1 1100 5.5 1011 5.0 1010 4.5 4.0 1001 1000 0.5 0001 1.0 0010 1.5 0011 3.5 0111 2.0 0100 2.5 0101 3.0 0110-0.5-1.0 1111 1110-1.5 1101-2.0 1100-2.5 1011-3.0 1010-3.5 1001 Signed 0 0000 Fix4_1-4.0 1000 0.5 0001 1.0 0010 1.5 0011 3.5 0111 2.0 0100 2.5 0101 3.0 0110

Range vs. Precision Unsigned 3.75 3.5 1111 1110 3.25 1101 3.0 UFix4_2 1100 2.75 1011 2.5 1010 2.25 2.0 1001 1000 0 0000 0.25 0001 0.5 0010 0.75 0011 1.75 0111 1.0 0100 1.25 0101 1.5 0110-0.25 -.05 1111 1110-0.75 1101-1.0 1100-1.25 1011-1.5 1010-1.75 1001 Signed 0 0000 0.25 0001 0.5 0010 0.75 0011 Fix4_2-2.0 1000 1.75 0111 1.0 0100 1.25 0101 1.5 0110

Fixed Point Arithmetic Addition & multiplication are provided Adders use general logic Multipliers use dedicated blocks Division expensive to implement and rarely used Multi-cycle operation Try to replace with shifts

Fixed Point Arithmetic More bits needed with each operation for full precision N bit M bit + max(n,m)+1 bit (integer growth) N bit M bit N+M bit (integer and fractional growth) May not always want to expand bitwidth, but must understand risk of overflow and/or quantization N bit N bit + N bit (overflow risk) N bit N bit N bit (integer overflow; fractional quantization)

Fixed Point Quantization Occurs when available fractional bits are insufficient Truncate (default): just drop bits past LSB; more efficient Round: choose nearest representable value Full Precision (UFix_11_8) 1 0 1 1 0 0 1 0 1 0 1 5.58203125 Truncated (UFix_8_5) 1 0 1 1 0 0 1 0 Rounded (UFix_8_5) 1 0 1 1 0 0 1 1 5.5625 (Δ=0.01953125) 5.59375 (Δ=0.01171875)

Fixed Point Overflow Occurs when available integer bits are insufficient Required bits increase with every operation This can add up very fast Think of a long FIR filter Most blocks Error on Overflow option Great for debugging in simulation Sim stops with error when overflow occurs Overflow in hardware is very hard to isolate: simulate to check first

Fixed Point Overflow Overflow Options 2.50 + 3.50 UFix_4_2 + UFix_4_2 1010 1110 2.50 3.50 6.00 UFix_5_2 11000 6.00 Full Precision - No overflow Notice the bit growth

Fixed Point Overflow Overflow Options 2.50 UFix_4_2 + 3.50 + UFix_4_2 6.00 UFix_4_2 1010 1110 1000 2.50 3.50 2.00 Wrap Happens by default in hardware if you don t give enough bits Not always bad; sometimes this is intentional Often the source of nonsensical results

Fixed Point Overflow Overflow Options 2.50 UFix_4_2 + 3.50 + UFix_4_2 6.00 UFix_4_2 1010 1110 1111 2.50 3.50 3.75 Saturate Stops at max/min to prevent overflow Sign of answer will be correct More expensive in hardware (requires comparator & mux for every operation)

System Generator Clocking Both simulation and hardware are discrete time Model has a master system sample period Related to FPGA clock in System Generator token An x sec system period = 1 FPGA clock period System Generator

Multiple Clock Domains All clock domains are multiples of master System Period Every other clock period is derived from master FPGA clock period System sample period must be the smallest period in the model System Generator

System Generator Clocking Sample periods propagate with signals Some blocks can override the propagation Feedback loops often require explicit sample periods Most blocks are single rate (eq. logic & arithmetic) Many blocks are multi-rate: upsample & downsample, interpolate & decimate, serial-parallel conversion

Multiple Clock Domains Example Upsample, filter, and downsample a 25 MHz (40 ns) signal FPGA Clock period = 40 ns System Generator System period = 1 Sample period = 1 1 1 2 25 MHz 50 MHz Error: sample rates not multiple of sample period

Multiple Clock Domains Example Upsample, filter, and downsample a 25 MHz (40 ns) signal System Generator FPGA Clock period = 20 ns System period = 1 Sample period = 2 25 MHz 50 MHz 2 1 1 2

Sample Times Example Upsample Blocks Downsample Blocks Use sample time colors!

Resource Estimation Any model of any size can be simulated Device resource limitations affect HW implementation Sysgen provides Resource Estimator block Adds up resource requirements before synthesis Good estimate - but not always right! Only post-place & route report is guaranteed Slice count usually matters most

System Generator Tips Show port data types and sample times Use variables instead of constants, initialize in a script Avoid explicit sample periods (except for feedback loops) Use keyboard shortcuts ctrl-click to wire blocks ctrl-drag to duplicate selected blocks ctrl-d to update/error check model Use subsystems and give them meaningful names Too much precision is okay at first, use Error on Overflow to optimize later Avoid saturation and rounding options

System Generator Example Gateway In: Gate way Out: UFix10_0 Accumulator: 10 bit output Add Wrap ROMs: Output: Fix16_15 Depth: 1024 Initial Values: cos(2π[0:1023]/1024) sin(2π[0:1023]/1024)

System Generator Example ROMs: Depth: 1024 Initial Values: cos(2π[0:1023]/1024) sin(2π[0:1023]/1024)