Reconfigurable Computing. Reconfigurable Architectures. Chapter 3.2

Size: px
Start display at page:

Download "Reconfigurable Computing. Reconfigurable Architectures. Chapter 3.2"

Transcription

1 Reconfigurable Architectures Chapter 3.2 Prof. Dr.-Ing. Jürgen Teich Lehrstuhl für Hardware-Software-Co-Design

2 Coarse-Grained Reconfigurable Devices

3 Recall: 1. Brief Historically development (Estrin Fix-Plus and Rammig machine) 2. Programmable Logic 1. PALs and PLAs 2. CPLDs 3. FPGAs 1. Technology 2. Architecture by means of an example 1. Actel 2. Xilinx 3. Altera 3

4 Once again: General purpose vs Special purpose With LUTs as function generators, FPGA can be seen as general purpose devices. Like any general purpose device, they are flexible but often inefficient. Flexible because any n-variable Boolean function can be implemented using an n-input LUT. Inefficient since complex functions must be implemented in many LUTs at different locations. The connection among the LUTs is done using the routing matrix wich increases the signal delays. LUT implementation is usually slower than direct wiring. 4

5 Once again: General purpose vs Special purpose Example: Implement the function using 2-input LUTs. F = ABD + AC D + A BC LUTs are grouped in logic blocks (LB). 2 2-input LUT per LB Connection inside a LB is efficient (direct) Connection outside LBs are slow (Connection matrix) A B D Connection matrix A B C A C D F 5

6 Once again: General purpose vs Special purpose Idea: Implement frequently used blocks as hard-core module in the device A B D Connection matrix A B C A C D F A B C D F 6

7 Coarse grained reconfigurable devices Overcome the inefficiency of FPGAs by providing coarse grained functional units (adders, multipliers, integrators, etc.), efficiently implemented Advantage: Very efficient in terms of speed (no need for connections over connection matrices for basic operators) Advantage: Direct wiring instead of LUT implementation A coarse grained device is usually an array of programmable and identical processing elements (PE) capable of executing few operations like addition and multiplication. Depending on the manufacturer, the functional units communicate via buses or can be directly connected using programmable routing matrices. 7

8 Coarse grained reconfigurable devices Memory exists between and inside the PEs. Several other functional units according to the manufacturer. A PE is usually an 8-bit, 16-bit or 32-bit tiny ALU which can be configured to execute only one operation on a given period (until the next configuration). Communication among the PEs can be either packet oriented (on buses) or point-to-point (using crossbar switches). Since each vendor has its own implementation approach, study will be done by means of few examples. Considered are: PACT XPP, Quicksilver ACM, NEC DRP, TCPA. 8

9 The PACT XPP Overall structure XPP (Extreme Processing Platform) is a hierarchical structure consisting of: An array of Processing Array Elements (PAE) grouped in clusters called Processing Array (PA) PAC = Processing Array Cluster (PAC) + Configuration manager (CM) A hierarchical configuration tree Local CMs manage the configuration at the PA level The local CMs access the local configuration memory while the supervisor CM (SCM) accesses external memory and supervises the whole configuration process on the device 9

10 The PACT XPP The Processing Array Elements A Communication Network Memory elements aside the PACs A set of I/Os The PAE: Two types of PAE The ALU PAE The RAM PAE The ALU PAE: Contains an ALU which can be configured to perform basic operations Back-register (BREG) provides routing channels for data and events from bottom to top Forward Register (FREG) provides routing channels from top to bottom The ALU PAE 10

11 The PACT XPP - The Processing Array Elements DataFlow Register (DF-REG) can be used at the object outputs for buffering data Input register can be preloaded by configuration data. The RAM PAE: 1. Differs from the ALU-PAE only on the function. Instead of an ALU, a RAM-PAE contains a dual-ported RAM. 2. Useful for data storage 3. Data is written or read after the reading of an address at the RAM-inputs The RAM PAE 4. BREG, FREG, and DF-REG of the RAM- PAE have the same function as in the ALU-PAE 11

12 The PACT XPP - Routing Routing in PACT XPP: Two independent networks One for data transmission The other for event transmission A Configuration BUS exists besides the data and event networks (very little information exists about the configuration bus) All objects can be connected to horizontal routing channels using switch-objects Vertical routing channels are provided by the BREG and FREG BREGs route from bottom to top FREGs route from top to bottom Vertical routing channels Horizontal routing channels 12

13 The PACT XPP - Interface Interfaces are available inside the chip Number and type of interfaces vary from device to device On the XPP42-A1: 6 internal interfaces consisting of: 4 identical general purpose I/O on-chip interfaces (bottom left, upper left, upper right, and bottom right) One configuration manager One JTAG (Join Test Action Group, "IEEE Standard ") Boundary scan interface for testing purpose (not shown in the picture) Interfaces 13

14 The PACT XPP - Interface The I/O interfaces can operate independent from each other. Two operation modes The RAM mode The streaming mode RAM mode: Each port can access external Static RAM (SRAM). Control signals for the SRAM transactions are available. No additional logic required 14

15 The PACT XPP - Interface Streaming mode: 1. For high speed streaming of data to and from the device 2. Each I/O element provides two bidirectional ports for data streaming 3. Handshake signals are used for synchronization of data packets to external port 15

16 The Quicksilver ACM - Architecture Structure: Fractal-like structure Hierarchical group of four nodes with full communication among the nodes 4 lower level nodes are grouped in a higher level node The lowest level consists of 4 heterogeneous processing nodes The connection is done in a Matrix Interconnect Network (MIN) A system controller Various I/O 16

17 The Quicksilver ACM The processing node An ACM processing node consists of: An algorithmic engine. It is unique to each node type and defines the operation to perform by the node. The node memory for data storage at the node level. A node wrapper which is common to all nodes. It is used to hide the complexity of the heterogeneous architecture. 17

18 The Quicksilver ACM The processing node Four types of nodes exist: The Programmable Scalar Node (PSN) provides a standard 32-bit RISC architecture with 32-bit general purpose registers The Adaptive Execution Node (AXN) provides variable size MAC and ALU operations The Domain Bit Manipulation (DBM) node provides bit manipulation and byte oriented operation External Memory Controller node provides DDRRAM, SRAM, memory random access DMA control interface ACM PSN-Node 18

19 The Quicksilver ACM The processing node ACM AXN-Node ACM DBM-Node 19

20 The Quicksilver ACM The processing node The node wrapper envelopes the algorithmic engine and presents an identical interface to neighbouring nodes. It features: 1. A MIN interface to support the communication among nodes via the MIN-network 2. A hardware task manager for task management at the node level 3. A DMA engine 4. Dedicated I/O circuitry 5. Memory controllers The ACM Node Wrapper 6. Data distributors and aggregators 20

21 The Quicksilver ACM - The MIN Matrix Interconnect Network is the communication medium in an ACM chip 1. Hierarchically organized. The MIN at a given level connects many lower-level MINs 2. The MIN-Root is used for: 1. Off-chip communication 2. Configuration 3. Supports the communication among nodes 4. Provides service like Point to point dataflow streaming, Realtime broadcasting, DMA, etc. Example of ACM chip configuration 21

22 The Quicksilver ACM - The System Controller The system controller is in charge of the system management: Loads tasks into node ready-to-run queue for execution Statically or dynamically sets the communication channels between the processing nodes Carries out the reconfiguration of nodes on a clock cycle-by-clock cycle basis The ACM chip features a set of I/O interfaces controllers like: PCI PLL SDRAM and SRAM The system controller The interface controllers 22

23 The NEC DRP Architecture The NEC Dynamically Reconfigurable Processor (DRP) consists of: A set of byte oriented processing elements (PE) A programmable interconnection network for communication among the PEs. A sequencer. Can be programmed as finite state machine (FSM) to control the reconfiguration process Memory around the device for storing configuration and computation data Various Interfaces 23

24 The NEC DRP - The Processing Element ALU: ordinary byte arithmetic/logic operations DMU (data management unit): handles byte select, shift, mask, constant generation, etc., as well as bit manipulations An instruction dictates ALU/DMU operations and inter-pe connections Source/destination operands can either be from/to its own register file other PEs (i.e., flow through) Instruction pointer (IP) is provided from STC (state transition controller) 24

25 The NEC DRP Reconfiguration Process Instruction Pointer (IP) from STC identifies a datapath plane Spatial computation with using a customized datapath plane Data In AES 3DES MD5 SHA-1 Dynamic Reconfigura tion When IP changes, datapath plane switches instantaneously PE instructions as a collection behave like an extreme VLIW (task selection by descriptor) Control Compress Multiple Datapath Planes Data Out Sequencing through instructions => Dynamic reconfiguration 25

26 The NEC DRP Reconfiguration Process 1 IP = 1 PE PE Array ALU DMU Insts. 1 Identify the instruction to be executed 2 Decode the instruction in the ALU plane Configure the ALU Plane according to the instruction 2 IP = PE PE Ad d Ad d Ad d PE Array Se l C m p Ad d Se l C m p 4 ALU DMU Insts. 26

27 Tightly-Coupled Processor Arrays (TCPA) Processor elements (PEs) with VLIW (Very long instruction word)-architecture Weakly programmable Small local instruction memory Limited parametrizable instruction set focused on digital signal processing Data flow oriented control path, no global address space, data streaming over the processing field Regular interconnect network Application areas: Digital signal processing, e.g., mobile communication, HDTV, multimedia,... 30

28 Tightly-Coupled Processor Arrays (TCPA) 31

29 TCPA Interconnect Network Basic structure: Grid Dynamic reconfigurable By using a bypass, more than one hop is possible in a single clock cycle Interconnect wrapper is responsible for switching 32

30 TCPA Network Example 4D Hypercube 33

31 TCPA Network Example 2D Torus 34

32 TCPA Dynamic Reconfiguration Multicast-Scheme for partial dynamic reconfiguration Differential reconfiguration (program/connections) also possible 35

33 24 Core TCPA Lehrstuhl für Informatik 12 24x 16 Bit cores Technology CMOS 1.0 V 9 metal layers 90 nm standard cell layout FUs/PE 2xAdd, 2xMul, 1xShift, 1xDPU Register/PE: 15 Instruction memory 1024x32 = 4kB Clock frequency: 200 MHz Peak Performance: 24 GOPS Energy consumption MHz (Hybrid Clock Gating). Power efficiency: 180 MOPS/mW 36

7a. System-on-chip design and prototyping platforms

7a. System-on-chip design and prototyping platforms 7a. System-on-chip design and prototyping platforms Labros Bisdounis, Ph.D. Department of Computer and Communication Engineering 1 What is System-on-Chip (SoC)? System-on-chip is an integrated circuit

More information

Architectural Level Power Consumption of Network on Chip. Presenter: YUAN Zheng

Architectural Level Power Consumption of Network on Chip. Presenter: YUAN Zheng Architectural Level Power Consumption of Network Presenter: YUAN Zheng Why Architectural Low Power Design? High-speed and large volume communication among different parts on a chip Problem: Power consumption

More information

Architectures and Platforms

Architectures and Platforms Hardware/Software Codesign Arch&Platf. - 1 Architectures and Platforms 1. Architecture Selection: The Basic Trade-Offs 2. General Purpose vs. Application-Specific Processors 3. Processor Specialisation

More information

Embedded System Hardware - Processing (Part II)

Embedded System Hardware - Processing (Part II) 12 Embedded System Hardware - Processing (Part II) Jian-Jia Chen (Slides are based on Peter Marwedel) Informatik 12 TU Dortmund Germany Springer, 2010 2014 年 11 月 11 日 These slides use Microsoft clip arts.

More information

Networking Virtualization Using FPGAs

Networking Virtualization Using FPGAs Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical and Computer Engineering University of Massachusetts,

More information

Interconnection Networks Programmierung Paralleler und Verteilter Systeme (PPV)

Interconnection Networks Programmierung Paralleler und Verteilter Systeme (PPV) Interconnection Networks Programmierung Paralleler und Verteilter Systeme (PPV) Sommer 2015 Frank Feinbube, M.Sc., Felix Eberhardt, M.Sc., Prof. Dr. Andreas Polze Interconnection Networks 2 SIMD systems

More information

Interconnection Networks. Interconnection Networks. Interconnection networks are used everywhere!

Interconnection Networks. Interconnection Networks. Interconnection networks are used everywhere! Interconnection Networks Interconnection Networks Interconnection networks are used everywhere! Supercomputers connecting the processors Routers connecting the ports can consider a router as a parallel

More information

Open Flow Controller and Switch Datasheet

Open Flow Controller and Switch Datasheet Open Flow Controller and Switch Datasheet California State University Chico Alan Braithwaite Spring 2013 Block Diagram Figure 1. High Level Block Diagram The project will consist of a network development

More information

Design of a High Speed Communications Link Using Field Programmable Gate Arrays

Design of a High Speed Communications Link Using Field Programmable Gate Arrays Customer-Authored Application Note AC103 Design of a High Speed Communications Link Using Field Programmable Gate Arrays Amy Lovelace, Technical Staff Engineer Alcatel Network Systems Introduction A communication

More information

System Interconnect Architectures. Goals and Analysis. Network Properties and Routing. Terminology - 2. Terminology - 1

System Interconnect Architectures. Goals and Analysis. Network Properties and Routing. Terminology - 2. Terminology - 1 System Interconnect Architectures CSCI 8150 Advanced Computer Architecture Hwang, Chapter 2 Program and Network Properties 2.4 System Interconnect Architectures Direct networks for static connections Indirect

More information

Best Practises for LabVIEW FPGA Design Flow. uk.ni.com ireland.ni.com

Best Practises for LabVIEW FPGA Design Flow. uk.ni.com ireland.ni.com Best Practises for LabVIEW FPGA Design Flow 1 Agenda Overall Application Design Flow Host, Real-Time and FPGA LabVIEW FPGA Architecture Development FPGA Design Flow Common FPGA Architectures Testing and

More information

Aims and Objectives. E 3.05 Digital System Design. Course Syllabus. Course Syllabus (1) Programmable Logic

Aims and Objectives. E 3.05 Digital System Design. Course Syllabus. Course Syllabus (1) Programmable Logic Aims and Objectives E 3.05 Digital System Design Peter Cheung Department of Electrical & Electronic Engineering Imperial College London URL: www.ee.ic.ac.uk/pcheung/ E-mail: p.cheung@ic.ac.uk How to go

More information

COMPUTER HARDWARE. Input- Output and Communication Memory Systems

COMPUTER HARDWARE. Input- Output and Communication Memory Systems COMPUTER HARDWARE Input- Output and Communication Memory Systems Computer I/O I/O devices commonly found in Computer systems Keyboards Displays Printers Magnetic Drives Compact disk read only memory (CD-ROM)

More information

Design and Implementation of an On-Chip timing based Permutation Network for Multiprocessor system on Chip

Design and Implementation of an On-Chip timing based Permutation Network for Multiprocessor system on Chip Design and Implementation of an On-Chip timing based Permutation Network for Multiprocessor system on Chip Ms Lavanya Thunuguntla 1, Saritha Sapa 2 1 Associate Professor, Department of ECE, HITAM, Telangana

More information

Interconnection Networks

Interconnection Networks Advanced Computer Architecture (0630561) Lecture 15 Interconnection Networks Prof. Kasim M. Al-Aubidy Computer Eng. Dept. Interconnection Networks: Multiprocessors INs can be classified based on: 1. Mode

More information

Computer Organization & Architecture Lecture #19

Computer Organization & Architecture Lecture #19 Computer Organization & Architecture Lecture #19 Input/Output The computer system s I/O architecture is its interface to the outside world. This architecture is designed to provide a systematic means of

More information

What is a System on a Chip?

What is a System on a Chip? What is a System on a Chip? Integration of a complete system, that until recently consisted of multiple ICs, onto a single IC. CPU PCI DSP SRAM ROM MPEG SoC DRAM System Chips Why? Characteristics: Complex

More information

Header Parsing Logic in Network Switches Using Fine and Coarse-Grained Dynamic Reconfiguration Strategies

Header Parsing Logic in Network Switches Using Fine and Coarse-Grained Dynamic Reconfiguration Strategies Header Parsing Logic in Network Switches Using Fine and Coarse-Grained Dynamic Reconfiguration Strategies by Alexander Sonek A thesis presented to the University of Waterloo in fulfillment of the thesis

More information

FPGA. AT6000 FPGAs. Application Note AT6000 FPGAs. 3x3 Convolver with Run-Time Reconfigurable Vector Multiplier in Atmel AT6000 FPGAs.

FPGA. AT6000 FPGAs. Application Note AT6000 FPGAs. 3x3 Convolver with Run-Time Reconfigurable Vector Multiplier in Atmel AT6000 FPGAs. 3x3 Convolver with Run-Time Reconfigurable Vector Multiplier in Atmel AT6000 s Introduction Convolution is one of the basic and most common operations in both analog and digital domain signal processing.

More information

A Generic Network Interface Architecture for a Networked Processor Array (NePA)

A Generic Network Interface Architecture for a Networked Processor Array (NePA) A Generic Network Interface Architecture for a Networked Processor Array (NePA) Seung Eun Lee, Jun Ho Bahn, Yoon Seok Yang, and Nader Bagherzadeh EECS @ University of California, Irvine Outline Introduction

More information

Power Reduction Techniques in the SoC Clock Network. Clock Power

Power Reduction Techniques in the SoC Clock Network. Clock Power Power Reduction Techniques in the SoC Network Low Power Design for SoCs ASIC Tutorial SoC.1 Power Why clock power is important/large» Generally the signal with the highest frequency» Typically drives a

More information

Digitale Signalverarbeitung mit FPGA (DSF) Soft Core Prozessor NIOS II Stand Mai 2007. Jens Onno Krah

Digitale Signalverarbeitung mit FPGA (DSF) Soft Core Prozessor NIOS II Stand Mai 2007. Jens Onno Krah (DSF) Soft Core Prozessor NIOS II Stand Mai 2007 Jens Onno Krah Cologne University of Applied Sciences www.fh-koeln.de jens_onno.krah@fh-koeln.de NIOS II 1 1 What is Nios II? Altera s Second Generation

More information

Systolic Computing. Fundamentals

Systolic Computing. Fundamentals Systolic Computing Fundamentals Motivations for Systolic Processing PARALLEL ALGORITHMS WHICH MODEL OF COMPUTATION IS THE BETTER TO USE? HOW MUCH TIME WE EXPECT TO SAVE USING A PARALLEL ALGORITHM? HOW

More information

CHAPTER 5 FINITE STATE MACHINE FOR LOOKUP ENGINE

CHAPTER 5 FINITE STATE MACHINE FOR LOOKUP ENGINE CHAPTER 5 71 FINITE STATE MACHINE FOR LOOKUP ENGINE 5.1 INTRODUCTION Finite State Machines (FSMs) are important components of digital systems. Therefore, techniques for area efficiency and fast implementation

More information

Memory Systems. Static Random Access Memory (SRAM) Cell

Memory Systems. Static Random Access Memory (SRAM) Cell Memory Systems This chapter begins the discussion of memory systems from the implementation of a single bit. The architecture of memory chips is then constructed using arrays of bit implementations coupled

More information

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET)

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET) INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET) International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 0976 ISSN 0976 6464(Print)

More information

SoC IP Interfaces and Infrastructure A Hybrid Approach

SoC IP Interfaces and Infrastructure A Hybrid Approach SoC IP Interfaces and Infrastructure A Hybrid Approach Cary Robins, Shannon Hill ChipWrights, Inc. ABSTRACT System-On-Chip (SoC) designs incorporate more and more Intellectual Property (IP) with each year.

More information

All Programmable Logic. Hans-Joachim Gelke Institute of Embedded Systems. Zürcher Fachhochschule

All Programmable Logic. Hans-Joachim Gelke Institute of Embedded Systems. Zürcher Fachhochschule All Programmable Logic Hans-Joachim Gelke Institute of Embedded Systems Institute of Embedded Systems 31 Assistants 10 Professors 7 Technical Employees 2 Secretaries www.ines.zhaw.ch Research: Education:

More information

Chapter 7 Memory and Programmable Logic

Chapter 7 Memory and Programmable Logic NCNU_2013_DD_7_1 Chapter 7 Memory and Programmable Logic 71I 7.1 Introduction ti 7.2 Random Access Memory 7.3 Memory Decoding 7.5 Read Only Memory 7.6 Programmable Logic Array 77P 7.7 Programmable Array

More information

Operating System Support for Multiprocessor Systems-on-Chip

Operating System Support for Multiprocessor Systems-on-Chip Operating System Support for Multiprocessor Systems-on-Chip Dr. Gabriel marchesan almeida Agenda. Introduction. Adaptive System + Shop Architecture. Preliminary Results. Perspectives & Conclusions Dr.

More information

SOC architecture and design

SOC architecture and design SOC architecture and design system-on-chip (SOC) processors: become components in a system SOC covers many topics processor: pipelined, superscalar, VLIW, array, vector storage: cache, embedded and external

More information

Von der Hardware zur Software in FPGAs mit Embedded Prozessoren. Alexander Hahn Senior Field Application Engineer Lattice Semiconductor

Von der Hardware zur Software in FPGAs mit Embedded Prozessoren. Alexander Hahn Senior Field Application Engineer Lattice Semiconductor Von der Hardware zur Software in FPGAs mit Embedded Prozessoren Alexander Hahn Senior Field Application Engineer Lattice Semiconductor AGENDA Overview Mico32 Embedded Processor Development Tool Chain HW/SW

More information

Introduction to Programmable Logic Devices. John Coughlan RAL Technology Department Detector & Electronics Division

Introduction to Programmable Logic Devices. John Coughlan RAL Technology Department Detector & Electronics Division Introduction to Programmable Logic Devices John Coughlan RAL Technology Department Detector & Electronics Division PPD Lectures Programmable Logic is Key Underlying Technology. First-Level and High-Level

More information

Chapter 2 Heterogeneous Multicore Architecture

Chapter 2 Heterogeneous Multicore Architecture Chapter 2 Heterogeneous Multicore Architecture 2.1 Architecture Model In order to satisfy the high-performance and low-power requirements for advanced embedded systems with greater fl exibility, it is

More information

Lesson 7: SYSTEM-ON. SoC) AND USE OF VLSI CIRCUIT DESIGN TECHNOLOGY. Chapter-1L07: "Embedded Systems - ", Raj Kamal, Publs.: McGraw-Hill Education

Lesson 7: SYSTEM-ON. SoC) AND USE OF VLSI CIRCUIT DESIGN TECHNOLOGY. Chapter-1L07: Embedded Systems - , Raj Kamal, Publs.: McGraw-Hill Education Lesson 7: SYSTEM-ON ON-CHIP (SoC( SoC) AND USE OF VLSI CIRCUIT DESIGN TECHNOLOGY 1 VLSI chip Integration of high-level components Possess gate-level sophistication in circuits above that of the counter,

More information

Serial port interface for microcontroller embedded into integrated power meter

Serial port interface for microcontroller embedded into integrated power meter Serial port interface for microcontroller embedded into integrated power meter Mr. Borisav Jovanović, Prof. dr. Predrag Petković, Prof. dr. Milunka Damnjanović, Faculty of Electronic Engineering Nis, Serbia

More information

Switched Interconnect for System-on-a-Chip Designs

Switched Interconnect for System-on-a-Chip Designs witched Interconnect for ystem-on-a-chip Designs Abstract Daniel iklund and Dake Liu Dept. of Physics and Measurement Technology Linköping University -581 83 Linköping {danwi,dake}@ifm.liu.se ith the increased

More information

Chapter 12: Multiprocessor Architectures. Lesson 04: Interconnect Networks

Chapter 12: Multiprocessor Architectures. Lesson 04: Interconnect Networks Chapter 12: Multiprocessor Architectures Lesson 04: Interconnect Networks Objective To understand different interconnect networks To learn crossbar switch, hypercube, multistage and combining networks

More information

Computer Network. Interconnected collection of autonomous computers that are able to exchange information

Computer Network. Interconnected collection of autonomous computers that are able to exchange information Introduction Computer Network. Interconnected collection of autonomous computers that are able to exchange information No master/slave relationship between the computers in the network Data Communications.

More information

Example-driven Interconnect Synthesis for Heterogeneous Coarse-Grain Reconfigurable Logic

Example-driven Interconnect Synthesis for Heterogeneous Coarse-Grain Reconfigurable Logic Example-driven Interconnect Synthesis for Heterogeneous Coarse-Grain Reconfigurable Logic Clifford Wolf, Johann Glaser, Florian Schupfer, Jan Haase, Christoph Grimm Computer Technology /99 Overview Ultra-Low-Power

More information

Qsys and IP Core Integration

Qsys and IP Core Integration Qsys and IP Core Integration Prof. David Lariviere Columbia University Spring 2014 Overview What are IP Cores? Altera Design Tools for using and integrating IP Cores Overview of various IP Core Interconnect

More information

OpenSPARC T1 Processor

OpenSPARC T1 Processor OpenSPARC T1 Processor The OpenSPARC T1 processor is the first chip multiprocessor that fully implements the Sun Throughput Computing Initiative. Each of the eight SPARC processor cores has full hardware

More information

SOFTWARE RADIO APPROACH FOR RE-CONFIGURABLE MULTI-STANDARD RADIOS

SOFTWARE RADIO APPROACH FOR RE-CONFIGURABLE MULTI-STANDARD RADIOS SOFTWARE RADIO APPROACH FOR RE-CONFIGURABLE MULTI-STANDARD RADIOS Jörg Brakensiek 1, Bernhard Oelkrug 1, Martin Bücker 1, Dirk Uffmann 1, A. Dröge 1, M. Darianian 1, Marius Otte 2 1 Nokia Research Center,

More information

Optimising the resource utilisation in high-speed network intrusion detection systems.

Optimising the resource utilisation in high-speed network intrusion detection systems. Optimising the resource utilisation in high-speed network intrusion detection systems. Gerald Tripp www.kent.ac.uk Network intrusion detection Network intrusion detection systems are provided to detect

More information

Architekturen und Einsatz von FPGAs mit integrierten Prozessor Kernen. Hans-Joachim Gelke Institute of Embedded Systems Professur für Mikroelektronik

Architekturen und Einsatz von FPGAs mit integrierten Prozessor Kernen. Hans-Joachim Gelke Institute of Embedded Systems Professur für Mikroelektronik Architekturen und Einsatz von FPGAs mit integrierten Prozessor Kernen Hans-Joachim Gelke Institute of Embedded Systems Professur für Mikroelektronik Contents Überblick: Aufbau moderner FPGA Einblick: Eigenschaften

More information

NORTHEASTERN UNIVERSITY Graduate School of Engineering

NORTHEASTERN UNIVERSITY Graduate School of Engineering NORTHEASTERN UNIVERSITY Graduate School of Engineering Thesis Title: Enabling Communications Between an FPGA s Embedded Processor and its Reconfigurable Resources Author: Joshua Noseworthy Department:

More information

Module 2. Embedded Processors and Memory. Version 2 EE IIT, Kharagpur 1

Module 2. Embedded Processors and Memory. Version 2 EE IIT, Kharagpur 1 Module 2 Embedded Processors and Memory Version 2 EE IIT, Kharagpur 1 Lesson 5 Memory-I Version 2 EE IIT, Kharagpur 2 Instructional Objectives After going through this lesson the student would Pre-Requisite

More information

REC FPGA Seminar IAP 1998. Seminar Format

REC FPGA Seminar IAP 1998. Seminar Format REC FPGA Seminar IAP 1998 Session 1: Architecture, Economics, and Applications of the FPGA Robotics and Electronics Cooperative FPGA Seminar IAP 1998 1 Seminar Format Four 45 minute open sessions two on

More information

APPLICATION NOTE AN-409

APPLICATION NOTE AN-409 MEMORY SIMPLIFIES WIRELESS BASE STATION DESIGN APPLICATION NOTE AN-409 ABSTRACT Recent research has shown that the digital signal processor (DSP)/ Dual port/ field programmable gate array (FPGA) chain

More information

CHAPTER 7: The CPU and Memory

CHAPTER 7: The CPU and Memory CHAPTER 7: The CPU and Memory The Architecture of Computer Hardware, Systems Software & Networking: An Information Technology Approach 4th Edition, Irv Englander John Wiley and Sons 2010 PowerPoint slides

More information

Outline. Introduction. Multiprocessor Systems on Chip. A MPSoC Example: Nexperia DVP. A New Paradigm: Network on Chip

Outline. Introduction. Multiprocessor Systems on Chip. A MPSoC Example: Nexperia DVP. A New Paradigm: Network on Chip Outline Modeling, simulation and optimization of Multi-Processor SoCs (MPSoCs) Università of Verona Dipartimento di Informatica MPSoCs: Multi-Processor Systems on Chip A simulation platform for a MPSoC

More information

Lecture 5: Gate Logic Logic Optimization

Lecture 5: Gate Logic Logic Optimization Lecture 5: Gate Logic Logic Optimization MAH, AEN EE271 Lecture 5 1 Overview Reading McCluskey, Logic Design Principles- or any text in boolean algebra Introduction We could design at the level of irsim

More information

Computer Performance. Topic 3. Contents. Prerequisite knowledge Before studying this topic you should be able to:

Computer Performance. Topic 3. Contents. Prerequisite knowledge Before studying this topic you should be able to: 55 Topic 3 Computer Performance Contents 3.1 Introduction...................................... 56 3.2 Measuring performance............................... 56 3.2.1 Clock Speed.................................

More information

Hardware Implementation of Improved Adaptive NoC Router with Flit Flow History based Load Balancing Selection Strategy

Hardware Implementation of Improved Adaptive NoC Router with Flit Flow History based Load Balancing Selection Strategy Hardware Implementation of Improved Adaptive NoC Rer with Flit Flow History based Load Balancing Selection Strategy Parag Parandkar 1, Sumant Katiyal 2, Geetesh Kwatra 3 1,3 Research Scholar, School of

More information

150127-Microprocessor & Assembly Language

150127-Microprocessor & Assembly Language Chapter 3 Z80 Microprocessor Architecture The Z 80 is one of the most talented 8 bit microprocessors, and many microprocessor-based systems are designed around the Z80. The Z80 microprocessor needs an

More information

Testing of Digital System-on- Chip (SoC)

Testing of Digital System-on- Chip (SoC) Testing of Digital System-on- Chip (SoC) 1 Outline of the Talk Introduction to system-on-chip (SoC) design Approaches to SoC design SoC test requirements and challenges Core test wrapper P1500 core test

More information

Network Traffic Monitoring an architecture using associative processing.

Network Traffic Monitoring an architecture using associative processing. Network Traffic Monitoring an architecture using associative processing. Gerald Tripp Technical Report: 7-99 Computing Laboratory, University of Kent 1 st September 1999 Abstract This paper investigates

More information

1. Introduction to Embedded System Design

1. Introduction to Embedded System Design 1. Introduction to Embedded System Design Lothar Thiele ETH Zurich, Switzerland 1-1 Contents of Lectures (Lothar Thiele) 1. Introduction to Embedded System Design 2. Software for Embedded Systems 3. Real-Time

More information

Chapter 11 I/O Management and Disk Scheduling

Chapter 11 I/O Management and Disk Scheduling Operating Systems: Internals and Design Principles, 6/E William Stallings Chapter 11 I/O Management and Disk Scheduling Dave Bremer Otago Polytechnic, NZ 2008, Prentice Hall I/O Devices Roadmap Organization

More information

Low-Overhead Hard Real-time Aware Interconnect Network Router

Low-Overhead Hard Real-time Aware Interconnect Network Router Low-Overhead Hard Real-time Aware Interconnect Network Router Michel A. Kinsy! Department of Computer and Information Science University of Oregon Srinivas Devadas! Department of Electrical Engineering

More information

Central Processing Unit (CPU)

Central Processing Unit (CPU) Central Processing Unit (CPU) CPU is the heart and brain It interprets and executes machine level instructions Controls data transfer from/to Main Memory (MM) and CPU Detects any errors In the following

More information

A New, High-Performance, Low-Power, Floating-Point Embedded Processor for Scientific Computing and DSP Applications

A New, High-Performance, Low-Power, Floating-Point Embedded Processor for Scientific Computing and DSP Applications 1 A New, High-Performance, Low-Power, Floating-Point Embedded Processor for Scientific Computing and DSP Applications Simon McIntosh-Smith Director of Architecture 2 Multi-Threaded Array Processing Architecture

More information

Why the Network Matters

Why the Network Matters Week 2, Lecture 2 Copyright 2009 by W. Feng. Based on material from Matthew Sottile. So Far Overview of Multicore Systems Why Memory Matters Memory Architectures Emerging Chip Multiprocessors (CMP) Increasing

More information

Making Multicore Work and Measuring its Benefits. Markus Levy, president EEMBC and Multicore Association

Making Multicore Work and Measuring its Benefits. Markus Levy, president EEMBC and Multicore Association Making Multicore Work and Measuring its Benefits Markus Levy, president EEMBC and Multicore Association Agenda Why Multicore? Standards and issues in the multicore community What is Multicore Association?

More information

Lecture 18: Interconnection Networks. CMU 15-418: Parallel Computer Architecture and Programming (Spring 2012)

Lecture 18: Interconnection Networks. CMU 15-418: Parallel Computer Architecture and Programming (Spring 2012) Lecture 18: Interconnection Networks CMU 15-418: Parallel Computer Architecture and Programming (Spring 2012) Announcements Project deadlines: - Mon, April 2: project proposal: 1-2 page writeup - Fri,

More information

RAPID PROTOTYPING OF DIGITAL SYSTEMS Second Edition

RAPID PROTOTYPING OF DIGITAL SYSTEMS Second Edition RAPID PROTOTYPING OF DIGITAL SYSTEMS Second Edition A Tutorial Approach James O. Hamblen Georgia Institute of Technology Michael D. Furman Georgia Institute of Technology KLUWER ACADEMIC PUBLISHERS Boston

More information

8-Bit Flash Microcontroller for Smart Cards. AT89SCXXXXA Summary. Features. Description. Complete datasheet available under NDA

8-Bit Flash Microcontroller for Smart Cards. AT89SCXXXXA Summary. Features. Description. Complete datasheet available under NDA Features Compatible with MCS-51 products On-chip Flash Program Memory Endurance: 1,000 Write/Erase Cycles On-chip EEPROM Data Memory Endurance: 100,000 Write/Erase Cycles 512 x 8-bit RAM ISO 7816 I/O Port

More information

Xeon+FPGA Platform for the Data Center

Xeon+FPGA Platform for the Data Center Xeon+FPGA Platform for the Data Center ISCA/CARL 2015 PK Gupta, Director of Cloud Platform Technology, DCG/CPG Overview Data Center and Workloads Xeon+FPGA Accelerator Platform Applications and Eco-system

More information

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-17: Memory organisation, and types of memory

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-17: Memory organisation, and types of memory ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-17: Memory organisation, and types of memory 1 1. Memory Organisation 2 Random access model A memory-, a data byte, or a word, or a double

More information

OPNET Network Simulator

OPNET Network Simulator Simulations and Tools for Telecommunications 521365S: OPNET Network Simulator Jarmo Prokkola Research team leader, M. Sc. (Tech.) VTT Technical Research Centre of Finland Kaitoväylä 1, Oulu P.O. Box 1100,

More information

Logical Operations. Control Unit. Contents. Arithmetic Operations. Objectives. The Central Processing Unit: Arithmetic / Logic Unit.

Logical Operations. Control Unit. Contents. Arithmetic Operations. Objectives. The Central Processing Unit: Arithmetic / Logic Unit. Objectives The Central Processing Unit: What Goes on Inside the Computer Chapter 4 Identify the components of the central processing unit and how they work together and interact with memory Describe how

More information

Seeking Opportunities for Hardware Acceleration in Big Data Analytics

Seeking Opportunities for Hardware Acceleration in Big Data Analytics Seeking Opportunities for Hardware Acceleration in Big Data Analytics Paul Chow High-Performance Reconfigurable Computing Group Department of Electrical and Computer Engineering University of Toronto Who

More information

Lecture 2 Parallel Programming Platforms

Lecture 2 Parallel Programming Platforms Lecture 2 Parallel Programming Platforms Flynn s Taxonomy In 1966, Michael Flynn classified systems according to numbers of instruction streams and the number of data stream. Data stream Single Multiple

More information

CMS Level 1 Track Trigger

CMS Level 1 Track Trigger Institut für Technik der Informationsverarbeitung CMS Level 1 Track Trigger An FPGA Approach Management Prof. Dr.-Ing. Dr. h.c. J. Becker Prof. Dr.-Ing. Eric Sax Prof. Dr. rer. nat. W. Stork KIT University

More information

Introduction to Exploration and Optimization of Multiprocessor Embedded Architectures based on Networks On-Chip

Introduction to Exploration and Optimization of Multiprocessor Embedded Architectures based on Networks On-Chip Introduction to Exploration and Optimization of Multiprocessor Embedded Architectures based on Networks On-Chip Cristina SILVANO silvano@elet.polimi.it Politecnico di Milano, Milano (Italy) Talk Outline

More information

Implementation Details

Implementation Details LEON3-FT Processor System Scan-I/F FT FT Add-on Add-on 2 2 kbyte kbyte I- I- Cache Cache Scan Scan Test Test UART UART 0 0 UART UART 1 1 Serial 0 Serial 1 EJTAG LEON_3FT LEON_3FT Core Core 8 Reg. Windows

More information

STM32 F-2 series High-performance Cortex-M3 MCUs

STM32 F-2 series High-performance Cortex-M3 MCUs STM32 F-2 series High-performance Cortex-M3 MCUs STMicroelectronics 32-bit microcontrollers, 120 MHz/150 DMIPS with ART Accelerator TM and advanced peripherals www.st.com/mcu STM32 F-2 series The STM32

More information

Computer Architecture

Computer Architecture Computer Architecture Slide Sets WS 2013/2014 Prof. Dr. Uwe Brinkschulte M.Sc. Benjamin Betting Part 11 Memory Management Computer Architecture Part 11 page 1 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin

More information

Lizy Kurian John Electrical and Computer Engineering Department, The University of Texas as Austin

Lizy Kurian John Electrical and Computer Engineering Department, The University of Texas as Austin BUS ARCHITECTURES Lizy Kurian John Electrical and Computer Engineering Department, The University of Texas as Austin Keywords: Bus standards, PCI bus, ISA bus, Bus protocols, Serial Buses, USB, IEEE 1394

More information

FPGA Clocking. Clock related issues: distribution generation (frequency synthesis) multiplexing run time programming domain crossing

FPGA Clocking. Clock related issues: distribution generation (frequency synthesis) multiplexing run time programming domain crossing FPGA Clocking Clock related issues: distribution generation (frequency synthesis) Deskew multiplexing run time programming domain crossing Clock related constraints 100 Clock Distribution Device split

More information

ON SUITABILITY OF FPGA BASED EVOLVABLE HARDWARE SYSTEMS TO INTEGRATE RECONFIGURABLE CIRCUITS WITH HOST PROCESSING UNIT

ON SUITABILITY OF FPGA BASED EVOLVABLE HARDWARE SYSTEMS TO INTEGRATE RECONFIGURABLE CIRCUITS WITH HOST PROCESSING UNIT 216 ON SUITABILITY OF FPGA BASED EVOLVABLE HARDWARE SYSTEMS TO INTEGRATE RECONFIGURABLE CIRCUITS WITH HOST PROCESSING UNIT *P.Nirmalkumar, **J.Raja Paul Perinbam, @S.Ravi and #B.Rajan *Research Scholar,

More information

Chapter 2 Logic Gates and Introduction to Computer Architecture

Chapter 2 Logic Gates and Introduction to Computer Architecture Chapter 2 Logic Gates and Introduction to Computer Architecture 2.1 Introduction The basic components of an Integrated Circuit (IC) is logic gates which made of transistors, in digital system there are

More information

Switch Fabric Implementation Using Shared Memory

Switch Fabric Implementation Using Shared Memory Order this document by /D Switch Fabric Implementation Using Shared Memory Prepared by: Lakshmi Mandyam and B. Kinney INTRODUCTION Whether it be for the World Wide Web or for an intra office network, today

More information

UNIT 2 CLASSIFICATION OF PARALLEL COMPUTERS

UNIT 2 CLASSIFICATION OF PARALLEL COMPUTERS UNIT 2 CLASSIFICATION OF PARALLEL COMPUTERS Structure Page Nos. 2.0 Introduction 27 2.1 Objectives 27 2.2 Types of Classification 28 2.3 Flynn s Classification 28 2.3.1 Instruction Cycle 2.3.2 Instruction

More information

KeyStone Architecture Security Accelerator (SA) User Guide

KeyStone Architecture Security Accelerator (SA) User Guide KeyStone Architecture Security Accelerator (SA) User Guide Literature Number: SPRUGY6B January 2013 Release History www.ti.com Release Date Description/Comments SPRUGY6B January 2013 Added addition engine

More information

4. H.323 Components. VOIP, Version 1.6e T.O.P. BusinessInteractive GmbH Page 1 of 19

4. H.323 Components. VOIP, Version 1.6e T.O.P. BusinessInteractive GmbH Page 1 of 19 4. H.323 Components VOIP, Version 1.6e T.O.P. BusinessInteractive GmbH Page 1 of 19 4.1 H.323 Terminals (1/2)...3 4.1 H.323 Terminals (2/2)...4 4.1.1 The software IP phone (1/2)...5 4.1.1 The software

More information

Agenda. Michele Taliercio, Il circuito Integrato, Novembre 2001

Agenda. Michele Taliercio, Il circuito Integrato, Novembre 2001 Agenda Introduzione Il mercato Dal circuito integrato al System on a Chip (SoC) La progettazione di un SoC La tecnologia Una fabbrica di circuiti integrati 28 How to handle complexity G The engineering

More information

Contents. System Development Models and Methods. Design Abstraction and Views. Synthesis. Control/Data-Flow Models. System Synthesis Models

Contents. System Development Models and Methods. Design Abstraction and Views. Synthesis. Control/Data-Flow Models. System Synthesis Models System Development Models and Methods Dipl.-Inf. Mirko Caspar Version: 10.02.L.r-1.0-100929 Contents HW/SW Codesign Process Design Abstraction and Views Synthesis Control/Data-Flow Models System Synthesis

More information

DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA EFFICIENT ROUTER DESIGN FOR NETWORK ON CHIP

DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA EFFICIENT ROUTER DESIGN FOR NETWORK ON CHIP DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA EFFICIENT ROUTER DESIGN FOR NETWORK ON CHIP SWAPNA S 2013 EFFICIENT ROUTER DESIGN FOR NETWORK ON CHIP A

More information

Computer Systems Structure Input/Output

Computer Systems Structure Input/Output Computer Systems Structure Input/Output Peripherals Computer Central Processing Unit Main Memory Computer Systems Interconnection Communication lines Input Output Ward 1 Ward 2 Examples of I/O Devices

More information

Chapter 2. Multiprocessors Interconnection Networks

Chapter 2. Multiprocessors Interconnection Networks Chapter 2 Multiprocessors Interconnection Networks 2.1 Taxonomy Interconnection Network Static Dynamic 1-D 2-D HC Bus-based Switch-based Single Multiple SS MS Crossbar 2.2 Bus-Based Dynamic Single Bus

More information

Clocking. Figure by MIT OCW. 6.884 - Spring 2005 2/18/05 L06 Clocks 1

Clocking. Figure by MIT OCW. 6.884 - Spring 2005 2/18/05 L06 Clocks 1 ing Figure by MIT OCW. 6.884 - Spring 2005 2/18/05 L06 s 1 Why s and Storage Elements? Inputs Combinational Logic Outputs Want to reuse combinational logic from cycle to cycle 6.884 - Spring 2005 2/18/05

More information

PowerPC Microprocessor Clock Modes

PowerPC Microprocessor Clock Modes nc. Freescale Semiconductor AN1269 (Freescale Order Number) 1/96 Application Note PowerPC Microprocessor Clock Modes The PowerPC microprocessors offer customers numerous clocking options. An internal phase-lock

More information

路 論 Chapter 15 System-Level Physical Design

路 論 Chapter 15 System-Level Physical Design Introduction to VLSI Circuits and Systems 路 論 Chapter 15 System-Level Physical Design Dept. of Electronic Engineering National Chin-Yi University of Technology Fall 2007 Outline Clocked Flip-flops CMOS

More information

On-Chip Interconnection Networks Low-Power Interconnect

On-Chip Interconnection Networks Low-Power Interconnect On-Chip Interconnection Networks Low-Power Interconnect William J. Dally Computer Systems Laboratory Stanford University ISLPED August 27, 2007 ISLPED: 1 Aug 27, 2007 Outline Demand for On-Chip Networks

More information

Reconfigurable Architecture Requirements for Co-Designed Virtual Machines

Reconfigurable Architecture Requirements for Co-Designed Virtual Machines Reconfigurable Architecture Requirements for Co-Designed Virtual Machines Kenneth B. Kent University of New Brunswick Faculty of Computer Science Fredericton, New Brunswick, Canada ken@unb.ca Micaela Serra

More information

Reconfigurable System-on-Chip Design

Reconfigurable System-on-Chip Design Reconfigurable System-on-Chip Design MITCHELL MYJAK Senior Research Engineer Pacific Northwest National Laboratory PNNL-SA-93202 31 January 2013 1 About Me Biography BSEE, University of Portland, 2002

More information

Exploiting Stateful Inspection of Network Security in Reconfigurable Hardware

Exploiting Stateful Inspection of Network Security in Reconfigurable Hardware Exploiting Stateful Inspection of Network Security in Reconfigurable Hardware Shaomeng Li, Jim Tørresen, Oddvar Søråsen Department of Informatics University of Oslo N-0316 Oslo, Norway {shaomenl, jimtoer,

More information

AMD Opteron Quad-Core

AMD Opteron Quad-Core AMD Opteron Quad-Core a brief overview Daniele Magliozzi Politecnico di Milano Opteron Memory Architecture native quad-core design (four cores on a single die for more efficient data sharing) enhanced

More information

Introduction to Digital System Design

Introduction to Digital System Design Introduction to Digital System Design Chapter 1 1 Outline 1. Why Digital? 2. Device Technologies 3. System Representation 4. Abstraction 5. Development Tasks 6. Development Flow Chapter 1 2 1. Why Digital

More information