Mentor Phillip Balister. Advisor Professor Miriam Leeser

Similar documents
NORTHEASTERN UNIVERSITY Graduate School of Engineering. Thesis Title: CRASH: Cognitive Radio Accelerated with Software and Hardware

Accelerate Cloud Computing with the Xilinx Zynq SoC

Best Practises for LabVIEW FPGA Design Flow. uk.ni.com ireland.ni.com

WiSER: Dynamic Spectrum Access Platform and Infrastructure

Model-based system-on-chip design on Altera and Xilinx platforms

Extending the Power of FPGAs. Salil Raje, Xilinx

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-12: ARM

Embedded Systems: map to FPGA, GPU, CPU?

Xeon+FPGA Platform for the Data Center

An Embedded Based Web Server Using ARM 9 with SMS Alert System

7a. System-on-chip design and prototyping platforms

Building an Embedded Processor System on a Xilinx Zync FPGA (Profiling): A Tutorial

ZigBee Technology Overview

1000Mbps Ethernet Performance Test Report

FPGAs for High-Performance DSP Applications

All Programmable Logic. Hans-Joachim Gelke Institute of Embedded Systems. Zürcher Fachhochschule

Eli Levi Eli Levi holds B.Sc.EE from the Technion.Working as field application engineer for Systematics, Specializing in HDL design with MATLAB and

Going to the wire: The next generation financial risk management platform

BEAGLEBONE BLACK ARCHITECTURE MADELEINE DAIGNEAU MICHELLE ADVENA

Seeking Opportunities for Hardware Acceleration in Big Data Analytics

Application Note Design Process for Smart, Distributed RF Sensors Ettus Research

FPGA Acceleration using OpenCL & PCIe Accelerators MEW 25

Ettus Research Products and Roadmap 2011

MP3 Player CSEE 4840 SPRING 2010 PROJECT DESIGN.

Lesson 7: SYSTEM-ON. SoC) AND USE OF VLSI CIRCUIT DESIGN TECHNOLOGY. Chapter-1L07: "Embedded Systems - ", Raj Kamal, Publs.: McGraw-Hill Education

Application of Android OS as Real-time Control Platform**

Networking Virtualization Using FPGAs

Data Center and Cloud Computing Market Landscape and Challenges

Architectures and Platforms

CS 377: Operating Systems. Outline. A review of what you ve learned, and how it applies to a real operating system. Lecture 25 - Linux Case Study

Continuous-Time Converter Architectures for Integrated Audio Processors: By Brian Trotter, Cirrus Logic, Inc. September 2008

LLRF. Digital RF Stabilization System

Qsys and IP Core Integration

Developing reliable Multi-Core Embedded-Systems with NI Linux Real-Time

Embedded Development Tools

AGIPD Interface Electronic Prototyping

Enabling Open-Source High Speed Network Monitoring on NetFPGA

Digitale Signalverarbeitung mit FPGA (DSF) Soft Core Prozessor NIOS II Stand Mai Jens Onno Krah

An Introduction to Dispersive Virtualized Networks

GNU Radio for Android

Inspecting GNU Radio Applications with ControlPort and Performance Counters

The Advanced JTAG Bridge. Nathan Yawn 05/12/09

LogiCORE IP AXI Performance Monitor v2.00.a

9/14/ :38

The virtualization of SAP environments to accommodate standardization and easier management is gaining momentum in data centers.

New Technology Introduction: Android Studio with PushBot

1) SETUP ANDROID STUDIO

ARM Cortex -A8 SBC with MIPI CSI Camera and Spartan -6 FPGA SBC1654

Full and Para Virtualization

Ping Pong Game with Touch-screen. March 2012

Intel Xeon +FPGA Platform for the Data Center

A General Framework for Tracking Objects in a Multi-Camera Environment

VPX Implementation Serves Shipboard Search and Track Needs

GNU Radio. An introduction. Jesper M. Kristensen Department of Electronic Systems Programmerbare digitale enheder Tuesday 6/3 2007

Lecture 3: Modern GPUs A Hardware Perspective Mohamed Zahran (aka Z) mzahran@cs.nyu.edu

HAM FOR HACKERS TAKE BACK THE AIRWAVES. JonM DEFCON 16

The new frontier of the DATA acquisition using 1 and 10 Gb/s Ethernet links. Filippo Costa on behalf of the ALICE DAQ group

Open Architecture Design for GPS Applications Yves Théroux, BAE Systems Canada

Nutaq. PicoDigitizer 125-Series 16 or 32 Channels, 125 MSPS, FPGA-Based DAQ Solution PRODUCT SHEET. nutaq.com MONTREAL QUEBEC

Outline. Introduction. Multiprocessor Systems on Chip. A MPSoC Example: Nexperia DVP. A New Paradigm: Network on Chip

Wireless Transmission of JPEG file using GNU Radio and USRP

Design and Implementation of the Heterogeneous Multikernel Operating System

x64 Servers: Do you want 64 or 32 bit apps with that server?

Linux. Reverse Debugging. Target Communication Framework. Nexus. Intel Trace Hub GDB. PIL Simulation CONTENTS

Current and Ultrasonic Testing System

Martin C. Alcock, M. Sc. (Dist), MIEEE Embedded Systems Specialist

OFDM, Mobile Software Development Framework

Soft processors for microcontroller programming education

Introduction to AMBA 4 ACE and big.little Processing Technology

Embedded Linux development with Buildroot training 3-day session

Java Embedded Applications

Development With ARM DS-5. Mervyn Liu FAE Aug. 2015

Hybrid Platform Application in Software Debug

Software Defined Radio

Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging

Attention. restricted to Avnet s X-Fest program and Avnet employees. Any use

GnuRadio CONTACT INFORMATION: phone: fax: web:

Getting Started with RemoteFX in Windows Embedded Compact 7

Zynq SATA Storage Extension (Zynq SSE) - NAS. Technical Brief from Missing Link Electronics:

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

Achieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building Blocks. An Oracle White Paper April 2003

Architectures, Processors, and Devices

Router Architectures

VALAR: A BENCHMARK SUITE TO STUDY THE DYNAMIC BEHAVIOR OF HETEROGENEOUS SYSTEMS

Embedded Linux RADAR device

FPGA-based MapReduce Framework for Machine Learning

AXI Performance Monitor v5.0

Linux NIC and iscsi Performance over 40GbE

MicroBlaze Debug Module (MDM) v3.2

Network connectivity controllers

Packet Capture in 10-Gigabit Ethernet Environments Using Contemporary Commodity Hardware

ARM Ltd 110 Fulbourn Road, Cambridge, CB1 9NJ, UK.

The BSN Hardware and Software Platform: Enabling Easy Development of Body Sensor Network Applications

The new 32-bit MSP432 MCU platform from Texas

Achieving Performance Isolation with Lightweight Co-Kernels

I/O virtualization. Jussi Hanhirova Aalto University, Helsinki, Finland Hanhirova CS/Aalto

Architekturen und Einsatz von FPGAs mit integrierten Prozessor Kernen. Hans-Joachim Gelke Institute of Embedded Systems Professur für Mikroelektronik

Using Mobile Processors for Cost Effective Live Video Streaming to the Internet

2.0 Command and Data Handling Subsystem

Product: Order Delivery Tracking

Transcription:

Mentor Phillip Balister Advisor Professor Miriam Leeser 1

Why FPGA Acceleration in GNU Radio? Faster performance for some algorithms Frees processor to perform other tasks Low latency, deterministic response time Xilinx Zynq ARM + FPGA Dual Core Cortex A9 Plentiful FPGA Resources Tightly coupled via high speed buses Zedboard, inexpensive development kit 2

Project Goals: Run GNU Radio on the Zynq's ARM processors Create a FPGA acceleration infrastructure Demonstrate FPGA acceleration in GNU Radio Provide comprehensive documentation 3

Low Pass Filter 4

Low Pass Filter 17% of total runtime 5

FPGA Accelerated Low Pass Filter 6

FPGA Accelerated Low Pass Filter 4% of total runtime Reduced 13% 7

vs Isolate block performance from GNU Radio Quantify effect of filter length on performance Wrote simple C++ program to measure each block's sample processing performance gettimeofday() on work() method 8

Millions of Samples / sec 40 35 30 25 20 15 10 5 Performance Comparison of FPGA Accelerated FIR Filter Block in GNU Radio 5.0 FIR Filter CCF (ARM) FIR Filter IC (FPGA) 35.5 35.5 35.1 34.4 3.9 3.2 2.8 2.4 35.5 7x 15x 0 31 51 71 91 111 Number of Filter Taps 9

Hardware: ZC706 Development Board Others have used Zedboard & ZC702 Xilinx Zynq Dual Core ARM Cortex A9 GNU Radio FPGA Accelerated Block Linux Kernel Device Driver ARM to FPGA Interface FPGA Accelerator FPGA Fabric 10

Linux 3.9, Ubuntu 13.04 GNU Radio 3.7.1 Xilinx Zynq Dual Core ARM Cortex A9 GNU Radio FPGA Accelerated Block Linux Kernel Device Driver Xilinx ISE Design Suite 14.6 Setup information available on GNU Radio Zynq Wiki: http://gnuradio.org/redmine/ projects/gnuradio/wiki/zynq ARM to FPGA Interface FPGA Accelerator FPGA Fabric 11

Xilinx Zynq Goal: Offload GNU Radio blocks to the FPGA Requires: Moving GNU Radio sample & control data between ARM / FPGA Implement: Shared memory between ARM & FPGA FPGA control interface Accessible by GNU Radio Blocks Dual Core ARM Cortex A9 GNU Radio FPGA Accelerated Block Linux Kernel Device Driver ARM to FPGA Interface FPGA Accelerator FPGA Fabric 12

ARM & FPGA communicate over AMBA AXI4 interconnect ARM standardized bus Connects ARM cores, RAM, & FPGA FPGA uses AXI ports to access the interconnect Simplified diagram to show only AXI ports Use two AXI ports Read / write Sample & Control data in RAM FPGA Control Interface Xilinx Zynq Dual Core ARM Cortex A9 GNU Radio FPGA Accelerated Block Linux Kernel Device Driver ARM to FPGA Interface FPGA Accelerator FPGA Fabric AXI Ports 13

ARM & FPGA pass control & sample data through RAM Knowledge of physical memory addresses Device driver Xilinx Zynq Dual Core ARM Cortex A9 GNU Radio FPGA Accelerated Block mmap() Linux Kernel Device Driver Handles memory allocation & resolves physical addresses AXI Ports ARM to FPGA Interface Provides interface (mmap) to access shared memory & AXI port for FPGA control FPGA Accelerator FPGA Fabric 14

ARM to FPGA interface Uses Xilinx IP to read / write sample & control data from AXI port for RAM access Receives read / write commands from ARM via AXI port for FPGA control Output sample & control data on a simple bus for the FPGA accelerator AXI4 Stream Bus Xilinx Zynq Dual Core ARM Cortex A9 GNU Radio FPGA Accelerated Block ARM to FPGA Interface FPGA Accelerator FPGA Fabric mmap() Linux Kernel Device Driver AXI Ports AXI4 Stream 15

Interface with Device Driver Code to copy GNU Radio sample & control data to shared memory memcopy Methods to control custom FPGA accelerator Xilinx Zynq Dual Core ARM Cortex A9 GNU Radio FPGA Accelerated Block mmap() Linux Kernel Device Driver Drop in custom FPGA accelerator(s) Compatible with Xilinx IP library Advantage of AXI4 Stream Example FIR Filter AXI Ports ARM to FPGA Interface AXI4 Stream FPGA Accelerator FPGA Fabric 16

Versions supporting integer, complex float, & complex short int Set coefficients with set_taps() method Xilinx Coregen FIR Filter Reloadable coefficients 32-bit fixed point Floating point in future Dual channel for complex samples Tested up to 111 taps Xilinx Zynq Dual Core ARM Cortex A9 GNU Radio FPGA FIR Filter mmap() Linux Kernel Device Driver AXI Ports ARM to FPGA Interface AXI4 Stream FPGA FIR Filter Accelerator FPGA Fabric 17

18

19

Accelerate GNU Radio signal processing Filters, FFT, Error Correction / Viterbi Decoder High sample rate processing in FPGA Process USRP raw ADC / DAC data (100 Msps) with very low latency Port previous project, CRUSH, to Zynq Implement agile algorithms Spectrum sensing and channel occupancy Split MAC architecture Heterogeneous software defined radio ARM implements control FPGA offloads heavy signal processing 20

Completed GSoC Goals: ü Ran GNU Radio on the Zynq's ARM processors ü Created a FPGA acceleration infrastructure ü Demonstrated FPGA acceleration in GNU Radio Example FPGA accelerated FIR Filter 7 15x performance increase (FIR Filter on ARM versus on FPGA) ü Wrote comprehensive documentation available on the GNU Radio Wiki 21

Mentor Phillip Balister Advisor Professor Miriam Leeser Moritz Fischer Tom Rondeau, Martin Braun GNU Radio Community Jonathon Pendlum (jon.pendlum@gmail.com) GNU Radio Zynq Wiki Page (installation instructions): http://gnuradio.org/redmine/projects/gnuradio/wiki/zynq Northeastern Reconfigurable Computing Laboratory http://coe.neu.edu/research/rcl/ 22

Millions of Samples / sec 40 35 30 25 20 15 10 5 Performance Comparison of FPGA Accelerated FIR Filter Block in GNU Radio FIR Filter CCF (Intel Xeon) 36.2 35.5 35.5 35.1 34.4 28.8 23.9 FIR Filter IC (FPGA) 19.7 17.6 35.5 0 31 51 71 91 111 Number of Filter Taps 23

24