FPGA Implementation of Boolean Neural Networks using UML



Similar documents
Chapter 7 Memory and Programmable Logic

Hardware and Software

Reconfigurable Architecture Requirements for Co-Designed Virtual Machines

9/14/ :38

A First Course in Digital Design Using VHDL and Programmable Logic

Delay Characterization in FPGA-based Reconfigurable Systems

A Compact FPGA Implementation of Triple-DES Encryption System with IP Core Generation and On-Chip Verification

Contents. System Development Models and Methods. Design Abstraction and Views. Synthesis. Control/Data-Flow Models. System Synthesis Models

Introduction to Field Programmable Gate Arrays

REC FPGA Seminar IAP Seminar Format

Polymorphic AES Encryption Implementation

Introduction. Jim Duckworth ECE Department, WPI. VHDL Short Course - Module 1

Applications of algorithms for image processing using programmable logic

40G MACsec Encryption in an FPGA

Architekturen und Einsatz von FPGAs mit integrierten Prozessor Kernen. Hans-Joachim Gelke Institute of Embedded Systems Professur für Mikroelektronik

Seeking Opportunities for Hardware Acceleration in Big Data Analytics

Modeling Latches and Flip-flops

Digital Systems. Role of the Digital Engineer

Eli Levi Eli Levi holds B.Sc.EE from the Technion.Working as field application engineer for Systematics, Specializing in HDL design with MATLAB and

University of St. Thomas ENGR Digital Design 4 Credit Course Monday, Wednesday, Friday from 1:35 p.m. to 2:40 p.m. Lecture: Room OWS LL54

International Journal of Advancements in Research & Technology, Volume 2, Issue3, March ISSN

Introduction to Programmable Logic Devices. John Coughlan RAL Technology Department Detector & Electronics Division

Pre-tested System-on-Chip Design. Accelerates PLD Development

Digital Systems Design! Lecture 1 - Introduction!!

Open Flow Controller and Switch Datasheet

System Generator for DSP

RAPID PROTOTYPING OF DIGITAL SYSTEMS Second Edition

How To Fix A 3 Bit Error In Data From A Data Point To A Bit Code (Data Point) With A Power Source (Data Source) And A Power Cell (Power Source)

ON SUITABILITY OF FPGA BASED EVOLVABLE HARDWARE SYSTEMS TO INTEGRATE RECONFIGURABLE CIRCUITS WITH HOST PROCESSING UNIT

7 Series FPGA Overview

DEVELOPMENT OF DEVICES AND METHODS FOR PHASE AND AC LINEARITY MEASUREMENTS IN DIGITIZERS

Echtzeittesten mit MathWorks leicht gemacht Simulink Real-Time Tobias Kuschmider Applikationsingenieur

Networking Virtualization Using FPGAs

Rapid System Prototyping with FPGAs

FPGA. AT6000 FPGAs. Application Note AT6000 FPGAs. 3x3 Convolver with Run-Time Reconfigurable Vector Multiplier in Atmel AT6000 FPGAs.

CoProcessor Design for Crypto- Applications using Hyperelliptic Curve Cryptography

A DA Serial Multiplier Technique based on 32- Tap FIR Filter for Audio Application

IMPLEMENTATION OF FPGA CARD IN CONTENT FILTERING SOLUTIONS FOR SECURING COMPUTER NETWORKS. Received May 2010; accepted July 2010

DDS. 16-bit Direct Digital Synthesizer / Periodic waveform generator Rev Key Design Features. Block Diagram. Generic Parameters.

Introduction to Digital Design Using Digilent FPGA Boards Block Diagram / Verilog Examples

LogiCORE IP AXI Performance Monitor v2.00.a

USB - FPGA MODULE (PRELIMINARY)

SYSTEM-ON-PROGRAMMABLE-CHIP DESIGN USING A UNIFIED DEVELOPMENT ENVIRONMENT. Nicholas Wieder

System on Chip Platform Based on OpenCores for Telecommunication Applications

FPGA-based MapReduce Framework for Machine Learning

Academic year: 2015/2016 Code: IES s ECTS credits: 6. Field of study: Electronics and Telecommunications Specialty: -

FPGA-Based System Virtual Machines

Best Practises for LabVIEW FPGA Design Flow. uk.ni.com ireland.ni.com

CMS Level 1 Track Trigger

Implementation and Design of AES S-Box on FPGA

Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture

Master/Slave Flip Flops

Arquitectura Virtex. Delay-Locked Loop (DLL)

Hardware Task Scheduling and Placement in Operating Systems for Dynamically Reconfigurable SoC

Introduction to Digital System Design

White Paper FPGA Performance Benchmarking Methodology

Compiling PCRE to FPGA for Accelerating SNORT IDS

MATLAB/Simulink Based Hardware/Software Co-Simulation for Designing Using FPGA Configured Soft Processors

LMS is a simple but powerful algorithm and can be implemented to take advantage of the Lattice FPGA architecture.

Aims and Objectives. E 3.05 Digital System Design. Course Syllabus. Course Syllabus (1) Programmable Logic

FPGA area allocation for parallel C applications

Red de Revistas Científicas de América Latina y el Caribe, España y Portugal. Universidad Autónoma del Estado de México

SOCWIRE: A SPACEWIRE INSPIRED FAULT TOLERANT NETWORK-ON-CHIP FOR RECONFIGURABLE SYSTEM-ON-CHIP DESIGNS

Lesson 7: SYSTEM-ON. SoC) AND USE OF VLSI CIRCUIT DESIGN TECHNOLOGY. Chapter-1L07: "Embedded Systems - ", Raj Kamal, Publs.: McGraw-Hill Education

Low Cost System on Chip Design for Audio Processing

FSMD and Gezel. Jan Madsen

Multipliers. Introduction

Hardware Implementations of RSA Using Fast Montgomery Multiplications. ECE 645 Prof. Gaj Mike Koontz and Ryon Sumner

CFD Implementation with In-Socket FPGA Accelerators

Low-resolution Image Processing based on FPGA

NIOS II Based Embedded Web Server Development for Networking Applications

Design and FPGA Implementation of a Novel Square Root Evaluator based on Vedic Mathematics

A Second Undergraduate Course in Digital Logic Design: The Datapath+Controller-Based Approach

A Mixed-Signal System-on-Chip Audio Decoder Design for Education

Extending the Power of FPGAs. Salil Raje, Xilinx

NARC: Network-Attached Reconfigurable Computing for High-performance, Network-based Applications

Implementation of emulated digital CNN-UM architecture on programmable logic devices and its applications

All Programmable Logic. Hans-Joachim Gelke Institute of Embedded Systems. Zürcher Fachhochschule

Study of 32-bit RISC Processor Architecture and VHDL FPGA Implementation 32-bitMatrix Manipulation

Architectures and Platforms

Introduction to Xilinx System Generator Part II. Evan Everett and Michael Wu ELEC Spring 2013

High-Level Synthesis for FPGA Designs

Testing & Verification of Digital Circuits ECE/CS 5745/6745. Hardware Verification using Symbolic Computation

Model-based system-on-chip design on Altera and Xilinx platforms

DRAFT Gigabit network intrusion detection systems

Hardware/Software Guidelines

International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research)

No serious hazards are involved in this laboratory experiment, but be careful to connect the components with the proper polarity to avoid damage.

So far we have investigated combinational logic for which the output of the logic devices/circuits depends only on the present state of the inputs.

Offline HW/SW Authentication for Reconfigurable Platforms

Transcription:

FPGA Implementation of Boolean Neural Networks using UML Roman Kohut,, Bernd Steinbach, Dominik Fröhlich Freiberg University of Mining and Technology Institute of Computer Science Freiberg (Sachs), Germany

Outline Introduction Boolean Neural Networks UML-Models Experiment results Conclusion 2

Introduction FPGAs y = f (x), x ={x1, x2,, xnx}, Nx 4 5 slice Nx 5 6 CLB Nx 6 Slice structure 3

Introduction The Problem Connectivity problems, Structured problems limited number of logic gates and interconnections, On-Chip learning problems (sequential computations), type of data Large number of CLBs number of inputs Large number of CLBs (10 th -100 th ) are required for one single neuron GANGLION - (640-784) CLBs, Gschwind NN - 22 CLBs, Xilinx-NN - 51 CLBs, Hopfield NN - 26 CLBs. complex tranfer function 4

Boolean Neural Networks Boolean Neuron y y = = f B f B ( x, w) ( x, w ) B B Inputs x 1 x 2 x 3 w 1 w 2 w 3 w Nx Weights of synaptic connections f B Transfer function y = f Output ( x,w) y x w = { x x,, }, B 1 2 x Nx f B y B = { w w,, }, B 1 2 w N x, {0,1} f B y B x i w i {0,1} {0,1} - Boolean transfer function - output signal x Nx General structure of Boolean neuron Advantages of the BN: speeding up of calculation significantly, reduction of necessary memory size, possibility to map the BN into one single CLB of FPGAs. 5

Boolean Neural Network Structure Nk1 x 1 x 2 x Nx Nk 2 k 1 k 2 N 1 N 2 y 1 y 2 y Ny inputs x 1 x 2 x 3 x 4 LUT of CLB weight coefficients w 1 w 2 output w 3 f B y w 4 y = f B (x B, w B ) transfer function Nk Zn k Zn LUT: Nk Zn, N Ny 4 Slice: Nk Zn, N Ny 5 CLB: Nk Zn, N Ny 6 Training algorithm N Ny N x =4 LUT 6

Boolean Neural Networks Mapping of BNN to FPGA BN 1 LUT 1 BN 2 LUT 2 BN 3 LUT 3 BN 4 LUT 4 BN 5 LUT 5 LUT 6 BN 6 7

UML Models Example Structure of BNN y 0 x 1 x 2 x 3 k 1 k 2 k 3 k 4 0 0 0 0 0 1 0 0 0 1 1 0 1 0 0 1 0 0 0 0 1 0 1 1 1 0 0 0 1 0 0 0 1 0 0 1 0 1 1 0 0 1 1 1 0 1 0 0 0 1 1 1 0 0 0 1 x 1 x 2 k 1 k 2 y 1 k 3 y 0 y 1 y 2 y 3 y 4 y 5 y 6 y 7 y 8 y 9 k 1 1 1 0 1 0 1 1 0 0 0 k 2 0 1 0 0 1 0 1 1 1 0 k 3 0 0 1 1 0 0 1 1 1 0 k 4 0 0 1 0 1 1 0 0 1 1 x 3 k 4 y 9 8

UML Models Design Model <<focus>> Main <<focus>> -app +create() : Main 1 +destroy() : void -app +main() : int +create() : Main 1 +destroy() : void +main() : int boolean[] x=new boolean[nx]; boolean[] y=new boolean[ny]; Bnn net = new Bnn(); net.init_x(x); if(net.calculate()) { y=net.get_y(); } destroy net; destroy y; destroy x; return 0; calculate <<auxiliary>> -bnn Bnn 1 calculate -bnn 1 return (!a&&!c a&&!b&&c +create() : Bnn a&&b&&!c); +destroy() : void +calculate() : boolean return k01 k02; k01=k1(); k02=k2(); y00=y0(); y01=y1(); y09=y9(); return true; y[0]=y00; y[1]=y01; y[9]=y09; return y; a=inputs[0]; b=inputs[1]; c=inputs[2]; <<auxiliary>> Bnn +a : boolean +b : boolean +c : boolean +y00 : boolean +y01 : boolean +create() : Bnn +destroy() : void +k1() : boolean +k2() : boolean +() +y0() : boolean +y1() : boolean +() +calculate() : boolean +init_x( x : boolean[] ) : void +get_y() : boolean[] 9

UML Models Deployment Model Implementations platforms: C++ VHDL <<ImplementationPlatform>> C++ <<implement>> <<ImplementationPlatform>> VHDL <<implement>> Hardware platform: Pentium IV processor 2.4 GHz Xilinx Virtex-II FPGA 3 million gates 100 MHz Communication Path: PCI-Bus 33 MHz <<SystemMaster>> h0 <<deploy>> <<executable>> main.exe - master 1 <<manifest>> Main <<realize>> <<focus>> Main Communication Path BNN <<realize>> <<auxiliary>> Bnn -slave 1 <<manifest>> <<FPGA>> h1 <<deploy>> <<Configuration>> bnn.bit 10

Experiment results Device Utilization Summary compilation/synthesis time: 3-5 minutes Logic Utilization Used for Bnn # Slices: 64 (49) # Flip Flops: 92 (79) # LUTs: 91 (56) # IOBs: 102 Bnn::calculate: 21(14) LUTs, execution time: 0.200 µs Method # Slices #Flip Flops #4-input LUTs Bnn::calculate 18 27 21 Bnn::create 1 1 0 Bnn::destroy 1 1 0 Bnn::k1 4 5 7 Bnn::k2 3 4 5 Bnn::k3 2 4 3 Bnn::k4 3 5 4 Bnn::y0 2 3 2 Bnn::y1 2 3 3 Bnn::y2 2 3 3 Bnn::y3 2 3 3 Bnn::y4 2 3 3 Bnn::y5 2 3 3 Bnn::y6 2 4 3 Bnn::y7 2 3 3 Bnn::y8 2 4 3 Bnn::y9 2 3 2 11

Experiment results Technology schematic of Bnn::k4() 12

Conclusion Results (1) UML based hardware/software co-design of Boolean neural networks, (2) decreasing of the required number of configurable logic blocks (CLB) for the realizing of Boolean neuron, (3) Boolean neuron can be mapped directly to lookup table (LUT) and configurable logic block (CLB) of FPGAs, (4) efficient FPGA implementations of BNNs in terms of performance and gate count. 13

Conclusion Future work optimal presentation of Boolean functions by BNNs, automated hardware/software synthesis with MOCCA and UML, optimization of FPGA implementation of Boolean neural networks, design and develop of mapping methodology for Boolean neural networks with on-chip learning. 14