Genetic Improvement and Approximation: From Hardware to Software

Similar documents
Overview. Swarms in nature. Fish, birds, ants, termites, Introduction to swarm intelligence principles Particle Swarm Optimization (PSO)

ON SUITABILITY OF FPGA BASED EVOLVABLE HARDWARE SYSTEMS TO INTEGRATE RECONFIGURABLE CIRCUITS WITH HOST PROCESSING UNIT

npsolver A SAT Based Solver for Optimization Problems

Testing & Verification of Digital Circuits ECE/CS 5745/6745. Hardware Verification using Symbolic Computation

Lecture 7: NP-Complete Problems

A single register, called the accumulator, stores the. operand before the operation, and stores the result. Add y # add y from memory to the acc

Gates, Circuits, and Boolean Algebra

1. True or False? A voltage level in the range 0 to 2 volts is interpreted as a binary 1.

Evolutionary SAT Solver (ESS)

CSE140: Midterm 1 Solution and Rubric

Lecture 8: Synchronous Digital Systems

Fault Modeling. Why model faults? Some real defects in VLSI and PCB Common fault models Stuck-at faults. Transistor faults Summary

Introducing A Cross Platform Open Source Cartesian Genetic Programming Library

NP-Completeness I. Lecture Overview Introduction: Reduction and Expressiveness

Why? A central concept in Computer Science. Algorithms are ubiquitous.

Chapter 1. NP Completeness I Introduction. By Sariel Har-Peled, December 30, Version: 1.05

Lecture 5: Gate Logic Logic Optimization

Digital Logic Design. Basics Combinational Circuits Sequential Circuits. Pu-Jen Cheng

Sequential Logic. (Materials taken from: Principles of Computer Hardware by Alan Clements )

Bounded Treewidth in Knowledge Representation and Reasoning 1

Car Insurance. Havránek, Pokorný, Tomášek

Example-driven Interconnect Synthesis for Heterogeneous Coarse-Grain Reconfigurable Logic

A Parallel Processor for Distributed Genetic Algorithm with Redundant Binary Number

exclusive-or and Binary Adder R eouven Elbaz reouven@uwaterloo.ca Office room: DC3576

Discuss the size of the instance for the minimum spanning tree problem.

Xilinx ISE. <Release Version: 10.1i> Tutorial. Department of Electrical and Computer Engineering State University of New York New Paltz

Similarity Search in a Very Large Scale Using Hadoop and HBase

Product Development Flow Including Model- Based Design and System-Level Functional Verification

Design and Development of Virtual Instrument (VI) Modules for an Introductory Digital Logic Course

IMPLEMENTATION OF BACKEND SYNTHESIS AND STATIC TIMING ANALYSIS OF PROCESSOR LOCAL BUS(PLB) PERFORMANCE MONITOR

CSE140: Components and Design Techniques for Digital Systems

Digital Systems. Syllabus 8/18/2010 1

Digital Electronics Detailed Outline

Introduction To Genetic Algorithms

Chapter 2: Boolean Algebra and Logic Gates. Boolean Algebra

FORDHAM UNIVERSITY CISC Dept. of Computer and Info. Science Spring, Lab 2. The Full-Adder

Upon completion of unit 1.1, students will be able to

Approximation Algorithms

Product Geometric Crossover for the Sudoku Puzzle

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. !-approximation algorithm.

Improving SAT-Based Weighted MaxSAT Solvers CP 2012 (Quebec)

Chapter 1. Computation theory

International Journal of Electronics and Computer Science Engineering 1482

Reductions & NP-completeness as part of Foundations of Computer Science undergraduate course

Optimised Realistic Test Input Generation

The Role of Size Normalization on the Recognition Rate of Handwritten Numerals

Karnaugh Maps & Combinational Logic Design. ECE 152A Winter 2012

Performance Test of Local Search Algorithms Using New Types of

United States Naval Academy Electrical and Computer Engineering Department. EC262 Exam 1

IE1204 Digital Design F12: Asynchronous Sequential Circuits (Part 1)

Chapter 2 Logic Gates and Introduction to Computer Architecture

DDS. 16-bit Direct Digital Synthesizer / Periodic waveform generator Rev Key Design Features. Block Diagram. Generic Parameters.

Circuits and Boolean Expressions

Fast Matching of Binary Features

EXPERIMENT 8. Flip-Flops and Sequential Circuits

Genetic Algorithms for the Variable Ordering Problem of Binary Decision Diagrams

DESIGN OF GATE NETWORKS

ENGI 241 Experiment 5 Basic Logic Gates

Verification & Design Techniques Used in a Graduate Level VHDL Course

University of St. Thomas ENGR Digital Design 4 Credit Course Monday, Wednesday, Friday from 1:35 p.m. to 2:40 p.m. Lecture: Room OWS LL54

Architecture bits. (Chromosome) (Evolved chromosome) Downloading. Downloading PLD. GA operation Architecture bits

Chapter 7 Memory and Programmable Logic

Hardware Implementations of RSA Using Fast Montgomery Multiplications. ECE 645 Prof. Gaj Mike Koontz and Ryon Sumner

Applied Algorithm Design Lecture 5

AUTOMATIC ADJUSTMENT FOR LASER SYSTEMS USING A STOCHASTIC BINARY SEARCH ALGORITHM TO COPE WITH NOISY SENSING DATA

6 Creating the Animation

Big Data Analytics CSCI 4030

Programmable Logic Controllers Definition. Programmable Logic Controllers History

NEW adder cells are useful for designing larger circuits despite increase in transistor count by four per cell.

The programme will be delivered by the faculty of respective NITs with additional support from practicing executives from the IT & ITES industry.

Systems I: Computer Organization and Architecture

Design Example: Counters. Design Example: Counters. 3-Bit Binary Counter. 3-Bit Binary Counter. Other useful counters:

Technical Aspects of Creating and Assessing a Learning Environment in Digital Electronics for High School Students

Binary Adders: Half Adders and Full Adders

LFSR BASED COUNTERS AVINASH AJANE, B.E. A technical report submitted to the Graduate School. in partial fulfillment of the requirements

BOOLEAN ALGEBRA & LOGIC GATES

Mathematics (Project Maths)

Machine Learning for Medical Image Analysis. A. Criminisi & the InnerEye MSRC

Lab 1: Full Adder 0.0

Efficient Load Balancing by Adaptive Bypasses for the Migration on the Internet

JUST THE MATHS UNIT NUMBER 8.5. VECTORS 5 (Vector equations of straight lines) A.J.Hobson

ISSN: ISO 9001:2008 Certified International Journal of Engineering Science and Innovative Technology (IJESIT) Volume 2, Issue 3, May 2013

Let s put together a Manual Processor

A bachelor of science degree in electrical engineering with a cumulative undergraduate GPA of at least 3.0 on a 4.0 scale

8-Bit Flash Microcontroller for Smart Cards. AT89SCXXXXA Summary. Features. Description. Complete datasheet available under NDA

MuACOsm A New Mutation-Based Ant Colony Optimization Algorithm for Learning Finite-State Machines

Chapter 4 Register Transfer and Microoperations. Section 4.1 Register Transfer Language

Guessing Game: NP-Complete?

Detection and Restoration of Vertical Non-linear Scratches in Digitized Film Sequences

UNSUPERVISED MACHINE LEARNING TECHNIQUES IN GENOMICS

MSME TOOL ROOM, HYDERABAD CENTRAL INSTITUTE OF TOOL DESIGN

Acommonvericationproblemforhardwaredesignsistodetermineifevery

Hardware and Software

Online Algorithms: Learning & Optimization with No Regret.

Contents. System Development Models and Methods. Design Abstraction and Views. Synthesis. Control/Data-Flow Models. System Synthesis Models

ABB i-bus EIB Logic Module LM/S 1.1

Cartesian Genetic Programming for Trading: A Preliminary Investigation

Life Cycle of a Memory Request. Ring Example: 2 requests for lock 17

MODIFIED RECONSTRUCTABILITY ANALYSIS FOR MANY-VALUED FUNCTIONS AND RELATIONS

Transcription:

Genetic Improvement and Approximation: From Hardware to Software Lukáš Sekanina Brno University of Technology, Faculty of Information Technology Brno, Czech Republic sekanina@fit.vutbr.cz CREST COW 45, London, January 25-26, 2016

Genetic improvement and genetic approximation error acceptable error increase genetic approximation initial solution power genetic improvement 2

Motivation for Approximate computing Variability of circuit parameters for technology nodes < 45 nm is very HIGH Low-power computing, but with unreliable components! High performance & low power computing is requested. Many applications are errorresilient - the error can be traded for energy savings or performance. Error as a design metric! Functional approximation by means of Genetic Improvement Search for "approximate computing" in articles by Google Scholar (Jan, 2016) 3

Outline Genetic improvement of complex digital circuits Genetic approximation of complex digital circuits Genetic approximation of elementary SW functions for microcontrollers: Median Conclusions 4

HDL Hardware Description Languages.i 14 alu4.pla.o 8.ilb i_0_ i_1_ i_2_ i_3_ i_4_ i_5_ i_6_ i_7_ i_8_ i_9_ i_10_ i_11_ i_12_ i_13_.ob o_0_ o_1_ o_2_ o_3_ o_4_ o_5_ o_6_ o_7_.p 1028 1----1---1---- 10000000 Truth table 1----0----1--- 10000000 1--------11--- 10000000-1----1--1---- 10000000-1----0---1--- 10000000-1-------11--- 10000000 --1----1-1---- 10000000 etc..model./pla/alu4 alu4.blif.inputs i_0_ i_10_ i_11_ i_12_ i_13_ i_1_ i_2_ i_3_ i_4_ i_5_ i_6_ i_7_ i_8_ i_9_.outputs o_0_ o_1_ o_2_ o_3_ o_4_ o_5_ o_6_ o_7_.gate NAND A=i_2_ B=_net203568 O=_net196167 Netlist.gate NAND A=i_11_ B=_net203428 O=_net196385.gate OR A=_net204803 B=_net200095 O=o_5_.gate NOR A=i_0_ B=i_12_ O=_net196891.gate NAND A=_net203823 B=_net196167 O=_net198561.gate NAND A=i_1_ B=_net198561 O=_net198562 etc. VHDL 5

Digital circuit design with Cartesian GP [Miller 1999] Example: CGP parameters n r =3 (#rows) n c = 3 (#columns) n i = 3 (#inputs) n o = 2 (#outputs) n a = 2 (max. arity) L = 3 (level-back parameter) = {NAND (0), NOR (1), XOR (2), AND (3), OR (4), NOT (5) } Mutation-based (1+ ) EA NETLIST = GENOTYPE Typical fitness function (circuit functionality): K Number of test vectors f = yi wi i=1 Desired response Circuit response K = 2 inputs for combinational circuits. Max: ~20 inputs Max: ~ tens of gates No scalable!!! 6

Functionality: Two types of specifications Complete specifications A correct output value is requested for every possible input (e.g. for arithmetic circuits) 2 n test cases used to evaluate an n-input circuit Impossible to improve the functionality of a correct solution, only non-functional parameters can be improved. Incomplete specifications It is difficult to define correct output values for all possible inputs, e.g. filters, classifiers, predictors, A circuit with an acceptable error is sought using a training set of k test cases, k << 2 n GI can improve functional and non-functional parameters. Error = 0 7

Genetic improvement for complete specifications Original circuit C (BLIF) Conventional synthesis (ABC, SIS ) Optimized circuit C1 (= a seed for the initial population; reference circuit) CGP Even more optimized C1 SAT solver is used to decide whether candidate circuit C i and reference circuit C1 are functionally equivalent. If so, then fitness(c i ) = the number of gates in C i ; Otherwise: discard C i. [Vašíček, Sekanina: Genetic Programming and Evolvable Machines 12(3), 2011] 8

Creating an auxiliary circuit G C1 (parent):? C2 (offspring): G: a b a b xor 0 0 0 0 1 1 1 0 1 1 1 0 If C1 and C2 are not functionally equivalent then there is at least one assignment to the inputs for which the output of G is 1. 9

Tseitin transform to create CNF for circuit G 1 2 3 4 5 8 9 6 10 7 Example: y = not (x) CNF formula g(x, y) = 1 if the predicate y = OP(x) holds true 11 x y g 0 0 0 0 1 1 12 1 0 1 13 1 1 0 g = (~x ~y)(x y) 10

SAT solver in action 1 2 3 4 5 6 7 11 13 10 12 8 9 SAT solver: MiniSAT variables: 13, clauses: 30, time elapsed: 0.03ms result: SATISFIABLE / NONEQUIVALENT model / counter example: 0011111101011 11

Experiment 1: Minimization of the number of gates CGP + SAT solver: ES(1+1), 1 mut/chrom, seed: SIS, Gate set: {AND, OR, NOT, NAND, NOR, XOR} 100 runs (12 hours each) Average area improvement: 25% ABC, SIS conventional open academic synthesis tools very fast (seconds, minutes) C1, C2, C3 commercial synthesis tools [Vašíček, Sekanina: DATE 2011] 12

Experiment 1: Convergence curves max mean min More time better results in the case of CGP Current circuit synthesis and optimization tools provide far from optimum circuits! 13

Experiment 2: SAT solving combined with simulation SAT solver is called only if the circuit simulation performed for a small subset of vectors has indicated no error in the candidate circuit. - the number of gates (optimized by ABC) 100 test circuits 100 combinational circuits ( 15 inputs) - IWLS2005, MCNC, QUIP benchmarks Heavily optimized by ABC 1: alcom (N G = 106 gates; N PI = 15 inputs; N PO = 38 outputs) 100: ac97ctrl (N G = 16,158; N PI = 2,176; N PO = 2,136) 14

Experiment 2: SAT solving combined with simulation CGP + SAT solver + circuit simulation Y-axis: Gate reduction w.r.t. ABC after 15 minutes, 34% on average Gate reduction w.r.t. ABC after 24 hours [Vašíček Z.: EuroGP 2015] 15

Genetic approximation Original circuit Quality metric GP Approximate circuit Relaxed equivalence checking is needed for approximate computing What is the distance between functionality of two circuits? How to calculate this distance for complex circuits when a simulation using a data set is not accurate? The Hamming distance can be obtained using Binary Decision Diagrams for (many useful) complex circuits in a short time! 16

Binary Decision Diagrams (BDD) f = ac + bc a b c f 0 0 0 0 0 0 1 0 0 1 0 0 0 1 1 1 1 0 0 0 1 0 1 1 1 1 0 0 1 1 1 1 Truth table c b f a 0 0 0 1 0 1 0 1 Decision tree 1 edge 0 edge b c c c f= (a+b)c b 0 a c 1 Reduced Ordered BDD (ROBDD) Operations over (RO)BDDs implemented by many libraries, e.g. Buddy. 17

Hamming distance using ROBDD SatCount(z 1 ) = 2 SatCount(z 2 ) = 0 Create ROBDD for the parent circuit C A, the offspring circuit C B and the XOR gates. The error is the average Hamming distance SatCount(z i ) 18

Circuit approximation: Example error/area only error/delay only single run global Pareto front Clmb (bus interface): 46 inputs, 33 outputs Original clmb: 641 gates, 19 logic levels, BDD = 6966, BDD opt = 627 (SIFT in 2.3 s) Optimized by CGP (no error allowed): Best: 410 gates, 12 logic levels -- in 29 minutes (2.9 x 10 6 generations) Median: 442 gates, 13 logic levels Properly optimize before doing approximations! 19

Detailed error analysis for itc_b10 circuit Z. Vašíček and L. Sekanina. Evolutionary Design of Complex Approximate Combinational Circuits. Genetic Programming & Evolvable Machines, 2016, in press. 20

The median function corrupted image (10% pixels, impulse noise) filtered image (9-input median filter) original 21

Median as a comparator network pixelvalue opt_med9 (pixelvalue * p) { } PIX_SORT(p[1], p[2]) ; PIX_SORT(p[4], p[5]) ; PIX_SORT(p[7], p[8]) ; PIX_SORT(p[0], p[1]) ; PIX_SORT(p[3], p[4]) ; PIX_SORT(p[6], p[7]) ; PIX_SORT(p[1], p[2]) ; PIX_SORT(p[4], p[5]) ; PIX_SORT(p[7], p[8]) ; PIX_SORT(p[0], p[3]) ; PIX_SORT(p[5], p[8]) ; PIX_SORT(p[4], p[7]) ; PIX_SORT(p[3], p[6]) ; PIX_SORT(p[1], p[4]) ; PIX_SORT(p[2], p[5]) ; PIX_SORT(p[4], p[7]) ; PIX_SORT(p[4], p[2]) ; PIX_SORT(p[6], p[4]) ; PIX_SORT(p[4], p[2]) ; return(p[4]) ; #define PIX_SORT(a,b) { if ((a)>(b)) PIX_SWAP((a),(b)); } Approximations conducted by means of CGP (and training images): 20% instructions 60% instructions 100 % instructions 22

Approximate 9-median as SW for microcontrollers 34.9% error prob., max. error dist. 2 52% power reduction 4.8% error prob., max. error dist. 1 21% power reduction fully-working median ops = operations in the source code. #define PIX_SORT(a,b) { if ((a)>(b)) PIX_SWAP((a),(b)); } V. Mrazek, Z. Vasicek and L. Sekanina. GECCO GI Workshop, 2015 23

Conclusions Genetic improvement and genetic approximation introduced in the context of circuits described as netlists. Complete and incomplete specifications considered. The notion of relaxed equivalence checking was introduced. Future work Efficient methods of relaxed equivalence checking SAT-based, BDD-based, pseudo-boolean polynomial representation-based etc. Efficient search methods exploiting properties of a particular relaxed equivalence checking method Real-world case studies 24

Acknowledgement EHW group at Brno University of Technology Zdeněk Vašíček, Michal Bidlo, Roland Dobai Michaela Šikulová, Radek Hrbáček, Vojtěch Mrázek, David Grochol, and other students Research projects IT4Innovations Centre of Excellence National supercomputing center Advanced Methods for Evolutionary Design of Complex Digital Circuits, 2014 2016 (Czech Science Foundation) Relaxed equivalence checking for approximate computing, 2016 2018 (Czech Science Foundation) 25

Thank you for your attention! Lukáš Sekanina Brno University of Technology, Faculty of Information Technology Brno, Czech Republic sekanina@fit.vutbr.cz