Reconfigurable FPGA Inter-Connect For Optimized High Speed DNA Sequencing

Similar documents
Knuth-Morris-Pratt Algorithm

DNA Mapping/Alignment. Team: I Thought You GNU? Lars Olsen, Venkata Aditya Kovuri, Nick Merowsky

CHAPTER 5 FINITE STATE MACHINE FOR LOOKUP ENGINE

Configurable String Matching Hardware for Speeding up Intrusion Detection. Monther Aldwairi*, Thomas Conte, Paul Franzon

DRAFT Gigabit network intrusion detection systems

Lecture 4: Exact string searching algorithms. Exact string search algorithms. Definitions. Exact string searching or matching

Fast string matching

Binary Coded Web Access Pattern Tree in Education Domain

Face Reorganization Method for Cyber Security System in Financial Sector Using FPGA Implementation

Image Compression through DCT and Huffman Coding Technique

BITWISE OPTIMISED CAM FOR NETWORK INTRUSION DETECTION SYSTEMS. Sherif Yusuf and Wayne Luk

How To Fix Out Of Focus And Blur Images With A Dynamic Template Matching Algorithm

ON SUITABILITY OF FPGA BASED EVOLVABLE HARDWARE SYSTEMS TO INTEGRATE RECONFIGURABLE CIRCUITS WITH HOST PROCESSING UNIT

Lossless Grey-scale Image Compression using Source Symbols Reduction and Huffman Coding

Compiling PCRE to FPGA for Accelerating SNORT IDS

FPGA Implementation of an Extended Binary GCD Algorithm for Systolic Reduction of Rational Numbers

Implementation and Design of AES S-Box on FPGA

Floating Point Fused Add-Subtract and Fused Dot-Product Units

IP address lookup for Internet routers using cache routing table

Original-page small file oriented EXT3 file storage system

Systolic Computing. Fundamentals

Reconfigurable Architecture Requirements for Co-Designed Virtual Machines

Design and Implementation of an On-Chip timing based Permutation Network for Multiprocessor system on Chip

An efficient matching algorithm for encoded DNA sequences and binary strings

UNITE: Uniform hardware-based Network Intrusion detection Engine

Best Practises for LabVIEW FPGA Design Flow. uk.ni.com ireland.ni.com

Improved Single and Multiple Approximate String Matching

Storage Optimization in Cloud Environment using Compression Algorithm

Attack graph analysis using parallel algorithm

Hadoop Technology for Flow Analysis of the Internet Traffic

Index Terms Domain name, Firewall, Packet, Phishing, URL.

So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)

In-Memory Databases Algorithms and Data Structures on Modern Hardware. Martin Faust David Schwalb Jens Krüger Jürgen Müller

Reconfigurable Low Area Complexity Filter Bank Architecture for Software Defined Radio

Efficient Iceberg Query Evaluation for Structured Data using Bitmap Indices

A Comparison of Dictionary Implementations

CS 2112 Spring Instructions. Assignment 3 Data Structures and Web Filtering. 0.1 Grading. 0.2 Partners. 0.3 Restrictions

How To Use Neural Networks In Data Mining

A Novel Cloud Based Elastic Framework for Big Data Preprocessing

FPGA Implementation of IP Packet Segmentation and Reassembly in Internet Router*

A Fast Pattern Matching Algorithm with Two Sliding Windows (TSW)

IMPLEMENTATION OF FPGA CARD IN CONTENT FILTERING SOLUTIONS FOR SECURING COMPUTER NETWORKS. Received May 2010; accepted July 2010

Memory Systems. Static Random Access Memory (SRAM) Cell

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

How To Fix A 3 Bit Error In Data From A Data Point To A Bit Code (Data Point) With A Power Source (Data Source) And A Power Cell (Power Source)

Topics. Introduction. Java History CS 146. Introduction to Programming and Algorithms Module 1. Module Objectives

FPGA-based MapReduce Framework for Machine Learning

Implementation of Modified Booth Algorithm (Radix 4) and its Comparison with Booth Algorithm (Radix-2)

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

respectively. Section IX concludes the paper.

Modeling Sequential Elements with Verilog. Prof. Chien-Nan Liu TEL: ext: Sequential Circuit

MAXIMIZING RESTORABLE THROUGHPUT IN MPLS NETWORKS

Overview. Swarms in nature. Fish, birds, ants, termites, Introduction to swarm intelligence principles Particle Swarm Optimization (PSO)

A FAST STRING MATCHING ALGORITHM

SEARCH ENGINE OPTIMIZATION USING D-DICTIONARY

Wan Accelerators: Optimizing Network Traffic with Compression. Bartosz Agas, Marvin Germar & Christopher Tran

HY345 Operating Systems

Network Security Algorithms

Extending the Power of FPGAs. Salil Raje, Xilinx

A Novel Switch Mechanism for Load Balancing in Public Cloud

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

Experimental Comparison of Set Intersection Algorithms for Inverted Indexing

what operations can it perform? how does it perform them? on what kind of data? where are instructions and data stored?

COMPARISON OF ALGORITHMS FOR DETECTING FIREWALL POLICY ANOMALIES

Automata Designs for Data Encryption with AES using the Micron Automata Processor

Big Data & Scripting Part II Streaming Algorithms

Software-Defined Traffic Measurement with OpenSketch

ACCELERATING SELECT WHERE AND SELECT JOIN QUERIES ON A GPU

Hardware Pattern Matching for Network Traffic Analysis in Gigabit Environments

Eastern Washington University Department of Computer Science. Questionnaire for Prospective Masters in Computer Science Students

Parallel Compression and Decompression of DNA Sequence Reads in FASTQ Format

Physical Data Organization

Xeon+FPGA Platform for the Data Center

An Open Architecture through Nanocomputing

Spam Detection Using Customized SimHash Function

Acceleration for Personalized Medicine Big Data Applications

Distance Degree Sequences for Network Analysis

Hypertable Architecture Overview

Fast Sequential Summation Algorithms Using Augmented Data Structures

Binary search tree with SIMD bandwidth optimization using SSE

On Demand Loading of Code in MMUless Embedded System

Accelerate Cloud Computing with the Xilinx Zynq SoC

ProTrack: A Simple Provenance-tracking Filesystem

FAULT TOLERANCE FOR MULTIPROCESSOR SYSTEMS VIA TIME REDUNDANT TASK SCHEDULING

Finding Frequent Patterns Based On Quantitative Binary Attributes Using FP-Growth Algorithm

Analysis of Compression Algorithms for Program Data

Exploiting Stateful Inspection of Network Security in Reconfigurable Hardware

Performance Oriented Management System for Reconfigurable Network Appliances

Open Flow Controller and Switch Datasheet

In-Memory Columnar Databases HyPer. Arto Kärki University of Helsinki

ENGI E1112 Departmental Project Report: Computer Science/Computer Engineering

Transcription:

Reconfigurable FPGA Inter-Connect For Optimized High Speed DNA Sequencing 1 A.Nandhini, 2 C.Ramalingam, 3 N.Maheswari, 4 N.Krishnakumar, 5 A.Surendar 1,2,3,4 UG Students, 5 Assistant Professor K.S.R College of Engineering, Tamil Nadu-637 215,India Abstract In the world of expanding set of biological species finding repetitive structures in genomes and proteins is important to understand their biological functions. If the number of maximal repeat increases then finding those structures becomes tedious. In existing method Burrows Wheeler Transform and Wavelet Coding the major disadvantage is time consumption. And it also needs huge computer space for processing the structures. In our method we are going to propose a time, memory and speed optimized algorithms for efficient repetitive finding in genomes and proteins. Keywords: Bioinformatics, pattern matching, sequence alignment, FPGA implementation, INTRODUCTION One of the well-known features of DNA is its repetitive structures.many existing methods proposed different data compression formats to reduce the space consumption. Even though the method saves memory, time and speed efficiency cannot be obtained. To obtain optimization the bioinformatics algorithms such as o Bloom Filter o Content-Addressable Memory o Knuth-Morris-Pratt Algorithm o Aho-Corasick Algorithm are used. BLOOM FILTER Bloom filter is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set. This compact representation is the payoff for allowing a small rate of false positives in membership queries; that is, queries might incorrectly recognize an element as member of the set which can be made negligible by the intensive design effort. BUILD HASH FUNCTIONS AND HASH TABLE FILL THE HASH TABLE WITH ALL ZEROS TRAIN FILTER WITH MALICIOUS STRINGS ENTER DATA TO BE SEARCHED HASH TABLE IS FILLED BASED ON THE PRESENCE OF DATA STRING FOUND Flow of Bloom Filter Volume 02 No.4, Issue: 01 Page 36

It consists of number of hash tables and hash functions that easily store and handle the incoming strings. Each hash table entry stores only a single bit of data, thus a hash table of size M would be made up of M entries, each of size one bit. A bloom filter must be trained with a dictionary of malicious strings before it can be used in a system. All hash table entries are initialized to 0 before training s. During the training phase malicious strings are fed one at a time to the bloom filter. Each of the k hash functions then acts on every incoming dictionary string and computes an output in the range 0 to M 1.Entries corresponding to these k outputs are set to 1 in the hash table. The bloom filter reports the current string in its window as a member of the dictionary on which the bloom filter was trained. If a non dictionary input string is such that it hashes to k hash table entries, each of which was set to 1 by one or more dictionary elements during the training phase, the bloom filter will erroneously report this string to be a member of the dictionary on which the bloom filter was trained. The bloom filter thus reports a false positive in this case. Though a bloom filter may occasionally report false positives, it does not allow for false negatives. Even though a bloom filter may sometimes report a non member to be a part of the dictionary set, it will never happen that a true member goes unreported. Thus, the working of bloom filter can be explained in following steps 1. Training the bloom filter with various strings 2. Defining the hash functions and building hash tables 3. Testing the incoming strings for the finding of given string with the help of hash functions 4. Determination of result using the behavior of the filter The part to hash the values to the hash table to the filter is as hash1:=hash(present_state(1),d); hash2:=hash(present_state(2),d); a<=hash1; b<=hash2; y(hash1)<='1'; y(hash2)<='1'; if(hashd_bitvectr(hash1)='1' and hashd_bitvectr(hash2)='1' )then state<="positive"; shft_out :='0'; match<=input(pos-1 to pos+4); --elsif(hashd_bitvectr(hash1)='1' or hashd_bitvectr(hash2)='1' )then --- state<="fals_pos"; -- shft_out :='1'; else state<="negative"; shft_out :='1'; end if; Output wave form of bloom filter CONTENT ADDRESSABLE MEMORY Manuscript of CAM was received on July 17, 1987; revised October 5, 1987.A content-addressable memory (CAM) is a high speed matching unit because it has parallel matching capability. It speeds up the data searching and pattern matching.cams are storage devices that allow its contents to be accessible on the basis of a match between a specified key and the contents, a process called "content addressing". CAM architectures fall between two extremes: Volume 02 No.4, Issue: 01 Page 37

the bit serial CAM and the fully parallel CAM. In the bit serial CAM, the matching logic is associated with one bit position, and shared among all the bits in a word, in effect matching one bit at-a-time simultaneously in all the CAM words. In the fully parallel CAM, each word has its own bitparallel matching logic, allowing that match of all words to process. Here initially the device is trained with certain 8 bit binary database. And then the input binary parameter is given. the number of 1 s in the parameter is extracted by parameter extraction then it is stored in the parameter memory. Then 1 s in the input parameter is compare with the trained database and produce a required result else next input parameter is given. 4. It is stored in memory 5. Compare 1 s in the given input with database. 6. Finally it produced the required input if it is matched. The code to train CAM is given as process(clk,fail,sram_out,ss) if rising_edge(clk) then tcam_in.input<=input; tcam_in.current_state<=sram_out; end if; end process; x2:tcam port map(tcam_in,tcam_out); x3:sram port map(tcam_out,sram_out,fail); process(sram_out) if sram_out=2 then output<=" pattern he matched "; elsif sram_out=9 then output<="pattern hers matched"; elsif sram_out=7 then output<="pattern his matched "; elsif sram_out=5 then output<="pattern she matched "; end if end process; Pre-computation block of CAM Thus, the working of CAM filter can be expressed in following steps. 1. The device is trained with database. 2. Input is given. 3. The number of 1 s and 0 s is extracted by parameter extraction. Output of cam Volume 02 No.4, Issue: 01 Page 38

algorithm is efficient because it minimizes the total number of comparisons of the pattern against the input string. The running time of the KMP algorithm is optimal (O(m + n)), which is very fast. The algorithm never needs to move backwards in the input text T. It makes the algorithm good for processing very large files. Output waveform of sram String Matching Of Knuth-Morris-Pratt The critical part of CAM pre-computation can be coded as Output waveform of tcam KNUTH-MORRIS-PRATT ALGORITHM This algorithm was conceived by Donald Knuth and Vaughan Pratt and independently by James H.Morris in 1977. Knuth, Morris and Pratt discovered first linear time string-matching algorithm by analysis of the naïve algorithm. It keeps the information that naive approach wasted gathered during the scan of the text. By avoiding this waste of information, it achieves a running time of O(m + n). The implementation of Knuth-Morris-Pratt m length[p] a[1] 0 k 0 for q 2 to m do while k > 0 and p[k + 1], p[q] do k a[k] end while if p[k + 1] = p[q] then k k + 1 end if a[q] k end for return u Here a = u In this KMP algorithm the device is trained with certain character parameter. Compares from left to right. The first character of input parameter is matched with trained character if it is matched it shift to the next character else the first bit is compare with the second one. Volume 02 No.4, Issue: 01 Page 39

Thus, the working of KMP filter can be expressed in following steps. 1. Best known for linear time for exact matching. 2. Compares from left to right. 3. Shifts more than one position. 4. Preprocessing approach of Pattern to avoid trivial comparisions. 5. Avoids re-computing matches. AHO-CORASICK ALGORITHM Aho-Corasick algorithm is a dictionary matching algorithm that searches for elements of a finite set of strings in the input text, developed by Alfred V. Aho and Margaret J. Corasick. Since it locates all patterns in one time, the time complexity of the algorithm is proportional to sum of the length of the patterns, length of the input text and the number of matches. In this algorithm, a trie with suffix tree-like set of links from is established from each node representing a string to the node corresponding to the longest proper suffix. Since it also consists of links from each node to the longest suffix node that connect to a match string, all of the matches can be traversed by going along the resulting linked list. The trie is utilized at runtime to keep track of the longest match and the suffix links are used to make sure the computation is proportional to the length of the input. For every link along the dictionary suffix linked list and every node in the dictionary located, a match is found. Since most of the time, the pattern database is known ahead, program can be created to build the trie, compile it and save it for later use. In this case, the computational complexity in the runtime is proportional to the sum of the length of the inputs and the number of matched entries. Figure shows an example of data structure made up from a couple of strings. Each row represents a node in the trie while each column indicates the distinct order of characters from root to the node. Dictionary In every step, the current node will try to find its child recursively if the suffix child does not exist until it reach the root node. Steps taken when scanning abccab are shown below Simulation on an input text Since there may be two or more dictionary entries at a character location in the input text, more than one dictionary suffix link may need to be followed. Volume 02 No.4, Issue: 01 Page 40

process(clk,g,a) if rising_edge(clk) then case g is when state_0 => match_vector<="00"; pattern<='0'; if a(1)='h' then g<=state_1; elsif a(1)='t' then g<=state_3; else -- g<=state_0; end if; Flow of Aho-Corasick String Matching The working of Aho-Corasick can be explained as follows. 1. The data pattern to be analyzed is built as a dictionary. 2. The pattern to be find is given as input. 3. The node built based on suffix matching. 4. The input for next string is taken from previous node. 5. Hence all the pattern is matched at same time. The Critical part of code to train is given as, architecture behave of aho_corasick is RESULT output waveform of Aho-Corasick The below table shows the requirements of optimized filters. type state_node is (state_0,state_1,state_2,state_3,state_4,state_5,state_6,stat e_7,state_8,state_9); signal g:state_node; Volume 02 No.4, Issue: 01 Page 41

PERFORMANCE RESULTS PARAME TERS AHO BLO OM CAM TCA M SRA M IOs 12 321 169 65 72 CELL 111 9004 221 41 325 USAGE FF/LATC 13 104 20 - - HES IO 12 321 169 37 72 BUFFERS TOTAL 3s 10s 5s 3s 4s REAL TIME TOTAL 3.23s 10.2s 5.4s 2.93s 4.28s CPU TIME TOTAL MEMOR Y USED 23487 2Kb 26239 Kb 24408 Kb 23212 0Kb 23346 4Kb FUTURE WORK The optimized algorithms are implemented in reconfigurable FPGA platform. The FPGA platform is in Altium Nanoboard 3000- Xilinx Spartan. The above algorithms are analyzed to be efficient from their performance. When it is implemented in a reconfigurable platform it will work in more optimized way that produces accurate outputs. The NanoBoard 3000 is a programmable design environment so it will be efficient for analysis. REFERENCES 1. Grigorios Chrysos, Agathoklis Papadopoulos, Geore Petihakis, Opportunities from the Use of FPGAs as Platforms for Bioinformatics Algorithms, Twelfth IEEE International Conference on Bioinformatics and Bioengineering (BIBE), November 2012. 2. Timothy Oliver, Bertil Schmidt, Douglas Maskell, Hyper Customized Processors for Bio- Sequence Database Scanning on FPGAs, FPGA 05, Monterey, CA, 2005. 3.. P. H. Singer, Photomask And Reticle Defect Detection, Semiconductor Int., Py; 66-73, Apr. 1985. 4. P. Burggraaf, Wafer Inspection For Defect, Semiconductor Int., Pp.56-65, July 1985. 5. J. F. Jarvis, A Method For Automating The Visual Inspection Of Printed Wiring Boards, IEEE Trans. Pattern Anal. Machine Intell.,M. Ejiri, M. Mese, T. Uno, And S. Ikeda, A Process For Detecting Defects In Complicated Patterns, Computer Graphics And Image Processing, Pp. 326-339, 1913. 6. Wikipedia. [Online]. Www.Wikipedia.Org 7. Www.Asic-World.Com_Code_Hdl_Models 8. M. O_Guzhan Ku Lekci, Jeffrey Scott Vitter, And Bojian Xu Efficient Maximal Repeat Finding Using The Burrows-Wheeler Transform And Wavelet Tree, IEEE/ACM Transactions On Computational Biology And Bioinformatics, Vol. 9, No. 2, March/April 2012 9. T. Nandha Kumar, Senior Member, IEEE, And Fabrizio Lombardi, Fellow, IEEE A Novel Heuristic Method For Application-Dependent Testing Of SRAM-Based FPGA Interconnect IEEE Transactions On Computers, Vol. 62, No. 1, January 2013 10. S.A.M. Al Junid, M.A. Haron, Z. Abd Majid, A.K. Halim, F.N.Osman, H. Hashim, "Development of Novel Data Compression Technique for Accelerate DNA Sequence Alignment Based onsmith," ems, pp.181-186, 2009 Third UKSim European Symposium on Computer Modeling and Simulation, 2009 11. Souradip Sarkar, Student Member, IEEE, Gaurav Ramesh Kulkarni, Student Member, IEEE,Partha Pratim Pande, Member, IEEE, and Ananth Kalyanaraman, Member, IEEE Network-on- Chip Hardware Accelerators for Biological Sequence Alignment IEEE Transactions On Computers, Vol. 59, No. 1, January 2010 Volume 02 No.4, Issue: 01 Page 42