Compiling PCRE to FPGA for Accelerating SNORT IDS



Similar documents
Exploiting Stateful Inspection of Network Security in Reconfigurable Hardware

DRAFT Gigabit network intrusion detection systems

BITWISE OPTIMISED CAM FOR NETWORK INTRUSION DETECTION SYSTEMS. Sherif Yusuf and Wayne Luk

Configurable String Matching Hardware for Speeding up Intrusion Detection. Monther Aldwairi*, Thomas Conte, Paul Franzon

Bricata Next Generation Intrusion Prevention System A New, Evolved Breed of Threat Mitigation

UNITE: Uniform hardware-based Network Intrusion detection Engine

MIDeA: A Multi-Parallel Intrusion Detection Architecture

A Fast Pattern-Matching Algorithm for Network Intrusion Detection System

Reconfigurable Architecture Requirements for Co-Designed Virtual Machines

FPGA-based Multithreading for In-Memory Hash Joins

10 Gbps Line Speed Programmable Hardware for Open Source Network Applications*

Open Flow Controller and Switch Datasheet

FPGA-based MapReduce Framework for Machine Learning

THE growth and success of the Internet has made it

ANNEX. to the. Commission Delegated Regulation

Networking Virtualization Using FPGAs

FLIX: Fast Relief for Performance-Hungry Embedded Applications

respectively. Section IX concludes the paper.

Best Practises for LabVIEW FPGA Design Flow. uk.ni.com ireland.ni.com

Hardware and Software

How To Design An Image Processing System On A Chip

CFD Implementation with In-Socket FPGA Accelerators

DNA Mapping/Alignment. Team: I Thought You GNU? Lars Olsen, Venkata Aditya Kovuri, Nick Merowsky

Enhance Service Delivery and Accelerate Financial Applications with Consolidated Market Data

Systolic Computing. Fundamentals

Hardware Components in Cybersecurity Education

Delivering 160Gbps DPI Performance on the Intel Xeon Processor E Series using HyperScan

IMPLEMENTATION OF FPGA CARD IN CONTENT FILTERING SOLUTIONS FOR SECURING COMPUTER NETWORKS. Received May 2010; accepted July 2010

Extending the Power of FPGAs. Salil Raje, Xilinx

Digitale Signalverarbeitung mit FPGA (DSF) Soft Core Prozessor NIOS II Stand Mai Jens Onno Krah

Firewalls and IDS. Sumitha Bhandarkar James Esslinger

RAPID PROTOTYPING OF DIGITAL SYSTEMS Second Edition

5 Things You Need to Know About Deep Packet Inspection (DPI)

Aho-Corasick FSM Implementation for

How to Build a Massively Scalable Next-Generation Firewall

Architectures and Platforms

Seeking Opportunities for Hardware Acceleration in Big Data Analytics

1000Mbps Ethernet Performance Test Report

All Programmable Logic. Hans-Joachim Gelke Institute of Embedded Systems. Zürcher Fachhochschule

Advanced Computer Architecture-CS501. Computer Systems Design and Architecture 2.1, 2.2, 3.2

Data Center and Cloud Computing Market Landscape and Challenges

Router Architectures

Denial of Service (DOS) Testing IxChariot

Improving the Database Logging Performance of the Snort Network Intrusion Detection Sensor

7a. System-on-chip design and prototyping platforms

A single register, called the accumulator, stores the. operand before the operation, and stores the result. Add y # add y from memory to the acc

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-12: ARM

SCAMPI Programmable hardware for network monitoring. Masaryk University

Embedded System Hardware - Processing (Part II)

Design of Digital Circuits (SS16)

Software Pipelining. for (i=1, i<100, i++) { x := A[i]; x := x+1; A[i] := x

Virtualized Security: The Next Generation of Consolidation

Management Challenge. Managing Hardware Assets. Central Processing Unit. What is a Computer System?

Lesson 7: SYSTEM-ON. SoC) AND USE OF VLSI CIRCUIT DESIGN TECHNOLOGY. Chapter-1L07: "Embedded Systems - ", Raj Kamal, Publs.: McGraw-Hill Education

Multi Stage Filtering

FPGA Implementation of Network Security System Using Counting Bloom Filter

Computer Architecture Lecture 2: Instruction Set Principles (Appendix A) Chih Wei Liu 劉 志 尉 National Chiao Tung University

Securing the Intelligent Network

Lecture 3: Modern GPUs A Hardware Perspective Mohamed Zahran (aka Z) mzahran@cs.nyu.edu

Overview. CISC Developments. RISC Designs. CISC Designs. VAX: Addressing Modes. Digital VAX

Algorithms for Advanced Packet Classification with Ternary CAMs

Intrusion Detection Architecture Utilizing Graphics Processors

Instruction Set Architecture. or How to talk to computers if you aren t in Star Trek

Nutaq. PicoDigitizer 125-Series 16 or 32 Channels, 125 MSPS, FPGA-Based DAQ Solution PRODUCT SHEET. nutaq.com MONTREAL QUEBEC

LSN 2 Computer Processors

Aims and Objectives. E 3.05 Digital System Design. Course Syllabus. Course Syllabus (1) Programmable Logic

Bivio 7000 Series Network Appliance Platforms

Achieving Low-Latency Security

ON SUITABILITY OF FPGA BASED EVOLVABLE HARDWARE SYSTEMS TO INTEGRATE RECONFIGURABLE CIRCUITS WITH HOST PROCESSING UNIT

PCI Express Overview. And, by the way, they need to do it in less time.

picojava TM : A Hardware Implementation of the Java Virtual Machine

Spacecraft Computer Systems. Colonel John E. Keesee

Optimizing Pattern Matching for Intrusion Detection

How To Design An Intrusion Prevention System

Lizy Kurian John Electrical and Computer Engineering Department, The University of Texas as Austin

9/14/ :38

The new frontier of the DATA acquisition using 1 and 10 Gb/s Ethernet links. Filippo Costa on behalf of the ALICE DAQ group

Infrastructure Matters: POWER8 vs. Xeon x86

Emerging storage and HPC technologies to accelerate big data analytics Jerome Gaysse JG Consulting

IA-64 Application Developer s Architecture Guide

if-then else : 2-1 mux mux: process (A, B, Select) begin if (select= 1 ) then Z <= A; else Z <= B; end if; end process;

Network Traffic Monitoring an architecture using associative processing.

CHAPTER 7: The CPU and Memory

Transcription:

Compiling PCRE to FPGA for Accelerating SNORT IDS Abhishek Mitra Walid Najjar Laxmi N Bhuyan QuickTime and a QuickTime and a decompressor decompressor are needed to see this picture. are needed to see this picture.

Outline FPGA Introduction NIDS and SNORT Regular Expressions and PCRE PCRE to FPGA via OPCODES Hardware Details Performance Conclusion

Field Programmable Gate Array Silicon devices, Millions of Logic Gates Connected by Programmable interconnects Can efficiently Abstract a Data path by eliminating load-stores and branch instructions Emerging platforms include SGI RASC, XD 1000, Intel QuickAssist Hardware, etc.

a PROCESSOR vs an FPGA (the Highway analogy) Processor (Dual core) Field Programmable Gate Array SPEED LIMIT 100 SPEED LIMIT 10 2 parallel Highways 200 parallel Highways Throughput = 2 x 100 = 200. Throughput = 200 X 10 = 2000! What an FPGA loses in speed, it more than makes it up with parallelism Obtaining two orders of speedup is easily obtainable on an FPGA FPGAs are programmed with Hardware Description Languages

Network Intrusion Detection System (IDS) QuickTime and a decompressor are needed to see this picture. QuickTime and a decompressor are needed to see this picture. Detects and filters unwanted network packets (worms, spam, etc) Inspects payload as it enters or leaves a network, with set of rules Highly processor intensive with increasing number of rules

Network Intrusion Detection System (IDS) The IDS may become a bottleneck 10Mbps delivered by a Pentium 4 CPU Compare that to 10Gbps throughput of typical networks

SNORT IDS rules and REGULAR EXPRESSIONS SNORT IDS rules are used to capture signatures of malicious activity on the network (e.g. WORM activity) A rule written as a regular expression is compact, powerful and highly expressible (one rule matches multiple possible strings) SNORT IDS uses the PCRE (PERL compatible regular expressions) as the language in which the rules are written

SNORT and PCRE SNORT uses PCRE to match network packets on regular expression based signatures PCRE is an open source software that compiles and matches PERL regular expressions

Regular Expressions and Finite Automata A Regular Expression can be implemented as an Finite Automata A processor can execute only one finite automata per core An FPGA can execute hundreds of finite automata in parallel! Possible (Speedup!) over a Processor

Regular Expression and Finite automata on FPGA A regular expression viz. 1*(01*01*)* Can Identify Even number of zeros in a string composed of alphabets 0 and 1 The equivalent Finite Automata of this regular expression can be implemented in hardware Each state of the automata is encoded using one bit The transitions are encoded using two bits

Compiling Regular Expression to OPCODES We utilize the PCRE complier to obtain OP Codes corresponding to regular expression operators in the SNORT rules

Generating Hardware from PCRE OPCODES We compile the OPCODES obtained to VHDL Each OPCODE corresponds to a VHDL template The template is filled, based on additional parameters accompanying each OPCODE

Generating Hardware from OPCODES The VHDL blocks are tied together as an NFA (Non deterministic Finite Automata) Additional hardware connects the NFA to the memory controller Memory controller obtains network payload form the host CPU and transfers it to the NFAs

SNORT Rule to PCRE OPCODES v7.0 Example Rule snippet /^NetBus\s+\d+/ After Compilation 80 0 20 19 21 78 21 101 21 116 21 66 Start ^ -> N -> e -> t -> b 21 117 21 115 44 8 44 6 0 20 0 -> u -> s + \s + \d END OPCODES are the common intermediate representation for software or hardware execution

An example VHDL Block generated by compiling the opcodes if (clk'event and clk = '1') then if (start = '1') then char1_1 <= mem(conv_integer(nfa1));--n char2_1 <= mem(conv_integer(nfa1)+1);--e char3_1 <= mem(conv_integer(nfa1)+2);--t char4_1 <= mem(conv_integer(nfa1)+3);--b char5_1 <= mem(conv_integer(nfa1)+4);--u char6_1 <= mem(conv_integer(nfa1)+5);--s if ((char1_1 = conv_std_logic_vector(78, 8)) and (char2_1 = conv_std_logic_vector(101, 8)) and (char3_1 = conv_std_logic_vector(116, 8)) and (char4_1 = conv_std_logic_vector(98, 8)) and (char5_1 = conv_std_logic_vector(117, 8)) and (char6_1 = conv_std_logic_vector(115, 8)) ) then match_1 <= '1'; else match_1 <= '0'; end if; end if; end if;

CDF of Opcodes from regexes in SNORT DB 2.6 compiled with PCRE v7.1 40000 35000 30000 25000 20000 15000 10000 5000 0 22 79 89 78 46 72 38 51 64 47 60 88 53 39 48 75 20 8 61 70 66 73 59 97 52 29 11 6 68 82 21 80 83 65 0 4 5 9 10 30 33 44 57 The most frequently occurring OP Code corresponding to a single character match (OPCODE 22)

The Finite Automata on hardware The NFA Multiple NFA engines are implemented to match multiple SNORT rules on a network payload

The SGI RASC RC 100 BLADE QuickTime and a decompressor are needed to see this picture. The RASC Blade contains two XILINX Virtex-4 LX 200 FPGA and provides upto 7.2 GByte/s throughput

Architecture of 214 NFA Engines on a Virtex 4 LX 200 FPGA QuickTime and a decompressor are needed to see this picture. QuickTime and a decompressor are needed to see this picture.

Throughput and Speedup It is possible to obtain 12.9 Gbps throughput on the RASC RC-100 hardware

Comparison with related systems The Compiled PCRE Op Codes on SGI RASC RC-100 provides the highest throughput among other related FPGA platforms used for NIDS

Conclusion Compiled PCRE OPCODES to VHDL Implemented Fast NFA based Regular Expression engines on FPGA platform Regex engines Operate at 12.9 Gbps for efficient 10GbE IDS

Future Work Utilize multiple FPGAs to encompass all the SNORT rule-sets Use streaming data flow for transferring payload to the FPGA Implement all the OPCODES in hardware

Q&A THANK YOU!

result = pcre_exec(pcre_data->re, /* result of pcre_compile() */ pcre_data->pe, /* result of pcre_study() */ buf, /* the subject string */ len, /* the length of the subject string */ start_offset, /* start at offset 0 in the subject */ 0, /* options(handled at compile time */ ovector, /* vector for substring information */ SNORT_PCRE_OVECTOR_SIZE); /* number of elements in the vector */ if(result >= 0) { matched = 1; } else if(result == PCRE_ERROR_NOMATCH) { matched = 0; } Potential point for Speedup (i.e. implement pcre_exec in hardware)