Learning SQL for Database Intrusion Detection using Context-sensitive Modelling



Similar documents
Detection and Prevention of SQL Injection Attacks

Web Application Security

Web Forensic Evidence of SQL Injection Analysis

KEITH LEHNERT AND ERIC FRIEDRICH

Security Assessment of Waratek AppSecurity for Java. Executive Summary

A clustering Approach for Web Vulnerabilities Detection

Put a Firewall in Your JVM Securing Java Applications!

Securing Your Web Application against security vulnerabilities. Ong Khai Wei, IT Specialist, Development Tools (Rational) IBM Software Group

1. Introduction. 2. Web Application. 3. Components. 4. Common Vulnerabilities. 5. Improving security in Web applications

Toward A Taxonomy of Techniques to Detect Cross-site Scripting and SQL Injection Vulnerabilities

Magento Security and Vulnerabilities. Roman Stepanov

SQL INJECTION ATTACKS By Zelinski Radu, Technical University of Moldova

The Top Web Application Attacks: Are you vulnerable?

Data Mining For Intrusion Detection Systems. Monique Wooten. Professor Robila

The Cyber Threat Profiler

Adobe Systems Incorporated

CSE598i - Web 2.0 Security OWASP Top 10: The Ten Most Critical Web Application Security Vulnerabilities

VIDEO intypedia007en LESSON 7: WEB APPLICATION SECURITY - INTRODUCTION TO SQL INJECTION TECHNIQUES. AUTHOR: Chema Alonso

A Novel Approach to detect SQL injection in web applications

Web Application Security. Vulnerabilities, Weakness and Countermeasures. Massimo Cotelli CISSP. Secure

Table of Contents. Page 2/13

Expert Services Group (Security Testing) Nilesh Dasharathi Sadaf Kazi Aztecsoft Limited

ICTN Enterprise Database Security Issues and Solutions

A B S T R A C T. Index Terms: DoubleGuard; database server; intruder; web server I INTRODUCTION

Adaptive Framework for Network Traffic Classification using Dimensionality Reduction and Clustering

Preprocessing Web Logs for Web Intrusion Detection

Speedy Signature Based Intrusion Detection System Using Finite State Machine and Hashing Techniques

Intrusion detection for web applications

8070.S000 Application Security

Application and Database Security with F5 BIG-IP ASM and IBM InfoSphere Guardium

Web Application Security

Web application security

DFW INTERNATIONAL AIRPORT STANDARD OPERATING PROCEDURE (SOP)

Guidelines for Website Security and Security Counter Measures for e-e Governance Project

Check list for web developers

SQL Injection Attack. David Jong hoon An

CHAPTER 5 INTELLIGENT TECHNIQUES TO PREVENT SQL INJECTION ATTACKS

Sania: Syntactic and Semantic Analysis for Automated Testing against SQL Injection

An analysis on Blocking of SQL Injection Attacks by Comparing Static and Dynamic Queries

Analysis of SQL injection prevention using a proxy server

Cross-Site Scripting

External Network & Web Application Assessment. For The XXX Group LLC October 2012

FINAL DoIT v.4 PAYMENT CARD INDUSTRY DATA SECURITY STANDARDS APPLICATION DEVELOPMENT AND MAINTENANCE PROCEDURES

Passing PCI Compliance How to Address the Application Security Mandates

SQL INJECTION MONITORING SECURITY VULNERABILITIES IN WEB APPLICATIONS

Model Deployment. Dr. Saed Sayad. University of Toronto

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015

SQL Injection January 23, 2013

ETHICAL HACKING APPLICATIO WIRELESS110 00NETWORK APPLICATION MOBILE MOBILE0001

Creating Stronger, Safer, Web Facing Code. JPL IT Security Mary Rivera June 17, 2011

Countering SQL Injection Attacks with a Database Driver 1,2

Thick Client Application Security

Application Code Development Standards

Contemporary Web Application Attacks. Ivan Pang Senior Consultant Edvance Limited

THE SMARTEST WAY TO PROTECT WEBSITES AND WEB APPS FROM ATTACKS

Cross Site Scripting Prevention

Is Drupal secure? A high-level perspective on web vulnerabilities, Drupal s solutions, and how to maintain site security

Pentests more than just using the proper tools

05.0 Application Development

Pentests more than just using the proper tools

Log Mining Based on Hadoop s Map and Reduce Technique

NETASQ & PCI DSS. Is NETASQ compatible with PCI DSS? NG Firewall version 9

Advanced Web Security, Lab

Efficient Integration of Data Mining Techniques in Database Management Systems

Hardening Moodle. Concept and Realization of a Security Component in Moodle. a project by

INTRUSION PROTECTION AGAINST SQL INJECTION ATTACKS USING REVERSE PROXY

Web Application Security

Early Vulnerability Detection for Supporting Secure Programming

Hacking de aplicaciones Web

How I hacked PacketStorm ( )

BIG DATA What it is and how to use?

Clustering Connectionist and Statistical Language Processing

Web Application Gateway

We protect you applications! No, you don t. Digicomp Hacking Day 2013 May 16 th 2013

Securing Web Applications...at the Network Layer

Reducing Application Vulnerabilities by Security Engineering

Intrusion Detection via Machine Learning for SCADA System Protection

WEB APPLICATION FIREWALLS: DO WE NEED THEM?

Getting started with OWASP WebGoat 4.0 and SOAPUI.

IJMIE Volume 2, Issue 9 ISSN:

Liberty Alliance. CSRF Review. .NET Passport Review. Kerberos Review. CPSC 328 Spring 2009

How To Fix A Web Application Security Vulnerability

REAL-TIME WEB APPLICATION PROTECTION. AWF SERIES DATASHEET WEB APPLICATION FIREWALL

Developing the Corporate Security Architecture. Alex Woda July 22, 2009

Security Assessment through Google Tools -Focusing on the Korea University Website

Detection of SQL Injection Attacks by Combining Static Analysis and Runtime Validation

Towards More Security in Data Exchange

Webapps Vulnerability Report

Web application testing

Active Learning SVM for Blogs recommendation

WEB SECURITY. Oriana Kondakciu Software Engineering 4C03 Project

Rational AppScan & Ounce Products

Bayesian Classification for SQL Injection Detection

A Call for Drastic Action. A Survey of Web Application Firewalls

Adobe ColdFusion. Secure Profile Web Application Penetration Test. July 31, Neohapsis 217 North Jefferson Street, Suite 200 Chicago, IL 60661

Transcription:

Learning SQL for Database Intrusion Detection using Context-sensitive Modelling Martin Apel, Christian Bockermann, Michael Meier

The joke that should not be...

The joke that should not be... $name = $_POST[ name ]; // $name = Robert ); DROP TABLE Students; -- $insert = INSERT INTO STUDENTS VALUES ( $name ); ;

The joke that should not be... $name = $_POST[ name ]; // $name = Robert ); DROP TABLE Students; -- $insert = INSERT INTO STUDENTS VALUES ( $name ); ;

The joke that should not be... $name = $_POST[ name ]; // $name = Robert ); DROP TABLE Students; -- $insert = INSERT INTO STUDENTS VALUES ( $name ); ;

The joke that should not be... $name = $_POST[ name ]; // $name = Robert ); DROP TABLE Students; -- $insert = INSERT INTO STUDENTS VALUES ( $name ); ; INSERT INTO STUDENTS VALUES ( Robert ); DROP TABLE Students; -- );

Isn t SQL Injection rather old? Most attacks are focusing on XSS/client side script evaluation

Isn t SQL Injection rather old? Most attacks are focusing on XSS/client side script evaluation OWASP Top- list, still ranks SQL-Injection one of the major vulnerabilities:. Cross Site Scripting (XSS) 2. Injections Flaws (SQL-Injection,...) 3. Malicious File Execution (RFI) 4. Insecure Direct Object Reference... 5....

Isn t SQL Injection rather old? Most attacks are focusing on XSS/client side script evaluation OWASP Top- list, still ranks SQL-Injection one of the major vulnerabilities:. Cross Site Scripting (XSS) 2. Injections Flaws (SQL-Injection,...) 3. Malicious File Execution (RFI) 4. Insecure Direct Object Reference... 5.... Web Hacking Incident Database Ofer Shezaf et.al., whid.xiom.com SQL injection Hits Sensitive US Army servers WHID 29-4 245, records stolen from Orange France using SQL injection WHID 29-39 Kaspersky site breached using SQL injection, sensitive data exposed WHID 29-9 WHID 29-2: BitDefender joins Kasperski on the Breached side WHID 29-2

Related Work - ID in Databases Parse Tree Validation to prevent SQL-Injections Injected snippets do change overall structure of the query due to user-input Comparing query structures BEFORE and Using Parse Tree Validation to Prevent SQL Injection Attacks Gregory T. Buehrer, Bruce W. Weide, Paolo A.G. Sivilotti SEM '5: Proceedings of the 5th international workshop on Software engineering and middleware, ACM, 25 AFTER inserting user-data Comparison based on SQL parse tree DIWeDa - Detecting Intrusions in WebDatabases Analysing SQL-Sessions as provided by DBMS Queries generalized by keywords, replacing constants DIWeDa - Detecting Intrusions in Web-Databases Alex Roichman, Ehud Gudes Proceeedings of the 22nd annual IFIP WG.3 working conference on Data and Applications Security, 28

Intrusion Detection in DB Idea: Using machine-learning methods for anomaly detection in SQL database systems Monitor SQL query log Create a query set model for a specific (web-) application in training phase Provide a detector for online-discovery of malicious SQL queries in DBMS after training phase In this work: Focusing on different SQL representations for learning process

Good ways to learn on SQL? Parse Tree validation approach did not use machine learning techniques Previous Data Mining approaches do discard query structure in pre-processing phases What are good ways to model SQL for machine learning? Term-Vectors? n-grams? Using parse tree to create SQL-Vectors Using Tree-Kernel Methods on parse trees

Good ways to learn on SQL? Parse Tree validation approach did not use machine learning techniques Previous Data Mining approaches do discard query structure in pre-processing phases What are good ways to model SQL for machine learning? Term-Vectors? n-grams? Using parse tree to create SQL-Vectors Using Tree-Kernel Methods on parse trees SELECT * FROM USERS WHERE login = cb AND pass = secret

Good ways to learn on SQL? Parse Tree validation approach did not use machine learning techniques Previous Data Mining approaches do discard query structure in pre-processing phases What are good ways to model SQL for machine learning? Term-Vectors? n-grams? Using parse tree to create SQL-Vectors Using Tree-Kernel Methods on parse trees SELECT * FROM USERS WHERE login = cb AND pass = secret

Good ways to learn on SQL? Parse Tree validation approach did not use SELECT * FROM USERS WHERE login = cb AND pass = secret machine learning techniques Previous Data Mining approaches do discard query structure in pre-processing phases What are good ways to model SQL for machine learning? Term-Vectors? n-grams? Using parse tree to create SQL-Vectors Using Tree-Kernel Methods on parse trees... SELECT FROM USERS WHERE login cb AND pass secret... =

Good ways to learn on SQL? Parse Tree validation approach did not use SELECT * FROM USERS WHERE login = cb AND pass = secret machine learning techniques Previous Data Mining approaches do discard query structure in pre-processing phases What are good ways to model SQL for machine learning? Term-Vectors? n-grams? Using parse tree to create SQL-Vectors Using Tree-Kernel Methods on parse trees... SELECT FROM USERS WHERE login cb AND pass secret... =

Good ways to learn on SQL? Parse Tree validation approach did not use SELECT * FROM USERS WHERE login = cb AND pass = secret machine learning techniques Previous Data Mining approaches do discard query structure in pre-processing phases What are good ways to model SQL for machine learning? Term-Vectors? n-grams? Using parse tree to create SQL-Vectors Using Tree-Kernel Methods on parse trees... SEL ELE LEC ECT CT T * * * F FR... =

SQL Parse Trees SQL is a highly structured language Allows easy and fast parsing DBMS will perform optimization on parse tree Every DBMS contains specific SQL parser for its dialect we chose Apache Derby s parser (Java) (easy to use in embedded mode) modified Derby parser to obtain parse tree Other ways to obtain parsers by generating them based on grammars (antlr.org, javacc)

SQL Parse Tree SELECT name,sum(credits) FROM STUDENTS WHERE name = 'Marcin' AND lvid = '4259' Start SelectNode ResultColList FromList WhereClause ResultColRef ResultColRef TableRef BinaryOp AND name SUM(credits) STUDENTS BinaryOp = BinaryOp = ColumnRef CharConst ColumnRef CharConst name Marcin lvid 4259

SQL Parse Tree of an SQL-Injection SELECT name,sum(credits) FROM STUDENTS WHERE name = 'Marcin' AND lvid = '4259 OR > --' Start SelectNode ResultColList FromList WhereClause ResultColRef ResultColRef TableRef BinaryOp OR name SUM(credits) STUDENTS BinaryOp AND BinaryOp = BinaryOp = BinaryOp = ColumnRef Const ColumnRef CharConst ColumnRef CharConst lvid 4259 name Marcin lvid 4259

Vectorizing SQL Parse Trees We derived the production rules from SQL parse trees to create a histogram-vector of a query The i-th component of a vector denotes the number of applications of the i-th grammar rule SELECT name,sum(punkte) FROM STUDENTS WHERE name = 'Marcin' AND lvid = '4259' Start --> SelectNode SelectNode --> ResultCols FromList WhereClause ResultCols --> ResultColumn ResultColumn ResultColumn --> ColumnReference ColumnReference --> 'NAME' ResultColumn --> ColumnReference AggregateNode --> SUM ColumnReference --> 'PUNKTE'.. 2..

Vectorizing SQL Parse Trees Term vectors only consider leaves of parse trees Term Vectorization

Vectorizing SQL Parse Trees Term vectors only consider leaves of parse trees SQL rule vectorizations includes small context of each node (i.e. predecessor) SQL (Rule) Vectors

Vectorizing SQL Parse Trees Term vectors only consider leaves of parse trees SQL rule vectorizations includes small context of each node (i.e. predecessor) Integer values vectors can be used in a variety of Data Mining methods Clustering, Outlier detection, Classification SQL (Rule) Vectors

How does context help? We investigated the separability of n-grams, term-vectors and SQL-rule vectors Each query was represented by a vector and an SVM was trained to distinguish between queries labeled as attacks/normal true pos false pos time (s) 3-gram,667, 7,667,2 643 4-gram,333, 49,733,2.55 Term-vectors,667, 2,733,2 283 SQL-vectors,867, 3,867, 67 2 legal queries, 5 attacks true pos false pos time (s) legal queries, 5 attacks SVM results on different models (-fold Xval, optimized C,kernel-type)

What about more context? Rule based vectors consider ancestor context Tree/Graph Kernel functions approach in Machine Learning Used in Natural Language Processing Applied for IDS on Protocol Layer kernel function on trees: basically measuring similarity of two trees

What about more context? Rule based vectors consider ancestor context Tree/Graph Kernel functions approach in Machine Learning Used in Natural Language Processing Applied for IDS on Protocol Layer kernel function on trees: basically measuring similarity of two trees

What about more context? Rule based vectors consider ancestor context Tree/Graph Kernel functions approach in Machine Learning Used in Natural Language Processing Applied for IDS on Protocol Layer Convolution Kernels on Discrete Structures D. Haussler, Technical Report, University of Santa Cruz, 999 Kernels and Distances for Structured Data Thomas Gärtner, John W. Lloyd, Peter A. Flach, Machine Learning, 24 Incorporation of Application Layer Protocol Syntax into Anomal. Detc. Düssel, Gehl, Laskov, Rieck, in Proc. of Int. Conf on Information Systems Security, 28

What about more context? Rule based vectors consider ancestor context Tree/Graph Kernel functions approach in Machine Learning Used in Natural Language Processing Applied for IDS on Protocol Layer kernel function on trees: basically measuring similarity of two trees Convolution Kernels on Discrete Structures D. Haussler, Technical Report, University of Santa Cruz, 999 Kernels and Distances for Structured Data Thomas Gärtner, John W. Lloyd, Peter A. Flach, Machine Learning, 24 Incorporation of Application Layer Protocol Syntax into Anomal. Detc. Düssel, Gehl, Laskov, Rieck, in Proc. of Int. Conf on Information Systems Security, 28

Clustering of SQL No SVM with the tree-kernel, yet, therefore we used the tree kernel as similarity measure for: Clustering of SQL queries Outlier detection (work in progress) Distance based on kernel k can be defined as

Clustering of SQL We created a similarity matrix (pairwise similarity of queries) and visualized this in an ISOM the typo3 Big Picture :-)

Clustering of SQL We inspected the clusters manually, showing reasonable results similarity measure follows intuition clusters revealed groups of page-fetches, userlogins, session invalidation, etc. Inserted (synthetical) attacks remained isolated, even though only marginally changed

Summary / Future Work The work addresses the use of syntactical context for anomaly detection in SQL queries First results show precision gains by incorporating syntax into SQL log-file analysis Future work will focus on additional evaluation (more applications) parser/grammar extensions (other dialects) unsupervised detection methods (outlier search) Derivation of application specific SQL policies (SQL-firewall rules)