2.0. Specification of HSN 2.0 JavaScript Static Analyzer



Similar documents
Application of Data Mining based Malicious Code Detection Techniques for Detecting new Spyware

Detecting client-side e-banking fraud using a heuristic model

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification

Webapps Vulnerability Report

CSCI6900 Assignment 2: Naïve Bayes on Hadoop

LASTLINE WHITEPAPER. Large-Scale Detection of Malicious Web Pages

Bazaarvoice for Magento Extension Implementation Guide v6.3.4

Kaldeera Workflow Designer 2010 User's Guide

CS 558 Internet Systems and Technologies

CS 348: Introduction to Artificial Intelligence Lab 2: Spam Filtering

Java Application Developer Certificate Program Competencies

metaengine DataConnect For SharePoint 2007 Configuration Guide

Configuring Health Monitoring

A Tokenization and Encryption based Multi-Layer Architecture to Detect and Prevent SQL Injection Attack

AVG File Server User Manual. Document revision (8/19/2011)

Developing Web Views for VMware vcenter Orchestrator

Prophiler: A Fast Filter for the Large-Scale Detection of Malicious Web Pages

Easy Manage Helpdesk Guide version 5.4

MONETA.Assistant API Reference

Automatic Detection for JavaScript Obfuscation Attacks in Web Pages through String Pattern Analysis

Detection of SQL Injection Attacks by Combining Static Analysis and Runtime Validation

Pemrograman Dasar. Basic Elements Of Java

Web Application Security

Cross Site Scripting (XSS) and PHP Security. Anthony Ferrara NYPHP and OWASP Security Series June 30, 2011

Detection of Spyware by Mining Executable Files

Managed App Configuration for App Developers. February 22, 2016

Developer Guide to Authentication and Authorisation Web Services Secure and Public

How to Configure the Workflow Service and Design the Workflow Process Templates

DataPA OpenAnalytics End User Training

INTRUSION PROTECTION AGAINST SQL INJECTION ATTACKS USING REVERSE PROXY

SPARROW Gateway. Developer API. Version 2.00

Aras Corporation Aras Corporation. All rights reserved. Notice of Rights. Notice of Liability

Terms and Definitions for CMS Administrators, Architects, and Developers

FILESURF 7.5 SR3/WORKSITE INTEGRATION INSTALLATION MANUAL 1 PRELIMINARIES...3 STEP 1 - PLAN THE FIELD MAPPING...3 STEP 2 - WORKSITE CONFIGURATION...

VMware vcenter Log Insight User's Guide

Web Document Clustering

WildFire Features. Palo Alto Networks. PAN-OS New Features Guide Version 6.1. Copyright Palo Alto Networks

Oracle Marketing Encyclopedia System

Removing Web Spam Links from Search Engine Results

CDD user guide. PsN Revised

Many applications consist of one or more classes, each containing one or more methods. If you become part of a development team in industry, you may

Sandy. The Malicious Exploit Analysis. Static Analysis and Dynamic exploit analysis. Garage4Hackers

Start Oracle Insurance Policy Administration. Activity Processing. Version

SnapLogic Salesforce Snap Reference

Check list for web developers

Document Management System (DMS) Release 4.5 User Guide

JavaScript: Introduction to Scripting Pearson Education, Inc. All rights reserved.

Data Mining Analysis (breast-cancer data)

05.0 Application Development

Bypassing Web Application Firewalls (WAFs) Ing. Pavol Lupták, CISSP, CEH Lead Security Consultant

1.0 Getting Started Guide

Implementation of Breiman s Random Forest Machine Learning Algorithm

Field Properties Quick Reference

J a v a Quiz (Unit 3, Test 0 Practice)

Sources: On the Web: Slides will be available on:

IVR Studio 3.0 Guide. May Knowlarity Product Team

Manage Workflows. Workflows and Workflow Actions

WebSocket Server. To understand the Wakanda Server side WebSocket support, it is important to identify the different parts and how they interact:

Wakanda Studio Features

SQL Server Instance-Level Benchmarks with DVDStore

The un-official Google Analytics How To PDF guide to:

Novell Identity Manager

FWG Management System Manual

AVG File Server User Manual. Document revision (11/13/2012)

IBM Tivoli Composite Application Manager for Microsoft Applications: Microsoft Hyper-V Server Agent Version Fix Pack 2.

2 Decision tree + Cross-validation with R (package rpart)

PL/SQL Overview. Basic Structure and Syntax of PL/SQL

Protection, Usability and Improvements in Reflected XSS Filters

HP AppPulse Mobile. Adding HP AppPulse Mobile to Your Android App

SuiteBuilder (Customization) Guide September 3, 2013 Version 2013 Release 2

Security Intelligence Blacklisting

Technical Guide DocuPRO Embedded Client for Xerox

Setup The package simply needs to be installed and configured for the desired CDN s distribution server.

Crowdfunding Support Tools: Predicting Success & Failure

WEKA KnowledgeFlow Tutorial for Version 3-5-8

QlikView 11.2 SR5 DIRECT DISCOVERY

CRM Migration Manager for Microsoft Dynamics CRM. User Guide

Grandstream XML Application Guide Three XML Applications

STANDARDISATION AND CLASSIFICATION OF ALERTS GENERATED BY INTRUSION DETECTION SYSTEMS

Microsoft Dynamics GP. SmartList Builder User s Guide With Excel Report Builder

What is Web Security? Motivation

LICENSE4J LICENSE ACTIVATION AND VALIDATION PROXY SERVER USER GUIDE

Co-Creation of Models and Metamodels for Enterprise. Architecture Projects.

Performance Testing for Ajax Applications

Filtering Spam Using Search Engines

Data Domain Profiling and Data Masking for Hadoop

Name: Class: Date: 9. The compiler ignores all comments they are there strictly for the convenience of anyone reading the program.


Spryng Making Business Mobile Mobile Terminated Premium SMS Gateway. Contents:

Integrations. Help Documentation

CHAPTER 5 INTELLIGENT TECHNIQUES TO PREVENT SQL INJECTION ATTACKS

DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7

Handout 1. Introduction to Java programming language. Java primitive types and operations. Reading keyboard Input using class Scanner.

Administrator s Guide

Transcription:

2.0 Specification of HSN 2.0 JavaScript Static Analyzer Pawe l Jacewicz Version 0.3 Last edit by: Lukasz Siewierski, 2012-11-08 Relevant issues: #4925 Sprint: 11 Summary This document specifies operation and configuration of the JavaScript Static Analyzer (js sta).

Version Date Author Changes 0.1 2011-11-23 Pawe l Jacewicz Initial version 0.2 2011-11-24 Pawe l Pawliński Many fixes throughout the document: more specific descriptions, improved local parameters 0.3 2012-11-08 Lukasz Siewierski Added information about context whitelisting 2

Contents A. JS Static analyzer service description............... 3 B. JavaScript Static Analyzer processing.............. 4 C. Local service configuration..................... 5 A. JS Static analyzer service description The service should be implemented on the basis of research on capabilities of HSN 1.5 low-interaction client honeypot [1] and a proof of concept implementation of the LIMv2 tool. Functionality in terms of classification of JavaScripts should not be lesser than in LIC [1] (taking into account only ngrams, naive Bayes classifier and keywords confirmation). The service will receive an URL object containing a list of JavaScript contexts, retrieve the source code, process it individually and save results in attributes of the current object (see data contract [2] for detailed information). The service will not create new objects. There should be a possibility of running multiple classification threads in parallel. Number of threads will be constrained by the service configuration. Keyword lists will be provided as service parameters in a workflow definition. Internally, the Weka Toolkit [3] is used for classification, training data for model creation is supplied separately. Training dataset is contained in a single ARFF file which is specified in the service configuration. Once the training data is read and a classifier model is created, this file should not be accessed any more (all the processing threads should share the model). There should be a simple way of reloading training data and creating a new model in runtime, without disrupting already running classification tasks. Whitelisting is performed against a supplied file that contains hashes (one per line). If the context hash matches hash in the file, context is said to be whitelisted. Hashing algorithm strips context only to alphanumeric signs (A-Z, a-z and 0-9) and then calculates MD5 hash sum of this character string. 3

Contents B. JavaScript Static Analyzer processing JavaScript Static Analyzer service provides functionality to analyze JavaScript source code without executing it. The analysis is performed on chunks (contexts) of JavaScript code extracted by a web client (see data contract for web client [2] for details). The method of analysis is similar to one implemented in HSN 1.5 [1] the service detects malicious and obfuscated JavaScript code. For this purpose the Weka Toolkit [3], ngrams and pattern matching mechanisms are used. The processing path of JavaScript contexts is presented in figure 1. JS code ngrams generation Suspicious Malicious keyword found? keyword found? YES NO YES NO hash calculation min ngrams quantity reached? YES NO Suspicious keyword = true Suspicious keyword = false Malicious keyword = true Malicious keyword = false Context hash whitelisted? YES NO WEKA classifier Whitelisted = true Whitelisted = false Classification = {Benign Obfuscated Malicious} Unclassified FIN Processing steps are as follows: Figure 1.: Classification model 1. Retrieve a JavaScript context together with its identifier (used to distinguish between different contexts in a single web page). 2. Perform in parallel: check whether JavaScript contains any predefined malicious keywords check whether JavaScript contains any predefined suspicious keywords generate a list of most common ngrams calculate context hash and compare it against the list of predefined hashes 4 HONEYSPIDER NETWORK 2.0

C. Local service configuration 3. Perform Weka analysis on generated ngrams. a) in case number of generated ngrams is insufficient skip the Weka analysis and assign classification unclassified b) assign classification according to Weka result: malicious, obfuscated, benign 4. Associate all detected keywords with the context they were found in. 5. Save all information gathered about all contexts (classification, keywords, whitelisting status) in a list of structured form in an attribute of the current object. 6. Save an overall classification for URL object based on results for all contexts. Assuming unclassif ied < benign < obf uscated < malicious ordering of classifications, the overall result is the maximum of all context classifications. 7. Information whether any of the keywords on both lists were found during analysis and whether script is whitelisted or not should be added to the URL object. This can be done via relevant attributes containing boolean values. C. Local service configuration Local configuration is expressed through a set of parameters describing initial state of the running service. This document does not specify format of the configuration file. thread number mandatory: yes type: integer default value: 10 The number of classification threads the service is able to spawn at the same time. The number corresponds to maximum number of JavaScript code chunks the service is able to process simultaneously. training set mandatory: yes type: string default value: platform-specific Path to the ARFF file that contains labelled training data that should be used by Weka for training a classifier. classifier name mandatory: yes type: string default value: weka.classifiers.bayes.naivebayes Parameter declares the name of classifier to be used by Weka Toolkit. NASK & GOVCERT.nl 5

Contents ngram length mandatory: yes type: integer default value: 4 Parameter declares the length of single ngram generated from the JavaScript source code. The length must be the same as used when generating the classifier model file. ngrams quantity mandatory: yes type: integer default value: 50 Number of most frequent ngrams that appear in a context (top n) that should be used in the classification process. It must be consistent with contents of the training dataset. 6 HONEYSPIDER NETWORK 2.0

References [1] HSN 1.5 Low-Interaction Component capabilities [2] Data Contract Specification for HSN 2.0 Services [3] Weka 3 Toolkit, http://www.cs.waikato.ac.nz/ml/weka/ 7