SIMPLE MACHINE HEURISTIC INTELLIGENT AGENT FRAMEWORK



Similar documents
IBM Big Data in Government

PALANTIR CYBER An End-to-End Cyber Intelligence Platform for Analysis & Knowledge Management

Presented by: Aaron Bossert, Cray Inc. Network Security Analytics, HPC Platforms, Hadoop, and Graphs Oh, My

Massive Cloud Auditing using Data Mining on Hadoop

Harnessing the power of advanced analytics with IBM Netezza

BIG DATA THE NEW OPPORTUNITY

Real-Time Analytics on Large Datasets: Predictive Models for Online Targeted Advertising

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

Apache Hadoop: The Big Data Refinery

Name: Srinivasan Govindaraj Title: Big Data Predictive Analytics

Integrating a Big Data Platform into Government:

Pulsar Realtime Analytics At Scale. Tony Ng April 14, 2015

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Detect & Investigate Threats. OVERVIEW

Embedded inside the database. No need for Hadoop or customcode. True real-time analytics done per transaction and in aggregate. On-the-fly linking IP

locuz.com Big Data Services

The 4 Pillars of Technosoft s Big Data Practice

Niara Security Analytics. Overview. Automatically detect attacks on the inside using machine learning

Klarna Tech Talk: Mind the Data! Jeff Pollock InfoSphere Information Integration & Governance

Comprehensive Analytics on the Hortonworks Data Platform

How To Write A Trusted Analytics Platform (Tap)

Setting the Standard for Safe City Projects in the United States

The SIEM Evaluator s Guide

BIG DATA ANALYTICS For REAL TIME SYSTEM

OPEN MODERN DATA ARCHITECTURE FOR FINANCIAL SERVICES RISK MANAGEMENT

Big Data & Security. Aljosa Pasic 12/02/2015

Niara Security Intelligence. Overview. Threat Discovery and Incident Investigation Reimagined

CRITEO INTERNSHIP PROGRAM 2015/2016

MarkLogic Enterprise Data Layer

Towards Smart and Intelligent SDN Controller

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Augmented Search for Web Applications. New frontier in big log data analysis and application intelligence

A Guide Through the BPM Maze

SHARING THREAT INTELLIGENCE ANALYTICS FOR COLLABORATIVE ATTACK ANALYSIS

Solve your toughest challenges with data mining

Impact of Big Data in Oil & Gas Industry. Pranaya Sangvai Reliance Industries Limited 04 Feb 15, DEJ, Mumbai, India.

A New Era Of Analytic

Case Study: Real-time Analytics With Druid. Salil Kalia, Tech Lead, TO THE NEW Digital

WHITE PAPER. Harnessing the Power of Advanced Analytics How an appliance approach simplifies the use of advanced analytics

Cisco Data Preparation

Ganzheitliches Datenmanagement

Modern Data Architecture for Predictive Analytics

HDP Hadoop From concept to deployment.

Microsoft Big Data Solutions. Anar Taghiyev P-TSP

EL Program: Smart Manufacturing Systems Design and Analysis

CONNECTING DATA WITH BUSINESS

ORACLE BUSINESS INTELLIGENCE SUITE ENTERPRISE EDITION PLUS

ORACLE BUSINESS INTELLIGENCE SUITE ENTERPRISE EDITION PLUS

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

Introducing Oracle Exalytics In-Memory Machine

North Highland Data and Analytics. Data Governance Considerations for Big Data Analytics

Fast and Easy Delivery of Data Mining Insights to Reporting Systems

Problem Solving Hands-on Labware for Teaching Big Data Cybersecurity Analysis

IBM: An Early Leader across the Big Data Security Analytics Continuum Date: June 2013 Author: Jon Oltsik, Senior Principal Analyst

Cyber Security. BDS PhantomWorks. Boeing Energy. Copyright 2011 Boeing. All rights reserved.

How To Make Sense Of Data With Altilia

Extend your analytic capabilities with SAP Predictive Analysis

The Purview Solution Integration With Splunk

How To Handle Big Data With A Data Scientist

Decision Ready Data: Power Your Analytics with Great Data. Murthy Mathiprakasam

Interactive data analytics drive insights

Real World Application and Usage of IBM Advanced Analytics Technology

Getting the Most Out of SIEM. Presentation Title. Data in Big Data. Presented By: Dr. Char Sample, CERT

Luncheon Webinar Series May 13, 2013

Introduction to Engineering Using Robotics Experiments Lecture 17 Big Data

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Easy CramBible Lab DEMO ONLY VERSION. ** Single-user License ** This copy can be only used by yourself for educational purposes

Search and Data Mining: Techniques. Introduction Anna Yarygina Boris Novikov

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

Discover & Investigate Advanced Threats. OVERVIEW

OBSERVEIT DEPLOYMENT SIZING GUIDE

Big Data, Integration and Governance: Ask the Experts

Exploiting Data at Rest and Data in Motion with a Big Data Platform

White Paper. How Streaming Data Analytics Enables Real-Time Decisions

WAN security threat landscape and best mitigation practices. Rex Stover Vice President, Americas, Enterprise & ICP Sales

Powerful Management of Financial Big Data

A Framework of User-Driven Data Analytics in the Cloud for Course Management

BIG DATA AND THE ENTERPRISE DATA WAREHOUSE WORKSHOP

Big Data: Overview and Roadmap eglobaltech. All rights reserved.

Testing 3Vs (Volume, Variety and Velocity) of Big Data

IBM InfoSphere BigInsights Enterprise Edition

Three proven methods to achieve a higher ROI from data mining

S o l u t i o n O v e r v i e w. Turbo-charging Demand Response Programs with Operational Intelligence from Vitria

XpoLog Competitive Comparison Sheet

Big Data and Healthcare Payers WHITE PAPER

Talend Real-Time Big Data Sandbox. Big Data Insights Cookbook

CAPTURING & PROCESSING REAL-TIME DATA ON AWS

Introduction to Apache Kafka And Real-Time ETL. for Oracle DBAs and Data Analysts

How the oil and gas industry can gain value from Big Data?

EMC Data Protection Advisor 6.0

Easy Execution of Data Mining Models through PMML

TORNADO Solution for Telecom Vertical

Transcription:

SIMPLE MACHINE HEURISTIC INTELLIGENT AGENT FRAMEWORK Simple Machine Heuristic (SMH) Intelligent Agent (IA) Framework Tuesday, November 20, 2011 Randall Mora, David Harris, Wyn Hack Avum, Inc.

Outline Solution Objective Problem Definition SMH IA Framework Solution to the Problem SMH IA Framework Core Functionality Description of Use Case Scenarios Heuristic Analysis Across Domains Distributed Analysis of High Volume Network Traffic Predictive Analytics of Aggregated Data. Demonstration Metrics and Performance Discussion Applied Solutions With SMH IA Framework Future Applications

Solution Objective Develop a prototype framework operating within a diverse real-time and Big Data environment that enables rapid extraction of intrusion detection (or any) tradecraft into interacting communities of intelligent agents. The IA Framework Prototype Will Demonstrate: P Extended IAs that perform intrusion detection through continuous monitoring. P Distributed IAs threaded to handle a large volume of low-latency data for extended time periods. P A Big Data operating environment and rapid design-to-test IA base functionality for analysts/developers to perform intrusion analysis and the migration of tradecraft into the machine. P IAs collaborating in a team environment for the identification, generation, publication, and storing of meaning and relevance. P IBM InfoSphere Streams streaming data to the IA Framework message broker and aggregated data being streamed back to InfoSphere Streams for scoring with a PMML model. 3

Problem Definition Drowning In Data Innovative architectures are needed to allow the analyst to interact with data in new ways. Technologies are needed to leverage Big Data, including initial analysis, and design-totest and deployment of non-brittle scalable solutions. Methods are Needed to Combine Multiple Streams or Sources of Disparate Data P Technologies to build a comprehensive picture of data across domains P Flexible tools, technologies and frameworks to allow the analyst/ developer to incorporate new domains of data and build a unified view of data Scalable Analytics and Data Processing Continuous Monitoring Architectures 4

Today s [or Current ] Architectures Flat Architecture Todays Agent Frameworks operate within a single core Operating System (OS) and require multiple interfaces for receiving and communicating their knowledge base. Each agent is initialized based on its defined functionality. Single Domain Functionality P Interfaces required for multiple domains P Single view of data Brittle Architecture P Extensive design-to-test and deployment processes for new developments P Short useful lifespan in today s rapidly changing environment Container-Based Scalability P Application Server or Application Operating Environment 5

SMH IA Framework The SMH IA Framework supplies an open architecture that enables the programmer/analyst to build an IA suite for mining, fusing, examining and evaluating heterogeneous data for semantic representations. Extensible and Portable Framework Game Changing, Yet Proven Components for Interoperability and Big Data Communities of Distributed and Collaborating Agents Analyst/Developer rapid design-totest intrusion analysis and tradecraft extraction Real-time, near-real-time, and Big Data Analysis Highly Scalable 6

Components for Interoperability Java/JEE Spring Framework Users Open Service Gateway initiative (OSGi) Apache Kafka Apache ZooKeeper Hadoop Hypertable Analyst Programer/Analyst IBM InfoSphere Steams Predictive Model Markup Language (PMML) 7

SMH IA Framework Value Independent and Distributed Development and Execution of Domain Solutions Unified Framework that a Programmer/Analyst Can Plug Into for Cyber Analytics Enables the Creation of Independent Agents that can Be Shared Interagency Facilitates Intercommunicating Agent Communities Working Independently or Together with a Common Goal Extremely Convenient Environment for Design-Test and Deployment of Domain Data Fusion Solutions Built-in Hadoop Archival and Playback of Both Raw or Aggregated Data Post-Operation Analysis Plug In Visualization Capabilities Easily Plug-In and Integrate with Other Solutions/Technologies/ Architectures Solving Unique Solutions P IBM InfoSphere Streams 8

Heuristic Analysis Across Domains Scenario to demonstrate the use of SMH IAs in the context of illicit download detection P Communities/teams of interacting IAs collaborating to detect illicit downloads P Fusing information across domains to identify perpetrators P Rapid design-to-test and deployment of IAs P Pushing new IA to perform deep analysis and reporting for the possible perpetrator P Training IAs for acceptable threshold recognition using vetted data P Training IAs with user responses P Dynamic interaction with live users and IAs to identify illicit activities 9

Distributed Analysis of High Volume Network Traffic Scenario to demonstrate the distributed analysis of continuous high volume network traffic within the SMH IA Framework P Distributed Nature of OSGi P Visualization of data fused from multiple Kafka queues P Archiving data that can be replayed from Hadoop for later analysis P Utilizing IBM InfoSphere Streams as an IA 10

Distributed Analysis of High Volume Network Traffic 11

Predictive Analytics of Aggregated Data Scenario to demonstrate the use of PMML to perform predictive analytics against aggregated data from the IA Framework P Aggregated data being pushed into IBM InfoSphere Streams P Scoring of IA streams with a simple PMML model in IBM InfoSphere Streams P Aggregated/scored stream being returned to the IA Framework 12

IAs Created for Prototype Base (Fundamental Servers) IAs P Kafka, Hypertable, Database, Email services, IBM InfoSphere Integration Heuristic Analysis P PCAP PCAP File Reader P Reduced PCAP Archiver P User Activity Log Reader (with PCAP synchronization) P User Activity Parser P Current Login Status Dictionary P Traffic Flow Accumulator (monitors traffic in and out of servers) P User Download Threshold Analyst P Training Alert Response Agent P User Email Alert/Response Agent P Data-At-Rest and Illicit Activity Analyst/ Reporter Distributed Analysis P IBM InfoSphere Stream PCAP Splitter P Conversation Flow Packet Aggregator P Conversation Flow Archiver P Intrusion Detection Analyst P Flow Grapher Predictive Analytics (PMML Scoring) P PCAP File Reader P Traffic Flow Accumulator (monitors traffic in and out of various servers) P IBM InfoSphere Stream (Kafka Consumer IA) P PMML Data Mining Toolkit P IBM InfoSphere Stream (Kafka Producer IA) P PMML Results Analyst/Logger 13

Results, Scalability, Throughput, Flexibility 2 ½ Months Building the Framework ½ Month to Build the Demos Scenarios P The prototype demonstrates design-to-test rapid development of IAs P Prototype architecture allows an analyst/developer to interact with data in new ways P The prototype demonstrates rapid design-to-test of intrusion detection and continuous monitoring P The prototype demonstrates flexible tools, technologies and frameworks that allow the analyst/developer to incorporate new data domains and build a unified view of data Every Component and IA can be Distributed Across any Number of Machines For a Single Producer, Single Consumer and Single Kafka Machine Performance Tests demonstrated Throughput of: P 50MB/sec writing to the queues P 100MB/sec reading from the queues Standard Java/JEE Environment P Huge Collection of Affordable Talent Familiar With the Underling Architecture 14

SMH IA Applied To Relevant Problems Design-To-Test Cycles ü Current As-Is Intelligent Agent Solutions vs SMH IA Framework ü Unique Wired Operating Environment for Data Analysis Flexible and Adaptable Framework ü Integration With Existing Architectures, Operating Environments, Industrial Machinery, and Mobile Devices Big Data Operating Environment For Machine Learning ü Quickly and effectively integrate structured, semi-structured, and unstructured data for rapid design-to-test of existing, new and improved algorithms Create New Views of Data Across Domains For Analysis Interactively Simulate Solutions for Cyber Intelligence ü Interactively reaching new levels in the tradecraft extraction 15

Areas of Future Work Implement Predictive Components ü Add predictive modeling into IA s functionality ü Integrate Machine Learning into the base IA functionality (algorithms, predictive modeling, etc.) Fuse Additional Data Domains P Implement algorithms to assign value to and classify structured and unstructured data from additional sources P Transform/fuse additional domains to feed into algorithms to mine intelligence and make predictions Improve the System to be Self-Administrated P Tailor additional Administrator IA to control management subsystem for IA deployment Collaborate to Innovate P Work with Subject Matter Experts (SMEs) to incorporate their ideas Investigate and Implement IA Clusters Running Other Domains 16

Areas of Future Work 17