SGT Technology Innovation Center Dasvis Project

Size: px
Start display at page:

Download "SGT Technology Innovation Center Dasvis Project"

Transcription

1 SGT Technology Innovation Center Dasvis Project 12 March SGT Inc. Rohit Mital Jay Ellis Ashton Webster Grant Orndorff

2 Introduction About SGT Technology Innovation Center Genesis for Dasvis project 2015 SGT Inc. 2

3 Purpose Project Goals Develop a real-time distributed processing framework for big data Determine how tools like Dasvis (built upon this framework) can fit in with other tools in the marketplace Design and develop a complementary tool suite to SGT s Cyber Security capabilities to ensure the security of SGT and our infrastructure Dasvis is designed to be a customizable network monitoring tool Will mirror the capabilities of standard SIEM, Network IDS/IPS, and other tools Can accept a variety of inputs 2015 SGT Inc. 3

4 Data Exfiltration in the News Sony Pictures Entertainment Attack on Sony by Guardians of Peace (with suspected Nation-State involvement) in retaliation to the release of movie The Interview Exfiltration of PII from Sony employees / family members, s, executive salaries, and previously unreleased Sony movies Elimination of wide-scale theatrical movie release Edward Snowden Former NSA contractor, CIA, and DIA employee who released thousands of classified documents about NSA s global Surveillance programs Charged with espionage by US DOJ (30 year sentence) and theft of government property, currently living in Russia WikiLeaks to present 1.2 million documents published in the first year after website launch Initial communication to WikiLeaks founder by PFC Manning (currently serving 35 year prison term) considered to be the largest leak of classified information in history, to include: 500,000+ US Army reports (Afghan and Iraq War Logs) 250,000+ unredacted US State Department cables 2015 SGT Inc. 4

5 Real-World Applications Large-scale data exfiltration from both government and commercial sector becoming all too common Loss of sensitive and classified data occurring for and by corporations and Nation States Indicates a need for companies to monitor network and/or user activity to protect against these types of threats Tools and frameworks needed to process the amount of information necessary to thwart these types of attacks 2015 SGT Inc. 5

6 System Architecture Cloud-based Real-time distributed processing framework Developed using standard, open-source tools with an available labor pool to support future maintenance and expansion Designed with flexibility and portability in mind 2015 SGT Inc. 6

7 Dasvis Architecture / Tools Configuration Processing Apache 2.4 Web Server Capture and Processing Packet Captures: Pcap4j Data Transfer: Apache Kafka Queue Distributed/Real Time Processing: Apache Storm/Trident Data Storage NoSQL Databases: Primary Packet Store: MongoDB Aggregate/Time Series DB: Cube DB Reporting/Graphing Apache 2.4 Web Server PHP Web Framework: Laravel Graphing/Visualizations: Google Visualizations Post Processing (Future) Integration with HDFS/Hadoop with queries using HQL 2015 SGT Inc. 7

8 Apache Kafka Queue Kafka is a distributed messaging system that is used to transfer large amounts of data between processes. It is a queue and has producers and consumers Producers push data to a Kafka Queue Consumers pull data from a Kafka Queue Basically a reliable way to send big data from one place to another in virtually any format 2015 SGT Inc. 8

9 MongoDB and CubeDB MongoDB is a NoSQL database Has collections (analagous to tables in SQL) that can accept documents of varying structures Uses JavaScript Object Notation (JSON) for more flexible format (similar to rows in SQL) Unlike other databases (e.g. MySQL) that require every object inserted to be of the exact same structure/schema CubeDB is a Time Series database that sits on top of MongoDB A time series database is a database that is highly optimized for queries based on time of insertion 2015 SGT Inc. 9

10 Apache Storm/Trident Storm allows one to process large amounts of data in real time by providing an abstraction for writing distributed processing programs Spout: A unit that creates a stream of data to be processed A unit that accepts a stream of data, performs an operation on it, and optionally passes on more data. Topology: A collection of spouts and bolts connected by the streams of data passed between them Storm Bolts and Spouts can be run as multiple tasks (threads) and even on different machines in parallel Trident is a further abstraction on top of Storm that handles the creation of spouts and bolts in what it deems the most efficient topology 2015 SGT Inc. 10

11 How it All Fits Together 2015 SGT Inc. 11

12 Dasvis Storm Topologies: Tracking and Comparing The Tracking Topology looks at incoming data and aggregates data that we want to track Aggregated data is stored in the Time Series database, and sent to the Comparing Topology The Comparing topology compares the incoming data to the Baseline Data to look for anomalies Raw Data Do we want to track this data? Yes Aggregate incoming data Aggregated Data Compares incoming data to baseline data Discard Data Comparison information Tracking Topology Comparing Topology 2015 SGT Inc. 12

13 A Closer Look at the Tracking Topology Packet Spout: Packet is retrieved from Kafka Queue Packet Parse Packet Parsed to JSON Packet Match Packet Matched with Configurations Packet Aggregation Packet aggregated over time with other packets Single Insertion Packet inserted to MongoDB Aggregate Forward Aggregated packets sent to Comparing Topology via Kafka Queue Aggregate Insertion Packet aggregate data stored in Time Series Database Spouts and bolts make for simple programming abstractions Spouts start the data processing Bolts are operations on those packets Bolts Data Flow 2015 SGT Inc. 13

14 A Closer Look at the Tracking Topology Packet Spout: Packet Parse Packet Match Single Insertion Aggregate Forward Bolts Can Run as multiple Tasks Tasks can be thought of as threads Packet Aggregation Aggregate Insertion Bolts Task 2015 SGT Inc. 14

15 Node 1 Node 2 Packet Spout: Packet Parse Packet Match Packet Spout: Packet Parse Packet Match A Closer Look at the Tracking Topology Node 4 Single Insertion Aggregate Forward Bolts can run on multiple nodes in a cluster Each bolt can still run as multiple tasks This greatly improves performance Packet Aggregation Aggregate Insertion Bolts Tasks Nodes Node 3 Node SGT Inc. 15

16 Episodes and Baseline Data Baseline Data is the data that represents what the incoming data to Dasvis should look like If the incoming data is significantly different from the Baseline Data, then we have an anomaly An Episode is a set of Baseline Data associated with a set of Conditions This allows the user to have different sets of Baseline Data for different times. Episodes of Baseline Data Normal Baseline Data 2015 SGT Inc. 16

17 Review of Dasvis-Specific Concepts Tracking vs Comparing Topologies Tracking topology records and aggregates the incoming data we want to track Comparing topology decides if there are anomalies in incoming data by comparing against baseline data Baseline Data Past data aggregated by Dasvis that represents the normal distribution of data Episode A set of Baseline Data that is only used at specific times (Ex. only on Mondays, or only during business hours) 2015 SGT Inc. 17

18 Demo Mini Tutorial Creating a Baseline Setting Baseline Data Example Scenario and expected output Normal data that matches baseline well Potentially malicious activity 2015 SGT Inc. 18

19 Summary Challenges / Issues Need to clarify current use of Open source tools and potential costs for deploying Dasvis as a COTS product Future Plans Adding new inputs such as Netflow, Application Logs, etc. in addition to packet capture Adherence to NIST Cyber Security Situational Awareness specification 2015 SGT Inc. 19

20 Comments/Questions? Your Feedback is Appreciated! 2015 SGT Inc. 20

BIG DATA FOR MEDIA SIGMA DATA SCIENCE GROUP MARCH 2ND, OSLO

BIG DATA FOR MEDIA SIGMA DATA SCIENCE GROUP MARCH 2ND, OSLO BIG DATA FOR MEDIA SIGMA DATA SCIENCE GROUP MARCH 2ND, OSLO ANTHONY A. KALINDE SIGMA DATA SCIENCE GROUP ASSOCIATE "REALTIME BEHAVIOURAL DATA COLLECTION CLICKSTREAM EXAMPLE" WHAT IS CLICKSTREAM ANALYTICS?

More information

A stream computing approach towards scalable NLP

A stream computing approach towards scalable NLP A stream computing approach towards scalable NLP Xabier Artola, Zuhaitz Beloki, Aitor Soroa IXA group. University of the Basque Country. LREC, Reykjavík 2014 Table of contents 1

More information

Real Time Fraud Detection With Sequence Mining on Big Data Platform. Pranab Ghosh Big Data Consultant IEEE CNSV meeting, May 6 2014 Santa Clara, CA

Real Time Fraud Detection With Sequence Mining on Big Data Platform. Pranab Ghosh Big Data Consultant IEEE CNSV meeting, May 6 2014 Santa Clara, CA Real Time Fraud Detection With Sequence Mining on Big Data Platform Pranab Ghosh Big Data Consultant IEEE CNSV meeting, May 6 2014 Santa Clara, CA Open Source Big Data Eco System Query (NOSQL) : Cassandra,

More information

Why NoSQL? Your database options in the new non- relational world. 2015 IBM Cloudant 1

Why NoSQL? Your database options in the new non- relational world. 2015 IBM Cloudant 1 Why NoSQL? Your database options in the new non- relational world 2015 IBM Cloudant 1 Table of Contents New types of apps are generating new types of data... 3 A brief history on NoSQL... 3 NoSQL s roots

More information

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase Architectural patterns for building real time applications with Apache HBase Andrew Purtell Committer and PMC, Apache HBase Who am I? Distributed systems engineer Principal Architect in the Big Data Platform

More information

Introducing Storm 1 Core Storm concepts Topology design

Introducing Storm 1 Core Storm concepts Topology design Storm Applied brief contents 1 Introducing Storm 1 2 Core Storm concepts 12 3 Topology design 33 4 Creating robust topologies 76 5 Moving from local to remote topologies 102 6 Tuning in Storm 130 7 Resource

More information

Real-time Big Data Analytics with Storm

Real-time Big Data Analytics with Storm Ron Bodkin Founder & CEO, Think Big June 2013 Real-time Big Data Analytics with Storm Leading Provider of Data Science and Engineering Services Accelerating Your Time to Value IMAGINE Strategy and Roadmap

More information

Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world

Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world Analytics March 2015 White paper Why NoSQL? Your database options in the new non-relational world 2 Why NoSQL? Contents 2 New types of apps are generating new types of data 2 A brief history of NoSQL 3

More information

Take the Red Pill: Becoming One with Your Computing Environment using Security Intelligence

Take the Red Pill: Becoming One with Your Computing Environment using Security Intelligence Take the Red Pill: Becoming One with Your Computing Environment using Security Intelligence Chris Poulin Security Strategist, IBM Reboot Privacy & Security Conference 2013 1 2012 IBM Corporation Securing

More information

Spark use case at Telefonica CBS

Spark use case at Telefonica CBS CiberSecurity Spark use case at Telefonica CBS Telefónica Digital Digital Services WHOAMI o Francisco J. Gomez o Worker at Telefónica (Spain) o Securityholic o @ffranz WHY WHY WHY CiberSecurity Spark use

More information

LOG INTELLIGENCE FOR SECURITY AND COMPLIANCE

LOG INTELLIGENCE FOR SECURITY AND COMPLIANCE PRODUCT BRIEF uugiven today s environment of sophisticated security threats, big data security intelligence solutions and regulatory compliance demands, the need for a log intelligence solution has become

More information

BIG DATA. Using the Lambda Architecture on a Big Data Platform to Improve Mobile Campaign Management. Author: Sandesh Deshmane

BIG DATA. Using the Lambda Architecture on a Big Data Platform to Improve Mobile Campaign Management. Author: Sandesh Deshmane BIG DATA Using the Lambda Architecture on a Big Data Platform to Improve Mobile Campaign Management Author: Sandesh Deshmane Executive Summary Growing data volumes and real time decision making requirements

More information

SIMPLE MACHINE HEURISTIC INTELLIGENT AGENT FRAMEWORK

SIMPLE MACHINE HEURISTIC INTELLIGENT AGENT FRAMEWORK SIMPLE MACHINE HEURISTIC INTELLIGENT AGENT FRAMEWORK Simple Machine Heuristic (SMH) Intelligent Agent (IA) Framework Tuesday, November 20, 2011 Randall Mora, David Harris, Wyn Hack Avum, Inc. Outline Solution

More information

Assignment # 1 (Cloud Computing Security)

Assignment # 1 (Cloud Computing Security) Assignment # 1 (Cloud Computing Security) Group Members: Abdullah Abid Zeeshan Qaiser M. Umar Hayat Table of Contents Windows Azure Introduction... 4 Windows Azure Services... 4 1. Compute... 4 a) Virtual

More information

HP ArcSight User Behavior Analytics

HP ArcSight User Behavior Analytics Insider Threat HP ArcSight User Behavior Analytics Application Misuse Sensitive Data Access Hakan Durgut ArcSight Specialist Nordics/Baltics 1 The insider threat challenge IT Security focus in on the external

More information

Integrating Big Data into the Computing Curricula

Integrating Big Data into the Computing Curricula Integrating Big Data into the Computing Curricula Yasin Silva, Suzanne Dietrich, Jason Reed, Lisa Tsosie Arizona State University http://www.public.asu.edu/~ynsilva/ibigdata/ 1 Overview Motivation Big

More information

Information Technology Policy

Information Technology Policy Information Technology Policy Security Information and Event Management Policy ITP Number Effective Date ITP-SEC021 October 10, 2006 Category Supersedes Recommended Policy Contact Scheduled Review RA-ITCentral@pa.gov

More information

Pulsar Realtime Analytics At Scale. Tony Ng April 14, 2015

Pulsar Realtime Analytics At Scale. Tony Ng April 14, 2015 Pulsar Realtime Analytics At Scale Tony Ng April 14, 2015 Big Data Trends Bigger data volumes More data sources DBs, logs, behavioral & business event streams, sensors Faster analysis Next day to hours

More information

GUJARAT TECHNOLOGICAL UNIVERSITY

GUJARAT TECHNOLOGICAL UNIVERSITY GUJARAT TECHNOLOGICAL UNIVERSITY Seminar on Intrusion Detection for Hypervisor- Based Cloud Computing Infrastructure by Dr. Rajeev Agrawal, North Carolina A&T State University, USA GTU s PG Research Center

More information

Monitis Project Proposals for AUA. September 2014, Yerevan, Armenia

Monitis Project Proposals for AUA. September 2014, Yerevan, Armenia Monitis Project Proposals for AUA September 2014, Yerevan, Armenia Distributed Log Collecting and Analysing Platform Project Specifications Category: Big Data and NoSQL Software Requirements: Apache Hadoop

More information

the missing log collector Treasure Data, Inc. Muga Nishizawa

the missing log collector Treasure Data, Inc. Muga Nishizawa the missing log collector Treasure Data, Inc. Muga Nishizawa Muga Nishizawa (@muga_nishizawa) Chief Software Architect, Treasure Data Treasure Data Overview Founded to deliver big data analytics in days

More information

Big Data Analytics - Accelerated. stream-horizon.com

Big Data Analytics - Accelerated. stream-horizon.com Big Data Analytics - Accelerated stream-horizon.com Legacy ETL platforms & conventional Data Integration approach Unable to meet latency & data throughput demands of Big Data integration challenges Based

More information

Towards Smart and Intelligent SDN Controller

Towards Smart and Intelligent SDN Controller Towards Smart and Intelligent SDN Controller - Through the Generic, Extensible, and Elastic Time Series Data Repository (TSDR) YuLing Chen, Dell Inc. Rajesh Narayanan, Dell Inc. Sharon Aicler, Cisco Systems

More information

MongoDB Developer and Administrator Certification Course Agenda

MongoDB Developer and Administrator Certification Course Agenda MongoDB Developer and Administrator Certification Course Agenda Lesson 1: NoSQL Database Introduction What is NoSQL? Why NoSQL? Difference Between RDBMS and NoSQL Databases Benefits of NoSQL Types of NoSQL

More information

Data-intensive HPC: opportunities and challenges. Patrick Valduriez

Data-intensive HPC: opportunities and challenges. Patrick Valduriez Data-intensive HPC: opportunities and challenges Patrick Valduriez Big Data Landscape Multi-$billion market! Big data = Hadoop = MapReduce? No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard,

More information

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future

More information

Predictive Research Inc., Predict & Benefit

Predictive Research Inc., Predict & Benefit Confidential: Renowned Ecommerce site located in US. Client needed a sales strategy for elevating his product sales and increase profit by getting more customers. Then cluster the products into premium

More information

IBM: An Early Leader across the Big Data Security Analytics Continuum Date: June 2013 Author: Jon Oltsik, Senior Principal Analyst

IBM: An Early Leader across the Big Data Security Analytics Continuum Date: June 2013 Author: Jon Oltsik, Senior Principal Analyst ESG Brief IBM: An Early Leader across the Big Data Security Analytics Continuum Date: June 2013 Author: Jon Oltsik, Senior Principal Analyst Abstract: Many enterprise organizations claim that they already

More information

Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control

Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control EP/K006487/1 UK PI: Prof Gareth Taylor (BU) China PI: Prof Yong-Hua Song (THU) Consortium UK Members: Brunel University

More information

Big Data. A general approach to process external multimedia datasets. David Mera

Big Data. A general approach to process external multimedia datasets. David Mera Big Data A general approach to process external multimedia datasets David Mera Laboratory of Data Intensive Systems and Applications (DISA) Masaryk University Brno, Czech Republic 7/10/2014 Table of Contents

More information

An Industrial Perspective on the Hadoop Ecosystem. Eldar Khalilov Pavel Valov

An Industrial Perspective on the Hadoop Ecosystem. Eldar Khalilov Pavel Valov An Industrial Perspective on the Hadoop Ecosystem Eldar Khalilov Pavel Valov agenda 03.12.2015 2 agenda Introduction 03.12.2015 2 agenda Introduction Research goals 03.12.2015 2 agenda Introduction Research

More information

QRadar SIEM and Zscaler Nanolog Streaming Service

QRadar SIEM and Zscaler Nanolog Streaming Service QRadar SIEM and Zscaler Nanolog Streaming Service February 2014 1 QRadar SIEM: Security Intelligence Platform QRadar SIEM provides full visibility and actionable insight to protect networks and IT assets

More information

NitroView. Content Aware SIEM TM. Unified Security and Compliance Unmatched Speed and Scale. Application Data Monitoring. Database Monitoring

NitroView. Content Aware SIEM TM. Unified Security and Compliance Unmatched Speed and Scale. Application Data Monitoring. Database Monitoring NitroView Unified Security and Compliance Unmatched Speed and Scale Application Data Monitoring Database Monitoring Log Management Content Aware SIEM TM IPS Today s security challenges demand a new approach

More information

THE 2014 THREAT DETECTION CHECKLIST. Six ways to tell a criminal from a customer.

THE 2014 THREAT DETECTION CHECKLIST. Six ways to tell a criminal from a customer. THE 2014 THREAT DETECTION CHECKLIST Six ways to tell a criminal from a customer. Telling criminals from customers online isn t getting any easier. Attackers target the entire online user lifecycle from

More information

Big Data Analytics in LinkedIn. Danielle Aring & William Merritt

Big Data Analytics in LinkedIn. Danielle Aring & William Merritt Big Data Analytics in LinkedIn by Danielle Aring & William Merritt 2 Brief History of LinkedIn - Launched in 2003 by Reid Hoffman (https://ourstory.linkedin.com/) - 2005: Introduced first business lines

More information

Monitoring BGP and Route Leaks using OpenBMP and Apache Kafka

Monitoring BGP and Route Leaks using OpenBMP and Apache Kafka Monitoring BGP and Route Leaks using OpenBMP and Apache Kafka Tim Evens (tievens@cisco.com) NANOG-65 Traditional Method: VTY (cli/netconf/xml) Data is polled instead of pushed (not real-time) Large queries

More information

HDMQ :Towards In-Order and Exactly-Once Delivery using Hierarchical Distributed Message Queues. Dharmit Patel Faraj Khasib Shiva Srivastava

HDMQ :Towards In-Order and Exactly-Once Delivery using Hierarchical Distributed Message Queues. Dharmit Patel Faraj Khasib Shiva Srivastava HDMQ :Towards In-Order and Exactly-Once Delivery using Hierarchical Distributed Message Queues Dharmit Patel Faraj Khasib Shiva Srivastava Outline What is Distributed Queue Service? Major Queue Service

More information

Monitoring Best Practices for

Monitoring Best Practices for Monitoring Best Practices for OVERVIEW Providing the right level and depth of monitoring is key to ensuring the effective operation of IT systems. This is especially true for ecommerce systems like Magento,

More information

The Purview Solution Integration With Splunk

The Purview Solution Integration With Splunk The Purview Solution Integration With Splunk Integrating Application Management and Business Analytics With Other IT Management Systems A SOLUTION WHITE PAPER WHITE PAPER Introduction Purview Integration

More information

An Approach to Implement Map Reduce with NoSQL Databases

An Approach to Implement Map Reduce with NoSQL Databases www.ijecs.in International Journal Of Engineering And Computer Science ISSN: 2319-7242 Volume 4 Issue 8 Aug 2015, Page No. 13635-13639 An Approach to Implement Map Reduce with NoSQL Databases Ashutosh

More information

Big Data Security Challenges and Recommendations

Big Data Security Challenges and Recommendations International Journal of Computer Sciences and Engineering Open Access Review Paper Volume-4, Issue-1 E-ISSN: 2347-2693 Big Data Security Challenges and Recommendations Renu Bhandari 1, Vaibhav Hans 2*

More information

Big Data Use Case. How Rackspace is using Private Cloud for Big Data. Bryan Thompson. May 8th, 2013

Big Data Use Case. How Rackspace is using Private Cloud for Big Data. Bryan Thompson. May 8th, 2013 Big Data Use Case How Rackspace is using Private Cloud for Big Data Bryan Thompson May 8th, 2013 Our Big Data Problem Consolidate all monitoring data for reporting and analytical purposes. Every device

More information

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing

More information

A New Approach to Network Visibility at UBC. Presented by the Network Management Centre and Wireless Infrastructure Teams

A New Approach to Network Visibility at UBC. Presented by the Network Management Centre and Wireless Infrastructure Teams A New Approach to Network Visibility at UBC Presented by the Network Management Centre and Wireless Infrastructure Teams Agenda Business Drivers Technical Overview Network Packet Broker Tool Network Monitoring

More information

Tungsten Replicator, more open than ever!

Tungsten Replicator, more open than ever! Tungsten Replicator, more open than ever! MC Brown, Senior Product Line Manager September, 2015 2014 VMware Inc. All rights reserved. We Face An Age Old Problem BRS/Search 2 It s Gotten Worse 3 Much Worse

More information

Big Data a threat or a chance?

Big Data a threat or a chance? Big Data a threat or a chance? Helwig Hauser University of Bergen, Dept. of Informatics Big Data What is Big Data? well, lots of data, right? we come back to this in a moment. certainly, a buzz-word but

More information

Performance and Scalability Overview

Performance and Scalability Overview Performance and Scalability Overview This guide provides an overview of some of the performance and scalability capabilities of the Pentaho Business Analytics platform. PENTAHO PERFORMANCE ENGINEERING

More information

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh 1 Hadoop: A Framework for Data- Intensive Distributed Computing CS561-Spring 2012 WPI, Mohamed Y. Eltabakh 2 What is Hadoop? Hadoop is a software framework for distributed processing of large datasets

More information

Openbus Documentation

Openbus Documentation Openbus Documentation Release 1 Produban February 17, 2014 Contents i ii An open source architecture able to process the massive amount of events that occur in a banking IT Infraestructure. Contents:

More information

NStreamAware: Real-Time Visual Analytics for Data Streams to Enhance Situational Awareness

NStreamAware: Real-Time Visual Analytics for Data Streams to Enhance Situational Awareness Symposium on Visualization for Cyber Security (VizSec 2014) 10th November 2014, Paris, France NStreamAware: Real-Time Visual Analytics for Data Streams to Enhance Situational Awareness Fabian Fischer and

More information

HADOOP. Revised 10/19/2015

HADOOP. Revised 10/19/2015 HADOOP Revised 10/19/2015 This Page Intentionally Left Blank Table of Contents Hortonworks HDP Developer: Java... 1 Hortonworks HDP Developer: Apache Pig and Hive... 2 Hortonworks HDP Developer: Windows...

More information

Resource Aware Scheduler for Storm. Software Design Document. <jerry.boyang.peng@gmail.com> Date: 09/18/2015

Resource Aware Scheduler for Storm. Software Design Document. <jerry.boyang.peng@gmail.com> Date: 09/18/2015 Resource Aware Scheduler for Storm Software Design Document Author: Boyang Jerry Peng Date: 09/18/2015 Table of Contents 1. INTRODUCTION 3 1.1. USING

More information

Topology Aware Analytics for Elastic Cloud Services

Topology Aware Analytics for Elastic Cloud Services Topology Aware Analytics for Elastic Cloud Services athafoud@cs.ucy.ac.cy Master Thesis Presentation May 28 th 2015, Department of Computer Science, University of Cyprus In Brief.. a Tool providing Performance

More information

Indicator Expansion Techniques Tracking Cyber Threats via DNS and Netflow Analysis

Indicator Expansion Techniques Tracking Cyber Threats via DNS and Netflow Analysis Indicator Expansion Techniques Tracking Cyber Threats via DNS and Netflow Analysis United States Computer Emergency Readiness Team (US-CERT) Detection and Analysis January 2011 Background As the number

More information

How To Use A Data Center With A Data Farm On A Microsoft Server On A Linux Server On An Ipad Or Ipad (Ortero) On A Cheap Computer (Orropera) On An Uniden (Orran)

How To Use A Data Center With A Data Farm On A Microsoft Server On A Linux Server On An Ipad Or Ipad (Ortero) On A Cheap Computer (Orropera) On An Uniden (Orran) Day with Development Master Class Big Data Management System DW & Big Data Global Leaders Program Jean-Pierre Dijcks Big Data Product Management Server Technologies Part 1 Part 2 Foundation and Architecture

More information

Presenting Mongoose A New Approach to Traffic Capture (patent pending) presented by Ron McLeod and Ashraf Abu Sharekh January 2013

Presenting Mongoose A New Approach to Traffic Capture (patent pending) presented by Ron McLeod and Ashraf Abu Sharekh January 2013 Presenting Mongoose A New Approach to Traffic Capture (patent pending) presented by Ron McLeod and Ashraf Abu Sharekh January 2013 Outline Genesis - why we built it, where and when did the idea begin Issues

More information

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, sborkar95@gmail.com Assistant Professor, Information

More information

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,

More information

CA Technologies Big Data Infrastructure Management Unified Management and Visibility of Big Data

CA Technologies Big Data Infrastructure Management Unified Management and Visibility of Big Data Research Report CA Technologies Big Data Infrastructure Management Executive Summary CA Technologies recently exhibited new technology innovations, marking its entry into the Big Data marketplace with

More information

Software development & technologies in Market Research industry

Software development & technologies in Market Research industry Software development & technologies in Market Research industry Ember.js, PHP, ConfirmIt & Dimensions October 2014 1 ROC Online 2 Who we are and what we do? Team & Skills Process Software/Frameworks/Products

More information

White Paper: Datameer s User-Focused Big Data Solutions

White Paper: Datameer s User-Focused Big Data Solutions CTOlabs.com White Paper: Datameer s User-Focused Big Data Solutions May 2012 A White Paper providing context and guidance you can use Inside: Overview of the Big Data Framework Datameer s Approach Consideration

More information

Performance and Scalability Overview

Performance and Scalability Overview Performance and Scalability Overview This guide provides an overview of some of the performance and scalability capabilities of the Pentaho Business Analytics Platform. Contents Pentaho Scalability and

More information

Tivoli Security Information and Event Manager V1.0

Tivoli Security Information and Event Manager V1.0 Tivoli Security Information and Event Manager V1.0 Summary Security information and event management (SIEM) is a primary concern of the CIOs and CISOs in many enterprises. They need to centralize security-relevant

More information

Comparison of the Frontier Distributed Database Caching System with NoSQL Databases

Comparison of the Frontier Distributed Database Caching System with NoSQL Databases Comparison of the Frontier Distributed Database Caching System with NoSQL Databases Dave Dykstra dwd@fnal.gov Fermilab is operated by the Fermi Research Alliance, LLC under contract No. DE-AC02-07CH11359

More information

Kafka & Redis for Big Data Solutions

Kafka & Redis for Big Data Solutions Kafka & Redis for Big Data Solutions Christopher Curtin Head of Technical Research @ChrisCurtin About Me 25+ years in technology Head of Technical Research at Silverpop, an IBM Company (14 + years at Silverpop)

More information

Problem Solving Hands-on Labware for Teaching Big Data Cybersecurity Analysis

Problem Solving Hands-on Labware for Teaching Big Data Cybersecurity Analysis , 22-24 October, 2014, San Francisco, USA Problem Solving Hands-on Labware for Teaching Big Data Cybersecurity Analysis Teng Zhao, Kai Qian, Dan Lo, Minzhe Guo, Prabir Bhattacharya, Wei Chen, and Ying

More information

Cloud3DView: Gamifying Data Center Management

Cloud3DView: Gamifying Data Center Management Cloud3DView: Gamifying Data Center Management Yonggang Wen Assistant Professor School of Computer Engineering Nanyang Technological University ygwen@ntu.edu.sg November 26, 2013 School of Computer Engineering

More information

The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer, Cofounder @mikeolson

The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer, Cofounder @mikeolson The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer, Cofounder @mikeolson 1 A New Platform for Pervasive Analytics Multiple big data opportunities

More information

Embedded inside the database. No need for Hadoop or customcode. True real-time analytics done per transaction and in aggregate. On-the-fly linking IP

Embedded inside the database. No need for Hadoop or customcode. True real-time analytics done per transaction and in aggregate. On-the-fly linking IP Operates more like a search engine than a database Scoring and ranking IP allows for fuzzy searching Best-result candidate sets returned Contextual analytics to correctly disambiguate entities Embedded

More information

Hadoop Ecosystem B Y R A H I M A.

Hadoop Ecosystem B Y R A H I M A. Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open

More information

Big Data Analytics for Cyber

Big Data Analytics for Cyber Big Data Analytics for Cyber AFCEA International Cyber Symposium June 24, 2014 Jon Lau, Vice President and CTO UMBC Training Centers 6/26/2014 umbctraining.com 443-692-6600 1 Agenda About UMBC & UMBC Training

More information

Detecting Anomalous Behavior with the Business Data Lake. Reference Architecture and Enterprise Approaches.

Detecting Anomalous Behavior with the Business Data Lake. Reference Architecture and Enterprise Approaches. Detecting Anomalous Behavior with the Business Data Lake Reference Architecture and Enterprise Approaches. 2 Detecting Anomalous Behavior with the Business Data Lake Pivotal the way we see it Reference

More information

How To Handle Big Data With A Data Scientist

How To Handle Big Data With A Data Scientist III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

Designing a Data Solution with Microsoft SQL Server 2014

Designing a Data Solution with Microsoft SQL Server 2014 20465C - Version: 1 22 June 2016 Designing a Data Solution with Microsoft SQL Server 2014 Designing a Data Solution with Microsoft SQL Server 2014 20465C - Version: 1 5 days Course Description: The focus

More information

[Hadoop, Storm and Couchbase: Faster Big Data]

[Hadoop, Storm and Couchbase: Faster Big Data] [Hadoop, Storm and Couchbase: Faster Big Data] With over 8,500 clients, LivePerson is the global leader in intelligent online customer engagement. With an increasing amount of agent/customer engagements,

More information

The Cyber Threat Profiler

The Cyber Threat Profiler Whitepaper The Cyber Threat Profiler Good Intelligence is essential to efficient system protection INTRODUCTION As the world becomes more dependent on cyber connectivity, the volume of cyber attacks are

More information

Innovative, High-Density, Massively Scalable Packet Capture and Cyber Analytics Cluster for Enterprise Customers

Innovative, High-Density, Massively Scalable Packet Capture and Cyber Analytics Cluster for Enterprise Customers Innovative, High-Density, Massively Scalable Packet Capture and Cyber Analytics Cluster for Enterprise Customers The Enterprise Packet Capture Cluster Platform is a complete solution based on a unique

More information

Automating Big Data Benchmarking for Different Architectures with ALOJA

Automating Big Data Benchmarking for Different Architectures with ALOJA www.bsc.es Jan 2016 Automating Big Data Benchmarking for Different Architectures with ALOJA Nicolas Poggi, Postdoc Researcher Agenda 1. Intro on Hadoop performance 1. Current scenario and problematic 2.

More information

Self-organized Collaboration of Distributed IDS Sensors

Self-organized Collaboration of Distributed IDS Sensors Self-organized Collaboration of Distributed IDS Sensors KarelBartos 1 and Martin Rehak 1,2 and Michal Svoboda 2 1 Faculty of Electrical Engineering Czech Technical University in Prague 2 Cognitive Security,

More information

Request for Resume (RFR) CATS+ Master Contract All Master Contract Provisions Apply. Section 1 General Information

Request for Resume (RFR) CATS+ Master Contract All Master Contract Provisions Apply. Section 1 General Information Section 1 General Information RFR Number: (Reference BPO Number) Functional Area (Enter One Only) F50B5400042 FA 2- Web and Internet Systems Labor Category/s A single support staff or support groups of

More information

Ganzheitliches Datenmanagement

Ganzheitliches Datenmanagement Ganzheitliches Datenmanagement für Hadoop Michael Kohs, Senior Sales Consultant @mikchaos The Problem with Big Data Projects in 2016 Relational, Mainframe Documents and Emails Data Modeler Data Scientist

More information

Keyword: Cloud computing, service model, deployment model, network layer security.

Keyword: Cloud computing, service model, deployment model, network layer security. Volume 4, Issue 2, February 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com An Emerging

More information

Big Data Visualization with JReport

Big Data Visualization with JReport Big Data Visualization with JReport Dean Yao Director of Marketing Greg Harris Systems Engineer Next Generation BI Visualization JReport is an advanced BI visualization platform: Faster, scalable reports,

More information

International Journal of Enterprise Computing and Business Systems ISSN (Online) : 2230-8849

International Journal of Enterprise Computing and Business Systems ISSN (Online) : 2230-8849 WINDOWS-BASED APPLICATION AWARE NETWORK INTERCEPTOR Ms. Shalvi Dave [1], Mr. Jimit Mahadevia [2], Prof. Bhushan Trivedi [3] [1] Asst.Prof., MCA Department, IITE, Ahmedabad, INDIA [2] Chief Architect, Elitecore

More information

CMS Query Suite. CS4440 Project Proposal. Chris Baker Michael Cook Soumo Gorai

CMS Query Suite. CS4440 Project Proposal. Chris Baker Michael Cook Soumo Gorai CMS Query Suite CS4440 Project Proposal Chris Baker Michael Cook Soumo Gorai I) Motivation Relational databases are great places to efficiently store large amounts of information. However, information

More information

Architectures for massive data management

Architectures for massive data management Architectures for massive data management Apache Kafka, Samza, Storm Albert Bifet albert.bifet@telecom-paristech.fr October 20, 2015 Stream Engine Motivation Digital Universe EMC Digital Universe with

More information

NitroView Enterprise Security Manager (ESM), Enterprise Log Manager (ELM), & Receivers

NitroView Enterprise Security Manager (ESM), Enterprise Log Manager (ELM), & Receivers NitroView Enterprise Security Manager (ESM), Enterprise Log Manager (ELM), & Receivers The World's Fastest and Most Scalable SIEM Finally an enterprise-class security information and event management system

More information

The syslog-ng Premium Edition 5F2

The syslog-ng Premium Edition 5F2 The syslog-ng Premium Edition 5F2 PRODUCT DESCRIPTION Copyright 2000-2014 BalaBit IT Security All rights reserved. www.balabit.com Introduction The syslog-ng Premium Edition enables enterprises to collect,

More information

SQL + NOSQL + NEWSQL + REALTIME FOR INVESTMENT BANKS

SQL + NOSQL + NEWSQL + REALTIME FOR INVESTMENT BANKS Enterprise Data Problems in Investment Banks BigData History and Trend Driven by Google CAP Theorem for Distributed Computer System Open Source Building Blocks: Hadoop, Solr, Storm.. 3548 Hypothetical

More information

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January 2015. Email: bdg@qburst.com Website: www.qburst.com

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January 2015. Email: bdg@qburst.com Website: www.qburst.com Lambda Architecture Near Real-Time Big Data Analytics Using Hadoop January 2015 Contents Overview... 3 Lambda Architecture: A Quick Introduction... 4 Batch Layer... 4 Serving Layer... 4 Speed Layer...

More information

PROJECT BOEING SGS. Interim Technology Performance Report 1. Company Name: The Boeing Company. Contract ID: DE-OE0000191

PROJECT BOEING SGS. Interim Technology Performance Report 1. Company Name: The Boeing Company. Contract ID: DE-OE0000191 Interim Techlogy Performance Report 1 PROJECT BOEING SGS Contract ID: DE-OE0000191 Project Type: Revision: V2 Company Name: The Boeing Company December 10, 2012 1 Interim Techlogy Performance Report 1

More information

Create and Drive Big Data Success Don t Get Left Behind

Create and Drive Big Data Success Don t Get Left Behind Create and Drive Big Data Success Don t Get Left Behind The performance boost from MapR not only means we have lower hardware requirements, but also enables us to deliver faster analytics for our users.

More information

Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges

Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges Prerita Gupta Research Scholar, DAV College, Chandigarh Dr. Harmunish Taneja Department of Computer Science and

More information

Presented by: Aaron Bossert, Cray Inc. Network Security Analytics, HPC Platforms, Hadoop, and Graphs Oh, My

Presented by: Aaron Bossert, Cray Inc. Network Security Analytics, HPC Platforms, Hadoop, and Graphs Oh, My Presented by: Aaron Bossert, Cray Inc. Network Security Analytics, HPC Platforms, Hadoop, and Graphs Oh, My The Proverbial Needle In A Haystack Problem The Nuclear Option Problem Statement and Proposed

More information

ForeScout CounterACT. Device Host and Detection Methods. Technology Brief

ForeScout CounterACT. Device Host and Detection Methods. Technology Brief ForeScout CounterACT Device Host and Detection Methods Technology Brief Contents Introduction... 3 The ForeScout Approach... 3 Discovery Methodologies... 4 Passive Monitoring... 4 Passive Authentication...

More information

A Performance Analysis of Distributed Indexing using Terrier

A Performance Analysis of Distributed Indexing using Terrier A Performance Analysis of Distributed Indexing using Terrier Amaury Couste Jakub Kozłowski William Martin Indexing Indexing Used by search

More information

Security strategies to stay off the Børsen front page

Security strategies to stay off the Børsen front page Security strategies to stay off the Børsen front page Steve Durkin, Channel Director for Europe, Q1 Labs, an IBM Company 1 2012 IBM Corporation Given the dynamic nature of the challenge, measuring the

More information

Big Data and Privacy. Fritz Henglein Dept. of Computer Science, University of Copenhagen. Finance IT Day Riga, 2015-03-26

Big Data and Privacy. Fritz Henglein Dept. of Computer Science, University of Copenhagen. Finance IT Day Riga, 2015-03-26 Big Data and Privacy Fritz Henglein Dept. of Computer Science, University of Copenhagen Finance IT Day Riga, 2015-03-26 About me Professor, Programming Languages and Systems, University of Copenhagen Director,

More information

AccelOps NOC and SOC Analytics in a Single Pane of Glass Date: March 2016 Author: Tony Palmer, Senior ESG Lab Analyst

AccelOps NOC and SOC Analytics in a Single Pane of Glass Date: March 2016 Author: Tony Palmer, Senior ESG Lab Analyst ESG Lab Spotlight AccelOps NOC and SOC Analytics in a Single Pane of Glass Date: March 2016 Author: Tony Palmer, Senior ESG Lab Analyst Abstract: This ESG Lab Spotlight details ESG s hands-on testing of

More information