Skyminer Big Data Solution Features Highlights. Company Proprietary Sensitive Information



Similar documents
Big Data for Satellite Business Intelligence

Real-time Data Analytics mit Elasticsearch. Bernhard Pflugfelder inovex GmbH

[Hadoop, Storm and Couchbase: Faster Big Data]

Unified Batch & Stream Processing Platform

Highlights of CAMS. Service Health Manager

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

Reference Model for Cloud Applications CONSIDERATIONS FOR SW VENDORS BUILDING A SAAS SOLUTION

Partner Camp Leistungsstarkes Log-Management für physische, virtuelle und cloud-basierte Umgebungen. Tomas Baublys

Using distributed technologies to analyze Big Data

XpoLog Center Suite Log Management & Analysis platform

TRAVERSE: VIRTUALIZATION AND PRIVATE CLOUD MONITORING

How To Use Big Data For Telco (For A Telco)

Axibase Time Series Database

Processing millions of logs with Logstash

7/15/2011. Monitoring and Managing VDI. Monitoring a VDI Deployment. Veeam Monitor. Veeam Monitor

Big Data? Definition # 1: Big Data Definition Forrester Research

End- to- End Monitoring Unified Performance Dashboard (UPD)

Big Data and Market Surveillance. April 28, 2014

XpoLog Center Suite Data Sheet

Open PostgreSQL Monitoring

Towards Smart and Intelligent SDN Controller

ScienceLogic vs. Open Source IT Monitoring

the missing log collector Treasure Data, Inc. Muga Nishizawa

Topology Aware Analytics for Elastic Cloud Services

White Paper November Technical Comparison of Perspectium Replicator vs Traditional Enterprise Service Buses

INTRODUCING DRUID: FAST AD-HOC QUERIES ON BIG DATA MICHAEL DRISCOLL - CEO ERIC TSCHETTER - LEAD METAMARKETS

XpoLog Competitive Comparison Sheet

BIG DATA TOOLS. Top 10 open source technologies for Big Data

Jitterbit Technical Overview : Salesforce

Real Time Performance Dashboard for SOA Web Services ORION SOA

BASHO DATA PLATFORM SIMPLIFIES BIG DATA, IOT, AND HYBRID CLOUD APPS

Real Time Big Data Processing

Architectures for Big Data Analytics A database perspective

Big Data Pipeline and Analytics Platform

QuickSpecs. HP Data Protector Reporter Software Overview. Powerful enterprise reporting with point and click simplicity

TORNADO Solution for Telecom Vertical

Scalable Architecture on Amazon AWS Cloud

Improve performance and availability of Banking Portal with HADOOP

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January Website:

Technology Enablement

April 2016 JPoint Moscow, Russia. How to Apply Big Data Analytics and Machine Learning to Real Time Processing. Kai Wähner.

Liferay Portal s Document Library: Architectural Overview, Performance and Scalability

Benchmarking Couchbase Server for Interactive Applications. By Alexey Diomin and Kirill Grigorchuk

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

Understanding Neo4j Scalability

DISTRIBUTED internet systems monitoring is already

SMART SCALE YOUR STORAGE - Object "Forever Live" Storage - Roberto Castelli EVP Sales & Marketing BCLOUD

Driving business agility with. Smart Solutions

Simplifying Database Management with DataStax OpsCenter

CA Virtual Assurance/ Systems Performance for IM r12 DACHSUG 2011

Technical Overview Simple, Scalable, Object Storage Software

EMC Data Protection Advisor 6.0

Bernd Ahlers Michael Friedrich. Log Monitoring Simplified Get the best out of Graylog2 & Icinga 2

Getting Started with SandStorm NoSQL Benchmark

Moving From Hadoop to Spark

Comprehensive Monitoring of VMware vsphere ESX & ESXi Environments

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

IFS-8000 V2.0 INFORMATION FUSION SYSTEM

GESTIONE DI APPLICAZIONI DI CALCOLO ETEROGENEE CON STRUMENTI DI CLOUD COMPUTING: ESPERIENZA DI INFN-TORINO

Server & Application Monitor

NetVue Integrated Management System

Petabyte Scale Data at Facebook. Dhruba Borthakur, Engineer at Facebook, SIGMOD, New York, June 2013

THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES

Querying MongoDB without programming using FUNQL

itrademonitor - DMA - Monitoring and Reporting Solution

Liferay Portal Performance. Benchmark Study of Liferay Portal Enterprise Edition

CitusDB Architecture for Real-Time Big Data

Web to Print Knowledge Experience. A Case Study of the Government of Hessen, Germany s Half-Time Report

What s New in Security Analytics Be the Hunter.. Not the Hunted

Frequently Asked Questions Plus What s New for CA Application Performance Management 9.7

BIG DATA FOR MEDIA SIGMA DATA SCIENCE GROUP MARCH 2ND, OSLO

How Comcast Built An Open Source Content Delivery Network National Engineering & Technical Operations

Decoding DNS data. Using DNS traffic analysis to identify cyber security threats, server misconfigurations and software bugs

Search and Real-Time Analytics on Big Data

BIRT ihub Actuate Customer Days. Wow that looks good! Jeff Morris & Mark Gamble

JFlooder - Application performance testing with QoS assurance

Oracle WebLogic Server 11g Administration

Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack

The Rise of Industrial Big Data

A Vision for Operational Analytics as the Enabler for Business Focused Hybrid Cloud Operations

Sage Integration Cloud Technology Whitepaper

An Approach to Implement Map Reduce with NoSQL Databases

Modern IT Operations Management. Why a New Approach is Required, and How Boundary Delivers

PTC System Monitor Solution Training

A Service-oriented Architecture for Business Intelligence

Is backhaul the weak link in your LTE network? Network assurance strategies for LTE backhaul infrastructure

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

Learning Management Redefined. Acadox Infrastructure & Architecture

Simplifying Big Data Analytics: Unifying Batch and Stream Processing. John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!!

OWB Users, Enter The New ODI World

Wisdom from Crowds of Machines

Ikasan ESB Reference Architecture Review

Parallel Databases. Parallel Architectures. Parallelism Terminology 1/4/2015. Increase performance by performing operations in parallel

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

From Spark to Ignition:

Leveraging the Power of SOLR with SPARK. Johannes Weigend QAware GmbH Germany pache Big Data Europe September 2015

ACEYUS REPORTING. Aceyus Intelligence Executive Summary

Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop

Big Data Visualization with JReport

Transcription:

Skyminer Big Data Solution Features Highlights

Purpose 1. Add value to the data by providing a unified archive and efficient analysis tools 2. Help using monitoring data to improve operations, detect and anticipate failures, and optimize systems 2

Big Data Problem? Systems generate data amount Store all data for any duration? Real-Time processing Correlation between data sources? Legacy Storage is everything archived? AND THEN Further analysis to detect unknown information? Learning model to anticipate failures? How efficiently is data stored and used? Company Company Proprietary Proprietary Sensitive Sensitive Information Information 3

Conclusion Systems generate data amount Usual Real-Time Processing Virtually keep all data forever Scale to any size! Big Data Storage AND THEN Correlation between data sources! Automated analysis for new information? Learning model to anticipate failures? Company Company Proprietary Proprietary Sensitive Sensitive Information Information 4

Primary Goals 1. Time series monitoring & analytics 2. Low cost system 3. Fast 4. Flexible & Scalable 5. Fault tolerant 6. Incorporates useful analysis features 7. Open to other systems 5

Architecture How does it work? Time Series Database (KairosDB) NoSQL Database as storage backend (Apache Cassandra) Domain expertise and deep integration How does it work? Company Company Proprietary Sensitive Sensitive Information 6

How does it work? ONE single database with a Time Series Web Service frontend A Time Series Database frontend (based on KairosDB) A NoSQL Database as storage backend (Apache Cassandra) We never query from Cassandra directly. 7

Large data sets: volume variety and throughput Monics Compass Epoch IPS Neuralstar GB TB GB TB GB TB GB TB MB PB MB PB MB PB MB PB Number of metrics: Low Number of metrics: High Number of metrics: High Number of metrics: High Throughput: High Throughput: Low Throughput: High Throughput: High Storage needs: high Storage needs: Medium Storage needs: High Storage needs: High 8

Solution EPOCH IPS Compass Neuralstar Monics Satellite C2 M&C NMS CSM Data Collector agent Data Collector agent Data Collector agent Data Collector agent Other Data Sources External Analytics systems Web UI Query, Reporting & Analytics Frontends Storage (pluggable) Data Integration Frontends 9

A Typical Skyminer System(s) Fault-Tolerant Small Cluster Using Apache Cassandra DB Start Small Scale out at lower cost when needed Low cost Quick start Fault management Data replication Easy administration Scale to any size Best performances 10

Interoperability Features 1. All features are provided as web services (HTTP / REST + JSON) 2. Open APIs 3. Interoperable data format based on JSON 4. Intuitive Web UI for starting using the system 5. APIs include: Data acquisition Data querying Analysis features (prediction, correlations) 11

Existing acquisition connectors 1. Spectrum and Signal Monitoring - Monics data 2. Equipment M&C - Compass data 3. Satellite Control - EPOCH telemetry 4. Enterprise Network Management - Neuralstar 5. Other connectors exist (or can be built) G B MB T B P B 12

Toolbox for leveraging Data value 13

Skyminer Exclusive Features 1. Analytics API via ad-hoc queries Time Series Query engine Horizontal aggregations (Down Sampling) statistical features calculated over time Series combining (vertical aggregations) Predictors Correlations analysis 2. User Interface Web UI with Correlations User Interface Integrated Dashboard 14

Skyminer Exclusive Features 3. Numerical Analysis Fully integrated with R o The most popular data analysis environment 4. Reporting Plugin for BIRT (or its commercial alternative OpenText Analytics) 15

Query Engine & aggregations Ad-hoc queries and statistics calculation Business Intelligence features already implemented (aggregate, drill & pivot) Data aggregates: Min, Max, Sum, Average, Count, Rate, Std Deviation etc Multi-level Group-by feature using tags, value, or time Filter by tags values 16

Aggregations: Available aggregators Min Max Avg Sum Std Dev Scale Rate (Derivative) Least Square Count Percentile KairosDB MinMax Filter (with predicate) First Last Interpolation Gaps marker Alias Untag/Retag Time Shift Formula Scripted aggregator Recorder Skyminer 17

Time Series Prediction analysis 1. Generic predictive analysis (in the query engine) is being implemented Actual Data Prediction Below this threshold there is a noticeable impact (failure, degradation) Predictive analysis indicates when and how-long the system or service will be affected by the degradation 2. Several predictors : linear (exponential Smoothing, holt, least squares), or dymanic with Dynamic Linear Model (DLM) 18

Correlations Analysis Explore and Find correlations within large data sets Correlations may indicate relationship in behaviours Several analysis methods Linear DTW (Dynamic Time Warping) Other methods can be added as new modules Several correlation API and user interfaces SEARCH Find data correlated to a reference MATRIX - Explore one-to-one correlations between large number of series 19

Web Interactive Dashboard 1. Integrated Dashboard 2. Most Skyminer analytic features available: aggregators (downsampling, vertical), group-by, filters, and predictors 3. Build a dashboard from various queries in a few mouse clicks from a rich Web user interface 20

Reporting Using BIRT reporting tool plugin 21

Specialized Analytics Using R Numerical Analysis Environment 22

Datastore Module Data Storage back-end can be changed in one line of configuration file => Pick-up the best of the moment for the use case core Pluggable DataStore HDF 5 23

Key elements 24

Features System exists and is operational in production since 2014 System is: Fast Scalable (1 to N nodes) Fault tolerant (1 to N replicas) Easy to backup (e.g. Cassandra snapshots files) Modular and evolutive Open (to change and to other systems) 25

Conclusion Skyminer is a simple system, but featuring a rich data processing toolbox for time series Unlimited aggregation capabilities Analytic features Predictions Correlations Time shifted queries Integration with R 26