Real-time distributed Complex Event Processing for Big Data scenarios

Similar documents
DATA RECOVERY SOLUTIONS EXPERT DATA RECOVERY SOLUTIONS FOR ALL DATA LOSS SCENARIOS.

REAL-TIME STREAMING ANALYTICS DATA IN, ACTION OUT

MASSIF: A Highly Scalable SIEM

Cloud Computing and Advanced Relationship Analytics

Online and Scalable Data Validation in Advanced Metering Infrastructures

Getting Real Real Time Data Integration Patterns and Architectures

Enabling the SmartGrid through Cloud Computing

Giving life to today s media distribution services

news Oracle ZDLRA Zero Data Loss Recovery Appliance

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

Big data coming soon... to an NSI near you. John Dunne. Central Statistics Office (CSO), Ireland

Relational Databases in the Cloud

Big Data Analytics - Accelerated. stream-horizon.com

Towards Smart and Intelligent SDN Controller

Cloud computing - Architecting in the cloud

Key Challenges in Cloud Computing to Enable Future Internet of Things

Load Balancing and Maintaining the Qos on Cloud Partitioning For the Public Cloud

Integrating Mobile Internet of Things and Cloud Computing towards Scalability: Lessons Learned from Existing Fog Computing Architectures

StreamStorage: High-throughput and Scalable Storage Technology for Streaming Data

Solution Overview. Optimizing Customer Care Processes Using Operational Intelligence

Apache Ignite TM (Incubating) - In- Memory Data Fabric Fast Data Meets Open Source

CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1

Manjrasoft Market Oriented Cloud Computing Platform

BIG DATA-AS-A-SERVICE

Hadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN

Blog:

IBM WebSphere Distributed Caching Products

Designing a Cloud Storage System

Provisioning and Resource Management at Large Scale (Kadeploy and OAR)

Management of Security Information and Events in Future Internet

USE CASES BROADBAND EXPERIENCE EVERYWHERE, ANYTIME SMART VEHICLES, TRANSPORT & INFRASTRUCTURE MEDIA EVERYWHERE CRITICAL CONTROL OF REMOTE DEVICES

Manjrasoft Market Oriented Cloud Computing Platform

Real Time Big Data Processing

Towards an Organic Middleware for the Smart Doorplate Project

Using Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM

GigaSpaces Real-Time Analytics for Big Data

Design Patterns for Large Scale Data Movement. Aaron Lee

Testing Big data is one of the biggest

BIG DATA TRENDS AND TECHNOLOGIES

In-Memory BigData. Summer 2012, Technology Overview

Intelligent Business Operations and Big Data Software AG. All rights reserved.

Vortex White Paper. Simplifying Real-time Information Integration in Industrial Internet of Things (IIoT) Control Systems

Pulsar Realtime Analytics At Scale. Tony Ng April 14, 2015

Connected Intelligence and the 21 st Century Digital Enterprise

Enabling the Use of Data

White Paper. How Streaming Data Analytics Enables Real-Time Decisions

Transforming industries: energy and utilities. How the Internet of Things will transform the utilities industry

How can the Future Internet enable Smart Energy?

An Open-Source Streaming Machine Learning and Real-Time Analytics Architecture

Cloud computing: the state of the art and challenges. Jānis Kampars Riga Technical University

MASSIF: A Promising Solution to Enhance Olympic Games IT Security

Data Management in the Cloud. Zhen Shi

Data Center Infrastructure Management Managing the Physical Infrastructure for Greater Efficiency

ORACLE COHERENCE 12CR2

Software-Defined Networks Powered by VellOS

Horizontal IoT Application Development using Semantic Web Technologies

Data Warehousing and Analytics Infrastructure at Facebook. Ashish Thusoo & Dhruba Borthakur athusoo,dhruba@facebook.com

Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control

Building Web-based Infrastructures for Smart Meters

Real Time Network Server Monitoring using Smartphone with Dynamic Load Balancing

Case Study: Semantic Integration as the Key Enabler of Interoperability and Modular Architecture for Smart Grid at Long Island Power Authority (LIPA)

Circuit Protection is Key in Maintaining Growth for The Internet of Things

COnvergence of fixed and Mobile BrOadband access/aggregation networks Work programme topic: ICT Future Networks Type of project: Large scale

Stream Processing on GPUs Using Distributed Multimedia Middleware

Using Data Classification to Manage File Servers

Virtual Privacy vs. Real Security

Apache Kafka Your Event Stream Processing Solution


IBM and Dynamic Infrastructure. Doug Neilson, IBM Systems Group May 2009

Data Refinery with Big Data Aspects

Find the Information That Matters. Visualize Your Data, Your Way. Scalable, Flexible, Global Enterprise Ready

Transport SDN - Clearing the Roadblocks to Wide-scale Commercial

How To Make Data Streaming A Real Time Intelligence

The 5G Infrastructure Public-Private Partnership

Complex Event Processing (CEP) Why and How. Richard Hallgren BUGS

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into

A Review on Quality of Service Architectures for Internet Network Service Provider (INSP)

Converging P2P with clouds towards advanced real time media distribution architectures.

IBM EXAM QUESTIONS & ANSWERS

Informatica Ultra Messaging SMX Shared-Memory Transport

Pervasive PSQL Meets Critical Business Requirements

This presentation covers virtual application shared services supplied with IBM Workload Deployer version 3.1.

An Implementation of Active Data Technology

Transcription:

Institute of Parallel and Distributed Systems () Universitätsstraße 38 D-70569 Stuttgart Real-time distributed Complex Event Processing for Big Data scenarios Ruben Mayer

Motivation: New Applications in IT Modern IT systems need to react to real-world situations Example: Enable Demand Response with Smart Grids + energy consumption energy production manage appliances High-rate event streams must be processed in real-time Gap between low level sensor readings (consumption and production rates) and high level situation (manage appliances) 2

Complex Event Processing Distributed Complex Event Processing (CEP) can be used to solve this problem Operator network processing of event streams switch on / off Analyze Aggregated rates Aggregate Consumption rates Aggregated rates Aggregate Production rates 3

IT Infrastructure Changes New, heterogeneous infrastructures Multi-core / Many-core systems Cloud Computing Computing ressources at the edge of the Internet Goal: Make CEP fit for such infrastructures Make use of multiple cores Elastic scaling in the cloud Push computing towards the edge of the network This development can make new applications possible Highly scalable Reliable Elastic 4

Research Problem: Reliability This talk focuses on reliability Node and communication failures Manufacturer Billing Customer Information Loss of operator state Events arrive late Event streams must still be reliable fail Delivery of a package of 3 artifacts for 300 $ No false-negatives No false-positives Source events false-negative Delivery of a package of 2 artifacts for 250 $ false-positive 5

Research Problem: Reliability State of the art Active/Passive Replication Rollback-Recovery with checkpoints Problem Find methods with low run-time overhead that offer real-time processing better scalability than existing approaches Approach: Develop processing model for CEP operators shows inherent operator properties better recovery methods can be developed 6

Operator Model All operators ω: Correlation of events is performed in steps Selection of events σ from incoming streams gets correlated A set of events (e 1,...,e n ) is deducted from that selection Correlation function f ω : σ (e 1,...,e n ) describes a correlation step σ f ω (e 1,...,e n ) incoming events ω outgoing events General observation: Processing of a selection is independent from processing of other selections Correlation function is stateless 7

Savepoint Recovery A rollback-recovery method that induces less run-time overhead Ensures strong reliability No false-positives, no false-negatives Works without persistent checkpoints Recovery of Incoming streams Current selection on them Incoming streams can be re-streamed from predecessors Information on current selection needs to be captured Execution model operator reveals selection information 8

Future Work Real-time recovery guarantees for savepoint recovery Modelling of different classes of reliability requirements Apply the optimal reliability method Find new, reliable parallelization methods Easy integration of operators Elastic parallelization degree Combine with reliability methods 9

End of Presentation Questions, Comments and Discussions 10