Tracking a Soccer Game with Big Data

Similar documents
Pulsar Realtime Analytics At Scale. Tony Ng April 14, 2015

Fast Innovation requires Fast IT

COMMUNITY OF MACHINES AND THE AGE OF NEXT INDUSTRIAL AUTOMATION

How To Handle Big Data With A Data Scientist

Building the Internet of Things Jim Green - CTO, Data & Analytics Business Group, Cisco Systems

Cloud Big Data Architectures

Streaming Analytics and the Internet of Things: Transportation and Logistics

Internet of Things. Opportunity Challenges Solutions

Azure Data Lake Analytics

Complex Event Processing (CEP) Why and How. Richard Hallgren BUGS

Solving big data problems in real-time with CEP and Dashboards - patterns and tips

Real Time Big Data Processing

How to leverage SAP HANA for fast ROI and business advantage 5 STEPS. to success. with SAP HANA. Unleashing the value of HANA

GigaSpaces Real-Time Analytics for Big Data

Monitor Your Key Performance Indicators using WSO2 Business Activity Monitor

Unified Big Data Processing with Apache Spark. Matei

Enabling Manufacturing Transformation in a Connected World. John Shewchuk Technical Fellow DX

WSO2 Message Broker. Scalable persistent Messaging System

SAP and Hortonworks Reference Architecture

SQLstream Blaze and Apache Storm A BENCHMARK COMPARISON

Decoding the Big Data Deluge a Virtual Approach. Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco

Towards Smart and Intelligent SDN Controller

IoT / practical usage

An Open-Source Streaming Machine Learning and Real-Time Analytics Architecture

Big Data Analytics Nokia

Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop

Horizontal IoT Application Development using Semantic Web Technologies

How telcos can benefit from streaming big data analytics

Spark in Action. Fast Big Data Analytics using Scala. Matei Zaharia. project.org. University of California, Berkeley UC BERKELEY

Training for Big Data

How To Make Data Streaming A Real Time Intelligence

BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research &

Moving From Hadoop to Spark

FRAUD DETECTION AND PREVENTION: A DATA ANALYTICS APPROACH BY SESHIKA FERNANDO TECHNICAL LEAD, WSO2

I. TODAY S UTILITY INFRASTRUCTURE vs. FUTURE USE CASES...1 II. MARKET & PLATFORM REQUIREMENTS...2

Systems of Discovery The Perfect Storm of Big Data, Cloud and Internet-of-Things

Testing & Assuring Mobile End User Experience Before Production. Neotys

Ali Ghodsi Head of PM and Engineering Databricks

Near Real Time Indexing Kafka Message to Apache Blur using Spark Streaming. by Dibyendu Bhattacharya

Next-Gen Contact Center Technology Trends. Sheila McGee-Smith McGee-Smith Analytics

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?

Architecting for the Internet of Things & Big Data

iservdb The database closest to you IDEAS Institute

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Exploring the Synergistic Relationships Between BPC, BW and HANA

Big Data and Complex Networks Analytics. Timos Sellis, CSIT Kathy Horadam, MGS

Hybrid Cloud Architectures for Operational Performance Management

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

EVERYTHING THAT MATTERS IN ADVANCED ANALYTICS

The Next Wave of Big Data Analytics: Internet of Things and Sensor Data. November 6, 2014 Hannah Smalltree, Director

A Modern Process Automation System Offers More than Process Control. Dick Hill Vice President ARC Advisory Group

The Synergy of SOA, Event-Driven Architecture (EDA), and Complex Event Processing (CEP)

The Information Revolution for the Enterprise

Integrating Mobile Internet of Things and Cloud Computing towards Scalability: Lessons Learned from Existing Fog Computing Architectures

Embedded inside the database. No need for Hadoop or customcode. True real-time analytics done per transaction and in aggregate. On-the-fly linking IP

Unified Batch & Stream Processing Platform

Parallel Computing: Strategies and Implications. Dori Exterman CTO IncrediBuild.

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

INTRODUCTION. IoT AND IP STRATEGIES

Talend Real-Time Big Data Sandbox. Big Data Insights Cookbook

The Internet of Things

Case Study: Real-time Analytics With Druid. Salil Kalia, Tech Lead, TO THE NEW Digital

Trafodion Operational SQL-on-Hadoop

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

White Paper. How Streaming Data Analytics Enables Real-Time Decisions

Analyzing Big Data with AWS

Real-time Data Analytics mit Elasticsearch. Bernhard Pflugfelder inovex GmbH

SQLstream 4 Product Brief. CHANGING THE ECONOMICS OF BIG DATA SQLstream 4.0 product brief

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, Viswa Sharma Solutions Architect Tata Consultancy Services

So What s the Big Deal?

Industry 4.0 and Big Data

CAPTURING & PROCESSING REAL-TIME DATA ON AWS

Big Data and Advanced Analytics Technologies for the Smart Grid

Real-Time Data Access Using Restful Framework for Multi-Platform Data Warehouse Environment

Strategic Decisions Supported by SAP Big Data Solutions. Angélica Bedoya / Strategic Solutions GTM Mar /2014

The Next Big Thing in the Internet of Things: Real-Time Big Data Analytics"

Unlocking the Intelligence in. Big Data. Ron Kasabian General Manager Big Data Solutions Intel Corporation

SAP Business Suite powered by SAP HANA

IoT in Logistics. An assessment of today s and tomorrows opportunities

Data-intensive HPC: opportunities and challenges. Patrick Valduriez

The full setup includes the server itself, the server control panel, Firebird Database Server, and three sample applications with source code.

Big Data Analytics in LinkedIn. Danielle Aring & William Merritt

Time-Series Databases and Machine Learning

From Relational to Hadoop Part 1: Introduction to Hadoop. Gwen Shapira, Cloudera and Danil Zburivsky, Pythian

Dynamic M2M Event Processing Complex Event Processing and OSGi on Java Embedded

Internet of Things From Idea to Scale

Intelligent Business Operations and Big Data Software AG. All rights reserved.

Why Big Data in the Cloud?

The Future of M2M Application Enablement Platforms

Apache Hadoop. Alexandru Costan

Transcription:

Tracking a Soccer Game with Big Data QCon Sao Paulo - 2015 Asanka Abeysinghe Vice President, Solutions Architecture - WSO2,Inc

2 Story about soccer

3 and Big Data

Outline Big Data and CEP Tracking a Soccer Game Making it real Conclusion 4 Photo credit John Trainoron Flickr http://www.flickr.com/photos/trainor/2902023575/, Licensed under CC

A day in your life Think about a day in your life? o What is the best road to take? o Would there be any bad weather? o How to invest my money? o How is my health? You can make better decisions if only you can access data and process them. 5 Photo credit http://www.flickr.com/photos/kcolwell/5512461652/ CC license

Real-time analytics 6 Most of the first use-cases were of batch processing, which take minutes if not hours to run Many insights you d rather have sooner o Tomorrows weather o Heart condition o Traffic congestion o Natural disaster o Stockmarket Crash

Realizing real-time analytics Processing Data on the fly, while storing a minimal amount of information and responding fast (from <1 ms to few seconds) Idea of Event streams o A series of events in time Enabling technologies o Stream Processing (Storm, S4) o Complex Event Processing 7

Internet of Things (IoT) 8 Currently the physical world and software worlds are detached Internet of things promises to bridge this o It is about sensors and actuators everywhere o In your fridge, in your blanket, in your chair, in your carpet.. Yes even in your socks o Google IO pressure mats

Vision of the future Sensors everywhere Data collected from everywhere, analyzing, optimizing, and helping (and hopefully not taking over) Analytics and Internet of things.. Immersive world Big data and real-time analytics will be crucial. How far are we from realizing that? 9

What required to build such world Sensors and actuators (Motes?) Fast interoperable event systems (MQTT?) Powerful query languages (CEP?) Powerful control systems and decision systems 10

11 Data processing landscape

12 Data processing landscape

13 Complex Event Processing

14 Complex Event Processing

CEP = SQL for real-time analytics Easy to follow from SQL Expressive, short, and sweet. Define core operations that covers 90% of problems Lets experts dig in when they like! 15

16 CEP operators Filters or transformations (process a single event) from Ball[v>10] select.. insert into.. Windows + aggregation (track window of events: time, length) from Ball#window.time(30s) select avg(v).. Joins (join two event streams to one) from Ball#window.time(30s) as b join Players as p on p.v < b.v Patterns (state machine implementation) from Ball[v>10], Ball[v<10]*,Ball[v>10] select.. Event tables (map a database as an event stream) Define table HitV (v double) using.. db info..

Soccer use-cases Dashboard on game status Alarms about critical events in the game Real-time game analysis and predictions about the next move Updates/ stats etc., on mobile phone with customized offers Study of game and players effectiveness Monitor players health and body functions 17

DEBS challenge Soccer game, players and ball has sensors sid, ts, x,y,z, v,a Use cases: Running analysis, Ball Possession and Shots on Goal, Heatmap of Activity WSO2 CEP (Siddhi) did 100K+ throughput 18

Use-case 1 : running analysis Detect when speed thresholds have passed define partition player by Players.id; from s = Players [v <= 1 or v > 11], t = Players [v > 1 and v <= 11]+, e = Players [v <= 1 or v > 11] select s.ts as tsstart, e.ts as tsstop,s.id as playerid, trot" as intensity, t [0].v as instantspeed, (e.ts - s.ts )/1000000000 as unitperiod insert into RunningStats partition by player; 19

Use-case 2 : ball possession You possess the ball from time you hit it until someone else hit it or ball leaves the ground. 20

Use-case 3 : heatmap of activity Show where actions happened (via cells defined by a grid of 64X100 etc.), need updates once every second Can resolve via cell change boundaries, but does not work if one player stays more than 1 sec in the same cell. Therefore, need to join with a timer. 21

Use-case 4 : detect kick on the goal Detect kicks on the ball, calculate direction after 1m, and keep giving updates as long as it is in right direction 22

23 Results from scenarios

24 Demo

25 Making it real

26 CEP high-availability

Compare CEP vs SP Complex Event Processing SQL like language Supports powerful temporal operators (e.g. windows, event patterns) Focus on speed Harder to scale e.g. WSO2 CEP, Streambase, Esper Stream Processing Operators connected in a network, but you have to write the logic Distributed by design Focus on reliability (do not loose messages), has transactions e.g. Storm, S4 27

28 Scaling CEP : pipeline

29 Scaling CEP : operators

Siddhi Storm bolt We have written a Siddhi bolt that would let users run distributed Siddhi Queries using Storm SiddhiBolt siddhibolt1 = new SiddhiBolt(.. siddhi queries.. ); SiddhiBolt siddhibolt2 = new SiddhiBolt(.. siddhi queries.. ); TopologyBuilder builder = new TopologyBuilder(); builder.setspout("source", new PlayStream(), 1); builder.setbolt("node1", siddhibolt1, 1).shuffleGrouping("source", "PlayStream1");.. builder.setbolt("leafeacho", new EchoBolt(), 1).shuffleGrouping("node1", "LongAdvanceStream");.. cluster.submittopology("word-count", conf, builder.createtopology()); 30

31 Siddhi Storm bolt

Data collection Agent agent = new Agent(agentConfiguration); publisher = new AsyncDataPublisher( "tcp://localhost:7612",.. ); StreamDefinition definition = new StreamDefinition(STREAM_NAME, VERSION); definition.addpayloaddata("sid", STRING);... publisher.addstreamdefinition(definition);... Event event = new Event(); event.setpayloaddata(eventdata); publisher.publish(stream_name, VERSION, event); Can receive events via SOAP, HTTP, JMS,.. WSO2 Events is blazing fast (400K events TPS) Default Agents and you can write custom agents. 32

33 Business Activity Monitor (BAM)

Think CEP AND Hive, not OR Real power comes when you can join the batch (NRT) and real-time processing E.g. if velocity of the ball after a kick is different from season average by 3 times of season s standard deviation, trigger event with player ID and speed o Hard to detect velocity after the kick with Hive/ MapReduce, but much easier with CEP o Complicated to calculate season average with CEP, as we end up with huge windows. 34

35

Other real-world use-cases System/ Device Management Fleet/ Logistic Management Fraud Detection Targeted/ Location Sensitive Marketing Smart Grid Control Geo Fencing 36

Summary WSO2 CEP WSO2 BAM **WSO2 ML 37

10:50-11:40 11:55-12:45 14:15-15:05 PATTERN DRIVEN ARCHITECTURE SECURING THE INSECURE CREATING AN API CENTRIC ENTERPRISE 38 15:35-16:25 16:40-17:30 NEXT-GEN APPS WITH IOT AND CLOUD PANEL: BUILDING TOMORROW SENTERPRISE: REPORTS FORM THE GROUND WARS

39

Obrigado.! Connect : @asankama asankaa AT wso2.com http://asanka.abeysinghe.org 40