Big Data Storage Challenges for the Industrial Internet of Things

Similar documents
Optimized for the Industrial Internet: GE s Industrial Data Lake Platform

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

Optimized for the Industrial Internet: GE s Industrial Data Lake Platform

The IoT Inc Business Meetup Silicon Valley

Industrial Dr. Stefan Bungart

Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop

The Future of Data Management

HDP Hadoop From concept to deployment.

The Future of Data Management with Hadoop and the Enterprise Data Hub

Integrating a Big Data Platform into Government:

Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.

Modernizing Your Data Warehouse for Hadoop

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

BIG DATA TRENDS AND TECHNOLOGIES

EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved.

Securing the Big Data Ecosystem

Comprehensive Analytics on the Hortonworks Data Platform

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

Information Builders Mission & Value Proposition

HDP Enabling the Modern Data Architecture

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

Building Data-Driven Internet of Things (IoT) Applications

Upcoming Announcements

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

Big Data Analytics: Today's Gold Rush November 20, 2013

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

Big Data Open Source Stack vs. Traditional Stack for BI and Analytics

Internet of Things. Opportunity Challenges Solutions

EMC Big Data: Cesta k podniku řízenému daty

Executive Summary... 2 Introduction Defining Big Data The Importance of Big Data... 4 Building a Big Data Platform...

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, Viswa Sharma Solutions Architect Tata Consultancy Services

Session 0202: Big Data in action with SAP HANA and Hadoop Platforms Prasad Illapani Product Management & Strategy (SAP HANA & Big Data) SAP Labs LLC,

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Sunnie Chung. Cleveland State University

A Modern Data Architecture with Apache Hadoop

BIG DATA AND MICROSOFT. Susie Adams CTO Microsoft Federal

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

Oracle Big Data Strategy Simplified Infrastrcuture

BIG DATA - HADOOP PROFESSIONAL amron

SQL Server 2012 PDW. Ryan Simpson Technical Solution Professional PDW Microsoft. Microsoft SQL Server 2012 Parallel Data Warehouse

Big Data Big Data/Data Analytics & Software Development

Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Addressing Open Source Big Data, Hadoop, and MapReduce limitations

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

Big Data and the Data Lake. February 2015

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns

Big Data and Industrial Internet

Architecting for the Internet of Things & Big Data

The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer,

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

Now, Next and the Future: IT, Big Data and other Implications for RIM. Presented by Michael S. Smith /

Interactive data analytics drive insights

Workshop on Hadoop with Big Data

Oracle Big Data for Dummies

Big Data Realities Hadoop in the Enterprise Architecture

MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering

Big Analytics in the Cloud. Matt Winkler PM, Big

Real Time Big Data Processing

#TalendSandbox for Big Data

Dell In-Memory Appliance for Cloudera Enterprise

Modern Data Warehouse

Data Refinery with Big Data Aspects

Cisco IT Hadoop Journey

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.

Real-Time Big Data Analytics + Internet of Things (IoT) = Value Creation

Big Data and Data Science. The globally recognised training program

Data Management in SAP Environments

Il mondo dei DB Cambia : Tecnologie e opportunita`

We are building the next generation of Big Data and Analytics solutions!

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

SAP and Hortonworks Reference Architecture

SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

Cloudera Enterprise Data Hub in Telecom:

Enabling Manufacturing Transformation in a Connected World. John Shewchuk Technical Fellow DX

Constructing a Data Lake: Hadoop and Oracle Database United!

Hadoop & Spark Using Amazon EMR

Machina Research. Where is the value in IoT? IoT data and analytics may have an answer. Emil Berthelsen, Principal Analyst April 28, 2016

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE

Oracle Big Data SQL Technical Update

2015 Analyst and Advisor Summit. Advanced Data Analytics Dr. Rod Fontecilla Vice President, Application Services, Chief Data Scientist

ANALYTICS CENTER LEARNING PROGRAM

Self-service BI for big data applications using Apache Drill

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

Protecting Big Data Data Protection Solutions for the Business Data Lake

Transforming the Telecoms Business using Big Data and Analytics

Why Big Data in the Cloud?

Big Data and Data Science: Behind the Buzz Words

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

Ganzheitliches Datenmanagement

Transcription:

Big Data Storage Challenges for the Industrial Internet of Things Shyam V Nath Diwakar Kasibhotla SDC September, 2014

Agenda Introduction to IoT and Industrial Internet Industrial & Sensor Data Big Data Storage Challenges Ingestion / Storage Retrieval / Consumption Use Cases Wrap up 2

About Shyam Principal Architect Analytics Board of Director (SIGs), 30K+ member User Group (IOUG) Started the IoT/ Industrial Internet Meetup in East Bay in June 2014, started other BI/Analytics related user groups Worked in IBM, Deloitte, Oracle and Halliburton, prior to GE Under grad from IIT (India), MS (Computer Science) and MBA (FAU) Regular speaker in large events like Oracle Openworld, Collaborate, BIWA Summit on IoT, Business Analytics and Data Warehousing / Engineered Systems related topics 3

About Diwakar Principal Architect GE Aviation Worked at Oracle, EMC, Pivotal prior to GE Regular speaker in large events like Exa SIG and Oracle Openworld Expertise: Data Integration, Database Appliances, Big Data, IOT 4

The Hype Cycle Gartner July 2013 The 2013 Hype Cycle features Internet of Things, machine-to-machine communication services, mesh networks: sensor and activity streams. 5 http://www.gartner.com/newsroom/id/2575515

What are the Things? 6

Big Data and IoT ERP/CRM 7

Different Views of Aircraft - as collection of sensors 8 http://www.flightglobal.com/cutaways/civil/phenom-300/

Data from Jet Engine General Electric Company, 2013. All Rights Reserved. 9

Making Sense of the Sensors EGT = Exhaust Gas Temperature The temperature of the exhaust gases as they enter the tail pipe, after passing through the turbine A good indicator of the health of engine (just like human body temperature) Recording and interpreting the EGT can help to detect several jet engine problems. 10

Industrial Internet: Big Data Analytics Delivering sharper insights to users Ingest massive volumes of with parallelization Bring analytics to and vice versa Elastically execute on large-scale requirements Innovative analytics models Various sources Enterprise (operational and business) Data, Industrial Data & External Data 11

Wind Farms Explained Via Visuals! 12 1

13 1

14 General Electric Company, 2013. All Rights Reserved.

Cloud for efficiency and agility Going mobile: anytime/ anywhere Access End-to-end Security Predictive insights from Big Data Cloud based Integrated Asset Management Industrial Internet computing requirements 021010308 013161090 040109010 104078050 Consistent and meaningful User experience Transition to Brilliant machines 15

Apply Batch or Real-Time Analytics to the Machine-Generated Data 16 General Electric Company, 2013. All Rights Reserved.

Industrial Big Data fast and vast 50B Machines will be connected on the internet by 2020 2X Industrial growth within next 10 years Sensor Historian CRM, ERP, etc. Geo-location Content (images, videos, manuals, etc.) Machine Logs Social network 35GB Data per day from each Smart Meter 50X Data growth in healthcare (2012 2020) 1TB Data per flight In practice only 3% of potentially useful is tagged and even less is analyzed* 9MM Data points per hour for each locomotive 500GB Data per blade by gas turbines *Source: IDC *Source: IDC 17

Today s approaches are not prepared for onslaught of Industrial Big Data Too slow Too expensive Too rigid 80% of an analytics project typically involves gathering and then preparing the for analysis* *Source: IDC 18

Yesterday s warehouse architecture What is it telling me? How is it doing? How does it look? 1 2 All over the place Data across multiple locations Limited types Mostly structured and semi-structured types Data scientist Field operations Business analyst O NE STATIC DATA MO DEL 3 Snapshot Limited to narrow snapshots and time CRM, ERP, etc. Logs Social network Geo-location TRADITIO NAL DATA WAREHO USE 19

New approach Industrial Data Lake architecture How long will it last without failures or maintenance? Is my asset performing optimally? How to configure for best operational results? Is my asset ready when there is market opportunity? 1 2 3 One place Access to all in one place to quickly respond to the speed of business change Any Handing of all types including documents, images machine, sensor All Access to real-time and historical and not limited to snapshot of Sensor Data scientist Field operations Business analyst Content (images, videos, manuals, etc.) FL EX IBL E DATA MODEL S INDUSTRIAL DATA LAKE Machine Historian CRM, ERP, etc. Logs, click streams Social network Geolocation Rapid access to all for analytics 20

Industrial Data Lake Sensor Content (images, videos, manuals, etc.) Business analyst Data scientist Field operations Historian Machine Industrial Data Lake CRM, ERP, etc. Logs Fast ingestion, storage and compute High performance analysis Optimized for mission-critical workloads Geo-location Social network Data governance and federation 21

Aviation and Big Data GE expects the collection to grow to 10 million flights and 1,500 terabytes of full flight operational by 2015. 22

Pivotal Architecture HAWQ Advanced Database Services Pivotal HD Enterprise Resource Management & Workflow Yarn Zookeeper HBase Xtension Framework ANSI SQL + Analytics Catalog Services Dynamic Pipelining Hadoop Virtualization (HVE) HDFS Query Optimizer Pig, Hive, Mahout Map Reduce Deploy, Configure, Monitor, Manage Command Center Sqoop Data Loader Flume Apache Pivotal HD Enterprise HAWQ 23

Q&A Shyam Nath Shyam.Nath@GE.com Diwakar Kasibhotla Diwakar.kasibhotla@GE.com Thank You!