How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning



Similar documents
BIG DATA What it is and how to use?

HDP Hadoop From concept to deployment.

Hadoop IST 734 SS CHUNG

Large scale processing using Hadoop. Ján Vaňo

Big Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

Apache Hadoop: The Big Data Refinery

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

BIG DATA TECHNOLOGY. Hadoop Ecosystem

This Symposium brought to you by

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

HDP Enabling the Modern Data Architecture

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

ANALYTICS IN BIG DATA ERA

Comprehensive Analytics on the Hortonworks Data Platform

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Native Connectivity to Big Data Sources in MicroStrategy 10. Presented by: Raja Ganapathy

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

Challenges for Data Driven Systems

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

Journée Thématique Big Data 13/03/2015

Big Data and Data Science: Behind the Buzz Words

VIEWPOINT. High Performance Analytics. Industry Context and Trends

Big Data and Analytics: Getting Started with ArcGIS. Mike Park Erik Hoel

Agile Business Intelligence Data Lake Architecture

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

ANALYTICS CENTER LEARNING PROGRAM

Hadoop implementation of MapReduce computational model. Ján Vaňo

Big Data. Value, use cases and architectures. Petar Torre Lead Architect Service Provider Group. Dubrovnik, Croatia, South East Europe May, 2013

locuz.com Big Data Services

Chapter 7. Using Hadoop Cluster and MapReduce

Workshop on Hadoop with Big Data

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

Advanced Big Data Analytics with R and Hadoop

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce

Big Data Analytics. Lucas Rego Drumond

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Big Data Big Data/Data Analytics & Software Development

Oracle Big Data SQL Technical Update

Data Refinery with Big Data Aspects

SQL + NOSQL + NEWSQL + REALTIME FOR INVESTMENT BANKS

Understanding the Value of In-Memory in the IT Landscape

Information Management course

Testing 3Vs (Volume, Variety and Velocity) of Big Data

Investor Presentation. Second Quarter 2015

Big Data With Hadoop

Traditional BI vs. Business Data Lake A comparison

Hadoop MapReduce and Spark. Giorgio Pedrazzi, CINECA-SCAI School of Data Analytics and Visualisation Milan, 10/06/2015

International Journal of Innovative Research in Information Security (IJIRIS) ISSN: (O) Volume 1 Issue 3 (September 2014)

Big Data Analytics Nokia

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM

Open source Google-style large scale data analysis with Hadoop

Big Data: What You Should Know. Mark Child Research Manager - Software IDC CEMA

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April

BIG DATA TRENDS AND TECHNOLOGIES

Cost-Effective Business Intelligence with Red Hat and Open Source

Hadoop Submitted in partial fulfillment of the requirement for the award of degree of Bachelor of Technology in Computer Science

How To Turn Big Data Into An Insight

Hadoop Ecosystem B Y R A H I M A.

Transforming the Telecoms Business using Big Data and Analytics

Keywords Big Data, NoSQL, Relational Databases, Decision Making using Big Data, Hadoop

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics

The 4 Pillars of Technosoft s Big Data Practice

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014

Name: Srinivasan Govindaraj Title: Big Data Predictive Analytics

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

Big Data Introduction

Advanced In-Database Analytics

How Companies are! Using Spark

Data Warehouse design

CSE-E5430 Scalable Cloud Computing Lecture 2

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

BIG DATA AND THE ENTERPRISE DATA WAREHOUSE WORKSHOP

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

BIG DATA IN BUSINESS ENVIRONMENT

Big Data on Microsoft Platform

Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics

Modernizing Your Data Warehouse for Hadoop

Apache Hadoop in the Enterprise. Dr. Amr Awadallah,

Majed Al-Ghandour, PhD, PE, CPM Division of Planning and Programming NCDOT 2016 NCAMPO Conference- Greensboro, NC May 12, 2016

Blueprints for Big Data Success

The Big Picture on Big Data. Princeton Section 307 Dinner Meeting December 11, 2013 Richard Herczeg

International Journal of Advancements in Research & Technology, Volume 3, Issue 5, May ISSN BIG DATA: A New Technology

Please give me your feedback

International Journal of Innovative Research in Computer and Communication Engineering

Integrating a Big Data Platform into Government:

Executive Summary... 2 Introduction Defining Big Data The Importance of Big Data... 4 Building a Big Data Platform...

Data Science & Big Data Practice

The Internet of Things and Big Data: Intro

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?

Hadoop and its Usage at Facebook. Dhruba Borthakur June 22 rd, 2009

The 3 questions to ask yourself about BIG DATA

Hadoop s Advantages for! Machine! Learning and. Predictive! Analytics. Webinar will begin shortly. Presented by Hortonworks & Zementis

Microsoft Big Data. Solution Brief

Transcription:

How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning

Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume of data in 100000 Terabytes Volume Variety Velocity No single and clear definition for BIG DATA

Big Data in Industry 4.0 Sensor Factory Value chain Constant data flow Learning, Optimizing, Monitoring, New business models Collect, Analyze Visualize, Report Pattern detection

In 2003, 2004, 2005 Google released three academic papers describing Google s technology for massive data processing 1. Google File System (GFS) Google storing all web content 2. Map-Reduce Google calculating PageRank and web search index 3. BigTable Google storing Crawling data Analytics, Earth and Personalized Search in columnar database

HADOOP In 2004/5 Doug Cutting developed Nutch open source web search engine struggling with huge data processing issues Doug implemented Google File System analog and named it HADOOP From 2006 HADOOP is an Apache Foundation project

Google Whitepaper HADOOP has been adopted! 2003 2004 2005 2006 2008 2009 2010 2007 Google file system reimpleme ntation

Big Data technical stack Business analytics tools Data Mining /Modeling tools Data integration Document databases Business analytics Data Mining / Modeling Columnar databases Key value stores SQL Real-time Machinelearning On-line Database Data Sources Batch/Map- Reduce Script Search In-Memory Output.... Batch data integration Streaming data flow Metadata management (HCatalog) Hadoop cluster of hosts Cluster management / monitoring (Ambari) HDFS..

Relational Data vs BIG DATA Relational data management Apply data schema (ETL) DATA Store data BIG DATA management Store in Relational database Apply analytics Apply data schema Apply analytics Schema on read Structure first Structure later

How to find the patterns? Machine Learning Supervised learning We have previous knowledge (previous feedback) about the sample cases that are basis for learning Algorithms Unsupervised learning We do not have any previous knowledge (previous feedback) about the sample cases that are basis for learning Algorithms Classification Regression Decision Trees Neural Networks Deep Learning Clustering Hidden Markov Chains Dimensionality reduction

Pattern recognition required! Examples have same Mean x = 9 Variation x = 11 Mean y = 7.5 Variance y = 4.12 Correlation = 0.816 Same linear regression We need algorithms that look into the data

Example: Event failure scoring TASK: Find the probability of event failure 1. Logistic function f (x) = 1 1+ e -x Historical events data 100 factors (parameters) Target Normal event = 0 Failed event = 1 2. Splitting the learning dataset randomly into 80% Training data 20% Test data 100 000 samples P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P100 T 3 5 6 7 8 9 4 2 3 4 5 2 4 6 2 0 2 4 5 6 3 2 5 3 2 5 7 3 6 3 2 1 3 7 5 4 2 4 7 6 2 5 2 6 5 4 2 0 3 5 6 7 8 9 4 2 3 4 5 2 4 6 2 1 2 4 5 6 3 2 5 3 2 5 7 3 6 3 2 1 3 7 5 4 2 4 7 6 2 5 2 6 5 4 2 1 3 5 6 7 8 Training 9 4 2 data 3 4 (80%) 5 2 4 6 2 1 2 4 5 6 3 2 5 3 2 5 7 3 6 3 2 0 3 7 5 4 2 4 7 6 2 5 2 6 5 4 2 1 3 5 6 7 8 9 4 2 3 4 5 2 4 6 2 0 2 4 5 6 3 2 5 3 2 5 7 3 6 3 2 1 3 7 5 4 2 4 7 6 2 5 2 6 5 4 2 1 3 5 6 7 8 9 4 2 3 4 5 2 4 6 2 1 2 4 5 6 3 2 5 3 2 5 7 3 6 3 2 1 3 7 5 4 2 4 7 6 2 5 2 6 5 4 2 0 Test data (20%).. 3 7 5 4 2 4 7 6 2 5 2 6 5 4 2 0 Model Prediction 3. Creating model based on Training data 4. Validating model based on Test data Test Data Actual Test Data Predicted 0 True positive 1 False positive 0 1 False Negative True Negative

Example: Heavy industry manufacturer Problem: Unhide the manufacturing information for products faults discovery and quality control About the case Sophisticated manufacturing processes Data is generated in all steps by machines Data usage for quality and error discovery Historical data should be used for detecting errors and failures Nortal Solution: Streaming data collection, analytics Historical data access for trend and pattern discovery Business impact: Improved manufacturing quality as data is fully used Predictive maintenance will decrease production line stops

Setup architecture Speed capacity: 15000 events per sec INPUT BIG DATA CLUSTER OUTPUT REAL-TIME Machine Learning PREDICTION Predictive material planning STREAM DATA Cluster REAL-TIME Analytics ALERTS Predictive Quality management Predictive production line maintenance DATA STORAGE Analytics 360 view on production Production Quality Assurance

Example: Telecom Problem: Improve data warehouses performance and capabilities to store and analyze all telecom data About the case Telecom has existing data warehouse that consists only part of the original data Costs are linearly increasing (1TB = 20 000 EUR) Nortal solution: Big Data technology based data warehouse All historical raw data could be stored and accessed Business impact: 95% Cost improvement per TB of data (1TB = 1 000 EUR) All raw and historical data accessible Complex data analytics available

Offloading existing Data Warehouse Big Data Warehouse and Traditional Data Warehouse working back-to-back Benefiting From Big Data Common Business Operations Hadoop based data warehouse Data Warehouse File copy / Raw inserts Extract, Transform, Load CDR data GPRS / internet sessions data Customer behavioural data CRM, SCM, ERP Legacy data sources 3 rd party data sources

Big Data values for Industry 4.0 Efficient technology Data capture and analysis New opportunities Decreased costs for IT Linear cost increase Small scale POC-s available Commodity hardware Fast time to market Data stored in one place Data fully accessible Data fully analyzed Pattern detection on all data Improved manufacturing processes Improved customer services and experience Cost optimization New services based on data

Thank you! Nortal Big Data and Machine Learning Solutions Lauri Ilison, PhD Head of Big Data and Machine Learning Tel: + 372 5111 003 Email: lauri.ilison@nortal.com