Big Data, Physics, and the Industrial Internet! How Modeling & Analytics are Making the World Work Better." Matt Denesuk! Chief Data Science Officer! GE Software! October 2014! Imagination at work. Contact: matthew.denesuk@ge.com! General Electric Company, 2014. All Rights Reserved.
What s this all about? " Industries that are all about data & IT see outsized productivity & performance gains! Telecom, financial srvcs,! Making industrials all about data & IT will transform how the world works! Power, water, aviation, rail, mining, oil & gas, manufacturing,! And Big Data + Physics is the enabler! 2
What happened when 1B people became connected? " [ Social marketing emerged ] [ Communications mobilized ] [ IT architecture virtualized ] Consumer Internet [ Retail & ad transformed ] [ Entertainment digitized ] General Electric Company, 2014. All Rights Reserved.
Now what happens when 50B Machines become connected? " Real-time Network Planning Shipment Visibility Hospital Optimization Industrial Internet Intelligent Medical Devices Smart Grid Factory Optimization Logistics Optimization Connected Machines Brilliant Rail Yard Brilliant Factory Brilliant Brilliant Hospital Power [ OT is virtualized Analytics become predictive Employees increase productivity Machines are self healing & automated Monitoring and maintenance is mobilized [ General Electric Company, 2014. All Rights Reserved.
Cornerstone of IoT Transformation is Software-Defined Machines (SDM s)"! CONSUMER" COMMERCIAL & INDUSTRIAL"! Easily connect machines to Internet! Embed apps and analytics into machines and cloud, making them intelligent and selfaware! Change and update capabilities of machines and devices without changing hardware! Deliver intelligence to users providing continuously better outcomes! Extend Industrial Internet platform via API and ecosystem!
Example: Wind Farm in Analytics Age! (40 TB/yr/ 500 wm farm)!
The Value to Customers is Huge! Efficiency and cost savings, new customer services, risk avoidance 1% improvements cuts $150b in waste across industries! Industry Aviation Power Healthcare Rail Oil and Gas Segment Type of savings Estimated value over 15 years Commercial 1% fuel savings $30B Gas-fired generation 1% fuel savings $66B System-wide 1% reduction in system inefficiency $63B Freight 1% reduction in system inefficiency $27B Exploration and development 1% reduction in capital expenditures $90B Note: Illustrative examples based on potential one percent savings applied across specific global industry sectors. Source: GE estimates 7 GESoftware.com @GESoftware #IndustrialInternet
Forces shaping" the Industrial Internet"! 1 Internet" of things! A living network" of machines, data," and people! 2 Big Data & 3 Analytics! Transforming massive amounts of data into intelligence, generating datadriven insights, and enhancing asset performance!! Intelligent, SW-defined! machines! Increasing system intelligence through embedded software! 4 8 GESoftware.com @GESoftware #IndustrialInternet Physics + Big Data! Employing deep physics & engineering models to leap-frog what s possible with data-driven techniques!
Reference Architecture " Platform for the Industrial Internet must bridge OT & IT! Mobility and Collaboration Insight to Action Maintenance SW Upgrades Machine Control Device mgmt. M2M, M2H, M2C Any Machine Single Record of Asset Business Process Management Industrial Big Data Management Event Processing Industrial Data Lake Analytics & Modeling PaaS SaaS Integration with ERP / CRM Any Device Cyber-Security & Operational Reliability
What do we need from Data Science?! 10
Two ways of seeing a data set* (and the world)" Computer Scientist: get the knowledge locked in the data! The data set is record of everything that happened, e.g.,! All customer transactions last month! All friendship links between members of social networking site! Goal is to find interesting patterns, rules, and/or associations.! Physical Scientist get the knowledge " The data set is an partial, and often very noisy reflection of some underlying phenomenon, e.g.,! Emission spectra from stars! Battery voltage varying with current, time, and temperature! Goal is better understanding or ability to predict aspects of that phenomenon, often through a mathematical model! For certain kinds of problems, immense power in the combination! 11 (*See D. Lambert, or R. Mahoney, e.g.)
Example: Statistical Translation! Regular Science approach Use of language is infinitely complex, but you can teach a computer all the rules and content. Employ language experts to codify rules, exceptions, vocabulary mappings, etc. Apply transformation to user s query. Costly, hard to scale Can translate nearly any statement (but accuracy variable) In theory, could be better than human. Too expensive and difficult to deploy comprehensively Statistical (data-driven) approach People say the same kind of things over and over. And somebody has already translated it. Gather and classify lots of translated docs (websites, UN, books, ) Identify & match patterns Map to user s translation query. Incrementally low cost, highly scalable. Limited in scope to digitized docs that have been translated before Limited by skill of human translators Will flop with innovative use of language (new poetry, )
Three basic components of Industrial Data Science" Physics/engineering-based models"! Need much less data! Powerful, but difficult to maintain and scale! Empirical, heuristic rules & insights"! Straightforward to understand! Captures accumulated knowledge of your experts! Data-driven techniques machine learning, statistics, optimization, advanced visualization, "! Often not enough data in the industrial domain! Bias: limited to regions of parameter space traversed in normal operation! But easiest to maintain and scale! 13
Some Patterns! 14 2014 General Electric Company - All rights reserved
Industrial Example: improving rule based systems! Many equipment operators have a system something like this, with rules derived based on experience and intuition. Low-latency operational data Rule sets implemented in Analytics Engine Produce alerts Alerts 15
Industrial Example: improving rule based systems! Low-latency operational data Rule sets implemented in Analytics Engine Outcome data Produce alerts More actionable alerts Pattern, sequence, association mining, etc. Combine ML plus rule-based alerts with outcome data to produce better alerts 16
Industrial Example: improving rule based systems! Low-latency operational data Rule sets implemented in Analytics Engine Outcome data Recommendation engine Actionable Recommendations Tune parameters of existing rules, and create new rules. Use ML and outcome data to refine and extend rule base, providing yet further actionability, resulting in substantial improvements in operational outcomes. 17
Another Industrial Example: use advanced physical models to create new features for ML approaches! Sensor Data Predicted Values and Δs" Using as ML features the: 1. Deviations from expected physics, & 2. Inferred or hidden parameter estimates provides much richer and effectively less noisy data, resulting in much stronger predictions and models. Variety of Machine Learning Techniques Outcome data 18
Climbing up the value chain toward Condition-based Performance Management and Business Optimization.! New levers for optimization across the operation or business! Fleet/operation-wide optimization levels. Trade-offs to optimize business performance Equipment heath is not a given, but a variable! Need: " Earlier detection! Root cause! Scaling to more equipment Types & instances! Prescriptive recommendations (multichannel) Predictive Maintenance ( future ) Model-driven! Work-driven! Time-driven! Condition-based Maintenance ( now ) 19 Fix it when it breaks 19
Capability / Impact Ramp" Data Science Complexity Basic Reporting Advanced Reporting Alerts Highlyactionable management info Rules Anomaly augmentation Detection Predictive analytics Sophisticated, optimized management of business operations High-value guidance Prescriptive analytics Operational optimization Data completeness, breadth, quality 20
Industrial Data Science Broad range of deep Data Science capabilities needed Physics & expertbased Modeling! Applied Statistics! Computer! Vision! Image Analytics! Employing deep physical and engineering understanding of equipment and processes to generate normative models. Innovates new ways of performing reliability analysis, statistical modeling of large data, biomarker discovery and financial risk management Focuses on developing algorithms and systems for real time video analysis Research in algorithms and software systems that analyze & understand images to produce actionable insights Knowledge! Discovery! Machine! Learning! Optimization & Management Science! Sensor & Signal Analytics! Delivering data and knowledge-driven decision support via semantic technologies and big data systems research Develop scalable and crossdisciplinary machine learning & predictive capabilities to derive actionable insights from big data Optimizes the design & operations of complex business and physical systems, extracting more value at lower risk Modeling complex system and noise processes to detect subtle deviations and estimate critical system parameters 21
Industrial Data Science What is it? " Industrial Data Science " Outcome-oriented application of mathematical & physics-based analysis & models to real-world problems in industrial operations.! Tools & processes needed to do that continually & at scale.! Why do we do it! Improve the performance of industrial operations, e.g.," Higher equipment uptime, utilization,! Lower maintenance/shop costs, longer component life! Fleet level optimization & trade-offs! Business optimization (linking to financial & customer data)! Service / contract management! What s needed" Combination of :" Physical & expert modeling experience & depth! Installed base of industrial equipment and data.! Big Data, Machine Learning, and statistical capabilities! 22