Optimized for the Industrial Internet: GE s Industrial Lake Platform
Agenda The Opportunity The Solution The Challenges The Results Solutions for Industrial Internet, deep domain expertise 2 GESoftware.com @GESoftware #IndustrialInternet
Big opportunities with Industrial Big The power of 1% Driving outcomes that matter Increasing freight utilization rail Predictive maintenance healthcare Predictive diagnostics power $27B Industry value by reducing system inefficiency $63B Industry value by reducing process inefficiency $66B Industry value with efficiency improvements in gas-fired power plant fleets Note: Illustrative examples based on potential one percent savings applied across specific global industry sectors over 15 years. Source: GE estimates 3 GESoftware.com @GESoftware #IndustrialInternet
Industrial Big fast and vast 50B Machines will be connected on the internet by 2020 2X Industrial growth within next 10 years Sensor Historian CRM, ERP, etc. Geo-location Content (images, videos, manuals, etc.) Machine Logs Social network 35GB per day from each Smart Meter 50X growth in healthcare (2012 2020) 1TB per flight In practice only 3% of potentially useful is tagged and even less is analyzed* 9MM points per hour for each locomotive 500GB per blade by gas turbines *Sources: IDC, IDC, Ericsson, Wikibon, Fast Company, ComputerWeekly 4 GESoftware.com @GESoftware #IndustrialInternet
Today s approaches are not prepared for onslaught of Industrial Big Too slow Too expensive Too rigid 80% of an analytics project typically involves gathering and then preparing the for analysis* *Source: IDC 5 GESoftware.com @GESoftware #IndustrialInternet
Yesterday s warehouse architecture What is it telling me? How is it doing? How does it look? 1 2 All over the place across multiple locations Limited types Mostly structured and semi-structured types scientist Field operations Business analyst ONE STATIC DATA MODEL 3 Snapshot Limited to narrow snapshots and time CRM, ERP, etc. Logs Social network Geo-location TRADITIONAL DATA WAREHOUSE 6 GESoftware.com @GESoftware #IndustrialInternet
Industrial Lake architecture Underpinned by governance appropriate to Business and Location How long will it last without failures or maintenance? Is my asset performing optimally? How to configure for best operational results? Is my asset ready when there is market opportunity? 1 2 3 One place Access to all in one place to quickly respond to the speed of business change Any Handing of all types including documents, images machine, sensor All Access to real-time and historical and not limited to snapshot of Sensor scientist Field operations Business analyst Content (images, videos, manuals, etc.) FLEXIBLE DATA MODELS INDUSTRIAL DATA LAKE Machine Historian CRM, ERP, etc. Logs, click streams Social network Geolocation Rapid access to all for analytics 7 GESoftware.com @GESoftware #IndustrialInternet
A day in the life management scientist Current situation Field operations Business analyst Analytics and operations scientist New way Field operations Business analyst Add semantic meta Replica of source governance INDUSTRIAL DATA LAKE Add semantic meta Replica of source loading ingestion Real-time ingestion CRM, ERP, etc. Logs Social network Geo-location collection CRM, ERP, etc. Logs, click streams Social network Geolocation Sensor Content (images, videos, manuals, etc.) Machine Historian Time to analyze Cost scientist Field operations Business analyst Agility scientist Field operations Business analyst Time Cost INDUSTRIAL DATA LAKE collection ingestion Analytics and governance operations collection ingestion Analytics and governance operations Rigid Agile 8 GESoftware.com @GESoftware #IndustrialInternet
Industrial Lake Customer focus Industrial Lake Appliance Pre-integrated with management, compute, and storage monetization and outcomes Consume Analyze Predictive / prescriptive analytics and visualization Security Management of all, any in one place Manage Process High performance computing 9 GESoftware.com @GESoftware #IndustrialInternet
Industrial Lake Optimized for industrial workloads Optimized for missioncritical workloads for addressing key SLAs such as Security, resiliency etc. for Industrial Internet applications Fast ingestion, storage and compute including machine to support multiple schema and types Highperformance analysis using massively parallel processing architecture supporting Apache Hadoop governance and federation, with geographicallydispersed deployment options 10 GESoftware.com @GESoftware #IndustrialInternet
Big without Governance Dumping into Big lake without repeatable processes and governance will create messy, uncontrollable environment Insights harvested from ungoverned lake, is not reliable and trustworthy If the insights can not be fully trusted, it s difficult to make business decisions confidently. Solutions for Industrial Internet, deep domain expertise 11 GESoftware.com @GESoftware #IndustrialInternet
GE as a Custodian of Customer Owned & Services Custodian Infrastructure a person who has responsibility for or looks after something Synonyms: keeper, guardian, steward, protector "the custodian of the relic" Management Custodian Roles Enforcement & Measurement Protection Customer Owned Access Controls Visibility Metrics Privacy 12 GESoftware.com @GESoftware #IndustrialInternet
Governance Disciplines Protect, Manage and Improve Information Quality Accuracy Completeness Consistency Lifecycle Provenance Lineage Retention Complianc e Regulatory Corporate Meta Dictionary Directory of all assets Classification and Tagging Auditing Monitoring Logging Log Analysis 13 GESoftware.com @GESoftware #IndustrialInternet
Evolving Hadoop Governance Cluster Apache Falcon Uses Oozie and Ambari Set Process Define pipelines Monitor pipelines Trace pipelines for dependency, lineage 14 GESoftware.com @GESoftware #IndustrialInternet
Industrial Lake Supports SLAs for industrial workload KPIs Availability Optimized for missioncritical workloads for Industrial Internet applications Capacity Elastic On-demand >99.99% 99.95% Planned downtime active disaster recovery Continuous operations, active-active Resiliency <30ms 30-40ms Medium/High High Performance / latency Industrial solutions OT focus (ex: M&D, CBM, ALM, etc.) Enterprise solutions IT focus (ex: CRM, SCM, ERP, etc.) Security 15 GESoftware.com @GESoftware #IndustrialInternet
Security Risk for Big More implies higher risk of exposure New types may give rise to new security breach scenarios Evolving and experimental analysis implies security policies are less likely to be in place Linkage to other already under compliance may create scenarios where compliance could be violated. 16 GESoftware.com @GESoftware #IndustrialInternet
Security Requirements Perimeter security Access control protection Visibility Challenge: Complete security solution does not exist for any of the popular big products 17 GESoftware.com @GESoftware #IndustrialInternet
Top Opportunity Areas for Security Perimeter: Infrastructure Protection: Encryption Access Control: Privacy Visibility: Management Communication protocols Access policy based encryption Secure dissemination integrity/proven ance Key management Searching / filtering encrypted Secure collection / aggregation Proof of storage Secure outsourcing of computation Secure collaboration 18 GESoftware.com @GESoftware #IndustrialInternet
Lake Security Solutions Physical Security Network Security Authentication Protecting the cluster(s) at rest and motion security obfuscation File Permissions Group Authorizations RBAC Configuration Management Provenance Lineage Change management Center Deployments Kerberos Authentication LDAP integration Segregation of duties Encryption and masking solutions FileSystem Groups LDAP Groups Identity Mgmt Tagging ETL Tools Map Reduce 19 GESoftware.com @GESoftware #IndustrialInternet
Evolving Hadoop Security Apache Knox: Perimeter / Network security Apache Ranger : Authorization protection Audit tracking Apache Sentry: Authorization 20 GESoftware.com @GESoftware #IndustrialInternet
Availability Excellence Framework H A f o r N I C, I S P, S e r v e r s, D i s k D a t a B a c k u p D R S t r a t e g y M o n i t o r i n g / A l e r t i n g N a m e N o d e HA C C B P r o c e s s M o n i t o r i n g / A l e r t i n g C o n t i n u o u s t o o l i m p r o v e m e n t s Q u i c k r e s p o n s e t o A l e r t s J V M i n s t r u m e n t a t i o n A u d i t i n g c h a n g e s C o n f i g f i l e s c o m m i t t e d t o Git R e s t r i c t e d a c c e s s t o P R O D P r e - t e s t e d, p r e - a p p r o v e d c h a n g e s t o b e d e p l o y e d o n l y 21 21 GESoftware.com @GESoftware #IndustrialInternet
Target Availability SLA Cost comparison SLA Cost associated Typical industry Use Case Feature list required <=99% $ Batch update systems, Retail Web Sites, Social Media sites, Big clusters 99.9% $$ Retail Web Sites, Social Media sites, Relational bases 99.99% $$$ Hi-Frequency Trading, Medical support systems 99.999% $$$$$ Hi-Frequency Trading, Medical support systems, Stock Exchanges ex. Nasdaq, NYSE, Air-traffic controllers NameNode HA, Higher Replication than 3, Hardware redundancy, Monitoring and Alerting, Centre Redundancy, 2X Projected Capacity implementation All of the above + Full Centre Redundancy, Automatic Failover, 3X Projected Capacity implementation All of the above + Full Centre Redundancy including near real time replication, 4X Projected Capacity implementation All of the above + Auto-recovering components, 5X Projected Capacity implementation 100% $$$$$$ Real-time Trading systems, Stock Exchanges ex. Nasdaq, NYSE, On-board flight computer, Air-traffic controllers All of the above + 10X Projected Capacity implementation 22 GESoftware.com @GESoftware #IndustrialInternet
Case study GE Aviation Asset productivity, minimize disruptions, improved forecasting 25 Airlines 3.4M Flights 340TB 2000X Performance improvement 10X Cost reduction 7 days Time-to-market for new analytic app Note: Illustrative Aviation example based on Predix solution currently in development. Estimates based on exploration, simulation and asset utilization models. Isolate root causes Identify sub-optimal performance parts Minimize disruptions 23 GESoftware.com @GESoftware #IndustrialInternet
Thank you General Electric reserves the right to make changes in specifications and features, or discontinue the product or service described at any time, without notice or obligation. These materials do not constitute a representation, warranty or documentation regarding the product or service featured. Illustrations are provided for informational purposes, and your configuration may differ. This information does not constitute legal, financial, coding, or regulatory advice in connection with your use of the product or service. Please consult your professional advisors for any such advice. GE, the GE Monogram, Predix, Predictivity are trademarks of General Electric Company. 2014 General Electric Company All rights reserved.