Industry 4.0 and Big Data Marek Obitko, mobitko@ra.rockwell.com Senior Research Engineer 03/25/2015 PUBLIC PUBLIC - 5058-CO900H
2 Background Joint work with Czech Institute of Informatics, Robotics and Cybernetics Big Data related topics investigated in RA-DIC laboratory within CIIRC Goal of the effort: Semantic Big Data Historian
3 Agenda Overview of related trends Industry 4.0 Big Data Semantics Semantic Big Data Historian Architecture Use Case Outlook Conclusion
4 Agenda Overview of related trends Industry 4.0 Big Data Semantics Semantic Big Data Historian Architecture Use Case Outlook Conclusion
5 Industry 4.0 Fourth Industrial Revolution Predicted a-priori, not observed ex-post Economic impact predicted to be huge Operational effectiveness, new business models, services and products Clear definition not provided Usually: vision, basic technologies, selected scenarios Design principles Interoperability, virtualization, decentralization, real-time capability, service orientation, and modularity
6 Industry 4.0 Components Primary components Cyber-physical systems Fusion of physical and virtual world integration of computation and physical processes Features: unique identification RFID tags, centralized storage and analytics, multiple sensors and actuators, network compatible Example: virtual battery a battery in electric car has its virtual counterpart updated in real time, which allows diagnostics, simulation, prediction etc. for better customer experience Internet of Things Network of physical systems that are uniquely identified and can interact to reach common goals Example: Smart Homes connected devices (temperature sensor, heating, mobile phone) Internet of Services Offering services via Internet so that they can be offered and combined into value-added services by various suppliers Example: forming virtual production technologies and capabilities Smart Factory often mentioned as a key feature of Industry 4.0 Information coming from physical and virtual world used to provide context and assistance for people and machines to execute their tasks in a better way Example: demand driven production, intelligent work piece carriers Other also related components: Smart product, Machine to machine (M2M), Big Data, Cloud
7 Agenda Overview of related trends Industry 4.0 Big Data Semantics Semantic Big Data Historian Architecture Use Case Outlook Conclusion
8 Big Data Motivation A CPG (consumer packaged goods) company generates 5,000 data samples every 33 milliseconds This corresponds to 70TB per year Can we meaningfully use such amount of data? Big Data dataset that is growing so that it becomes difficult to manage it using existing database management concepts and tools 3Vs Volume, Velocity, Variety
9 Big Data Volume data will grow 50 times by 2020 FB 50PB Velocity storing and getting data fraud detection Variety unstructured, 90% of new data videos Applications Online marketing targeting products based on user clickstream (Google, Amazon, Netflix ) Medicine, biology, chemistry data analysis Technologies Map-Reduce framework, introduced by Google Running on cheap machines in parallel in clusters (splitting data) implemented in e.g. Apache Hadoop It s about variety, not volume The Big is not the main problem, focus on heterogeneous data integration new analytic applications based on data that were not tracked so far
10 Agenda Overview of related trends Industry 4.0 Big Data Semantics Semantic Big Data Historian Architecture Use Case Outlook Conclusion
11 Semantics Linked Data / Semantic Web (machine processable data) Tens of RDF Gtriples on web Resource Description Framework Resources uniquely identified by URI Triples subject property object In fact relations between objects, values of properties Together forming RDF graph(s) Web Ontology Language Ontology specifies the conceptualization In fact description of vocabulary, constraints, attaches meaning to identifiers Designed for internet and web And so also usable for Internet of Things, Internet of Services etc. Inherently distributed approach, integration of data from heterogeneous and unreliable data sources
12 Agenda Overview of related trends Industry 4.0 Big Data Semantics Semantic Big Data Historian Architecture Use Case Outlook Conclusion
13 Plant Data Processing Traditional Historian Time series data collection, focus on fast scan rate Analyzing data What the ph was at 2:34:56 PM March 15, 2015 Not a problem, single retrieval, unless there is a problem with volume What the ph trend was from 1 to 7 PM of March 15, 2013, plus compare it to previous similar weekdays, holidays, after it rained, when different suppliers were used etc. Not easily possible in historians available today, especially for large scale data Samples of needed data processing Pattern recognition, pattern matching Predictive maintenance Benchmarking of KPIs Clustering similar machines Real time statistics / analytics / reporting
14 Semantic Big Data Historian Vision, currently being implemented to verify the technologies Collecting data from sensors Architecture based on OPC UA Sensors semantically described All data processed using Semantic Web languages and technologies allows linking data together Data stored in Hadoop Analyzing data Querying using SPARQL (RDF querying language) More complex queries implemented directly in Map-Reduce framework
15 Description of sensors and data Ontology building on top of SSN Semantic Sensor Network Ontology (W3C effort) Ontology describes Sensors Observations, including physical units, time, data quality etc. Data expressed using the ontology Particular observations All data linked together Directly stored as RDF triples
16 Agenda Overview of related trends Industry 4.0 Big Data Semantics Semantic Big Data Historian Architecture Use Case Outlook Conclusion
17 Case study data from passive house Our goal: evaluate the suitability of proposed technologies, scalability etc. Data focus: indoor air quality Environmental parameters: Temperature, Carbon dioxide concentration, Relative humidity, Air pressure Sample analysis tasks Relaxation time of the house Impact of sunlight on indoor temperature Detection of people inside
18 Case study data from passive house Raw data conversion to RDF to be stored to triple store
19 Case study data from passive house Sample task detection of people inside Time series processing of CO 2 data Values in sliding window, comparing with threshold Verified the results by comparing with people occupancy list Main result Data not really very big, however, reaching the limits of MATLAB package Map-Reduce implementation in Hadoop (both pre-processing and detection) much faster than in MATLAB The task proved the advantage of Hadoop implementation scalability
20 Agenda Overview of related trends Industry 4.0 Big Data Semantics Semantic Big Data Historian Architecture Use Case Outlook Conclusion
21 Outlook Semantic Big Data Historian overall goal: Semantic: connect data together Provide semantic description in the endpoints, connect to OPC UA and let the Historian to connect the data appropriately Big Data: be able to work with larger volume of data Historian Using Map-Reduce and similar frameworks to store, retrieve and analyze larger volume of heterogeneous data Focus on time-series data, however be able to also include other types of data E.g., information about suppliers, orders, shifts, various annotations etc. Achieve analytics that was not possible without current technologies Also connect to actions in physical world, not only ad-hoc analysis
22 Agenda Overview of related trends Industry 4.0 Big Data Semantics Semantic Big Data Historian Architecture Use Case Outlook Conclusion
23 Conclusion Industry 4.0 fusion of physical and virtual world, network of physical systems that interact to reach common goals, integration of services, smart devices, homes, factories, Big Data and Semantics prerequisite for processing large volume of heterogeneous data Semantic Big Data Historian The goal is to provide advanced analytics on plant heterogeneous data, in the scale that was not possible until now Demonstrated the Hadoop scalability Demonstrated Semantic Web suitability for data integration Next steps include advanced data analysis Industry 4.0 both distributed and centralized approaches needed Small scale (M2M) versus large scale (cloud) data processing
Thank you! Questions? Contact: mobitko@ra.rockwell.com PUBLIC www.rockwellautomation.com PUBLIC - 5058-CO900H