PICASSO Big Data Expert Group Sören Auer Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
The three Big Data V Variety is often neglected Quelle: Gesellschaft für Informatik Fraunhofer Seite 2 Sören Auer 2
Semantic Web Layer Cake 2001 Monolithic based on XML Focus on heavyweight Semantic (Ontologies, Logic, Reasoning) http://www.w3.org/2001/10/03-sww-1/slide7-0.html Fraunhofer Seite 3
(Access control), Signatur, Encryption (HTTPS/CERT/DANE), The Semantic Web Layer Cake 2015 A Little Semantics Goes a Long Way Lingua Franca of Data integration with many technology interfaces (XML, HTML, JSON, CSV, RDB, ) Focus on lightweight vocabularies, rules, thesauri etc. Less invasive Vocabularies SWRL Regeln SKOS Thesauri Logik Ontologien RDF SPARQL RDF Data Shapes RDF-Schema RDF/XML JSON-LD CSV2RDF R2RML RDFa XML JSON CSV RDB HTML Unicode URIs Fraunhofer Seite 4
INTEGRATING BIG DATA & LINKED DATA Fraunhofer Seite 5
message passing Blueprint of the Data Aggregator Platform Follows typical Lambda Architecture Input data Stream Spatial Social Statistical Temporal Transactiona l Imagery Real-time data & Transactions Data Storage message passing Batch Layer Speed Layer Batch View Big Data Analytics In-stream Mining Real-time View Applications & Showcases Real-time dashboards Domain-specific BDE apps BDE Platform & Intelligence Integrated on top of existing Big Data distribution + Semantic Layer (Retaining Semantics using LD approach ) Fraunhofer Seite 6 6
Adding a Semantic Layer to Data Lakes Accounting Management Accounting Regulatory Reporting Risk Treasury Outbound and Consumption Frontend to Access Relationship and KPI Definition / Documentation Frontend to Access (ad hoc) Reports Outbound Data Delivery to Target Systems Semantic Data Lake central place for model, schema and data historization Combination of Scale Out (cost reduction) and semantics (increased control & flexibility) grows incrementally (pay-as-you-go) Knowledge Graph for Relationship Definition and Meta Data XML2RDF JSON-LD CSVW R2RML Data Lake (order of magnitude cheaper scalable data store) Inbound Inbound Raw Data Store Data Sources [1] Wrobel, Fraunhofer Voss, Seite Köhler, 7 Beyer, Auer: Big Data, Big Opportunities - Anwendungssituation und Forschungsbedarf. 7Informa [2] Debattista, Lange, Scerri, Auer: Linked 'Big' Data. IEEE/ACM Big Data Computing BDC 2015: 92-98
INDUSTRIAL DATA SPACE Fraunhofer Seite 8
Vocabulary-based Integration facilitates Data-driven Businesses Vocabulary Fraunhofer Seite 9
Die Arbeiten zum Industrial Data Space sind komplementär verzahnt mit der Plattform Industrie 4.0 Versicherung 4.0 Industrie 4.0 Handel 4.0 Bank 4.0 Fokus auf die produzierende Industrie Smart Services Industrial Data Space Fokus auf Daten Daten Übertragung, Netzwerke Echtzeitsysteme Fraunhofer Seite 10
The Industrial Data Space Initiative Community of >30 large German and European Companies Pre-competitive, publicly funded innovation project involving 11 Fraunhofer institutes for developing IDS reference architecture Current signatories of the MoU to support the Industrial Data Space Association Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Semantic Data Linking for Enterprise Data Value Chains Data Lake Industrial Data Space Pure Internet centralized, monopolistic federated, secure, trusted, standard-based completely dezentral, open, unsecure Data management Central Repository Decentral Decentral Data Ownership Central Decentral Decentral Data Linking Single provider Federated, on demand Missing Data Security Bilateral Certified system Bilateral Market structure Central Provider Role system Unstructured Transport infrastructure Internet Internet Internet Bilder: Fotolia Francesco De Paoli, Nmedia, hakandogu Fraunhofer Seite 12
Basic principles of the Industrial Data Space On Demand Vernetzung Interlinking Linked Light Semantics Security with Industrial Data Container Certified Roles Bilder: Fotolia 77260795 73040142 58947296 68898041 Fraunhofer Seite 13
Industrial Data Space: On Demand Interlinking All Data stays with its Ownern and are controlled and secured. Only on request for a service data will be shared. No central platform. Service F Enterprise 6 Enterprise 5 Service G Service A Enterprise 1 Enterprise 4 Service B Service E Service C Enterprise 2 Enterprise 3 Service D Bildquellen: Istockphoto Fraunhofer Seite 14
Linked Light Semantics A lighweight approach for Data Interlinking Classical Enterprise systems Linked Light Semantics Internet / WWW Fixed Data schema Reference vocabularies Web pages Globale Enforcement Closed Manuel Transformation Bridge between local Representations Intelligent and structured interlinked Automatic translation/mapping Only Links Completely open Lack of standardization High cost Leight-weight No structure Q: istockphoto.com Fraunhofer Seite 15 --- VERTRAULICH ---
IDS Architecture Overview Clearing Vocabulary Apps Industrial Data Space App Store Industrial Data Space Index Registry Industrial Data Space Broker Download Third Party Internal IDS Connector Upload External IDS Connector Upload / Download / Search Internet External IDS Connector Cloud Provider Company A Internal IDS Connector Upload / Download Company B Fraunhofer Fraunhofer Seite 16 --- VERTRAULICH ---
Industry 4.0 Semantic Models as Bridge between Shop & Office Floor Fraunhofer Seite 17
Semantic Administrative Shell & Reference Architecture for Industry 4.0 (RAMI4.0) Administrative Shell (Verwaltungsschale) provides a digital identity for arbitrary Industry 4.0 components (e.g. sensors, actors/robots) exposing data covering the whole life-cycle Reference Architecture for Industry 4.0 (RAMI4.0) provides a conceptual framework for implementing comprehensive Industry 4.0 scenarios We have implemented both concepts along with a number of IEC and ISO standards in a comprehensive information model ready to be implemented in productive environments Fraunhofer Seite 18
Summary Challenges and Opportunities - Interoperability and Standardization Adding a semantic layer to Big Data technology Integrating Linked Data and Big Data technology Towards Enterprise Knowledge Graphs and Data Spaces Applications e.g. in Manufacturing, Cultural Heritage, Finance Fraunhofer Seite 19