IOT & Big Data: The Future Information Processing Architecture Dr. Michael Faden Dirk Weise BASEL BERN BRUGG GENF LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN 1
Oder: Was mach ich nur mit den ganzen Daten? 2
Agenda 1. Introduction 2. Terms and Definitions 3. Reference Architecture 4. Architecture Building Blocks 5. Architecture Patterns 6. Solution Building Blocks 3
Agenda 1. Introduction 2. Terms and Definitions 3. Reference Architecture 4. Architecture Building Blocks 5. Architecture Patterns 6. Solution Building Blocks 4
Discussions about Big Data Is our use case Big Data or not? What shall we do with all that Sensor Data Do we have to use Hadoop or NoSQL for Big Data? Do we miss out if we don t do Big Data? 5
Understand Drivers for Big Data Undertakings Opportunity Business Scenario Business Demand Capability A business scenario bridges three drivers end to end: Opportunity There is something one could do. Capability There is something we can do. Business demand There is something we need to do. 6
BIG Respelled Business Value: Investment in Big Data, be it in terms of personnel or technology, has to be rectified by convincing business demand. Integration: Big Data sources have to be integrated with the enterprise architecture semantically, technologically and procedurally. Governance: Big Data must not dilute the enterpise information model. Business processes have to be aligned to data source properties (e.g. topicality, reliability). 7
BIG Respelled Business Value: Investment in Big Data, be it in terms of personnel or technology, has to be rectified by convincing business demand. Integration: Big Data sources have to be integrated with the enterprise architecture semantically, technologically and procedurally. Governance: Big Data must not dilute the enterpise information model. Business processes have to be aligned to data source properties (e.g. topicality, reliability). 8
..how to avoid this? 9
Separate Concepts from Implementation Required Capability What? Provided Capability How? Required capability Conceptual requirements capture and high-level solution Independent from technology, organisation and processes Provided capability Concrete detailed solution and implementation Specific to technology, organisation and processes 10
Agenda 1. Introduction 2. Terms and Definitions 3. Reference Architecture 4. Architecture Building Blocks 5. Architecture Patterns 6. Solution Building Blocks 11
Terminology Architecture Fundamentals of a system Context, structure, abstraction, process Architecture View System seen from the perspective of specific concerns Comprehensible, comprehensive Building Block Potentially reusable component of an architecture Relevant, interrelated, fractal ISO/IEC 42010 Systems and software engineering Architecture description TOGAF Version 9.1 Definitions 12
Reference Architecture Architecture Building Blocks Solution Building Blocks Our Approach Method & Patterns Requirements view Implementation view Operation view Reference architecture. The canvas for our architecture ABBs. Commonly found requirements SBBs. Commonly applied solutions Views. Tailored to stakeholders Patterns. Commonly found requirement/solution profiles Method. Guideline to populate the canvas including consistency checks 13
Agenda 1. Introduction 2. Terms and Definitions 3. Reference Architecture 4. Architecture Building Blocks 5. Architecture Patterns 6. Solution Building Blocks 14
Reference Architecture Governance Data Sources Data Ingestion Data Factory & Managed Enterprise Information Information Provisioning Information Query & Visualisation Discovery Lab Organisation 15
Agenda 1. Introduction 2. Terms and Definitions 3. Reference Architecture 4. Architecture Building Blocks 5. Architecture Patterns 6. Solution Building Blocks 16
Architecture Building Blocks Governance Metadata Management Master Data Management Information Quality & Accountability Security Legal Compliance Data Sources Data Ingestion Data Factory & Managed Enterprise Information Information Provisioning Information Query & Visualisation Data Engines & Poly-structured Sources Structured Data Sources Data Source Connectivity & Capturing Raw Data in Motion Managed Information Layer Access & Performance Layer Virtualisation & Query Federation Export (push) Query (pull) Prebuilt & Adhoc BI Assets Information Services Master & Reference Data Sources Data Factory Layer Discovery Lab Content Raw Data at Rest Discovery Lab Sandboxes Advanced Analysis & Data Science Tools Organisation BI Competence Centre IT Operations Business Stakeholders 17
Agenda 1. Introduction 2. Terms and Definitions 3. Reference Architecture 4. Architecture Building Blocks 5. Architecture Patterns 6. Solution Building Blocks 18
Architecture Patterns Schema on Write Governance Data Sources Data Ingestion Data Factory & Managed Enterprise Information Information Provisioning Information Query & Visualisation Data Source Connectivity & Capturing Managed Information Raw Data in Motion Factory Discovery Lab Raw Data at Rest Organisation 19
Architecture Patterns Classic BI Governance Metadata Management Master Data Management Information Quality & Accountability Security Legal Compliance Data Sources Data Ingestion Data Factory & Managed Enterprise Information Information Provisioning Information Query & Visualisation Structured Data Sources Data Source Connectivity & Capturing Managed Information Access & Performance Layer Query (pull) Prebuilt & Adhoc BI Assets Master & Reference Data Sources Factory Discovery Lab Organisation BI Competence Centre IT Operations Business Stakeholders 20
Architecture Patterns Classic BI (in its own words) Governance Metadata Management Master Data Management Information Quality & Accountability Security Legal Compliance Data Sources Data Ingestion Data Factory & Managed Enterprise Information Information Provisioning Information Query & Visualisation Structured Data Sources Data Source Connectivity & Capturing Core DWH Data Marts Query (pull) Reporting & Ad-hoc Queries BI Portal etc. Master & Reference Data Sources ETL incl. Staging & Cleansing Advanced Analytics ODS incl. Historisation Data Mining Organisation BI Competence Centre IT Operations Business Stakeholders 21
Architecture Patterns Schema on Read Governance Data Sources Data Ingestion Data Factory & Managed Enterprise Information Information Provisioning Information Query & Visualisation Data Source Connectivity & Capturing Raw Data in Motion Discovery Lab Raw Data at Rest Organisation 22
Architecture Patterns Lambda Architecture Governance Data Sources Data Ingestion Data Factory & Managed Enterprise Information Information Provisioning Information Query & Visualisation Data Source Connectivity & Capturing Raw Data in Motion Factory Managed Information Access & Performance Layer Managed Informations Discovery Lab Raw Data at Rest Factory Organisation 23
Architecture Patterns Discovery Lab Governance Data Sources Data Ingestion Data Factory & Managed Enterprise Information Information Provisioning Information Query & Visualisation Data Source Connectivity & Capturing Raw Data in Motion Managed Information Layer Discovery Lab Raw Data at Rest Discovery Lab Sandboxes Advanced Analysis & Data Science Tools Organisation BI Competence Centre IT Operations Business Stakeholders 24
Agenda 1. Introduction 2. Terms and Definitions 3. Reference Architecture 4. Architecture Building Blocks 5. Architecture Patterns 6. Solution Building Blocks 25
StreamInsight Solutions Data Ingestion Architecture Solutions An RDBMS Way (synchronous) An MS Way (asynchronous) Data Source Connectivity & Capturing DB Upsert EventHub Raw Data in Motion Data Factory DB Triggers Raw Data at Rest DB Staging Tables SQLServer Similar: CDC Similar: ESB Kafka & Storm 27
Solutions Data Factory (Integration) Architecture Solutions To RDBMS To Hadoop To NoSQL Data Store (RDBMS) DB Link JDBC SQOOP, JDBC SPARQL Data Store (Hadoop) Data Store SQOOP, JDBC Hive, Pig, Drill Data Store (RDBMS) Hive, Pig, Drill Data Store (Hadoop) SPARQL Data Store (NoSQL) Data Store (NoSQL) SPARQL SPARQL SPARQL 28
Solutions Data Factory (Interpretation) Architecture Solutions Schema Application Identity Resolution Data Cleansing Data Capture Raw Data Raw Data Raw Data Raw Data Raw Data in Raw Motion Data MEI Parsing NLP OCR Normalisation MEI Matching Algorithms MEI Cleansing Algorithms MEI Raw Data at Raw Rest Data Reference Data Operational Data Reference Data 29
Solution Building Block Governance Data Sources Data Ingestion Azure HDInsight Data Factory & Managed Enterprise Information Power BI Information Provisioning Information Query & Visualisation ML Discovery Studio Lab Organisation 30
Questions & Answers Dr. Michael Faden Senior Consultant Tel.: +49 162 291 47 74 michael.faden@trivadis.com Dirk Weise Senior Consultant Tel.: +41 79 909 72 48 dirk.weise@trivadis.com BASEL BERN BRUGG GENF LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN