8. Business Intelligence Reference Architectures and Patterns Winter Semester 2008 / 2009 Prof. Dr. Bernhard Humm Darmstadt University of Applied Sciences Department of Computer Science 1 Prof. Dr. Bernhard Humm, Darmstadt University of Applied Sciences, WS 2008 / 2009. 1.12.2008
The lecture in the context of the entire course 1. Introduction 2. A reference architecture for business information systems 3. Application kernel 4. Persistence and transaction 5. Authorization 6. Client architecture 7. Exception handling 8. Business Intelligence 9. Systems integration 10. Service-oriented architecture 11. Selected design patterns 12. Design for testability 2 Prof. Dr. Bernhard Humm, Darmstadt University of Applied Sciences, WS 2008 / 2009. 1.12.2008
Agenda Definitions Reference Architecture ETL Aggregation Products Literature
Business Intelligence (BI) is the process of transforming data into information and, furthermore, into knowledge Example: customer segmentation Knowledge Selection of Customers that are most likely to purchase on-line Mailing to selected customers Decision Increased sales Information Purchasing behaviour with respect to product groups etc. Data Sales history age, Added Value 4 Prof. Dr. Bernhard Humm, Darmstadt University of Applied Sciences, WS 2008 / 2009. 1.12.2008
Business Intelligence: a buzzword amongst many Business Intelligence subsumes applications and technologies like, e.g., Data Warehousing (DW), Data Mining, Online Analytical Processing (OLAP), and Analytical Applications. Other related buzzwords / synonyms: Analytical Customer Relationship (acrm), Corporate Performance (CPM), Extraction Transformation - Load (ETL), Right Time Analytics, Information System (MIS), Decision Support System (DSS) 5 Prof. Dr. Bernhard Humm, Darmstadt University of Applied Sciences, WS 2008 / 2009. 1.12.2008
Example of a BI application Filters by dimensions Description Grouping according to dimensions Facts and measures (possibly aggregated) Graphic representation Source: MSDN 6 Prof. Dr. Bernhard Humm, Darmstadt University of Applied Sciences, WS 2008 / 2009. 1.12.2008
Facts and Measures Measure The smallest unit of information in a DW Always numerical Can be aggregated (sum, average, etc.) Distinguish between measure types (e.g., sales) and measure values (e.g., $42.00) Fact Description Provide additional information concerning measures Will not be aggregated; need not be numerical Fact An entity consisting of measures and fact descriptions as attributes Associated to dimensions Type Sales #Orders OrderNumber Example with values Sales = $42.00 #Orders = 5 OrderNumber = 4711 day = 2007-12-17 Business Unit = FRA Fact (type) Measure Fact Description Fact (values) Measure Fact Description Dimension 7 Prof. Dr. Bernhard Humm, Darmstadt University of Applied Sciences, WS 2008 / 2009. 1.12.2008
Dimensions Dimension Dimension Filter- and aggregation criterion for measures Span a multi-dimensional space Provide a coordinate system for navigating through measures Dimension Element Time Year Dimension element Dimension can be hierarchically structured into several dimension elements (dimension hierarchy) 1..n relationship between dimension elements Dimension Basis Day Month Week Form a list or rarely a tree rsp. a directed acyclic graph (DAG) Distinguish between dimension element types (e.g., day) and values (e.g., 2007-12-17) Dimension basis Is a particular dimension element Sales #Orders OrderNumber Fact Most concrete dimension element (innermost) 8 Prof. Dr. Bernhard Humm, Darmstadt University of Applied Sciences, WS 2008 / 2009. 1.12.2008
Star (Cube) = Facts + Dimensions Galaxy = Stars with common dimensions Region Star Year Time Country Orders Month Dimension Region Week Business Unit Day Dimension Element Sales #Orders OrderNumber Measure Dimension Basis Fact Description... Fact...... 9 Prof. Dr. Bernhard Humm, Darmstadt University of Applied Sciences, WS 2008 / 2009. 1.12.2008...
Modelling Reports and Stars (Cubes) Report Modelled with respect to Region Year Time Star (Cube) in DW (multi-dimensional) Country Region Business Unit Orders Sales #Orders OrderNumber Day Month Week...... Modelled with respect to...... Salesman BusinessUnit 1 System (relational) Customer CustomerId 1 Order Date OrderId 1 n Order Position Number 1 Product Price 10 Prof. Dr. Bernhard Humm, Darmstadt University of Applied Sciences, WS 2008 / 2009. 1.12.2008
Navigating in a cube: slicing & dicing, drill down & roll up The star is represented as a multi-dimensional cube Plan Actual Plan / Actual Regions US West Europe Asia / Pacific Jan Feb Mar Slicing 5 Time Car Truck Bus Drill Down, Roll up C 200 S 320 Smart Product groups Products Dicing Go up and down dimension hierarchies Take into account or omit dimensions 11 Prof. Dr. Bernhard Humm, Darmstadt University of Applied Sciences, WS 2008 / 2009. 1.12.2008
Agenda Definitions Reference Architecture ETL Aggregation Products Literature
The reference architecture for Business Intelligence / Data Warehousing Users Data Targets Analyst, Controller Manager Employee Partner Administrator system Data Warehouse / Business Intelligence Information Delivery Predefined Reporting Online analytical processing Analytic Applications Performance Forecasting, Simulation Warehouse 3. Analysis Data Mining Collaboration, Commenting Budgeting, Planning Meta Data 2. Aggregation Data Data Staging Extraction Data Store SQL, ODBC, JDBC, BAPI, XQuery, ODBO, MDX, XML/A, PMML Transformation, Harmonization, Integration Core DWH (Stars, Aggregates) Quality Data Marts (relational, multidimensional) Loading Meta Data Enterprise Application Integration (EAI) Enterprise Information Integration (EII) Security Scheduling Systems Legend Service 1. ETL Data Sources Data Flow Control Flow System (COTS or custom) Static Data Hub External Data (e.g. Information Provider) Informal Data (e.g., Spread Sheet) User 13 Prof. Dr. Bernhard Humm, Darmstadt University of Applied Sciences, WS 2008 / 2009. 1.12.2008
Agenda Definitions Reference Architecture ETL ETL Aggregation Products Literature
Extraction / Transformation / Loading in the context of the reference architecture Users Data Targets Analyst, Controller Manager Employee Partner Administrator system Data Warehouse / Business Intelligence Information Delivery Predefined Reporting Online analytical processing Analytic Applications Performance Forecasting, Simulation Warehouse 3. Analysis Data Mining Collaboration, Commenting Budgeting, Planning Meta Data 2. Aggregation Data Data Staging Extraction Data Store SQL, ODBC, JDBC, BAPI, XQuery, ODBO, MDX, XML/A, PMML Transformation, Harmonization, Integration Core DWH (Stars, Aggregates) Quality Data Marts (relational, multidimensional) Loading Meta Data Enterprise Application Integration (EAI) Enterprise Information Integration (EII) Security Scheduling Systems Legend Service 1. ETL Data Sources Data Flow Control Flow System (COTS or custom) Static Data Hub External Data (e.g. Information Provider) Informal Data (e.g., Spread Sheet) User 15 Prof. Dr. Bernhard Humm, Darmstadt University of Applied Sciences, WS 2008 / 2009. 1.12.2008
Extraction: How to extract data from operational systems? Dialog Business Transaction Application Kernel Extraction Technical Transaction Data Base 16 Prof. Dr. Bernhard Humm, Darmstadt University of Applied Sciences, WS 2008 / 2009. 1.12.2008
4 ways of extracting data from operational systems Via Application Kernel Via Data Base Dialog Dialog Export Application Kernel Export Application Kernel Data Base Data Base DB Export Logging (incremental) Dialog Business transaction Application Kernel Data Base Logging Dialog Application Kernel Technical Transaction Data Base DB Logging 17 Prof. Dr. Bernhard Humm, Darmstadt University of Applied Sciences, WS 2008 / 2009. 1.12.2008
Transformation and Loading Dialog Application Kernel Extraction Data Warehouse Data Base Dialog Application Kernel Data Base Extraction Transformation, Harmonization, Integration, Quality Mgmt. Loading Staging Area Dialog Application Kernel Extraction Data Base 18 Prof. Dr. Bernhard Humm, Darmstadt University of Applied Sciences, WS 2008 / 2009. 1.12.2008
Agenda Definitions Reference Architecture ETL Aggregation Products Literature
Aggregation in the context of the reference architecture Users Data Targets Analyst, Controller Manager Employee Partner Administrator system Data Warehouse / Business Intelligence Information Delivery Predefined Reporting Online analytical processing Analytic Applications Performance Forecasting, Simulation Warehouse 3. Analysis Data Mining Collaboration, Commenting Budgeting, Planning Meta Data 2. Aggregation Data Data Staging Extraction Data Store SQL, ODBC, JDBC, BAPI, XQuery, ODBO, MDX, XML/A, PMML Transformation, Harmonization, Integration Core DWH (Stars, Aggregates) Quality Data Marts (relational, multidimensional) Loading Meta Data Enterprise Application Integration (EAI) Enterprise Information Integration (EII) Security Scheduling Systems Legend Service 1. ETL Data Sources Data Flow Control Flow System (COTS or custom) Static Data Hub External Data (e.g. Information Provider) Informal Data (e.g., Spread Sheet) User 20 Prof. Dr. Bernhard Humm, Darmstadt University of Applied Sciences, WS 2008 / 2009. 1.12.2008
Aggregation Information Delivery / Analytic Applications 5. Data 4. 2. Core Data Warehouse Data Marts Data Store (ODS): Relational Multi-Dimensional Aggregated relational multidimensional 1. 3. Data Staging 21 Prof. Dr. Bernhard Humm, Darmstadt University of Applied Sciences, WS 2008 / 2009. 1.12.2008
Agenda Definitions Reference Architecture ETL Aggregation Products Literature
A product map assigns products to clusters of services in the reference architecture Users Data Targets Analyst, Controller Manager Employee Partner Administrator system Data Warehouse / Business Intelligence Information Delivery Predefined Reporting Data Mining Data Data Store BusinessObjects, CrystalReports,... Online analytical processing Collaboration, Commenting Core DWH (Stars, Aggregates) Analytic Applications Performance Budgeting, Planning Forecasting, Simulation SQL, ODBC, JDBC, BAPI, XQuery, ODBO, MDX, XML/A, PMML Data Marts (relational, multidimensional) Meta Data SAP-BW, Oracle, Warehouse IBM DB2, MS SQL-Server MicroStrategy, Meta Data... Security Scheduling Data Staging Extraction Transformation, Harmonization, Integration Informatica,... Quality Loading Enterprise Application Integration (EAI) Enterprise Information Integration (EII) Systems Data Sources System (COTS or custom) Static Data Hub External Data (e.g. Information Provider) Informal Data (e.g., Spread Sheet) 23 Prof. Dr. Bernhard Humm, Darmstadt University of Applied Sciences, WS 2008 / 2009. 1.12.2008
Agenda Definitions Reference Architecture ETL Aggregation Products Literature Literature
Literature Bernhard Humm, Frank Wietek: Architektur von Data Warehouses und Business Intelligence Systemen. Informatik Spektrum 3/05, S. 3-14, Springer Verlag. 2005 (download from my home page) 25 Prof. Dr. Bernhard Humm, Darmstadt University of Applied Sciences, WS 2008 / 2009. 1.12.2008