Data Integration and ETL Process
|
|
- Leo Hodge
- 8 years ago
- Views:
Transcription
1 Data Integration and ETL Process Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Software Development Technologies Master studies, second semester Academic year 2014/15 (winter course) 1 / 33
2 Review of the previous lecture Mining of massive datasets Evolution of database systems: operational vs. analytical systems. Dimensional modeling: Three goals of the logical design of data warehouse: simplicity, expressiveness and performance. The most popular conceptual schema: star schema. Designing data warehouses is not an easy task. 2 / 33
3 Outline 1 Motivation 2 Data Extraction 3 Transformation and Integration of Data 4 Load of Data 5 Data Warehouse Refreshment 6 Summary 3 / 33
4 Outline 1 Motivation 2 Data Extraction 3 Transformation and Integration of Data 4 Load of Data 5 Data Warehouse Refreshment 6 Summary 4 / 33
5 Motivation OLAP queries are usually performed in a separate system, i.e., in a data warehouse. Data warehouses combine data from multiple sources. Data must be translated into a consistent format. Data integration represents 80% of effort for a typical data warehouse project! 5 / 33
6 ETL process ETL = Extraction, Transformation, and Load Extraction of data from source systems, Transformation and integration of data into a useful format for analysis, Load of data into the warehouse and build of additional structures. Refreshment of data warehouse is closely related to ETL process. The ETL process is described by metadata stored in data warehouse. Architecture of data warehousing: Data sources Data staging area Data warehouse 6 / 33
7 ETL process 7 / 33
8 ETL tools Data extraction from heterogeneous data sources. Data transformation, integration, and cleansing. Data quality analysis and control. Data loading. High-speed data transfer. Data refreshment. Managing and analyzing metadata. Examples of ETL tools: MS SQL Server Integration Services(SSIS), IBM Infosphere DataStage, SAS ETL Studio, Oracle Warehouse Builder, Oracle Data Integrator, Business Objects Data Integrator, Pentaho Data Integration. 8 / 33
9 ETL tools MS SQL Server Integration Services(SSIS) 9 / 33
10 ETL tools MS SQL Server Integration Services(SSIS) 10 / 33
11 Outline 1 Motivation 2 Data Extraction 3 Transformation and Integration of Data 4 Load of Data 5 Data Warehouse Refreshment 6 Summary 11 / 33
12 Data extraction Data warehouse needs extraction of data from different external data sources: operational databases (relational, hierarchical, network, itp.), files of standard applications (Excel, COBOL applications), additional databases (direct marketing databases) and data services (stock data), and other documents (.doc, XML, WWW). 12 / 33
13 Data sources Data sources are often operational systems, providing the lowest level of data. 13 / 33
14 Data sources Data sources are often operational systems, providing the lowest level of data. Data sources are designed for operational use, not for decision support, and the data reflect this fact. 13 / 33
15 Data sources Data sources are often operational systems, providing the lowest level of data. Data sources are designed for operational use, not for decision support, and the data reflect this fact. Multiple data sources are often from different systems, run on a wide range of hardware and much of the software is built in-house or highly customized. 13 / 33
16 Data sources Data sources are often operational systems, providing the lowest level of data. Data sources are designed for operational use, not for decision support, and the data reflect this fact. Multiple data sources are often from different systems, run on a wide range of hardware and much of the software is built in-house or highly customized. Data sources can be designed using different logical structures. 13 / 33
17 Data extraction Identification of concepts and objects does not have to be easy. Example: Extract information about sales from the source system. What is meant by the term sale? 14 / 33
18 Data extraction Identification of concepts and objects does not have to be easy. Example: Extract information about sales from the source system. What is meant by the term sale? A sale has occurred when the order has been received by a customer, the order is sent to the customer, the invoice has been raised against the order. 14 / 33
19 Data extraction Identification of concepts and objects does not have to be easy. Example: Extract information about sales from the source system. What is meant by the term sale? A sale has occurred when the order has been received by a customer, the order is sent to the customer, the invoice has been raised against the order. It is a common problem that there is no table SALES in the operational databases; some other tables can exist like ORDER with an attribute ORDER STATUS. 14 / 33
20 Change monitoring Change monitoring is directly connected with data warehouse refreshment. Detect changes to an information source. Different monitoring techniques: external and intrusive techniques. 15 / 33
21 Change monitoring Change monitoring is directly connected with data warehouse refreshment. Detect changes to an information source. Different monitoring techniques: external and intrusive techniques. Monitoring Techniques Snapshot vs. timestamped sources Queryable, logged, and replicated sources Callback and internal action sources 15 / 33
22 Outline 1 Motivation 2 Data Extraction 3 Transformation and Integration of Data 4 Load of Data 5 Data Warehouse Refreshment 6 Summary 16 / 33
23 Transformation and integration of data Transformation and integration of data is the most important part of data warehousing. 17 / 33
24 Transformation and integration of data Transformation and integration of data is the most important part of data warehousing. This phase consists in removing all inconsistencies and redundancies of data coming to the data warehouse from operational data sources. 17 / 33
25 Transformation and integration of data Transformation and integration of data is the most important part of data warehousing. This phase consists in removing all inconsistencies and redundancies of data coming to the data warehouse from operational data sources. The data are conform to the conceptual schema used by the warehouse. 17 / 33
26 Transformation and integration of data Transformation and integration of data is the most important part of data warehousing. This phase consists in removing all inconsistencies and redundancies of data coming to the data warehouse from operational data sources. The data are conform to the conceptual schema used by the warehouse. Integration concerns data and data schemas. 17 / 33
27 Transformation and integration of data Transformation and integration of data is the most important part of data warehousing. This phase consists in removing all inconsistencies and redundancies of data coming to the data warehouse from operational data sources. The data are conform to the conceptual schema used by the warehouse. Integration concerns data and data schemas. Different levels of integration: schema, table, tuple, attribute values. 17 / 33
28 Conflicts and dirty data Different logical models of operational sources, 18 / 33
29 Conflicts and dirty data Different logical models of operational sources, Different data types (account number stored as String or Numeric), 18 / 33
30 Conflicts and dirty data Different logical models of operational sources, Different data types (account number stored as String or Numeric), Different data domains (gender: M, F, male, female, 1, 0), 18 / 33
31 Conflicts and dirty data Different logical models of operational sources, Different data types (account number stored as String or Numeric), Different data domains (gender: M, F, male, female, 1, 0), Different date formats (dd-mm-yyyy or mm-dd-yyyy), 18 / 33
32 Conflicts and dirty data Different logical models of operational sources, Different data types (account number stored as String or Numeric), Different data domains (gender: M, F, male, female, 1, 0), Different date formats (dd-mm-yyyy or mm-dd-yyyy), Different field lengths (address stored by using 20 or 50 chars), 18 / 33
33 Conflicts and dirty data Different logical models of operational sources, Different data types (account number stored as String or Numeric), Different data domains (gender: M, F, male, female, 1, 0), Different date formats (dd-mm-yyyy or mm-dd-yyyy), Different field lengths (address stored by using 20 or 50 chars), Different naming conventions: homonyms and synonyms, 18 / 33
34 Conflicts and dirty data Different logical models of operational sources, Different data types (account number stored as String or Numeric), Different data domains (gender: M, F, male, female, 1, 0), Different date formats (dd-mm-yyyy or mm-dd-yyyy), Different field lengths (address stored by using 20 or 50 chars), Different naming conventions: homonyms and synonyms, Semantic conflicts, when the same objects are modeled on different logical levels, 18 / 33
35 Conflicts and dirty data Different logical models of operational sources, Different data types (account number stored as String or Numeric), Different data domains (gender: M, F, male, female, 1, 0), Different date formats (dd-mm-yyyy or mm-dd-yyyy), Different field lengths (address stored by using 20 or 50 chars), Different naming conventions: homonyms and synonyms, Semantic conflicts, when the same objects are modeled on different logical levels, Structural conflicts, when the same concepts are modeled using different structures. 18 / 33
36 Conflicts and dirty data Different entries for the same attribute (state name or abbreviation), 19 / 33
37 Conflicts and dirty data Different entries for the same attribute (state name or abbreviation), Text fields can possess hidden information (contact person name can be given in the company name field), 19 / 33
38 Conflicts and dirty data Different entries for the same attribute (state name or abbreviation), Text fields can possess hidden information (contact person name can be given in the company name field), Wrong attribute entries (attribute name contains company name or contact person name), 19 / 33
39 Conflicts and dirty data Different entries for the same attribute (state name or abbreviation), Text fields can possess hidden information (contact person name can be given in the company name field), Wrong attribute entries (attribute name contains company name or contact person name), Inconsistent information concerning the same object, 19 / 33
40 Conflicts and dirty data Different entries for the same attribute (state name or abbreviation), Text fields can possess hidden information (contact person name can be given in the company name field), Wrong attribute entries (attribute name contains company name or contact person name), Inconsistent information concerning the same object, Information concerning the same object, but indicated by different keys, 19 / 33
41 Conflicts and dirty data Different entries for the same attribute (state name or abbreviation), Text fields can possess hidden information (contact person name can be given in the company name field), Wrong attribute entries (attribute name contains company name or contact person name), Inconsistent information concerning the same object, Information concerning the same object, but indicated by different keys, Missing values, 19 / 33
42 Conflicts and dirty data Different entries for the same attribute (state name or abbreviation), Text fields can possess hidden information (contact person name can be given in the company name field), Wrong attribute entries (attribute name contains company name or contact person name), Inconsistent information concerning the same object, Information concerning the same object, but indicated by different keys, Missing values, / 33
43 Data cleansing We would like to analyze high-quality data, since our goal is to support decision making. 20 / 33
44 Data cleansing techniques Conversion and normalization methods (date formats dd/mm/rrrr, names conventions: Jan Kowalski), 21 / 33
45 Data cleansing techniques Conversion and normalization methods (date formats dd/mm/rrrr, names conventions: Jan Kowalski), Parsing text fields in order to identify and isolate data elements: 21 / 33
46 Data cleansing techniques Conversion and normalization methods (date formats dd/mm/rrrr, names conventions: Jan Kowalski), Parsing text fields in order to identify and isolate data elements: Transformation (splitting the text into records {title = mgr, first name = Jan, last name = Kowalski}), 21 / 33
47 Data cleansing techniques Conversion and normalization methods (date formats dd/mm/rrrr, names conventions: Jan Kowalski), Parsing text fields in order to identify and isolate data elements: Transformation (splitting the text into records {title = mgr, first name = Jan, last name = Kowalski}), Standardization (Jan Kowalski, magister mgr Jan Kowalski), 21 / 33
48 Data cleansing techniques Conversion and normalization methods (date formats dd/mm/rrrr, names conventions: Jan Kowalski), Parsing text fields in order to identify and isolate data elements: Transformation (splitting the text into records {title = mgr, first name = Jan, last name = Kowalski}), Standardization (Jan Kowalski, magister mgr Jan Kowalski), Dictionary-based methods (database of names, geographical places, pharmaceutical data), 21 / 33
49 Data cleansing techniques Conversion and normalization methods (date formats dd/mm/rrrr, names conventions: Jan Kowalski), Parsing text fields in order to identify and isolate data elements: Transformation (splitting the text into records {title = mgr, first name = Jan, last name = Kowalski}), Standardization (Jan Kowalski, magister mgr Jan Kowalski), Dictionary-based methods (database of names, geographical places, pharmaceutical data), Domain-specific knowledge methods to complete data (postal codes), 21 / 33
50 Data cleansing techniques Conversion and normalization methods (date formats dd/mm/rrrr, names conventions: Jan Kowalski), Parsing text fields in order to identify and isolate data elements: Transformation (splitting the text into records {title = mgr, first name = Jan, last name = Kowalski}), Standardization (Jan Kowalski, magister mgr Jan Kowalski), Dictionary-based methods (database of names, geographical places, pharmaceutical data), Domain-specific knowledge methods to complete data (postal codes), Rationalization of data (PHX323RFD110A4 Print paper, format A4), 21 / 33
51 Data cleansing techniques Conversion and normalization methods (date formats dd/mm/rrrr, names conventions: Jan Kowalski), Parsing text fields in order to identify and isolate data elements: Transformation (splitting the text into records {title = mgr, first name = Jan, last name = Kowalski}), Standardization (Jan Kowalski, magister mgr Jan Kowalski), Dictionary-based methods (database of names, geographical places, pharmaceutical data), Domain-specific knowledge methods to complete data (postal codes), Rationalization of data (PHX323RFD110A4 Print paper, format A4), Rule-based cleansing (replace gender by sex) 21 / 33
52 Data cleansing techniques Conversion and normalization methods (date formats dd/mm/rrrr, names conventions: Jan Kowalski), Parsing text fields in order to identify and isolate data elements: Transformation (splitting the text into records {title = mgr, first name = Jan, last name = Kowalski}), Standardization (Jan Kowalski, magister mgr Jan Kowalski), Dictionary-based methods (database of names, geographical places, pharmaceutical data), Domain-specific knowledge methods to complete data (postal codes), Rationalization of data (PHX323RFD110A4 Print paper, format A4), Rule-based cleansing (replace gender by sex) Cleansing by using data mining. 21 / 33
53 Data cleansing techniques Deduplication ensures that one accurate record exists for each business entity represented in a database, 22 / 33
54 Data cleansing techniques Deduplication ensures that one accurate record exists for each business entity represented in a database, Householding is the technique of grouping individual customers by the household or organization of which they are a member; this technique has some interesting marketing implications, and can also support cost-saving measures of direct advertising. 22 / 33
55 Data cleansing techniques Deduplication ensures that one accurate record exists for each business entity represented in a database, Householding is the technique of grouping individual customers by the household or organization of which they are a member; this technique has some interesting marketing implications, and can also support cost-saving measures of direct advertising. Example: 22 / 33
56 Data cleansing techniques Deduplication ensures that one accurate record exists for each business entity represented in a database, Householding is the technique of grouping individual customers by the household or organization of which they are a member; this technique has some interesting marketing implications, and can also support cost-saving measures of direct advertising. Example: Consider the following rows in a database: Tim Jones 123 Main Street Marlboro MA T. Jones 123 Main St. Marlborogh MA Timothy Jones 321 Maine Street Marlborog AM Jones, Timothy 123 Maine Ave Marlborough MA / 33
57 Data cleansing techniques Deduplication ensures that one accurate record exists for each business entity represented in a database, Householding is the technique of grouping individual customers by the household or organization of which they are a member; this technique has some interesting marketing implications, and can also support cost-saving measures of direct advertising. Example: Consider the following rows in a database: Tim Jones 123 Main Street Marlboro MA T. Jones 123 Main St. Marlborogh MA Timothy Jones 321 Maine Street Marlborog AM Jones, Timothy 123 Maine Ave Marlborough MA The sales for around $500 are counted for each tuple. 22 / 33
58 Data cleansing techniques Deduplication ensures that one accurate record exists for each business entity represented in a database, Householding is the technique of grouping individual customers by the household or organization of which they are a member; this technique has some interesting marketing implications, and can also support cost-saving measures of direct advertising. Example: Consider the following rows in a database: Tim Jones 123 Main Street Marlboro MA T. Jones 123 Main St. Marlborogh MA Timothy Jones 321 Maine Street Marlborog AM Jones, Timothy 123 Maine Ave Marlborough MA The sales for around $500 are counted for each tuple. Is it the same person? 22 / 33
59 Outline 1 Motivation 2 Data Extraction 3 Transformation and Integration of Data 4 Load of Data 5 Data Warehouse Refreshment 6 Summary 23 / 33
60 Load of data After extracting, cleaning and transforming, data must be loaded into the warehouse. 24 / 33
61 Load of data After extracting, cleaning and transforming, data must be loaded into the warehouse. Loading the warehouse includes some other processing tasks: checking integrity constraints, sorting, summarizing, creating indexes, etc. 24 / 33
62 Load of data After extracting, cleaning and transforming, data must be loaded into the warehouse. Loading the warehouse includes some other processing tasks: checking integrity constraints, sorting, summarizing, creating indexes, etc. Batch (bulk) load utilities are used for loading. 24 / 33
63 Load of data After extracting, cleaning and transforming, data must be loaded into the warehouse. Loading the warehouse includes some other processing tasks: checking integrity constraints, sorting, summarizing, creating indexes, etc. Batch (bulk) load utilities are used for loading. A load utility must allow the administrator to monitor status, to cancel, suspend, and resume a load, and to restart after failure with no loss of data integrity. 24 / 33
64 Load of data Concern very large data volumes. Sequential loads can take a very long time. Full load can be treated as a single long batch transaction that builds up a new database. Using checkpoints ensures that if a failure occurs during the load, the process can restart from the last checkpoint. 25 / 33
65 Outline 1 Motivation 2 Data Extraction 3 Transformation and Integration of Data 4 Load of Data 5 Data Warehouse Refreshment 6 Summary 26 / 33
66 Data warehouse refreshment Refreshing a warehouse means propagating updates on source data to the data stored in the warehouse. 27 / 33
67 Data warehouse refreshment Refreshing a warehouse means propagating updates on source data to the data stored in the warehouse. Follows the same structure as ETL process. 27 / 33
68 Data warehouse refreshment Refreshing a warehouse means propagating updates on source data to the data stored in the warehouse. Follows the same structure as ETL process. Several constraints: accessibility of data sources, size of data, size of data warehouse, frequency of data refreshing, degradation of performance of operational systems. 27 / 33
69 Data warehouse refreshment Detect changes in external data sources. 28 / 33
70 Data warehouse refreshment Detect changes in external data sources. Extract the changes and integrate into the warehouse. 28 / 33
71 Data warehouse refreshment Detect changes in external data sources. Extract the changes and integrate into the warehouse. Update indexes, subaggregates and materialized views. 28 / 33
72 Data warehouse refreshment Periodical refreshment (daily or weekly). Immediate refreshment. Determined by usage, types of data source, etc. 29 / 33
73 Refreshment vs. load Refreshment can be asynchronous. Load of data requires a long-term access to source data. Refreshment concerns less data. Refreshment is faster. 30 / 33
74 Outline 1 Motivation 2 Data Extraction 3 Transformation and Integration of Data 4 Load of Data 5 Data Warehouse Refreshment 6 Summary 31 / 33
75 Summary ETL process is a strategic element of data warehousing. Main concepts: extraction, transformation and integration, load, data warehouse refreshment and metadata. New emerging technology / 33
76 Bibliography C. Todman. Designing a Data Warehouse: Supporting Customer Relationship Management. Prentice Hall PTR, 2001 Matthias Jarke, Y. Vassiliou, P. Vassiliadis, and M. Lenzerini. Fundamentals of Data Warehouses. Springer-Verlag New York, second edition, 1999 R. Kimball and M. Ross. The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling. John Wiley & Sons, third edition, / 33
Data Integration and ETL Process
Data Integration and ETL Process Krzysztof Dembczyński Institute of Computing Science Laboratory of Intelligent Decision Support Systems Politechnika Poznańska (Poznań University of Technology) Software
More informationChapter 6 Basics of Data Integration. Fundamentals of Business Analytics RN Prasad and Seema Acharya
Chapter 6 Basics of Data Integration Fundamentals of Business Analytics Learning Objectives and Learning Outcomes Learning Objectives 1. Concepts of data integration 2. Needs and advantages of using data
More informationExtraction Transformation Loading ETL Get data out of sources and load into the DW
Lection 5 ETL Definition Extraction Transformation Loading ETL Get data out of sources and load into the DW Data is extracted from OLTP database, transformed to match the DW schema and loaded into the
More informationCOURSE 20463C: IMPLEMENTING A DATA WAREHOUSE WITH MICROSOFT SQL SERVER
Page 1 of 8 ABOUT THIS COURSE This 5 day course describes how to implement a data warehouse platform to support a BI solution. Students will learn how to create a data warehouse with Microsoft SQL Server
More informationImplementing a Data Warehouse with Microsoft SQL Server
Page 1 of 7 Overview This course describes how to implement a data warehouse platform to support a BI solution. Students will learn how to create a data warehouse with Microsoft SQL 2014, implement ETL
More informationLITERATURE SURVEY ON DATA WAREHOUSE AND ITS TECHNIQUES
LITERATURE SURVEY ON DATA WAREHOUSE AND ITS TECHNIQUES MUHAMMAD KHALEEL (0912125) SZABIST KARACHI CAMPUS Abstract. Data warehouse and online analytical processing (OLAP) both are core component for decision
More informationMETA DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING Ramesh Babu Palepu 1, Dr K V Sambasiva Rao 2 Dept of IT, Amrita Sai Institute of Science & Technology 1 MVR College of Engineering 2 asistithod@gmail.com
More informationImplementing a Data Warehouse with Microsoft SQL Server
This course describes how to implement a data warehouse platform to support a BI solution. Students will learn how to create a data warehouse 2014, implement ETL with SQL Server Integration Services, and
More informationImplement a Data Warehouse with Microsoft SQL Server 20463C; 5 days
Lincoln Land Community College Capital City Training Center 130 West Mason Springfield, IL 62702 217-782-7436 www.llcc.edu/cctc Implement a Data Warehouse with Microsoft SQL Server 20463C; 5 days Course
More informationImplementing a Data Warehouse with Microsoft SQL Server MOC 20463
Implementing a Data Warehouse with Microsoft SQL Server MOC 20463 Course Outline Module 1: Introduction to Data Warehousing This module provides an introduction to the key components of a data warehousing
More informationCOURSE OUTLINE MOC 20463: IMPLEMENTING A DATA WAREHOUSE WITH MICROSOFT SQL SERVER
COURSE OUTLINE MOC 20463: IMPLEMENTING A DATA WAREHOUSE WITH MICROSOFT SQL SERVER MODULE 1: INTRODUCTION TO DATA WAREHOUSING This module provides an introduction to the key components of a data warehousing
More informationImplementing a Data Warehouse with Microsoft SQL Server 2014
Implementing a Data Warehouse with Microsoft SQL Server 2014 MOC 20463 Duración: 25 horas Introducción This course describes how to implement a data warehouse platform to support a BI solution. Students
More informationSQL Server 2012 Business Intelligence Boot Camp
SQL Server 2012 Business Intelligence Boot Camp Length: 5 Days Technology: Microsoft SQL Server 2012 Delivery Method: Instructor-led (classroom) About this Course Data warehousing is a solution organizations
More informationOLAP Systems and Multidimensional Expressions I
OLAP Systems and Multidimensional Expressions I Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Software Development Technologies Master
More informationData Warehousing Systems: Foundations and Architectures
Data Warehousing Systems: Foundations and Architectures Il-Yeol Song Drexel University, http://www.ischool.drexel.edu/faculty/song/ SYNONYMS None DEFINITION A data warehouse (DW) is an integrated repository
More informationIMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH
IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH Kalinka Mihaylova Kaloyanova St. Kliment Ohridski University of Sofia, Faculty of Mathematics and Informatics Sofia 1164, Bulgaria
More informationA Design and implementation of a data warehouse for research administration universities
A Design and implementation of a data warehouse for research administration universities André Flory 1, Pierre Soupirot 2, and Anne Tchounikine 3 1 CRI : Centre de Ressources Informatiques INSA de Lyon
More informationData Warehousing. Jens Teubner, TU Dortmund jens.teubner@cs.tu-dortmund.de. Winter 2015/16. Jens Teubner Data Warehousing Winter 2015/16 1
Jens Teubner Data Warehousing Winter 2015/16 1 Data Warehousing Jens Teubner, TU Dortmund jens.teubner@cs.tu-dortmund.de Winter 2015/16 Jens Teubner Data Warehousing Winter 2015/16 13 Part II Overview
More informationDATA WAREHOUSING AND OLAP TECHNOLOGY
DATA WAREHOUSING AND OLAP TECHNOLOGY Manya Sethi MCA Final Year Amity University, Uttar Pradesh Under Guidance of Ms. Shruti Nagpal Abstract DATA WAREHOUSING and Online Analytical Processing (OLAP) are
More informationA Simplified Framework for Data Cleaning and Information Retrieval in Multiple Data Source Problems
A Simplified Framework for Data Cleaning and Information Retrieval in Multiple Data Source Problems Agusthiyar.R, 1, Dr. K. Narashiman 2 Assistant Professor (Sr.G), Department of Computer Applications,
More informationImplementing a Data Warehouse with Microsoft SQL Server 2012
Implementing a Data Warehouse with Microsoft SQL Server 2012 Module 1: Introduction to Data Warehousing Describe data warehouse concepts and architecture considerations Considerations for a Data Warehouse
More informationIBM WebSphere DataStage Online training from Yes-M Systems
Yes-M Systems offers the unique opportunity to aspiring fresher s and experienced professionals to get real time experience in ETL Data warehouse tool IBM DataStage. Course Description With this training
More informationImplementing a Data Warehouse with Microsoft SQL Server 2012 MOC 10777
Implementing a Data Warehouse with Microsoft SQL Server 2012 MOC 10777 Course Outline Module 1: Introduction to Data Warehousing This module provides an introduction to the key components of a data warehousing
More informationImplementing a Data Warehouse with Microsoft SQL Server
Course Code: M20463 Vendor: Microsoft Course Overview Duration: 5 RRP: 2,025 Implementing a Data Warehouse with Microsoft SQL Server Overview This course describes how to implement a data warehouse platform
More informationData Warehousing and Data Mining
Data Warehousing and Data Mining Part I: Data Warehousing Gao Cong gaocong@cs.aau.dk Slides adapted from Man Lung Yiu and Torben Bach Pedersen Course Structure Business intelligence: Extract knowledge
More informationCourse Outline: Course: Implementing a Data Warehouse with Microsoft SQL Server 2012 Learning Method: Instructor-led Classroom Learning
Course Outline: Course: Implementing a Data with Microsoft SQL Server 2012 Learning Method: Instructor-led Classroom Learning Duration: 5.00 Day(s)/ 40 hrs Overview: This 5-day instructor-led course describes
More informationCourse 20463:Implementing a Data Warehouse with Microsoft SQL Server
Course 20463:Implementing a Data Warehouse with Microsoft SQL Server Type:Course Audience(s):IT Professionals Technology:Microsoft SQL Server Level:300 This Revision:C Delivery method: Instructor-led (classroom)
More informationCourse Outline. Module 1: Introduction to Data Warehousing
Course Outline Module 1: Introduction to Data Warehousing This module provides an introduction to the key components of a data warehousing solution and the highlevel considerations you must take into account
More informationMicrosoft. Course 20463C: Implementing a Data Warehouse with Microsoft SQL Server
Course 20463C: Implementing a Data Warehouse with Microsoft SQL Server Length : 5 Days Audience(s) : IT Professionals Level : 300 Technology : Microsoft SQL Server 2014 Delivery Method : Instructor-led
More informationImplementing a Data Warehouse with Microsoft SQL Server
CÔNG TY CỔ PHẦN TRƯỜNG CNTT TÂN ĐỨC TAN DUC INFORMATION TECHNOLOGY SCHOOL JSC LEARN MORE WITH LESS! Course 20463 Implementing a Data Warehouse with Microsoft SQL Server Length: 5 Days Audience: IT Professionals
More informationImplementing a SQL Data Warehouse 2016
Implementing a SQL Data Warehouse 2016 http://www.homnick.com marketing@homnick.com +1.561.988.0567 Boca Raton, Fl USA About this course This 4-day instructor led course describes how to implement a data
More informationEnterprise Solutions. Data Warehouse & Business Intelligence Chapter-8
Enterprise Solutions Data Warehouse & Business Intelligence Chapter-8 Learning Objectives Concepts of Data Warehouse Business Intelligence, Analytics & Big Data Tools for DWH & BI Concepts of Data Warehouse
More informationWould-be system and database administrators. PREREQUISITES: At least 6 months experience with a Windows operating system.
DBA Fundamentals COURSE CODE: COURSE TITLE: AUDIENCE: SQSDBA SQL Server 2008/2008 R2 DBA Fundamentals Would-be system and database administrators. PREREQUISITES: At least 6 months experience with a Windows
More informationCourse 10777A: Implementing a Data Warehouse with Microsoft SQL Server 2012
Course 10777A: Implementing a Data Warehouse with Microsoft SQL Server 2012 OVERVIEW About this Course Data warehousing is a solution organizations use to centralize business data for reporting and analysis.
More informationImplementing a Data Warehouse with Microsoft SQL Server 2012
Course 10777A: Implementing a Data Warehouse with Microsoft SQL Server 2012 Length: Audience(s): 5 Days Level: 200 IT Professionals Technology: Microsoft SQL Server 2012 Type: Delivery Method: Course Instructor-led
More informationMDM and Data Warehousing Complement Each Other
Master Management MDM and Warehousing Complement Each Other Greater business value from both 2011 IBM Corporation Executive Summary Master Management (MDM) and Warehousing (DW) complement each other There
More informationData Warehousing. Jens Teubner, TU Dortmund jens.teubner@cs.tu-dortmund.de. Winter 2014/15. Jens Teubner Data Warehousing Winter 2014/15 1
Jens Teubner Data Warehousing Winter 2014/15 1 Data Warehousing Jens Teubner, TU Dortmund jens.teubner@cs.tu-dortmund.de Winter 2014/15 Jens Teubner Data Warehousing Winter 2014/15 152 Part VI ETL Process
More informationImplementing a Data Warehouse with Microsoft SQL Server 2012 (70-463)
Implementing a Data Warehouse with Microsoft SQL Server 2012 (70-463) Course Description Data warehousing is a solution organizations use to centralize business data for reporting and analysis. This five-day
More informationSAS BI Course Content; Introduction to DWH / BI Concepts
SAS BI Course Content; Introduction to DWH / BI Concepts SAS Web Report Studio 4.2 SAS EG 4.2 SAS Information Delivery Portal 4.2 SAS Data Integration Studio 4.2 SAS BI Dashboard 4.2 SAS Management Console
More informationData Warehouse Design
Data Warehouse Design Modern Principles and Methodologies Matteo Golfarelli Stefano Rizzi Translated by Claudio Pagliarani Mc Grauu Hill New York Chicago San Francisco Lisbon London Madrid Mexico City
More informationImplementing a Data Warehouse with Microsoft SQL Server 2012
Course 10777 : Implementing a Data Warehouse with Microsoft SQL Server 2012 Page 1 of 8 Implementing a Data Warehouse with Microsoft SQL Server 2012 Course 10777: 4 days; Instructor-Led Introduction Data
More informationImplementing a Data Warehouse with Microsoft SQL Server 2012
Implementing a Data Warehouse with Microsoft SQL Server 2012 Course ID MSS300 Course Description Ace your preparation for Microsoft Certification Exam 70-463 with this course Maximize your performance
More informationData Integration with Talend Open Studio Robert A. Nisbet, Ph.D.
Data Integration with Talend Open Studio Robert A. Nisbet, Ph.D. Most college courses in statistical analysis and data mining are focus on the mathematical techniques for analyzing data structures, rather
More informationCSPP 53017: Data Warehousing Winter 2013" Lecture 6" Svetlozar Nestorov" " Class News
CSPP 53017: Data Warehousing Winter 2013 Lecture 6 Svetlozar Nestorov Class News Homework 4 is online Due by Tuesday, Feb 26. Second 15 minute in-class quiz today at 6:30pm Open book/notes Last 15 minute
More informationData Warehouse: Introduction
Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of base and data mining group,
More informationCourse: SAS BI(business intelligence) and DI(Data integration)training - Training Duration: 30 + Days. Take Away:
Course: SAS BI(business intelligence) and DI(Data integration)training - Training Duration: 30 + Days Take Away: Class notes and Books, Data warehousing concept Assignments for practice Interview questions,
More informationCourse 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing
More informationTHE DATA WAREHOUSE ETL TOOLKIT CDT803 Three Days
Three Days Prerequisites Students should have at least some experience with any relational database management system. Who Should Attend This course is targeted at technical staff, team leaders and project
More informationEast Asia Network Sdn Bhd
Course: Analyzing, Designing, and Implementing a Data Warehouse with Microsoft SQL Server 2014 Elements of this syllabus may be change to cater to the participants background & knowledge. This course describes
More informationData warehouses. Data Mining. Abraham Otero. Data Mining. Agenda
Data warehouses 1/36 Agenda Why do I need a data warehouse? ETL systems Real-Time Data Warehousing Open problems 2/36 1 Why do I need a data warehouse? Why do I need a data warehouse? Maybe you do not
More informationCopyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1
Slide 29-1 Chapter 29 Overview of Data Warehousing and OLAP Chapter 29 Outline Purpose of Data Warehousing Introduction, Definitions, and Terminology Comparison with Traditional Databases Characteristics
More informationThe Data Warehouse ETL Toolkit
2008 AGI-Information Management Consultants May be used for personal purporses only or by libraries associated to dandelon.com network. The Data Warehouse ETL Toolkit Practical Techniques for Extracting,
More informationNear Real-time Data Warehousing with Multi-stage Trickle & Flip
Near Real-time Data Warehousing with Multi-stage Trickle & Flip Janis Zuters University of Latvia, 19 Raina blvd., LV-1586 Riga, Latvia janis.zuters@lu.lv Abstract. A data warehouse typically is a collection
More informationChapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
More informationOverview. DW Source Integration, Tools, and Architecture. End User Applications (EUA) EUA Concepts. DW Front End Tools. Source Integration
DW Source Integration, Tools, and Architecture Overview DW Front End Tools Source Integration DW architecture Original slides were written by Torben Bach Pedersen Aalborg University 2007 - DWML course
More informationETL Tools. L. Libkin 1 Data Integration and Exchange
ETL Tools ETL = Extract Transform Load Typically: data integration software for building data warehouse Pull large volumes of data from different sources, in different formats, restructure them and load
More informationIndexing Techniques for Data Warehouses Queries. Abstract
Indexing Techniques for Data Warehouses Queries Sirirut Vanichayobon Le Gruenwald The University of Oklahoma School of Computer Science Norman, OK, 739 sirirut@cs.ou.edu gruenwal@cs.ou.edu Abstract Recently,
More informationETL-EXTRACT, TRANSFORM & LOAD TESTING
ETL-EXTRACT, TRANSFORM & LOAD TESTING Rajesh Popli Manager (Quality), Nagarro Software Pvt. Ltd., Gurgaon, INDIA rajesh.popli@nagarro.com ABSTRACT Data is most important part in any organization. Data
More informationETL Overview. Extract, Transform, Load (ETL) Refreshment Workflow. The ETL Process. General ETL issues. MS Integration Services
ETL Overview Extract, Transform, Load (ETL) General ETL issues ETL/DW refreshment process Building dimensions Building fact tables Extract Transformations/cleansing Load MS Integration Services Original
More informationAn Introduction to Data Warehousing. An organization manages information in two dominant forms: operational systems of
An Introduction to Data Warehousing An organization manages information in two dominant forms: operational systems of record and data warehouses. Operational systems are designed to support online transaction
More informationBUILDING OLAP TOOLS OVER LARGE DATABASES
BUILDING OLAP TOOLS OVER LARGE DATABASES Rui Oliveira, Jorge Bernardino ISEC Instituto Superior de Engenharia de Coimbra, Polytechnic Institute of Coimbra Quinta da Nora, Rua Pedro Nunes, P-3030-199 Coimbra,
More informationSSIS Training: Introduction to SQL Server Integration Services Duration: 3 days
SSIS Training: Introduction to SQL Server Integration Services Duration: 3 days SSIS Training Prerequisites All SSIS training attendees should have prior experience working with SQL Server. Hands-on/Lecture
More informationSQL Server Administrator Introduction - 3 Days Objectives
SQL Server Administrator Introduction - 3 Days INTRODUCTION TO MICROSOFT SQL SERVER Exploring the components of SQL Server Identifying SQL Server administration tasks INSTALLING SQL SERVER Identifying
More informationMicrosoft Data Warehouse in Depth
Microsoft Data Warehouse in Depth 1 P a g e Duration What s new Why attend Who should attend Course format and prerequisites 4 days The course materials have been refreshed to align with the second edition
More informationData-Warehouse-, Data-Mining- und OLAP-Technologien
Data-Warehouse-, Data-Mining- und OLAP-Technologien Chapter 2: Data Warehouse Architecture Bernhard Mitschang Universität Stuttgart Winter Term 2014/2015 Overview Data Warehouse Architecture Data Sources
More informationwww.ijreat.org Published by: PIONEER RESEARCH & DEVELOPMENT GROUP (www.prdg.org) 28
Data Warehousing - Essential Element To Support Decision- Making Process In Industries Ashima Bhasin 1, Mr Manoj Kumar 2 1 Computer Science Engineering Department, 2 Associate Professor, CSE Abstract SGT
More informationOptimizing the Performance of the Oracle BI Applications using Oracle Datawarehousing Features and Oracle DAC 10.1.3.4.1
Optimizing the Performance of the Oracle BI Applications using Oracle Datawarehousing Features and Oracle DAC 10.1.3.4.1 Mark Rittman, Director, Rittman Mead Consulting for Collaborate 09, Florida, USA,
More informationBeta: Implementing a Data Warehouse with Microsoft SQL Server 2012
CÔNG TY CỔ PHẦN TRƯỜNG CNTT TÂN ĐỨC TAN DUC INFORMATION TECHNOLOGY SCHOOL JSC LEARN MORE WITH LESS! Course 10777: Beta: Implementing a Data Warehouse with Microsoft SQL Server 2012 Length: 5 Days Audience:
More informationSubject Description Form
Subject Description Form Subject Code Subject Title COMP417 Data Warehousing and Data Mining Techniques in Business and Commerce Credit Value 3 Level 4 Pre-requisite / Co-requisite/ Exclusion Objectives
More informationCustomer-Centric Data Warehouse, design issues. Announcements
CRM Data Warehouse Announcements Assignment 2 is on the subject web site Students must form assignment groups ASAP: refer to the assignment for details 2 -Centric Data Warehouse, design issues Data modelling
More informationFor Sales Kathy Hall 402-963-4466 khall@it4e.com
IT4E Schedule 13939 Gold Circle Omaha NE 68144 402-431-5432 Course Number 10777 For Sales Chris Reynolds 402-963-4465 creynolds@it4e.com www.it4e.com For Sales Kathy Hall 402-963-4466 khall@it4e.com Course
More informationBusiness Intelligence: Effective Decision Making
Business Intelligence: Effective Decision Making Bellevue College Linda Rumans IT Instructor, Business Division Bellevue College lrumans@bellevuecollege.edu Current Status What do I do??? How do I increase
More informationData warehouse Architectures and processes
Database and data mining group, Data warehouse Architectures and processes DATA WAREHOUSE: ARCHITECTURES AND PROCESSES - 1 Database and data mining group, Data warehouse architectures Separation between
More informationChapter 5. Warehousing, Data Acquisition, Data. Visualization
Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization 5-1 Learning Objectives
More informationSpeeding ETL Processing in Data Warehouses White Paper
Speeding ETL Processing in Data Warehouses White Paper 020607dmxwpADM High-Performance Aggregations and Joins for Faster Data Warehouse Processing Data Processing Challenges... 1 Joins and Aggregates are
More informationChapter 3 - Data Replication and Materialized Integration
Prof. Dr.-Ing. Stefan Deßloch AG Heterogene Informationssysteme Geb. 36, Raum 329 Tel. 0631/205 3275 dessloch@informatik.uni-kl.de Chapter 3 - Data Replication and Materialized Integration Motivation Replication:
More informationDimensional Modeling for Data Warehouse
Modeling for Data Warehouse Umashanker Sharma, Anjana Gosain GGS, Indraprastha University, Delhi Abstract Many surveys indicate that a significant percentage of DWs fail to meet business objectives or
More informationSENG 520, Experience with a high-level programming language. (304) 579-7726, Jeff.Edgell@comcast.net
Course : Semester : Course Format And Credit hours : Prerequisites : Data Warehousing and Business Intelligence Summer (Odd Years) online 3 hr Credit SENG 520, Experience with a high-level programming
More informationWhat is Data Virtualization? Rick F. van der Lans, R20/Consultancy
What is Data Virtualization? by Rick F. van der Lans, R20/Consultancy August 2011 Introduction Data virtualization is receiving more and more attention in the IT industry, especially from those interested
More informationBuilding a Data Warehouse
Building a Data Warehouse With Examples in SQL Server EiD Vincent Rainardi BROCHSCHULE LIECHTENSTEIN Bibliothek Apress Contents About the Author. ; xiij Preface xv ^CHAPTER 1 Introduction to Data Warehousing
More informationData Warehouses and Business Intelligence ITP 487 (3 Units) Fall 2013. Objective
Data Warehouses and Business Intelligence ITP 487 (3 Units) Objective Fall 2013 While the increased capacity and availability of data gathering and storage systems have allowed enterprises to store more
More informationData Warehousing. Overview, Terminology, and Research Issues. Joachim Hammer. Joachim Hammer
Data Warehousing Overview, Terminology, and Research Issues 1 Heterogeneous Database Integration Integration System World Wide Web Digital Libraries Scientific Databases Personal Databases Collects and
More information1. OLAP is an acronym for a. Online Analytical Processing b. Online Analysis Process c. Online Arithmetic Processing d. Object Linking and Processing
1. OLAP is an acronym for a. Online Analytical Processing b. Online Analysis Process c. Online Arithmetic Processing d. Object Linking and Processing 2. What is a Data warehouse a. A database application
More informationData Warehousing with Oracle
Data Warehousing with Oracle Comprehensive Concepts Overview, Insight, Recommendations, Best Practices and a whole lot more. By Tariq Farooq A BrainSurface Presentation What is a Data Warehouse? Designed
More informationDatawarehousing and Analytics. Data-Warehouse-, Data-Mining- und OLAP-Technologien. Advanced Information Management
Anwendersoftware a Datawarehousing and Analytics Data-Warehouse-, Data-Mining- und OLAP-Technologien Advanced Information Management Bernhard Mitschang, Holger Schwarz Universität Stuttgart Winter Term
More informationEstablish and maintain Center of Excellence (CoE) around Data Architecture
Senior BI Data Architect - Bensenville, IL The Company s Information Management Team is comprised of highly technical resources with diverse backgrounds in data warehouse development & support, business
More informationMIS636 AWS Data Warehousing and Business Intelligence Course Syllabus
MIS636 AWS Data Warehousing and Business Intelligence Course Syllabus I. Contact Information Professor: Joseph Morabito, Ph.D. Office: Babbio 419 Office Hours: By Appt. Phone: 201-216-5304 Email: jmorabit@stevens.edu
More informationA Critical Review of Data Warehouse
Global Journal of Business Management and Information Technology. Volume 1, Number 2 (2011), pp. 95-103 Research India Publications http://www.ripublication.com A Critical Review of Data Warehouse Sachin
More informationHigh-Volume Data Warehousing in Centerprise. Product Datasheet
High-Volume Data Warehousing in Centerprise Product Datasheet Table of Contents Overview 3 Data Complexity 3 Data Quality 3 Speed and Scalability 3 Centerprise Data Warehouse Features 4 ETL in a Unified
More informationSQL Server 2005. Introduction to SQL Server 2005. SQL Server 2005 basic tools. SQL Server Configuration Manager. SQL Server services management
Database and data mining group, SQL Server 2005 Introduction to SQL Server 2005 Introduction to SQL Server 2005-1 Database and data mining group, SQL Server 2005 basic tools SQL Server Configuration Manager
More informationWhat's New in SAS Data Management
Paper SAS034-2014 What's New in SAS Data Management Nancy Rausch, SAS Institute Inc., Cary, NC; Mike Frost, SAS Institute Inc., Cary, NC, Mike Ames, SAS Institute Inc., Cary ABSTRACT The latest releases
More information5.5 Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall. Figure 5-2
Class Announcements TIM 50 - Business Information Systems Lecture 15 Database Assignment 2 posted Due Tuesday 5/26 UC Santa Cruz May 19, 2015 Database: Collection of related files containing records on
More informationAn Overview of Data Warehousing, Data mining, OLAP and OLTP Technologies
An Overview of Data Warehousing, Data mining, OLAP and OLTP Technologies Ashish Gahlot, Manoj Yadav Dronacharya college of engineering Farrukhnagar, Gurgaon,Haryana Abstract- Data warehousing, Data Mining,
More informationDeductive Data Warehouses and Aggregate (Derived) Tables
Deductive Data Warehouses and Aggregate (Derived) Tables Kornelije Rabuzin, Mirko Malekovic, Mirko Cubrilo Faculty of Organization and Informatics University of Zagreb Varazdin, Croatia {kornelije.rabuzin,
More informationCollege of Engineering, Technology, and Computer Science
College of Engineering, Technology, and Computer Science Design and Implementation of Cloud-based Data Warehousing In partial fulfillment of the requirements for the Degree of Master of Science in Technology
More informationBussiness Intelligence and Data Warehouse. Tomas Bartos CIS 764, Kansas State University
Bussiness Intelligence and Data Warehouse Schedule Bussiness Intelligence (BI) BI tools Oracle vs. Microsoft Data warehouse History Tools Oracle vs. Others Discussion Business Intelligence (BI) Products
More informationIntroduction to Datawarehousing
DIPARTIMENTO DI INGEGNERIA INFORMATICA AUTOMATICA E GESTIONALE ANTONIO RUBERTI Master of Science in Engineering in Computer Science (MSE-CS) Seminars in Software and Services for the Information Society
More informationMS 20467: Designing Business Intelligence Solutions with Microsoft SQL Server 2012
MS 20467: Designing Business Intelligence Solutions with Microsoft SQL Server 2012 Description: This five-day instructor-led course teaches students how to design and implement a BI infrastructure. The
More informationCúram Business Intelligence Reporting Developer Guide
IBM Cúram Social Program Management Cúram Business Intelligence Reporting Developer Guide Version 6.0.5 IBM Cúram Social Program Management Cúram Business Intelligence Reporting Developer Guide Version
More informationSizing Logical Data in a Data Warehouse A Consistent and Auditable Approach
2006 ISMA Conference 1 Sizing Logical Data in a Data Warehouse A Consistent and Auditable Approach Priya Lobo CFPS Satyam Computer Services Ltd. 69, Railway Parallel Road, Kumarapark West, Bangalore 560020,
More information