Data Vault and The Truth about the Enterprise Data Warehouse
|
|
|
- Melissa Booth
- 10 years ago
- Views:
Transcription
1 Data Vault and The Truth about the Enterprise Data Warehouse Roelant Vos Brisbane, Australia
2 Introduction More often than not, when discussion about data modeling and information architecture move towards the Enterprise Data Warehouse (EDW) heated discussions occur and (alternative) solutions are proposed supported by claims that these alternatives are quicker and easier to develop than the cumbersome EDW while also delivering value to the business in a more direct way. Apparently, the EDW has a bad image. An image that seems to be associated with long development times, high complexity, difficulties in maintenance and a falling short on promises in general. It is true that in the past many EDW projects have suffered from stagnation during the data integration (development) phase. These projects have stalled in a cycle of changes in ETL and data model design and the subsequent unit and acceptance testing. No usable information is presented to the business users while being stuck in this vicious cycle and as a result an image of inability is painted by the Data Warehouse team (and often at high costs). Why are things this way? One of the main reasons is that the Data Warehouse / Business Intelligence industry defines and works with architectures that do not (can not) live up to their promises. To date there are (still) many Data Warehouse specialists who argue that an EDW is always expensive, complex and monolithic whereas the real message should be that an EDW is in fact driven by business cases, adaptive and quick to deliver. It is a disconcerting thought that apparently the EDW is not understood by the Data Warehouse industry. The Enterprise Data Warehouse has a bad image while in reality it is driven by business cases, adaptive and quick to deliver when designed properly and supported by an ETL Framework. This white paper is the first in a series to convince the industry that this statement regarding the EDW is true and that it is a concept that is effective in any situation thus negating the need for custom architectures which often turn information management into an unmanageable mess. This series of white papers will attempt to give the EDW the positive image it deserves and will lay out the approach and techniques to achieve the best results. The key element of the solution is the adoption of an architecture that differs fundamentally from the common approach to Data Warehousing and from what has been used to design Data Warehouses over the past 15 years. This architecture delivers -out of the box- on flexibility in both design and implementation, true resilience, being future-proof (changes are a fact of life especially in the central spot a Data Warehouse operates in) and offers complete traceability of data. As a conceptual architecture the focus is not predominantly on technology although this will be addressed to some extent in the specifications of the ETL Framework.
3 The true Enterprise Data Warehouse The complete EDW architecture is composed of various functions, techniques and concepts that support these ambitions. Together they define the EDW that is truly consistent, adaptable and futureproof while delivering on the traceability and, most importantly, works the same way in every scenario. When looking into the details of the architecture we will find that many of the common Data Warehouse concepts such as normalization or Type-2 stacking of historical changes are still present, and for very good reasons. But it is the careful (re)positioning and application of these concepts in the (reference) architecture that makes the solution much more robust. It is here that the introduction of the Data Vault modeling technique as defined by Dan Linstedt makes a big difference. By positioning Data Vault as the core EDW layer supported by a structured ETL Framework that defines the loading templates and control processes the following results are achieved (some of it new and some of it already well-established): Business Rules (including managing data quality) become the responsibility of the Business. One of the ambitions of the EDW is to enable the business to change these definitions without compromising the data or the design of the Data Warehouse. This means that the EDW will initially focus on managing data and applying Data Warehouse housekeeping properly before applying the business rules. The Data Vault technique specifies hard and soft business rules which define when certain types of transformation logic must be implemented; ETL is structured, consistent and is directly related to Data Warehouse concepts; Data Warehouse concepts are clearly separated (decoupled). This includes key distribution and management, history management, deriving data to information and providing structure and context (i.e. hierarchies, dimensions). True decoupling of applications that provide data (operational systems) and the Data Warehouse. Changes are a fact of life for the Data Warehouse, but proper design can significantly reduce the impact of these changes (and thus reduce maintenance). A complete audit trail, always. It must be possible to trace all data back to the source and how it was processed in the Data Warehouse. The purpose ranges from data validation (how did I get this number?) to strict compliance rules (i.e. BASEL). Full scalability, which manifests itself in many ways. Due to the modular design the Data Vault is extremely flexible (yet easy and consistent) when it comes to adding or changing data sources. It also enables the Data Warehouse team to change the table structure when database size becomes an issue to introduce a degree of normalisation without impacting the Data Vault concepts. A properly management Data Vault is indefinitely scalable. Flexibility in ETL scheduling. ETL templates are designed to be able to handle the most granular of intervals (near real-time). It is also ensured that ETL processes are defined as atomic steps that serve one designated purpose in the overall architecture. By introducing these requirements to ETL design the Data Warehouse team can structure workflows to suit every need. This includes introducing parallelism in loading, changing loading intervals and even mixed workloads. This is all done without compromising the core model or losing information of flexibility.
4 Ultimately the goal of the EDW is to provide a meaningful reporting and analysis or data mining structure that Business Intelligence software can access. The main difference with previous EDW architectures is that this solution can accommodate all types of structures (i.e. star schemas, snowflakes, 3NF) and even change the approach without losing data or with excessive maintenance overhead. Does this sound too much to achieve in one go? It s not! All this can be achieved combining the Data Vault with a fresh look at the role ETL processes should play in the Data Warehouse reference architecture. As stated before, it is all about redefining where well-known concepts should be applied and letting go of some cherished ones. Letting go of the Truth One of the most persistent concepts that haunt Data Warehouses to date is the single version of the truth. As the subtitle of this paper hints at it is worthwhile to rebrand this concept into the single version of the fact. Deriving or calculating a truth is subjective and typically only really true for a certain recipient of the information whereas facts can be used to create any truth depending on the recipient s point of view. The Truth is subjective and can change over time. To make things worse the perception of truth can change over time rendering existing implementations difficult to maintain or even unusable. Data Warehouses which aim to deliver on this single version of the truth will see that, at some point in time, they cannot provide the information the users really need in a timely and cost-effective manner. Additionally interpreting data early in the process effectively ruins the audit trail which is a serious concern to the Data Warehouse team. There are many reasons that this view on Data Warehouse design is so firmly established in the industry. For a long period of time the Data Warehouse was deemed the best solution to bring together information from different silos of data (applications). But as the business changed the (IT) Data Warehouse could not keep up with the changing requirements and it turned out that there are always valid reasons to have different views on the meaning of the information. That is, as long as you can trace the interpreted information back to the same base values. This idea alone is one of the fundamental changes when compared to existing architectures. It essentially means that you need to maintain a pure or raw repository of the data as it was received and reroll the truth (current definition) from time to time when required. The Data Vault has the perfect qualifications to be this repository. Letting go of the Single Version Of The Truth means that the Data Warehouse is no longer the integrated and pure collection of information for the enterprise but it is closer to being a large storage facility for the company data (as an asset!) so that the various truths (views of the business) can be created at any time and in any format (typically Data Marts). For once, the Data Warehouse can truly match the user s requirements.
5 Data Vault Data Vault as a modeling technique is the perfect match for the new EDW architecture and provides the modular components and built-in audit trail to meet the ambitions. Due to the way Data Vault defines the entity types the EDW becomes a fully standardised and asynchronous environment where data can be added and loaded at any time. If you want 100% of the data 100% of the time Data Vault is the matching technique. Dan Linstedt defines the Data Vault as a detailed historically oriented, uniquely linked set of normalised tables that support one or more areas of the business. In other words; it is a semi-normalised model where key distribution, relationships and descriptive data are separated. The historical data is denormalised to some extent, labeling it as a hybrid modeling approach for data modeling. However, it is not even the difference in data modeling that make Data Vault unique (similarities with the IBM industry models come to mind). It is the way Data Vault uses the archetypical entities to allows for a natural, almost organic, controlled expansion of the model (in a way the IBM industry models do not) that make it such a powerful technique. The Data Vault defines three main types of entities: the Hub, Satellite and Link tables. Hub: Contains the unique list of business keys and maintains the EDW key distributions. Satellite: Historical descriptive data (about the hub or link). Much like a Type 2 dimension, the information is subject to change over time. Link: Unique list of relationships between Hub keys. A Link can be relational or transactional (with a limited number of variations). As relationships Link tables are always many-to-many. You have to be prepared for future changes. An example of a simple Data Vault model is as follows: This modeling approach has proven to be very flexible. For instance, if different information about a product (defined as a Hub entity) is added a separate Satellite table which contains new history can be created without impacting existing tables. Similarly, if an existing Satellite contains a single attribute with a high change rate the Satellite table can be split in separate smaller Satellites (and merged if the situation is the other way around). This is very easy to implement since the keys are maintained in the Hub table and not in the Satellite. Another example of the flexibility is in the Link table. You can define the Link as a single point of integration (unique list of hub keys) and have separate Link-Satellites for maintaining transactions or relationship information. Or, you can add a new link table for a different type of relationship.
6 A third example focuses on moving towards an interpretation of the data. Take the example where you want to create an integrated view on the employee data (Employee Hub) following specific business rules. This results in the definition of a subset of employees with certain qualities: a special employees Hub. The relationship of which employee is a special employee is maintained in a new Link table. Employee Hub: ID Logical Key Source Row ID Employee Satellite: (with raw address data): ID First Name Last Name Address Effective Date Expiry Date 1 John Doe 40 George St John Doe 40 George Street Peter Smith 70 York St After applying a match business rule the following tables will be populated. In this case two records are merged into one using a specific match and survive algorithm. The result is a set of two employees instead of the original three. Special Employee Hub (selected by business rules): ID Logical Key Source Row ID The relationship between the original data and the cleaned data has to be maintained for the audit trail. This table shows the relationship between the original and the cleansed data: Employees and Special Employees Link ID original employee ID special employee It is clear to see how these concepts deliver on the EDW promises made at the start of the document.
7 Positioning and misconceptions Data Vault is often misunderstood for a complete EDW methodology and as such discarded for the wrong reasons. It does not replace existing techniques to prepare data for reporting (i.e. a star schema) or solve issues regarding data staging and interfacing. Even Dan Linstedt himself does not claim that the Data Vault is the single solution for everything; a proper EDW typically requires some form of Presentation Layer. In other words: Data Vault is a modeling technique and not a methodology. The insights which Data Vault has delivered do render elements of existing methodologies obsolete. This is one of the reasons that Bill Inmon (the father of the Data Vault is a Data Warehouse ) has stated that the Data Vault is the ideal technique for component of a modeling the EDW in the DW2.0 Framework. Comparison must also be made with another well-known expert; Ralph Kimball and the dimensional modeling as complete EDW solution. defined in the Lifecycle Toolkit. This extensive methodology defines concepts that are still very valuable today and these can be used in conjunction with Data Vault in a very effective way. Implementing Data Vault does however replace certain concepts to be implemented prior to creating a Star or Snowflake schema but with the advantage of making this a very easy and generic step in ETL. When position Data Vault in the Kimball architecture the best way to think about it is to define Data Vault as the Data Staging, although it does more than that. The following diagram highlights this: In conclusion, all techniques and methodologies still have their place and it s the way we organise, structure and apply them which defines the truly generic EDW architecture we aim for.
8 Patterns Thanks to the modular approach the Data Vault is full of deterministic patterns which can be used to shorten the time to market for the EDW. These patterns (when applied within a structured ETL Framework) cover the entire scope for ETL leaving only the specific business rules to be defined manually, and even this can be automated to an extent. In this fact lies one of the truly powerful applications of the concept: consistent structured and efficient ETL. The only requirement is that you have the target Data Vault model ready and work within a structured ETL Framework. This is true model-driven development for the Data Warehouse. Examples of identified (ETL) patterns using the Data Vault are: Concept Pattern Hub Select a distinct set of only the attribute designated as business key (usually but not always the primary key of the source table). Do a lookup against the target Hub table, if the key does not exist for that source system, insert and generate a new Data Warehouse key. Satellite Dimension (from Data Vault to Star Schema) Select everything from the source table and use the designated business key for a key lookup against the Hub table. The key value is then used to lookup the history values in the Satellite. All attributes but the business keys are inserted if there is change in values. Join the Hub and Satellites and calculate the overlap in timelines, join this to the Link and through to the other set of Hub and Satellites. Since the keys are already distributed in the Hub table the (structure of the dimension) can be changed at any time without losing history or complex ETL. These ETL patterns will be referred to in the following articles where the required steps in the ETL Framework to deliver the EDW are explained in more detail. There are many fully functioning examples, demos and implementations available to show how this can be achieved. The role of ETL While not specified as part of Data Vault, ETL in the new EDW context should be split in the smallest units of work possible. In other words; the more and smaller processes to do tasks the better. Smaller units of work also means more generic ETL and therefore more options to generate the logic instead of manually developing it. This adds to the already effective EDW deployment. Single ETL processes should be self-sufficient and should be designed to be run at any time, for any interval without corrupting data or generally causing problems. It is also possible (and recommended) to step away from the traditional layer-based loading where all ETL for a specific type is loaded first before the next batch can start. By organising workflows or batches in a more vertical way true flexibility in terms of prioritising information and balancing load schedules can be achieved. This will be explained in detail in the next papers. Scheduling wise, Data Vault as modeling technique offers tremendous flexibility in (parallel) scheduling due to the early key distribution as well as rules regarding the use of placeholder values.
9 (Yet) Another Framework? No proper EDW implementation can be done without a solid ETL Framework which addresses operational / process metadata, error handling and definitions of what happens where in the ETL architecture. The implementation of an ETL Framework ties in closely with the EDW architecture. An example is regarding error handling; in the Data Vault EDW, with its focus on storing data, no error handling should be implemented on the way in to the Data Warehouse because this is essentially a business rule. How to use the error handling concepts and how this is implemented is one of the functions of the ETL Framework. Other functions include (but are not limited to): Specifying the ETL templates Defining how timelines are managed Integrating operational metadata Specify file handling The ETL Framework itself is an implementation approach for the EDW that links in with the fundamental concepts advocated by Data Vault. A window on the future It seems hardly practical that ETL work as we know it today will still exist in a couple of years. More and more sensible hybrid approaches towards Data Modeling and increased insight in the necessary steps towards data and information management, as well as improvements in the options (APIs) of ETL software will lead to hiding the complexities and shifting the focus of the work on data modeling. ETL development will move towards either incorporating the identified patterns into the way they move the data (using a semantic layer to translate the pattern into ETL logic), or brute-force the identified patterns into ETL processes following templates (ETL generation). Either way the days of the pure ETL developer seem to come to an end. The exciting thing in both scenarios is that there is no limitation on the type of applications you can link to the EDW and there is no locking in to specific technologies to achieve this. Right now it s already possible to generate the ETL processes once you understand your data and have defined your EDW model in Data Vault. Conclusion By embracing Data Vault, ensuring it has the proper position in the EDW reference architecture and leveraging the deterministic ETL patterns using a solid ETL Framework the EDW solution has become more construction rather than architecture. In other words; once you have designed your model the implementation is generic. Because this approach works in all situations it will contribute to the inevitable shift in skills and focus by hiding more and more of the architecture into a commonplace solution. By taking away the lengthy development times and associated distrust by the business this newly styled EDW will now both serve the business use and provide the future-proof, flexible and easy-to-maintain architecture to satisfy the architects and DBAs. This is the first in a series of articles that aim to give the Enterprise Data Warehouse the solution it deserves.
Data Vault at work. Does Data Vault fulfill its promise? GDF SUEZ Energie Nederland
Data Vault at work Does Data Vault fulfill its promise? Leading player on Dutch energy market Approximately 1,000 employees Production capacity: 3,813 MW 20% of the total Dutch electricity production capacity
Trends in Data Warehouse Data Modeling: Data Vault and Anchor Modeling
Trends in Data Warehouse Data Modeling: Data Vault and Anchor Modeling Thanks for Attending! Roland Bouman, Leiden the Netherlands MySQL AB, Sun, Strukton, Pentaho (1 nov) Web- and Business Intelligence
Sizing Logical Data in a Data Warehouse A Consistent and Auditable Approach
2006 ISMA Conference 1 Sizing Logical Data in a Data Warehouse A Consistent and Auditable Approach Priya Lobo CFPS Satyam Computer Services Ltd. 69, Railway Parallel Road, Kumarapark West, Bangalore 560020,
Knowledgent White Paper Series. Developing an MDM Strategy WHITE PAPER. Key Components for Success
Developing an MDM Strategy Key Components for Success WHITE PAPER Table of Contents Introduction... 2 Process Considerations... 3 Architecture Considerations... 5 Conclusion... 9 About Knowledgent... 10
Data Warehouse Overview. Srini Rengarajan
Data Warehouse Overview Srini Rengarajan Please mute Your cell! Agenda Data Warehouse Architecture Approaches to build a Data Warehouse Top Down Approach Bottom Up Approach Best Practices Case Example
COURSE OUTLINE. Track 1 Advanced Data Modeling, Analysis and Design
COURSE OUTLINE Track 1 Advanced Data Modeling, Analysis and Design TDWI Advanced Data Modeling Techniques Module One Data Modeling Concepts Data Models in Context Zachman Framework Overview Levels of Data
Trivadis White Paper. Comparison of Data Modeling Methods for a Core Data Warehouse. Dani Schnider Adriano Martino Maren Eschermann
Trivadis White Paper Comparison of Data Modeling Methods for a Core Data Warehouse Dani Schnider Adriano Martino Maren Eschermann June 2014 Table of Contents 1. Introduction... 3 2. Aspects of Data Warehouse
MDM and Data Warehousing Complement Each Other
Master Management MDM and Warehousing Complement Each Other Greater business value from both 2011 IBM Corporation Executive Summary Master Management (MDM) and Warehousing (DW) complement each other There
Reflections on Agile DW by a Business Analytics Practitioner. Werner Engelen Principal Business Analytics Architect
Reflections on Agile DW by a Business Analytics Practitioner Werner Engelen Principal Business Analytics Architect Introduction Werner Engelen Active in BI & DW since 1998 + 6 years at element61 Previously:
Software Life-Cycle Management
Ingo Arnold Department Computer Science University of Basel Theory Software Life-Cycle Management Architecture Styles Overview An Architecture Style expresses a fundamental structural organization schema
Introduction to Data Vault Modeling
Introduction to Data Vault Modeling Compiled and Edited by Kent Graziano, Senior BI/DW Consultant Note: This article is substantially excerpted (with permission) from the book Super Charge Your Data Warehouse:
Traditional BI vs. Business Data Lake A comparison
Traditional BI vs. Business Data Lake A comparison The need for new thinking around data storage and analysis Traditional Business Intelligence (BI) systems provide various levels and kinds of analyses
Master Data Management. Zahra Mansoori
Master Data Management Zahra Mansoori 1 1. Preference 2 A critical question arises How do you get from a thousand points of data entry to a single view of the business? We are going to answer this question
Data Warehousing. Jens Teubner, TU Dortmund [email protected]. Winter 2015/16. Jens Teubner Data Warehousing Winter 2015/16 1
Jens Teubner Data Warehousing Winter 2015/16 1 Data Warehousing Jens Teubner, TU Dortmund [email protected] Winter 2015/16 Jens Teubner Data Warehousing Winter 2015/16 13 Part II Overview
MDM for the Enterprise: Complementing and extending your Active Data Warehousing strategy. Satish Krishnaswamy VP MDM Solutions - Teradata
MDM for the Enterprise: Complementing and extending your Active Data Warehousing strategy Satish Krishnaswamy VP MDM Solutions - Teradata 2 Agenda MDM and its importance Linking to the Active Data Warehousing
Master Data Management and Data Warehousing. Zahra Mansoori
Master Data Management and Data Warehousing Zahra Mansoori 1 1. Preference 2 IT landscape growth IT landscapes have grown into complex arrays of different systems, applications, and technologies over the
JOURNAL OF OBJECT TECHNOLOGY
JOURNAL OF OBJECT TECHNOLOGY Online at www.jot.fm. Published by ETH Zurich, Chair of Software Engineering JOT, 2008 Vol. 7, No. 8, November-December 2008 What s Your Information Agenda? Mahesh H. Dodani,
Data Vault and Data Virtualization: Double Agility
Data Vault and Data Virtualization: Double Agility A Technical Whitepaper Rick F. van der Lans Independent Business Intelligence Analyst R20/Consultancy March 2015 Sponsored by Copyright 2015 R20/Consultancy.
Service Oriented Architecture and the DBA Kathy Komer Aetna Inc. New England DB2 Users Group. Tuesday June 12 1:00-2:15
Service Oriented Architecture and the DBA Kathy Komer Aetna Inc. New England DB2 Users Group Tuesday June 12 1:00-2:15 Service Oriented Architecture and the DBA What is Service Oriented Architecture (SOA)
An Introduction to Data Warehousing. An organization manages information in two dominant forms: operational systems of
An Introduction to Data Warehousing An organization manages information in two dominant forms: operational systems of record and data warehouses. Operational systems are designed to support online transaction
Improving your Data Warehouse s IQ
Improving your Data Warehouse s IQ Derek Strauss Gavroshe USA, Inc. Outline Data quality for second generation data warehouses DQ tool functionality categories and the data quality process Data model types
Integrating Netezza into your existing IT landscape
Marco Lehmann Technical Sales Professional Integrating Netezza into your existing IT landscape 2011 IBM Corporation Agenda How to integrate your existing data into Netezza appliance? 4 Steps for creating
Session M6. Andrea Matulick, Acting Manager, Business Intelligence, Robert Davies, Technical Team Leader, Enterprise Data
Strategic Data Management Conforming the Data Warehouse Session M6 September 24, 2007 Andrea Matulick, Acting Manager, Business Intelligence, Planning and Assurance Services, UniSA Robert Davies, Technical
Service Oriented Data Management
Service Oriented Management Nabin Bilas Integration Architect Integration & SOA: Agenda Integration Overview 5 Reasons Why Is Critical to SOA Oracle Integration Solution Integration
COURSE 20463C: IMPLEMENTING A DATA WAREHOUSE WITH MICROSOFT SQL SERVER
Page 1 of 8 ABOUT THIS COURSE This 5 day course describes how to implement a data warehouse platform to support a BI solution. Students will learn how to create a data warehouse with Microsoft SQL Server
EAI vs. ETL: Drawing Boundaries for Data Integration
A P P L I C A T I O N S A W h i t e P a p e r S e r i e s EAI and ETL technology have strengths and weaknesses alike. There are clear boundaries around the types of application integration projects most
Implementing a Data Warehouse with Microsoft SQL Server
Page 1 of 7 Overview This course describes how to implement a data warehouse platform to support a BI solution. Students will learn how to create a data warehouse with Microsoft SQL 2014, implement ETL
Course Outline. Module 1: Introduction to Data Warehousing
Course Outline Module 1: Introduction to Data Warehousing This module provides an introduction to the key components of a data warehousing solution and the highlevel considerations you must take into account
How service-oriented architecture (SOA) impacts your IT infrastructure
IBM Global Technology Services January 2008 How service-oriented architecture (SOA) impacts your IT infrastructure Satisfying the demands of dynamic business processes Page No.2 Contents 2 Introduction
Implementing a Data Warehouse with Microsoft SQL Server MOC 20463
Implementing a Data Warehouse with Microsoft SQL Server MOC 20463 Course Outline Module 1: Introduction to Data Warehousing This module provides an introduction to the key components of a data warehousing
COURSE OUTLINE MOC 20463: IMPLEMENTING A DATA WAREHOUSE WITH MICROSOFT SQL SERVER
COURSE OUTLINE MOC 20463: IMPLEMENTING A DATA WAREHOUSE WITH MICROSOFT SQL SERVER MODULE 1: INTRODUCTION TO DATA WAREHOUSING This module provides an introduction to the key components of a data warehousing
Microsoft. Course 20463C: Implementing a Data Warehouse with Microsoft SQL Server
Course 20463C: Implementing a Data Warehouse with Microsoft SQL Server Length : 5 Days Audience(s) : IT Professionals Level : 300 Technology : Microsoft SQL Server 2014 Delivery Method : Instructor-led
POLAR IT SERVICES. Business Intelligence Project Methodology
POLAR IT SERVICES Business Intelligence Project Methodology Table of Contents 1. Overview... 2 2. Visualize... 3 3. Planning and Architecture... 4 3.1 Define Requirements... 4 3.1.1 Define Attributes...
Business Intelligence and Service Oriented Architectures. An Oracle White Paper May 2007
Business Intelligence and Service Oriented Architectures An Oracle White Paper May 2007 Note: The following is intended to outline our general product direction. It is intended for information purposes
Oracle Warehouse Builder 10g
Oracle Warehouse Builder 10g Architectural White paper February 2004 Table of contents INTRODUCTION... 3 OVERVIEW... 4 THE DESIGN COMPONENT... 4 THE RUNTIME COMPONENT... 5 THE DESIGN ARCHITECTURE... 6
Data Warehousing Systems: Foundations and Architectures
Data Warehousing Systems: Foundations and Architectures Il-Yeol Song Drexel University, http://www.ischool.drexel.edu/faculty/song/ SYNONYMS None DEFINITION A data warehouse (DW) is an integrated repository
Data Vault Modeling in a Day
Course Description Data Vault Modeling in a Day GENESEE ACADEMY, LLC 2013 Course Developed by: Hans Hultgren DATA VAULT DAY Data Vault Modeling in a Day Overview Data Vault modeling is quickly becoming
WHITEPAPER QUIPU version 1.1
WHITEPAPER QUIPU version 1.1 OPEN SOURCE DATAWAREHOUSING Contents Introduction... 4 QUIPU... 4 History... 4 Data warehouse architecture... 6 Starting points... 6 Modeling... 7 Core components of QUIPU...
Chapter 5. Learning Objectives. DW Development and ETL
Chapter 5 DW Development and ETL Learning Objectives Explain data integration and the extraction, transformation, and load (ETL) processes Basic DW development methodologies Describe real-time (active)
East Asia Network Sdn Bhd
Course: Analyzing, Designing, and Implementing a Data Warehouse with Microsoft SQL Server 2014 Elements of this syllabus may be change to cater to the participants background & knowledge. This course describes
Implementing a Data Warehouse with Microsoft SQL Server
This course describes how to implement a data warehouse platform to support a BI solution. Students will learn how to create a data warehouse 2014, implement ETL with SQL Server Integration Services, and
Implementing a Data Warehouse with Microsoft SQL Server 2012 MOC 10777
Implementing a Data Warehouse with Microsoft SQL Server 2012 MOC 10777 Course Outline Module 1: Introduction to Data Warehousing This module provides an introduction to the key components of a data warehousing
Next Generation Business Performance Management Solution
Next Generation Business Performance Management Solution Why Existing Business Intelligence (BI) Products are Inadequate Changing Business Environment In the face of increased competition, complex customer
EII - ETL - EAI What, Why, and How!
IBM Software Group EII - ETL - EAI What, Why, and How! Tom Wu 巫 介 唐, [email protected] Information Integrator Advocate Software Group IBM Taiwan 2005 IBM Corporation Agenda Data Integration Challenges and
Implement a Data Warehouse with Microsoft SQL Server 20463C; 5 days
Lincoln Land Community College Capital City Training Center 130 West Mason Springfield, IL 62702 217-782-7436 www.llcc.edu/cctc Implement a Data Warehouse with Microsoft SQL Server 20463C; 5 days Course
Vendor briefing Business Intelligence and Analytics Platforms Gartner 15 capabilities
Vendor briefing Business Intelligence and Analytics Platforms Gartner 15 capabilities April, 2013 gaddsoftware.com Table of content 1. Introduction... 3 2. Vendor briefings questions and answers... 3 2.1.
An Architecture for Integrated Operational Business Intelligence
An Architecture for Integrated Operational Business Intelligence Dr. Ulrich Christ SAP AG Dietmar-Hopp-Allee 16 69190 Walldorf [email protected] Abstract: In recent years, Operational Business Intelligence
Lection 3-4 WAREHOUSING
Lection 3-4 DATA WAREHOUSING Learning Objectives Understand d the basic definitions iti and concepts of data warehouses Understand data warehousing architectures Describe the processes used in developing
Bringing agility to Business Intelligence Metadata as key to Agile Data Warehousing. 1 P a g e. www.analytixds.com
Bringing agility to Business Intelligence Metadata as key to Agile Data Warehousing 1 P a g e Table of Contents What is the key to agility in Data Warehousing?... 3 The need to address requirements completely....
Building an Effective Data Warehouse Architecture James Serra
Building an Effective Data Warehouse Architecture James Serra Global Sponsors: About Me Business Intelligence Consultant, in IT for 28 years Owner of Serra Consulting Services, specializing in end-to-end
IPL Service Definition - Master Data Management Service
IPL Proposal IPL Service Definition - Master Data Management Service Project: Date: 16th Dec 2014 Issue Number: Issue 1 Customer: Crown Commercial Service Page 1 of 7 IPL Information Processing Limited
Microsoft Data Warehouse in Depth
Microsoft Data Warehouse in Depth 1 P a g e Duration What s new Why attend Who should attend Course format and prerequisites 4 days The course materials have been refreshed to align with the second edition
Data warehouse and Business Intelligence Collateral
Data warehouse and Business Intelligence Collateral Page 1 of 12 DATA WAREHOUSE AND BUSINESS INTELLIGENCE COLLATERAL Brains for the corporate brawn: In the current scenario of the business world, the competition
SAS BI Course Content; Introduction to DWH / BI Concepts
SAS BI Course Content; Introduction to DWH / BI Concepts SAS Web Report Studio 4.2 SAS EG 4.2 SAS Information Delivery Portal 4.2 SAS Data Integration Studio 4.2 SAS BI Dashboard 4.2 SAS Management Console
Course 10777A: Implementing a Data Warehouse with Microsoft SQL Server 2012
Course 10777A: Implementing a Data Warehouse with Microsoft SQL Server 2012 OVERVIEW About this Course Data warehousing is a solution organizations use to centralize business data for reporting and analysis.
The Principles of the Business Data Lake
The Principles of the Business Data Lake The Business Data Lake Culture eats Strategy for Breakfast, so said Peter Drucker, elegantly making the point that the hardest thing to change in any organization
Task Manager. Task Management
Task Management ibpms Business Process Applications (BPAs) are the innovative, new class of Service Oriented Business Applications (SOBAs) that help businesses automate and simplify the management of missioncritical,
Data Management Roadmap
Data Management Roadmap A progressive approach towards building an Information Architecture strategy 1 Business and IT Drivers q Support for business agility and innovation q Faster time to market Improve
A Comprehensive Approach to Master Data Management Testing
A Comprehensive Approach to Master Data Management Testing Abstract Testing plays an important role in the SDLC of any Software Product. Testing is vital in Data Warehousing Projects because of the criticality
Understanding Data Warehousing. [by Alex Kriegel]
Understanding Data Warehousing 2008 [by Alex Kriegel] Things to Discuss Who Needs a Data Warehouse? OLTP vs. Data Warehouse Business Intelligence Industrial Landscape Which Data Warehouse: Bill Inmon vs.
Part 22. Data Warehousing
Part 22 Data Warehousing The Decision Support System (DSS) Tools to assist decision-making Used at all levels in the organization Sometimes focused on a single area Sometimes focused on a single problem
www.ijreat.org Published by: PIONEER RESEARCH & DEVELOPMENT GROUP (www.prdg.org) 28
Data Warehousing - Essential Element To Support Decision- Making Process In Industries Ashima Bhasin 1, Mr Manoj Kumar 2 1 Computer Science Engineering Department, 2 Associate Professor, CSE Abstract SGT
A Service-oriented Architecture for Business Intelligence
A Service-oriented Architecture for Business Intelligence Liya Wu 1, Gilad Barash 1, Claudio Bartolini 2 1 HP Software 2 HP Laboratories {[email protected]} Abstract Business intelligence is a business
A Tipping Point for Automation in the Data Warehouse. www.stonebranch.com
A Tipping Point for Automation in the Data Warehouse www.stonebranch.com Resolving the ETL Automation Problem The pressure on ETL Architects and Developers to utilize automation in the design and management
Presentation at 2006 DAMA / Wilshire Metadata Conference. John R. Friedrich, II, PhD [email protected]
Metadata Management Best Practices and Lessons Learned Presentation at 2006 DAMA / Wilshire Metadata Conference Denver, CO John R. Friedrich, II, PhD [email protected] Slide 1 of??? Outline
Big Data for Data Warehousing
A Point of View Big Data for Data Warehousing Introduction A Data Warehouse (DW) is a central repository of information created by integrating data from multiple disparate sources. DWs store current as
Implementing a Data Warehouse with Microsoft SQL Server 2012
Course 10777A: Implementing a Data Warehouse with Microsoft SQL Server 2012 Length: Audience(s): 5 Days Level: 200 IT Professionals Technology: Microsoft SQL Server 2012 Type: Delivery Method: Course Instructor-led
Data Warehousing with Oracle
Data Warehousing with Oracle Comprehensive Concepts Overview, Insight, Recommendations, Best Practices and a whole lot more. By Tariq Farooq A BrainSurface Presentation What is a Data Warehouse? Designed
Architecting an Industrial Sensor Data Platform for Big Data Analytics
Architecting an Industrial Sensor Data Platform for Big Data Analytics 1 Welcome For decades, organizations have been evolving best practices for IT (Information Technology) and OT (Operation Technology).
www.ducenit.com Analance Data Integration Technical Whitepaper
Analance Data Integration Technical Whitepaper Executive Summary Business Intelligence is a thriving discipline in the marvelous era of computing in which we live. It s the process of analyzing and exploring
Customer Insight Appliance. Enabling retailers to understand and serve their customer
Customer Insight Appliance Enabling retailers to understand and serve their customer Customer Insight Appliance Enabling retailers to understand and serve their customer. Technology has empowered today
CHAPTER SIX DATA. Business Intelligence. 2011 The McGraw-Hill Companies, All Rights Reserved
CHAPTER SIX DATA Business Intelligence 2011 The McGraw-Hill Companies, All Rights Reserved 2 CHAPTER OVERVIEW SECTION 6.1 Data, Information, Databases The Business Benefits of High-Quality Information
Evolving Data Warehouse Architectures
Evolving Data Warehouse Architectures In the Age of Big Data Philip Russom April 15, 2014 TDWI would like to thank the following companies for sponsoring the 2014 TDWI Best Practices research report: Evolving
IST722 Data Warehousing
IST722 Data Warehousing Components of the Data Warehouse Michael A. Fudge, Jr. Recall: Inmon s CIF The CIF is a reference architecture Understanding the Diagram The CIF is a reference architecture CIF
WHITEPAPER. Why Dependency Mapping is Critical for the Modern Data Center
WHITEPAPER Why Dependency Mapping is Critical for the Modern Data Center OVERVIEW The last decade has seen a profound shift in the way IT is delivered and consumed by organizations, triggered by new technologies
Implementing a Data Warehouse with Microsoft SQL Server 2012
Course 10777 : Implementing a Data Warehouse with Microsoft SQL Server 2012 Page 1 of 8 Implementing a Data Warehouse with Microsoft SQL Server 2012 Course 10777: 4 days; Instructor-Led Introduction Data
Emerging Technologies Shaping the Future of Data Warehouses & Business Intelligence
Emerging Technologies Shaping the Future of Data Warehouses & Business Intelligence Appliances and DW Architectures John O Brien President and Executive Architect Zukeran Technologies 1 TDWI 1 Agenda What
The Data Warehouse ETL Toolkit
2008 AGI-Information Management Consultants May be used for personal purporses only or by libraries associated to dandelon.com network. The Data Warehouse ETL Toolkit Practical Techniques for Extracting,
Big Data Integration: A Buyer's Guide
SEPTEMBER 2013 Buyer s Guide to Big Data Integration Sponsored by Contents Introduction 1 Challenges of Big Data Integration: New and Old 1 What You Need for Big Data Integration 3 Preferred Technology
B. 3 essay questions. Samples of potential questions are available in part IV. This list is not exhaustive it is just a sample.
IS482/682 Information for First Test I. What is the structure of the test? A. 20-25 multiple-choice questions. B. 3 essay questions. Samples of potential questions are available in part IV. This list is
Data Warehouse design
Data Warehouse design Design of Enterprise Systems University of Pavia 11/11/2013-1- Data Warehouse design DATA MODELLING - 2- Data Modelling Important premise Data warehouses typically reside on a RDBMS
Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole
Paper BB-01 Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole ABSTRACT Stephen Overton, Overton Technologies, LLC, Raleigh, NC Business information can be consumed many
Dimensional Modeling for Data Warehouse
Modeling for Data Warehouse Umashanker Sharma, Anjana Gosain GGS, Indraprastha University, Delhi Abstract Many surveys indicate that a significant percentage of DWs fail to meet business objectives or
www.sryas.com Analance Data Integration Technical Whitepaper
Analance Data Integration Technical Whitepaper Executive Summary Business Intelligence is a thriving discipline in the marvelous era of computing in which we live. It s the process of analyzing and exploring
Cloud application services (SaaS) Multi-Tenant Data Architecture Shailesh Paliwal Infosys Technologies Limited
Cloud application services (SaaS) Multi-Tenant Data Architecture Shailesh Paliwal Infosys Technologies Limited The paper starts with a generic discussion on the cloud application services and security
Gradient An EII Solution From Infosys
Gradient An EII Solution From Infosys Keywords: Grid, Enterprise Integration, EII Introduction New arrays of business are emerging that require cross-functional data in near real-time. Examples of such
Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing
Data Modeling in the Age of Big Data
Data Modeling in the Age of Big Data Pete Stiglich Pete Stiglich is a principal at Clarity Solution Group. [email protected] Abstract With big data adoption accelerating and strong interest in NoSQL
Metadata Management for Data Warehouse Projects
Metadata Management for Data Warehouse Projects Stefano Cazzella Datamat S.p.A. [email protected] Abstract Metadata management has been identified as one of the major critical success factor
Master Data Management
Master Data Management Managing Data as an Asset By Bandish Gupta Consultant CIBER Global Enterprise Integration Practice Abstract: Organizations used to depend on business practices to differentiate them
Evaluating Data Warehousing Methodologies: Objectives and Criteria
Evaluating Data Warehousing Methodologies: Objectives and Criteria by Dr. James Thomann and David L. Wells With each new technical discipline, Information Technology (IT) practitioners seek guidance for
THE DATA WAREHOUSE ETL TOOLKIT CDT803 Three Days
Three Days Prerequisites Students should have at least some experience with any relational database management system. Who Should Attend This course is targeted at technical staff, team leaders and project
COURSE SYLLABUS COURSE TITLE:
1 COURSE SYLLABUS COURSE TITLE: FORMAT: CERTIFICATION EXAMS: 55043AC Microsoft End to End Business Intelligence Boot Camp Instructor-led None This course syllabus should be used to determine whether the
