Data Quality in Data warehouse: problems and solution
|
|
|
- Leon Thomas
- 10 years ago
- Views:
Transcription
1 IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: , p- ISSN: Volume 16, Issue 1, Ver. IV (Jan. 2014), PP Data Quality in Data warehouse: problems and solution *Rahul Kumar Pandey, PhD Scholar, Surguja university (Chhattisgarh),India Abstract: In recent years, corporate scandals, regulatory changes, and the collapse of major financial institutions have brought much warranted attention to the quality of enterprise data if we can better understand the problems of quality issues, then we can develop a plan of action to address the problem that is both proactive and strategic. Each instance of a quality issue presents challenges in both identifying where problems exist and in quantifying the extent of the problems. Quantifying the issues is important in order to determine where our efforts should be focused. It is reported that more than $2 billion of U.S. federal loan money had been lost because of poor data quality at a single agency. It also reported that manufacturing companies spent over 25% of their sales on wasteful practices. Over the period of time many researchers have contributed to the data quality issues, but no research has collectively gathered all the causes of data quality problems at all the phases of data warehousing along with their possible solution. problems in different phase of data warehouse i.e.; data sources, data integration & data profiling, Data staging and ETL, data warehouse modeling & schema design are discussed in this paper. The purpose of the paper is to identify the reasons for data deficiencies, non-availability or reach ability problems at all the aforementioned stages of data warehousing and to give some classification of these causes as well as solution for improving data quality through Statistical Process Control (SPC),Quality engineering management. etc I have identified possible set of causes of data quality issues from the extensive literature review and with consultation of the data warehouse practitioners working in renowned IT company on India. I hope this will help developers & Implementers of warehouse to examine and analyze these issues before moving ahead for data integration and data warehouse solutions for quality decision oriented and business intelligence oriented applications. Keywords :Data Quality (DQ), Statistical Process Control (SPC),ETL, Data Staging, Data Warehouse I. Data Warehouse: Data Warehouse (DW) is a collection of technologies aimed at enabling the knowledge worker (executive,manager,analyst, etc) to make better and faster decisions. Many researchers and practitioners share that a data warehouse architecture can be formally understood as layers of materialized views on top of each other. A data warehouse architecture exhibits various layers of data in which data from one layer are derived from data of the lower layer. As defined by the father of data warehouse, William H.Inmon, a data warehouse is a collection of Integrated, Subject-Oriented, Non Volatile and Time Variant databases where each unit of data is specific to some period of time. Data Warehouses can contain detailed data, lightly summarized data and highly summarized data, all formatted for analysis and decision support (Inmon, 1996). In the Data Warehouse Toolkit, Ralph Kimball gives a more concise definition: a copy of transaction data specifically structured for query and analysis (Kimball, 1998). Both definitions stress the data warehouse s analysis focus, and highlight the historical nature of the data found in a data warehouse. fig: five stages of data warehouse 18 Page
2 Ensuring Data Quality Data quality is an increasingly serious issue for organizations large and small. It is central to all data integration initiatives. Before data can be used effectively in a data warehouse, or in customer relationship management,enterprise resource planning or business analytics applications, It need to be analyzed and cleansed.understanding the key data quality dimensions is the first step to data quality improvement. To be processable and interpretable in an effective and efficient manner, data has to satisfy a set of quality criteria. Data satisfying those quality criteria is said to be of high quality. Abundant attempts have been made to define data quality and to identify its dimensions. Dimensions of data quality typically include accuracy,reliability,importance,consistency,precision,timeliness,fineness,understandability, conciseness and usefulness. For this research paper I have under taken the quality criteria by taking 6 key dimensions as Completeness Consistency Validity Conformity Accuracy Integrity Completeness: deals with to ensure is all the requisite information available? Are some data values missing, or in an unusable state? Consistency: Do distinct occurrences of the same data instances agree with each other or provide conflicting information. Are values consistent across data sets? Validity: refers to the correctness and reasonableness of data Conformity: Are there expectations that data values conform to specified formats? If so, do all the values Accuracy: Do data objects accurately represent the real world values they are expected to model? Incorrect spellings of product or person names, addresses, and even untimely or not current data can impact operational and analytical applications. Integrity: What data is missing important relationship linkages? The inability to link related records together may actually introduce duplication Data Quality Issues: In order for the analyst to determine the scope of the underlying root causes of data quality issues and to plan the design the tools which can be used to address data quality issues, it is valuable to understand these common data quality issues. For the purpose of it the classification formed will be highly helpful to the data warehouse and data quality community. Data Quality Issues at Data Sources: The source system consists of all those 'transaction/production' raw data providers, from where the details are pulled out for making it suitable for Data Warehousing. All these data sources are having their own methods of storing data. Some of the data sources are cooperative and some might be non cooperative sources. Because of this diversity several reasons are present which may contribute to data quality problems, if not properly taken care of. A source that offers any kind of unsecured access can become unreliable-and ultimately contributing to poor data quality. Different data Sources have different kind of problems associated with it such as data from legacy data sources (e.g., mainframe-based COBOL programs) do not even have metadata that describe them. The sources of dirty data include data entry error by a human or computer system, data update error by a human or computer system. Part of the data comes from text files, Table1: causes of data quality at Data source Sr.No CAUSES OF DATA QUALITY PROBLEMS AT DATA SOURCES 1 Wrong information entered into source system 2 As time and proximity from the source increase, the chances for getting correct data decrease 3 In adequate knowledge of interdependencie among data sources incorporate DQ problems. 4 Inability to cope with ageing data contribute to data quality problems 5 Varying timeliness of data sources 6 Complex Data warehouse 7 Unexpected changes in source systems cause DQProblems 8 System fields designed to allow free forms (Field not having adequate length). 19 Page
3 9 Missing values in data sources 10 Additional columns 11 Use of different representation formats in data sources 12 Non-Compliance of data in data sources with the Standards 13 Failure to update all replicas of data causes DQ Problems. 14 Approximations or surrogates used in data 15 Different encoding formats (ASCII, EBCDIC,.) 16 Lack of business ownership, policy and planning of the entire enterprise data contribute to data quality problems. Fig:1 Types of data sources Organizations extract data from. Causes of Data Quality Issues at Data Profiling When possible candidate data sources are identified and finalized data profiling comes in play immediately. Data profiling is the examination and assessment of your source systems data quality,integrity and consistency sometimes also called as source systems analysis Table 2:causes of data quality at Data profiling Sr.No CAUSES OF DATA QUALITY PROBLEMS AT DATA PROFILING 1 Insufficient range and distribution of values or threshold analysis for required fields. 2 Unreliable and incomplete metadata of data source 3 UserGeneratedSQLqueriesforthedataprofilingpurpose leave the data quality problems. 4 Inability of evaluation of inconsistent business processes during data profiling cause data quality problems. 5 Inability of evaluation of data structure, data values and data relationships before data integration,propagates poor data quality 6 Inability of integration between Data profiling and ETL causes Data quality problem 7 Inappropriate selection of Automated profiling tool cause data quality issues 8 Insufficient data content analysis against external reference data causes data quality problems. 9 Insufficient structural analysis of the data sources in the profiling stage. 10 Insufficient Pattern analysis for given fields within each data store Data Quality issue at D ata Staging ETL(Extra action), Transformation and Loading) One consideration is whether data cleansing is most appropriate a the source system, during the ETL process, at the staging database,or within the data warehouse..a data cleaning process is executed in the data staging area in order to improve the accuracy of the data warehouse. The data staging area is the place where all grooming' is done on data after it is culled from the source systems. Staging and ETL phase is considered to be most crucial stage of data warehousing where maximum responsibility of data quality efforts resides.it is a prime location for validating data quality from source or auditing and tracking down data issues. Some of the identified area from literature review are shown in Table 20 Page
4 Table3: Causes of Data Quality Issues at Data Staging and ETL Phase Sr.No CAUSES OF DATA QUALITY ISSUES AT DATA STAGING AND ETL PHASE. 1 Data warehouse architecture undertaken affects the data quality (Staging, Non Staging Architecture). 2 Type of staging area,relational or non relational affects the data quality. 3 Different business rules of various data sources Creates problem of data quality.. 4 Business rules lack currency contributes to data quality problems 5 Business rules lack currency contributes to data quality problems 6 Lack of capturing only changes in source files 7 Lack of periodical re freshing of the integrated data storage (Data Staging area) cause data quality degradation 8 Truncating the data staging area cause data quality Problems 9 Disabling data integrity constraints in data staging tables cause wrong data and relationships to be extracted and hence cause data quality problem 10 Purging of data from the Data warehouse \cause data quality problem 11 The inability to restart the ETL process from checkpoints without losing data 12 Lack of Providing internal profiling or integration to third-party data profiling and cleansing tools.[ 13 Lack of automatically generating rules for ETL tools to build mapping s that detect and fix data defects 14 Inability of integrating cleansing tasks into visual work flows and diagrams 15 Unhandled null values causes data quality problem 16 Lack of automated unit testing facility causes data quality problem Causes of Data Quality Problems at Data Modeling (Database Schema Design) Stage. The quality of the information depends on 3 things:(1) the quality of the data itself,(2)the quality of the application programs and (3) the quality of the database schema [19]. Design of the data warehouse greatly influences the quality of the analysis that is possible with data init. So, special attention should be given to the issues of schema design. Some of the issues such as slowly changing dimensions, rapidly changing dimension, and multi valued dimensions etc.a flawed schema impacts negatively on information quality. Table is depicting the listing of some most important causes of data quality issues at data warehouse schema designing Table 4 : Causes of Data Quality Issues at Data Warehouse Schema Modeling Phase Sr.No CAUSESOF DATA QUALITY ISSUES AT DATA WAREHOUSE SCHEM A DESIGN. 1 Incomplete or wrong requirement analysis of the project lea d to poor schema design which further cause data quality problems. 2 Lack of currency in business rules cause poor requirement analysis which leads to poor schema design and contribute to data quality problems. 3 Choice of dimensional modeling (STAR,SNOWFLAKE,FACTCONSTALLATION) schema contribute to data quality. 4 Late identification of slowly changing dimensions contribute to data qualit y problems. 5 Late arriving dimensions cause DQ Problems. 6 Multi valued dimensions cause DQ problems 7 Improper selection of record granularity may lead to poor schema design and thereby affecting DQ. Problem 8 Incomplete/Wrong identification of facts/dimensions, bridge tables or relationship tables or their individual relationships contribute to DQ problems. 9 Inability to support database schema refactoring cause data quality problems 21 Page
5 Suggestion for improving Data quality: Enterprise Architecture Creating and maintaining enterprise architecture (EA) is a effective method for controlling data redundancies as well as process redundancies, and thereby reducing the anomalies and inconsistencies that are inherently produced by uncontrolled redundancies. EA is comprised of models that describe an organization in terms of its business architecture (business functions, business processes, business data, and so on) and technical architecture (applications, databases, and so on). Purpose of these models is to describe the actual business in which the organization engages. EA is applicable to all organizations, large and small. Because EA models are best built incrementally, one project at a time, it is appropriate to develop EA models on DW and BI projects, as well as on projects that simply solve departmental challenges. Data Quality Improvement Process In addition to applying enterprise-wide data quality disciplines, creating an enterprise data model, and documenting metadata, the data quality group should develop their own data quality improvement process. At the highest level, this process must address the six major components shown in figure These components are:.assess Every improvement cycle starts with an assessment. This can either be an initial enterprise-wide data quality assessment, a system-by-system data quality assessment, or a department-by-department data quality assessment. When performing the assessment, do not limit your efforts to profiling the data and collecting statistics on data defects. Analyze the entire data entry or data manipulation process to find the root causes of errors and to find process improvement opportunities. Another type of assessment is a periodic data audit. This type of assessment is usually limited to one file or one database at a time. It involves data profiling as well as manual validation of data values against the documented data domains (valid data values) Fig 2: Data quality improvement cycle.plan After opportunities for improvement have been defined, the improvements should be analyzed, prioritized, approved, funded, staffed, and scheduled. Not all improvements have the same payback and not all improvements are practical or even feasible. An impact analysis should determine which improvements have the most far-reaching benefits. After improvement projects have been prioritized, approved, and funded, they should be staffed and scheduled..implement In some cases, the data quality group can implement the approved improvements, but in many cases, other staff members from both the business side and IT will be required. For example, a decision might have been made that an overloaded column (a column containing data values describing multiple attributes) should be separated in a database. That would involve the business people who are currently accessing the database, the database administrators who are maintaining it, and the developers whose programs are accessing it. Evaluate the best ideas sometimes backfire. Although some impact analysis will have been performed during planning, occasionally an adverse impact will be overlooked. Or worse, the implemented improvement might have inadvertently created a new problem. It is therefore advisable to monitor the implemented improvements and evaluate their effectiveness. If deemed necessary, an improvement can be reversed. 22 Page
6 .Adapt hopefully, most improvements do not have to be reversed, but some may have to be modified before announcing them to the entire organization or before turning them into new standards, guidelines, or procedures..educate The final step is to disseminate information about the new improvement process just implemented. Depending on the scope of the change, education can be accomplished through classroom training, computerbased training, an announcement on the organization s intranet website, an internal newsletter, or simple notification. Data Quality Disciplines: Organizations have a number of data quality disciplines at their disposal, but rarely will they implement all disciplines at once because improving data quality is a process and not an event. Figure shows the common data quality improvement activities in each of the five data quality maturity levels based on Larry English s adaptation of the capability model (CMM) to data quality. The five levels are: Fig 3 data quality improvement activities Level 1: Uncertainty At Level 1, the organization is stumbling over data defects as its programs abend (crash) or its information consumers complain. There is no proactive data quality improvement process, no data quality group, and no funding. The organization denies any serious data quality problems and considers data analysis a waste of time. Or the CIO is ready to retire and doesn t want anything to disrupt it. Basically, the organization is asleep and doesn t want to be awakened. Level 2: Awakening At Level 2, the organization performs some limited data analysis and data correction activities, such as data profiling and data cleansing. There still is no enterprise-wide support for data quality improvement, no data quality group, and no funding. However, a few isolated individuals acknowledge their dirty data and want to incorporate data quality disciplines in their projects. These individuals can be data administrators, database administrators, developers, or business people. Level 3: Enlightenment At Level 3, the organization starts to address the root causes of its dirty data through program edits and data quality training. A data quality group is created and funding for data quality improvement projects is available. The data quality group immediately performs an enterprise-wide data quality assessment of their critical files and databases, and prioritizes the data quality improvement activities. This group also institutes several data quality disciplines and launches a comprehensive data quality training program across the organization. Level 4: Wisdom At Level 4, the organization proactively works on preventing future data defects by adding more data quality disciplines to its data quality improvement program. Managers across the organization accept personal responsibility for data quality. The data quality group has been moved under a chief officer either the CIO, COO, CFO, or a new position, such as a chief knowledge officer (CKO). Metrics are in place to measure the number of data defects produced by staff, and these metrics are considered in the staff s job performance appraisals. Incentives for improving data quality have replaced incentives for cranking out systems at the speed of light. Level 5: Certainty At Level 5, the organization is in an optimization cycle by continuously monitoring and improving its data defect prevention processes. Data quality is an integral part of all business processes. Every job description requires attention to data quality, reporting of data defects, determining the root causes, 23 Page
7 improving the affected data quality processes to eliminate the root causes, and monitoring the effects of the improvement. Basically, the culture of the organization has changed. II. Conclusion In this paper attempt has been made to collect all possible causes of data quality problems that may exist at all the phases of data warehouse. My objective was to put forth such a descriptive classification which covers all the phases of data warehousing which can impact the data quality. And also to provide solution for improving Data quality. The motivation of the research was to integrate all the sayings of different researches which were focused on individual phases of data warehouse. Such as lot of literature is available on dirty data taxonomies This paper is helpful for the data warehouse practitioners, implementers and researchers for taking care of these issues before moving ahead with each phase of data warehousing. It would also be helpful for the vendors and those who are involved in development of data quality tools so as to incorporate changes in their tools to overcome the problems highlighted in classifications Future work Each item of the classification will be converted into a item of the research instrument such as questionnaire and will be empirically tested by collecting views about these items from the data warehouse practitioners, appropriately. References [1] A Descriptive Classification of Causes of Data Quality Problems in Data Warehousing,R Singh,Dr k Singh. [2] [3] "TDWI Data Cleansing: Delivering High-Quality Warehouse Data." [4] Thibodeau, Patrick. "Data Problems Thwart Effort to Count [5] AmolShrivastav,MohitBhaduria, Harsha Rajwanshi (2008), Data Warehouse and Quality Issues, Warehouse-and-Quality-Issues. [6] Ahimanikya Satapathy, Building an ETL Tool, Sun Microsystems 24 Page
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING Ramesh Babu Palepu 1, Dr K V Sambasiva Rao 2 Dept of IT, Amrita Sai Institute of Science & Technology 1 MVR College of Engineering 2 [email protected]
Data Quality Assessment. Approach
Approach Prepared By: Sanjay Seth Data Quality Assessment Approach-Review.doc Page 1 of 15 Introduction Data quality is crucial to the success of Business Intelligence initiatives. Unless data in source
DATA QUALITY PROBLEMS IN DATA WAREHOUSING
DATA QUALITY PROBLEMS IN DATA WAREHOUSING Diksha Verma, Anjli Tyagi, Deepak Sharma Department of Information Technology, Dronacharya college of Engineering Abstract- Data quality is critical to data warehouse
An Introduction to Data Warehousing. An organization manages information in two dominant forms: operational systems of
An Introduction to Data Warehousing An organization manages information in two dominant forms: operational systems of record and data warehouses. Operational systems are designed to support online transaction
Data Quality C HAPTER 3
03_DataStrat.qxd 5/20/05 5:14 PM Page 47 C HAPTER 3 Data Quality Virtually everything in business today is an undifferentiated commodity, except how a company manages its information. How you manage information
Building a Data Quality Scorecard for Operational Data Governance
Building a Data Quality Scorecard for Operational Data Governance A White Paper by David Loshin WHITE PAPER Table of Contents Introduction.... 1 Establishing Business Objectives.... 1 Business Drivers...
Five Fundamental Data Quality Practices
Five Fundamental Data Quality Practices W H I T E PA P E R : DATA QUALITY & DATA INTEGRATION David Loshin WHITE PAPER: DATA QUALITY & DATA INTEGRATION Five Fundamental Data Quality Practices 2 INTRODUCTION
Data Warehouse Overview. Srini Rengarajan
Data Warehouse Overview Srini Rengarajan Please mute Your cell! Agenda Data Warehouse Architecture Approaches to build a Data Warehouse Top Down Approach Bottom Up Approach Best Practices Case Example
A Review of Contemporary Data Quality Issues in Data Warehouse ETL Environment
DOI: 10.15415/jotitt.2014.22021 A Review of Contemporary Data Quality Issues in Data Warehouse ETL Environment Rupali Gill 1, Jaiteg Singh 2 1 Assistant Professor, School of Computer Sciences, 2 Associate
Enterprise Data Governance
Enterprise Aligning Quality With Your Program Presented by: Mark Allen Sr. Consultant, Enterprise WellPoint, Inc. ([email protected]) 1 Introduction: Mark Allen is a senior consultant and enterprise
CHAPTER SIX DATA. Business Intelligence. 2011 The McGraw-Hill Companies, All Rights Reserved
CHAPTER SIX DATA Business Intelligence 2011 The McGraw-Hill Companies, All Rights Reserved 2 CHAPTER OVERVIEW SECTION 6.1 Data, Information, Databases The Business Benefits of High-Quality Information
IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH
IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH Kalinka Mihaylova Kaloyanova St. Kliment Ohridski University of Sofia, Faculty of Mathematics and Informatics Sofia 1164, Bulgaria
Information Quality for Business Intelligence. Projects
Information Quality for Business Intelligence Projects Earl Hadden Intelligent Commerce Network LLC Objectives of this presentation Understand Information Quality Problems on BI/DW Projects Define Strategic
Lection 3-4 WAREHOUSING
Lection 3-4 DATA WAREHOUSING Learning Objectives Understand d the basic definitions iti and concepts of data warehouses Understand data warehousing architectures Describe the processes used in developing
THE QUALITY OF DATA AND METADATA IN A DATAWAREHOUSE
THE QUALITY OF DATA AND METADATA IN A DATAWAREHOUSE Carmen Răduţ 1 Summary: Data quality is an important concept for the economic applications used in the process of analysis. Databases were revolutionized
Enterprise Data Quality
Enterprise Data Quality An Approach to Improve the Trust Factor of Operational Data Sivaprakasam S.R. Given the poor quality of data, Communication Service Providers (CSPs) face challenges of order fallout,
Deriving Business Intelligence from Unstructured Data
International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 9 (2013), pp. 971-976 International Research Publications House http://www. irphouse.com /ijict.htm Deriving
Sizing Logical Data in a Data Warehouse A Consistent and Auditable Approach
2006 ISMA Conference 1 Sizing Logical Data in a Data Warehouse A Consistent and Auditable Approach Priya Lobo CFPS Satyam Computer Services Ltd. 69, Railway Parallel Road, Kumarapark West, Bangalore 560020,
Data Warehouse (DW) Maturity Assessment Questionnaire
Data Warehouse (DW) Maturity Assessment Questionnaire Catalina Sacu - [email protected] Marco Spruit [email protected] Frank Habers [email protected] September, 2010 Technical Report UU-CS-2010-021
DATA WAREHOUSING AND OLAP TECHNOLOGY
DATA WAREHOUSING AND OLAP TECHNOLOGY Manya Sethi MCA Final Year Amity University, Uttar Pradesh Under Guidance of Ms. Shruti Nagpal Abstract DATA WAREHOUSING and Online Analytical Processing (OLAP) are
Data Warehouse Snowflake Design and Performance Considerations in Business Analytics
Journal of Advances in Information Technology Vol. 6, No. 4, November 2015 Data Warehouse Snowflake Design and Performance Considerations in Business Analytics Jiangping Wang and Janet L. Kourik Walker
Making Business Intelligence Easy. Whitepaper Measuring data quality for successful Master Data Management
Making Business Intelligence Easy Whitepaper Measuring data quality for successful Master Data Management Contents Overview... 3 What is Master Data Management?... 3 Master Data Modeling Approaches...
Key organizational factors in data warehouse architecture selection
Key organizational factors in data warehouse architecture selection Ravi Kumar Choudhary ABSTRACT Deciding the most suitable architecture is the most crucial activity in the Data warehouse life cycle.
US Department of Education Federal Student Aid Integration Leadership Support Contractor January 25, 2007
US Department of Education Federal Student Aid Integration Leadership Support Contractor January 25, 2007 Task 18 - Enterprise Data Management 18.002 Enterprise Data Management Concept of Operations i
ETL-EXTRACT, TRANSFORM & LOAD TESTING
ETL-EXTRACT, TRANSFORM & LOAD TESTING Rajesh Popli Manager (Quality), Nagarro Software Pvt. Ltd., Gurgaon, INDIA [email protected] ABSTRACT Data is most important part in any organization. Data
Enabling Data Quality
Enabling Data Quality Establishing Master Data Management (MDM) using Business Architecture supported by Information Architecture & Application Architecture (SOA) to enable Data Quality. 1 Background &
Data Quality Assurance
CHAPTER 4 Data Quality Assurance The previous chapters define accurate data. They talk about the importance of data and in particular the importance of accurate data. They describe how complex the topic
Business Performance & Data Quality Metrics. David Loshin Knowledge Integrity, Inc. [email protected] (301) 754-6350
Business Performance & Data Quality Metrics David Loshin Knowledge Integrity, Inc. [email protected] (301) 754-6350 1 Does Data Integrity Imply Business Value? Assumption: improved data quality,
Chapter 6 Basics of Data Integration. Fundamentals of Business Analytics RN Prasad and Seema Acharya
Chapter 6 Basics of Data Integration Fundamentals of Business Analytics Learning Objectives and Learning Outcomes Learning Objectives 1. Concepts of data integration 2. Needs and advantages of using data
Populating a Data Quality Scorecard with Relevant Metrics WHITE PAPER
Populating a Data Quality Scorecard with Relevant Metrics WHITE PAPER SAS White Paper Table of Contents Introduction.... 1 Useful vs. So-What Metrics... 2 The So-What Metric.... 2 Defining Relevant Metrics...
Data Integration and ETL Process
Data Integration and ETL Process Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Software Development Technologies Master studies, second
Enterprise Solutions. Data Warehouse & Business Intelligence Chapter-8
Enterprise Solutions Data Warehouse & Business Intelligence Chapter-8 Learning Objectives Concepts of Data Warehouse Business Intelligence, Analytics & Big Data Tools for DWH & BI Concepts of Data Warehouse
HYPERION MASTER DATA MANAGEMENT SOLUTIONS FOR IT
HYPERION MASTER DATA MANAGEMENT SOLUTIONS FOR IT POINT-AND-SYNC MASTER DATA MANAGEMENT 04.2005 Hyperion s new master data management solution provides a centralized, transparent process for managing critical
Presented By: Leah R. Smith, PMP. Ju ly, 2 011
Presented By: Leah R. Smith, PMP Ju ly, 2 011 Business Intelligence is commonly defined as "the process of analyzing large amounts of corporate data, usually stored in large scale databases (such as a
Data Warehousing: A Technology Review and Update Vernon Hoffner, Ph.D., CCP EntreSoft Resouces, Inc.
Warehousing: A Technology Review and Update Vernon Hoffner, Ph.D., CCP EntreSoft Resouces, Inc. Introduction Abstract warehousing has been around for over a decade. Therefore, when you read the articles
Whitepaper. Data Warehouse/BI Testing Offering YOUR SUCCESS IS OUR FOCUS. Published on: January 2009 Author: BIBA PRACTICE
YOUR SUCCESS IS OUR FOCUS Whitepaper Published on: January 2009 Author: BIBA PRACTICE 2009 Hexaware Technologies. All rights reserved. Table of Contents 1. 2. Data Warehouse - Typical pain points 3. Hexaware
Enterprise Intelligence - Enabling High Quality in the Data Warehouse/DSS Environment. by Bill Inmon. INTEGRITY IN All Your INformation
INTEGRITY IN All Your INformation R TECHNOLOGY INCORPORATED Enterprise Intelligence - Enabling High Quality in the Data Warehouse/DSS Environment by Bill Inmon WPS.INM.E.399.1.e Introduction In a few short
DATA MINING AND WAREHOUSING CONCEPTS
CHAPTER 1 DATA MINING AND WAREHOUSING CONCEPTS 1.1 INTRODUCTION The past couple of decades have seen a dramatic increase in the amount of information or data being stored in electronic format. This accumulation
A Framework for Identifying and Managing Information Quality Metrics of Corporate Performance Management System
Journal of Modern Accounting and Auditing, ISSN 1548-6583 February 2012, Vol. 8, No. 2, 185-194 D DAVID PUBLISHING A Framework for Identifying and Managing Information Quality Metrics of Corporate Performance
Information Governance Workshop. David Zanotta, Ph.D. Vice President, Global Data Management & Governance - PMO
Information Governance Workshop David Zanotta, Ph.D. Vice President, Global Data Management & Governance - PMO Recognition of Information Governance in Industry Research firms have begun to recognize the
DATA QUALITY IN BUSINESS INTELLIGENCE APPLICATIONS
DATA QUALITY IN BUSINESS INTELLIGENCE APPLICATIONS Gorgan Vasile Academy of Economic Studies Bucharest, Faculty of Accounting and Management Information Systems, Academia de Studii Economice, Catedra de
A Knowledge Management Framework Using Business Intelligence Solutions
www.ijcsi.org 102 A Knowledge Management Framework Using Business Intelligence Solutions Marwa Gadu 1 and Prof. Dr. Nashaat El-Khameesy 2 1 Computer and Information Systems Department, Sadat Academy For
Data Warehouse and Business Intelligence Testing: Challenges, Best Practices & the Solution
Warehouse and Business Intelligence : Challenges, Best Practices & the Solution Prepared by datagaps http://www.datagaps.com http://www.youtube.com/datagaps http://www.twitter.com/datagaps Contact [email protected]
The Data Warehouse ETL Toolkit
2008 AGI-Information Management Consultants May be used for personal purporses only or by libraries associated to dandelon.com network. The Data Warehouse ETL Toolkit Practical Techniques for Extracting,
PUSH INTELLIGENCE. Bridging the Last Mile to Business Intelligence & Big Data. 2013 Copyright Metric Insights, Inc.
PUSH INTELLIGENCE Bridging the Last Mile to Business Intelligence & Big Data 2013 Copyright Metric Insights, Inc. INTRODUCTION... 3 CHALLENGES WITH BI... 4 The Dashboard Dilemma... 4 Architectural Limitations
Enterprise Data Quality Dashboards and Alerts: Holistic Data Quality
Enterprise Data Quality Dashboards and Alerts: Holistic Data Quality Jay Zaidi Bonnie O Neil (Fannie Mae) Data Governance Winter Conference Ft. Lauderdale, Florida November 16-18, 2011 Agenda 1 Introduction
Course Outline: Course: Implementing a Data Warehouse with Microsoft SQL Server 2012 Learning Method: Instructor-led Classroom Learning
Course Outline: Course: Implementing a Data with Microsoft SQL Server 2012 Learning Method: Instructor-led Classroom Learning Duration: 5.00 Day(s)/ 40 hrs Overview: This 5-day instructor-led course describes
Class News. Basic Elements of the Data Warehouse" 1/22/13. CSPP 53017: Data Warehousing Winter 2013" Lecture 2" Svetlozar Nestorov" "
CSPP 53017: Data Warehousing Winter 2013 Lecture 2 Svetlozar Nestorov Class News Class web page: http://bit.ly/wtwxv9 Subscribe to the mailing list Homework 1 is out now; due by 1:59am on Tue, Jan 29.
A Survey on Data Warehouse Architecture
A Survey on Data Warehouse Architecture Rajiv Senapati 1, D.Anil Kumar 2 1 Assistant Professor, Department of IT, G.I.E.T, Gunupur, India 2 Associate Professor, Department of CSE, G.I.E.T, Gunupur, India
Appendix B Data Quality Dimensions
Appendix B Data Quality Dimensions Purpose Dimensions of data quality are fundamental to understanding how to improve data. This appendix summarizes, in chronological order of publication, three foundational
DATA QUALITY MATURITY
3 DATA QUALITY MATURITY CHAPTER OUTLINE 3.1 The Data Quality Strategy 35 3.2 A Data Quality Framework 38 3.3 A Data Quality Capability/Maturity Model 42 3.4 Mapping Framework Components to the Maturity
Creating a Business Intelligence Competency Center to Accelerate Healthcare Performance Improvement
Creating a Business Intelligence Competency Center to Accelerate Healthcare Performance Improvement Bruce Eckert, National Practice Director, Advisory Group Ramesh Sakiri, Executive Consultant, Healthcare
Adapting an Enterprise Architecture for Business Intelligence
Adapting an Enterprise Architecture for Business Intelligence Pascal von Bergen 1, Knut Hinkelmann 2, Hans Friedrich Witschel 2 1 IT-Logix, Schwarzenburgstr. 11, CH-3007 Bern 2 Fachhochschule Nordwestschweiz,
Whitepaper Data Governance Roadmap for IT Executives Valeh Nazemoff
Whitepaper Data Governance Roadmap for IT Executives Valeh Nazemoff The Challenge IT Executives are challenged with issues around data, compliancy, regulation and making confident decisions on their business
THE DATA WAREHOUSE ETL TOOLKIT CDT803 Three Days
Three Days Prerequisites Students should have at least some experience with any relational database management system. Who Should Attend This course is targeted at technical staff, team leaders and project
Mergers and Acquisitions: The Data Dimension
Global Excellence Mergers and Acquisitions: The Dimension A White Paper by Dr Walid el Abed CEO Trusted Intelligence Contents Preamble...............................................................3 The
CHAPTER 1 INTRODUCTION
CHAPTER 1 INTRODUCTION 1. Introduction 1.1 Data Warehouse In the 1990's as organizations of scale began to need more timely data for their business, they found that traditional information systems technology
ETL Process in Data Warehouse. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT
ETL Process in Data Warehouse G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT Outline ETL Extraction Transformation Loading ETL Overview Extraction Transformation Loading ETL To get data out of
Establish and maintain Center of Excellence (CoE) around Data Architecture
Senior BI Data Architect - Bensenville, IL The Company s Information Management Team is comprised of highly technical resources with diverse backgrounds in data warehouse development & support, business
10426: Large Scale Project Accounting Data Migration in E-Business Suite
10426: Large Scale Project Accounting Data Migration in E-Business Suite Objective of this Paper Large engineering, procurement and construction firms leveraging Oracle Project Accounting cannot withstand
Framework for Data warehouse architectural components
Framework for Data warehouse architectural components Author: Jim Wendt Organization: Evaltech, Inc. Evaltech Research Group, Data Warehousing Practice. Date: 04/08/11 Email: [email protected] Abstract:
Effecting Data Quality Improvement through Data Virtualization
Effecting Data Quality Improvement through Data Virtualization Prepared for Composite Software by: David Loshin Knowledge Integrity, Inc. June, 2010 2010 Knowledge Integrity, Inc. Page 1 Introduction The
DATA GOVERNANCE AND DATA QUALITY
DATA GOVERNANCE AND DATA QUALITY Kevin Lewis Partner Enterprise Management COE Barb Swartz Account Manager Teradata Government Systems Objectives of the Presentation Show that Governance and Quality are
Building a Successful Data Quality Management Program WHITE PAPER
Building a Successful Data Quality Management Program WHITE PAPER Table of Contents Introduction... 2 DQM within Enterprise Information Management... 3 What is DQM?... 3 The Data Quality Cycle... 4 Measurements
Build an effective data integration strategy to drive innovation
IBM Software Thought Leadership White Paper September 2010 Build an effective data integration strategy to drive innovation Five questions business leaders must ask 2 Build an effective data integration
Jagir Singh, Greeshma, P Singh University of Northern Virginia. Abstract
224 Business Intelligence Journal July DATA WAREHOUSING Ofori Boateng, PhD Professor, University of Northern Virginia BMGT531 1900- SU 2011 Business Intelligence Project Jagir Singh, Greeshma, P Singh
Presented by: Jose Chinchilla, MCITP
Presented by: Jose Chinchilla, MCITP Jose Chinchilla MCITP: Database Administrator, SQL Server 2008 MCITP: Business Intelligence SQL Server 2008 Customers & Partners Current Positions: President, Agile
TDWI Data Integration Techniques: ETL & Alternatives for Data Consolidation
TDWI Data Integration Techniques: ETL & Alternatives for Data Consolidation Format : C3 Education Course Course Length : 9am to 5pm, 2 consecutive days Date : Sydney 22-23 Nov 2011, Melbourne 28-29 Nov
Dimensional Modeling for Data Warehouse
Modeling for Data Warehouse Umashanker Sharma, Anjana Gosain GGS, Indraprastha University, Delhi Abstract Many surveys indicate that a significant percentage of DWs fail to meet business objectives or
Seeking Data Quality. Using Agile Methods to Test a Data Warehouse
Seeking Data Quality Using Agile Methods to Test a Data Warehouse Copyright Ideaca 2008 Agenda Seeking Data Quality Data Warehouse Overview The Value of a Data Warehouse Agile as Business Value Driver
CONCEPTUALIZING BUSINESS INTELLIGENCE ARCHITECTURE MOHAMMAD SHARIAT, Florida A&M University ROSCOE HIGHTOWER, JR., Florida A&M University
CONCEPTUALIZING BUSINESS INTELLIGENCE ARCHITECTURE MOHAMMAD SHARIAT, Florida A&M University ROSCOE HIGHTOWER, JR., Florida A&M University Given today s business environment, at times a corporate executive
Data Warehousing and Data Mining
Data Warehousing and Data Mining Part I: Data Warehousing Gao Cong [email protected] Slides adapted from Man Lung Yiu and Torben Bach Pedersen Course Structure Business intelligence: Extract knowledge
www.ijreat.org Published by: PIONEER RESEARCH & DEVELOPMENT GROUP (www.prdg.org) 28
Data Warehousing - Essential Element To Support Decision- Making Process In Industries Ashima Bhasin 1, Mr Manoj Kumar 2 1 Computer Science Engineering Department, 2 Associate Professor, CSE Abstract SGT
ETL tools for Data Warehousing: An empirical study of Open Source Talend Studio versus Microsoft SSIS
ETL tools for Data Warehousing: An empirical study of Open Source Talend Studio versus Microsoft SSIS Ranjith Katragadda Unitech Institute of Technology Auckland, New Zealand Sreenivas Sremath Tirumala
OPTIMUS SBR. Optimizing Results with Business Intelligence Governance CHOICE TOOLS. PRECISION AIM. BOLD ATTITUDE.
OPTIMUS SBR CHOICE TOOLS. PRECISION AIM. BOLD ATTITUDE. Optimizing Results with Business Intelligence Governance This paper investigates the importance of establishing a robust Business Intelligence (BI)
Implementing a Data Warehouse with Microsoft SQL Server 2012 MOC 10777
Implementing a Data Warehouse with Microsoft SQL Server 2012 MOC 10777 Course Outline Module 1: Introduction to Data Warehousing This module provides an introduction to the key components of a data warehousing
The following is intended to outline our general product direction. It is intended for informational purposes only, and may not be incorporated into
The following is intended to outline our general product direction. It is intended for informational purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any
Fundamentals of Measurements
Objective Software Project Measurements Slide 1 Fundamentals of Measurements Educational Objective: To review the fundamentals of software measurement, to illustrate that measurement plays a central role
IST722 Data Warehousing
IST722 Data Warehousing Components of the Data Warehouse Michael A. Fudge, Jr. Recall: Inmon s CIF The CIF is a reference architecture Understanding the Diagram The CIF is a reference architecture CIF
Turkish Journal of Engineering, Science and Technology
Turkish Journal of Engineering, Science and Technology 03 (2014) 106-110 Turkish Journal of Engineering, Science and Technology journal homepage: www.tujest.com Integrating Data Warehouse with OLAP Server
Three Fundamental Techniques To Maximize the Value of Your Enterprise Data
Three Fundamental Techniques To Maximize the Value of Your Enterprise Data Prepared for Talend by: David Loshin Knowledge Integrity, Inc. October, 2010 2010 Knowledge Integrity, Inc. 1 Introduction Organizations
Application Of Business Intelligence In Agriculture 2020 System to Improve Efficiency And Support Decision Making in Investments.
Application Of Business Intelligence In Agriculture 2020 System to Improve Efficiency And Support Decision Making in Investments Anuraj Gupta Department of Electronics and Communication Oriental Institute
Next Generation Business Performance Management Solution
Next Generation Business Performance Management Solution Why Existing Business Intelligence (BI) Products are Inadequate Changing Business Environment In the face of increased competition, complex customer
Data Testing on Business Intelligence & Data Warehouse Projects
Data Testing on Business Intelligence & Data Warehouse Projects Karen N. Johnson 1 Construct of a Data Warehouse A brief look at core components of a warehouse. From the left, these three boxes represent
A Design Technique: Data Integration Modeling
C H A P T E R 3 A Design Technique: Integration ing This chapter focuses on a new design technique for the analysis and design of data integration processes. This technique uses a graphical process modeling
How to Enhance Traditional BI Architecture to Leverage Big Data
B I G D ATA How to Enhance Traditional BI Architecture to Leverage Big Data Contents Executive Summary... 1 Traditional BI - DataStack 2.0 Architecture... 2 Benefits of Traditional BI - DataStack 2.0...
DATABASE SYSTEMS. Chapter 7 Normalisation
DATABASE SYSTEMS DESIGN IMPLEMENTATION AND MANAGEMENT INTERNATIONAL EDITION ROB CORONEL CROCKETT Chapter 7 Normalisation 1 (Rob, Coronel & Crockett 978184480731) In this chapter, you will learn: What normalization
