DATA QUALITY PROBLEMS IN DATA WAREHOUSING
|
|
|
- Lucas Townsend
- 10 years ago
- Views:
Transcription
1 DATA QUALITY PROBLEMS IN DATA WAREHOUSING Diksha Verma, Anjli Tyagi, Deepak Sharma Department of Information Technology, Dronacharya college of Engineering Abstract- Data quality is critical to data warehouse and business intelligence solutions. Better informed, more reliable decisions come from using the right data quality technology during the process of loading a data warehouse. It is important the data is accurate, complete, and consistent across data sources. Data warehousing is gaining in eminence as organizations become awake of the benefits of decision oriented and business intelligence oriented data basesover the period of time many researchers have contributed to the data quality issues, but no research has collectively gathered all the causes of data quality problems at all the phases of data warehousing Viz. 1) data sources, 2) data integration & data profiling, 3) Data staging and ETL, 4) data warehouse modeling & schema design. The state-of-theart purpose of the paper is to identify the reasons for data deficiencies, non-availability or reach ability problems at all the aforementioned stages of data warehousing and to formulate descriptive classification of these causes. We have identified possible set of causes of data quality issues from the extensive literature review and with consultation of the data warehouse practitioners working in renowned IT giants on India. We hope this will help developers & Implementers of warehouse to examine and analyze these issues before moving ahead for data integration and data warehouse solutions for quality decision oriented and business intelligence oriented applications. I. INTRODUCTION Data cleaning, also called data cleansing or scrubbing, deals with detecting and removing errors and inconsistencies from data in order to improve the quality of data. Data quality problems are present in single data collections, such as files and databases, e.g., due to misspellings during data entry, missing information or other invalid data. When multiple data sources need to be integrated, e.g., in data warehouses, federated database systems or global web-based information systems, the need for data cleaning increases significantly. This is because the sources often contain redundant data in different representations. In order to provide access to accurate and consistent data, consolidation of different data representations and elimination of duplicate information become necessary. Implementing a data warehouse infrastructure to support business intelligence (BI) can be a daunting challenge, as highlighted in recent years by the focus on how often data warehouse implementations fail. Data integration technologies such Informatica PowerCenter has gone a long way towards enabling organizations to successfully deliver their data warehouse projects under budget and with greater business adoption rates. Despite this some data warehouse implementation still fail to meet expectations because of a lack of attention to data quality. Moving data from its various sources into an easily accessible repository is only part of the challenge in delivering a data warehouse and BI. Without paying attention to the accuracy, consistency and timeliness of the data as part of the data integration lifecycle, BI quickly leads to poor decision-making, increased cost and lost opportunities. According to industry analyst fi rm Gartner, more than 50 percent of business intelligence and customer relationship management deployments will suffer limited acceptance, if not outright failure, due to lack of attention to data quality issues. The impact of poor data quality is far reaching and its affects are both tangible and intangible. If data quality problems are allowed to persist, your executives grow to mistrust the information in the data warehouse and will be reluctant to use it for decision-making. 7 Sources of Poor Data Quality In recent years, corporate scandals, regulatory changes, and the collapse of major financial institutions have brought much warranted attention to the quality of enterprise information. We have seen the rise and assimilation of tools and methodologies that promise to make data cleaner and more complete. Best practices have been developed and discussed in print and online. Data quality is no longer the domain of just the data warehouse. It is accepted as an enterprise responsibility. If we have the tools, experiences, and best practices, why, then, do we continue to struggle with the problem of data quality? IJIRT INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH IN TECHNOLOGY 265
2 The answer lies in the difficulty of truly understanding what quality data is and in quantifying the cost of bad data. It isn't always understood why or how to correct this problem because poor data quality presents itself in so many ways. We plug one hole in our system, only to find more problems elsewhere. If we can better understand the underlying sources of quality issues, then we can develop a plan of action to address the problem that is both proactive and strategic. Each instance of a quality issue presents challenges in both identifying where problems exist and in quantifying the extent of the problems. Quantifying the issues is important in order to determine where our efforts should be focused first. A large number of missing addresses may well be alarming but could present little impact if there is no process or plan for communicating by . It is imperative to understand the business requirements and to match them against the assessment of the problem at hand. Consider the following seven sources of data quality issues. 1. Entry quality: Did the information enter the system correctly at the origin? 2. Process quality: Was the integrity of the information maintained during processing through the system? 3. Identification quality: Are two similar objects identified correctly to be the same or different? 4. Integration quality: Is all the known information about an object integrated to the point of providing an accurate representation of the object? 5. Usage quality: Is the information used and interpreted correctly at the point of access? 6. Aging quality: Has enough time passed that the validity of the information can no longer be trusted? 7. Organizational quality: Can the same information be reconciled between two systems based on the way the organization constructs and views the data? A plan of action must account for each of these sources of error. Each case differs in its ease of detection and ease of correction. An examination of each of these sources reveals a varying amount of costs associated with each and inconsistent amounts of difficulty to address the problem. Entry Quality Entry quality is probably the easiest problem to identify but is often the most difficult to correct. Entry issues are usually caused by a person entering data into a system. The problem may be a typo or a willful decision, such as providing a dummy phone number or address. Identifying these outliers or missing data is easily accomplished with profiling tools or simple queries. The cost of entry problems depends on the use. If a phone number or address is used only for informational purposes, then the cost of its absence is probably low. If instead, a phone number is used for marketing and driving new sales, then opportunity cost may be significant over a major percentage of records. Addressing data quality at the source can be difficult. If data was sourced from a third party, there is usually little the organization can do. Likewise, applications that provide internal sources of data might be old and too expensive to modify. And there are few incentives for the clerks at the point of entry to obtain, verify, and enter every data point. Process Quality Process quality issues usually occur systematically as data is moved through an organization. They may result from a system crash, lost file, or any other technical occurrence that results from integrated systems. These issues are often difficult to identify, especially if the data has made a number of transformations on the way to its destination. Process quality can usually be remedied easily once the source of the problem is identified. Proper checks and quality control at each touchpoint along the path can help ensure that problems are rooted out, but these checks are often absent in legacy processes. Identification Quality Identification quality problems result from a failure to recognize the relationship between two objects. For example, two similar products with different SKUs are incorrectly judged to be the same. Identification quality may have significant associated costs, such as mailing the same household more than once. Data quality processes can largely eliminate this problem by matching records, identifying duplicates and placing a confidence score on the similarity of IJIRT INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH IN TECHNOLOGY 266
3 records. Ambiguously scored records can be reviewed and judged by a data steward. Still, the results are never perfect, and determining the proper business rules for matching can involve trial and error. Integration Quality Integration quality, or quality of completeness, can present big challenges for large organizations. Integration quality problems occur because information is isolated by system or departmental boundaries. It might be important for an auto claims adjuster to know that a customer is also a high-value life insurance customer, but if the auto and life insurance systems are not integrated, that information will not be available. While the desire to have integrated information may seem obvious, the reality is that it is not always apparent. Business users who are accustomed to working with one set of data may not be aware that other data exists or may not understand its value. Data governance programs that document and promote enterprise data can facilitate the development of data warehousing and master data management systems to address integration issues. MDM enables the process of identifying records from multiple systems that refer to the same entity. The records are then consolidated into a single master record. The data warehouse allows the transactional details related to that entity to be consolidated so that its behaviors and relationships across systems can be assessed and analyzed. Usage Quality Usage quality often presents itself when data warehouse developers lack access to legacy source documentation or subject matter experts. Without adequate guidance, they are left to guess the meaning and use of certain data elements. Another scenario occurs in organizations where users are given the tools to write their own queries or create their own reports. Incorrect usage may be difficult to detect and quantify in cost. Thorough documentation, robust metadata, and user training are helpful and should be built into any new initiative, but gaining support for a postimplementation metadata project can be difficult. Again, this is where a data governance program should be established and a grassroots effort made to identify and document corporate systems and data definitions. This metadata can be injected into systems and processes as it becomes part of the culture to do so. This may be more effective and realistic than a bigbang approach to metadata. Aging Quality The most challenging aspect of aging quality is determining at which point the information is no longer valid. Usually, such decisions are somewhat arbitrary and vary by usage. For example, maintaining a former customer's address for more than five years is probably not useful. If customers haven't been heard from in several years despite marketing efforts, how can we be certain they still live at the same address? At the same time, maintaining customer address information for a homeowner's insurance claim may be necessary and even required by law. Such decisions need to be made by the business owners and the rules should be architected into the solution. Many MDM tools provide a platform for implementing survivorship and aging rules. Organizational Quality Organizational quality, like entry quality, is easy to diagnose and sometimes very difficult to address. It shares much in common with process quality and integration quality but is less a technical problem than a systematic one that occurs in large organizations. Organizational issues occur when, for example, marketing tries to "tie" their calculations to finance. Financial reporting systems generally take an account view of information, which may be very different than how the company markets the product or tracks its customers. These business rules may be buried in many layers of code throughout multiple systems. However, the biggest challenge to reconciliation is getting the various departments to agree that their A equals the other's B equals the other's C plus D. A Strategic Approach The first step to developing a data strategy is to identify where quality problems exist. These issues are not always apparent, and it is important to develop methods for detection. A thorough approach requires inventorying the system, documenting the business and technical rules that affect data quality, and conducting data profiling and scoring activities that give us insight in the extent of the issues. After identifying the problem, it is important to assess the business impact and cost to the organization. The downstream effects are not always easy to quantify, especially when it is difficult to detect an issue in the first place. In addition, the cost associated with a IJIRT INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH IN TECHNOLOGY 267
4 particular issue may be small at a departmental level but much greater when viewed across the entire enterprise. The business impact will drive business involvement and investment in the effort. Finally, once we understand the issues and their impact on the organization, we can develop a plan of action. Data quality programs are multifaceted. A single tool or project is not the answer. Addressing data quality requires changes in the way we conduct our business and in our technology framework. It requires organizational commitment and long-term vision. The strategy for addressing data quality issues requires a blend of analysis, technology, and business involvement. When viewed from this perspective, an MDM program is an effective approach. MDM provides the framework for identifying quality problems, cleaning the data, and synchronizing it between systems. However, MDM by itself won't resolve all data quality issues. An active data governance program empowered by chief executives is essential to making the organizational changes necessary to achieve success. The data governance council should set the standards for quality and ensure that the right systems are in place for measurement. In addition, the company should establish incentives for both users and system developers to maintain the standards. The end result is an organization where attention to quality and excellence permeate the company. Such an approach to enterprise information quality takes dedication and requires a shift in the organization's mindset. However, the results are both achievable and profitable. Data Warehousing Data warehouses are one of the foundations of the Decision Support Systems of many IS operations. As defined by the father of data warehouse, William H. Inmon, a data warehouse is a collection of Integrated, Subject-Oriented, Non Volatile and Time Variant databases where each unit of data is specific to some period of time. Data Warehouses can contain detailed data, lightly summarized data and highly summarized data, all formatted for analysis and decision support (Inmon, 1996). In the Data Warehouse Toolkit, Ralph Kimball gives a more concise definition: a copy of transaction data specifically structured for query and analysis (Kimball, 1998). Both definitions stress the data warehouse s analysis focus, and highlight the historical nature of the data found in a data warehouse. Figure 2: Data Warehousing Structure Stages of Data Warehousing Susceptible to Data Quality Problems The purpose of paper here is to formulate a descriptive taxonomy of all the issues at all the stages of Data Warehousing. The phases are: Data Source Data Integration and Data Profiling Data Staging and ETL Database Scheme (Modeling) Quality of data can be compromised depending upon how data is received, entered, integrated, maintained, processed (Extracted, Transformed and Cleansed) and loaded. Data is impacted by numerous processes that bring data into your data environment, most of which affect its quality to some extent. All these phases of data warehousing are responsible for data quality in the data warehouse. Despite all the efforts, there still exists a certain percentage of dirty data. This residual dirty data should be reported, stating the reasons for the failure in data cleansing for the same. Data quality problems can occur in many different ways [9]. The most common include: Poor data handling procedures and processes. Failure to stick on to data entry and maintenance procedures. Errors in the migration process from one system to another. External and third-party data that may not fit with your company data standards or may otherwise be of unconvinced quality. The assumptions undertaken are that data quality issues can arise at any stage of data warehousing viz. in data sources, in data integration & profiling, in data staging, in ETL and database modeling. Following model is depicting the possible stages which are vulnerable of getting data quality problems IJIRT INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH IN TECHNOLOGY 268
5 Figure 1: Data Warehousing Structure II. CONCULSION Data quality is an increasingly serious issue for organizations large and small. It is central to all data integration initiatives. Before data can be used effectively in a data warehouse, or in customer relationship management, enterprise resource planning or business analytics applications, it needs to be analyzed and cleansed. To ensure high quality data is sustained, organizations need to apply ongoing data cleansing processes and procedures, and to monitor and track data quality levels over time. Otherwise poor data quality will lead to increased costs, breakdowns in the supply chain and inferior customer relationship management. Defective data also hampers business decision making and efforts to meet regulatory compliance responsibilities. The key to successfully addressing data quality is to get business professionals centrally involved in the process. Informatica Data Explorer and Informatica Data Quality are unique, easy-to-use data quality software products specifi cally designed to bridge the gap between and better align IT and the business, providing them with all they need to be able to control data quality processes enterprise-wide in order to reduce costs, increase revenue, and improve business decisionmaking. REFERENCES 1 A Descriptive Classification of Causes of Data Quality Problems in Data Warehousing by Ranjit Singh, Dr. Kawaljeet Sing, Research Scholar, University College of Engineering (UCoE), Punjabi University Patiala (Punjab), INDIA 2. 7 Sources of Poor Data Quality, 611/2.htm 3. Data Quality in Data Warehouse & Business Intelligence Informatica Data Quality and Informatica Data Explorer 4. Channah F. Naiman, Aris M. Ouksel (1995) AClassification of Semantic Conflicts in Heterogeneous Database Systems, Journal of Organizational Computing, Vol. 5, John Hess (1998), Dealing With Missing Values In The Data Warehouse A Report of Stonebridge Technologies, Inc (1998). 6. Jaideep Srivastava, Ping-Yao Chen (1999) Warehouse Creation-A Potential Roadblock to Data Warehousing, IEEE Transactions on Knowledge and Data Engineering January/February 1999 (Vol. 11, No. 1) pp. 7. Amit Rudra and Emilie Yeo (1999) Key Issues in Achieving Data Quality and Consistency in Data Warehousing among Large Organizations in Australia, Proceedings of the 32nd Hawaii International Conference on System Sciences Jesús Bisbal et all (1999) Legacy Information Systems: Issues and Directions, IEEE Software September/ October Scott W. Ambler (2001) Challenges with legacy data: Knowing your data enemy is the first step in overcoming it, Practice Leader, Agile Development, Rational Methods Group, IBM, 01 Jul Scott W. Ambler The Joy of Legacy Data available at: l IJIRT INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH IN TECHNOLOGY 269
Proper study of Data Warehousing and Data Mining Intelligence Application in Education Domain
Journal of The International Association of Advanced Technology and Science Proper study of Data Warehousing and Data Mining Intelligence Application in Education Domain AMAN KADYAAN JITIN Abstract Data-driven
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING Ramesh Babu Palepu 1, Dr K V Sambasiva Rao 2 Dept of IT, Amrita Sai Institute of Science & Technology 1 MVR College of Engineering 2 [email protected]
Data Quality in Data warehouse: problems and solution
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 16, Issue 1, Ver. IV (Jan. 2014), PP 18-24 Data Quality in Data warehouse: problems and solution *Rahul Kumar
An Introduction to Data Warehousing. An organization manages information in two dominant forms: operational systems of
An Introduction to Data Warehousing An organization manages information in two dominant forms: operational systems of record and data warehouses. Operational systems are designed to support online transaction
Why Data Governance - 1 -
Data Governance Why Data Governance - 1 - Industry: Lack of Data Governance is a Key Issue Faced During Projects As projects address process improvements, they encounter unidentified data processes that
JOURNAL OF OBJECT TECHNOLOGY
JOURNAL OF OBJECT TECHNOLOGY Online at www.jot.fm. Published by ETH Zurich, Chair of Software Engineering JOT, 2008 Vol. 7, No. 8, November-December 2008 What s Your Information Agenda? Mahesh H. Dodani,
Enable Business Agility and Speed Empower your business with proven multidomain master data management (MDM)
Enable Business Agility and Speed Empower your business with proven multidomain master data management (MDM) Customer Viewpoint By leveraging a well-thoughtout MDM strategy, we have been able to strengthen
Data Warehousing: A Technology Review and Update Vernon Hoffner, Ph.D., CCP EntreSoft Resouces, Inc.
Warehousing: A Technology Review and Update Vernon Hoffner, Ph.D., CCP EntreSoft Resouces, Inc. Introduction Abstract warehousing has been around for over a decade. Therefore, when you read the articles
Making Business Intelligence Easy. Whitepaper Measuring data quality for successful Master Data Management
Making Business Intelligence Easy Whitepaper Measuring data quality for successful Master Data Management Contents Overview... 3 What is Master Data Management?... 3 Master Data Modeling Approaches...
Whitepaper Data Governance Roadmap for IT Executives Valeh Nazemoff
Whitepaper Data Governance Roadmap for IT Executives Valeh Nazemoff The Challenge IT Executives are challenged with issues around data, compliancy, regulation and making confident decisions on their business
Busting 7 Myths about Master Data Management
Knowledge Integrity Incorporated Busting 7 Myths about Master Data Management Prepared by: David Loshin Knowledge Integrity, Inc. August, 2011 Sponsored by: 2011 Knowledge Integrity, Inc. 1 (301) 754-6350
Build an effective data integration strategy to drive innovation
IBM Software Thought Leadership White Paper September 2010 Build an effective data integration strategy to drive innovation Five questions business leaders must ask 2 Build an effective data integration
Data Quality Assessment. Approach
Approach Prepared By: Sanjay Seth Data Quality Assessment Approach-Review.doc Page 1 of 15 Introduction Data quality is crucial to the success of Business Intelligence initiatives. Unless data in source
Three Fundamental Techniques To Maximize the Value of Your Enterprise Data
Three Fundamental Techniques To Maximize the Value of Your Enterprise Data Prepared for Talend by: David Loshin Knowledge Integrity, Inc. October, 2010 2010 Knowledge Integrity, Inc. 1 Introduction Organizations
The following is intended to outline our general product direction. It is intended for informational purposes only, and may not be incorporated into
The following is intended to outline our general product direction. It is intended for informational purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any
Enabling Data Quality
Enabling Data Quality Establishing Master Data Management (MDM) using Business Architecture supported by Information Architecture & Application Architecture (SOA) to enable Data Quality. 1 Background &
Enterprise Data Quality
Enterprise Data Quality An Approach to Improve the Trust Factor of Operational Data Sivaprakasam S.R. Given the poor quality of data, Communication Service Providers (CSPs) face challenges of order fallout,
DATA GOVERNANCE AND DATA QUALITY
DATA GOVERNANCE AND DATA QUALITY Kevin Lewis Partner Enterprise Management COE Barb Swartz Account Manager Teradata Government Systems Objectives of the Presentation Show that Governance and Quality are
Five Fundamental Data Quality Practices
Five Fundamental Data Quality Practices W H I T E PA P E R : DATA QUALITY & DATA INTEGRATION David Loshin WHITE PAPER: DATA QUALITY & DATA INTEGRATION Five Fundamental Data Quality Practices 2 INTRODUCTION
Enterprise Information Management Capability Maturity Survey for Higher Education Institutions
Enterprise Information Management Capability Maturity Survey for Higher Education Institutions Dr. Hébert Díaz-Flores Chief Technology Architect University of California, Berkeley August, 2007 Instructions
OPTIMUS SBR. Optimizing Results with Business Intelligence Governance CHOICE TOOLS. PRECISION AIM. BOLD ATTITUDE.
OPTIMUS SBR CHOICE TOOLS. PRECISION AIM. BOLD ATTITUDE. Optimizing Results with Business Intelligence Governance This paper investigates the importance of establishing a robust Business Intelligence (BI)
Deriving Business Intelligence from Unstructured Data
International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 9 (2013), pp. 971-976 International Research Publications House http://www. irphouse.com /ijict.htm Deriving
A Review of Contemporary Data Quality Issues in Data Warehouse ETL Environment
DOI: 10.15415/jotitt.2014.22021 A Review of Contemporary Data Quality Issues in Data Warehouse ETL Environment Rupali Gill 1, Jaiteg Singh 2 1 Assistant Professor, School of Computer Sciences, 2 Associate
SAP BusinessObjects Information Steward
SAP BusinessObjects Information Steward Michael Briles Senior Solution Manager Enterprise Information Management SAP Labs LLC June, 2011 Agenda Challenges with Data Quality and Collaboration Product Vision
Knowledgent White Paper Series. Developing an MDM Strategy WHITE PAPER. Key Components for Success
Developing an MDM Strategy Key Components for Success WHITE PAPER Table of Contents Introduction... 2 Process Considerations... 3 Architecture Considerations... 5 Conclusion... 9 About Knowledgent... 10
Getting started with a data quality program
IBM Software White Paper Information Management Getting started with a data quality program 2 Getting started with a data quality program The data quality challenge Organizations depend on quality data
Assessing and implementing a Data Governance program in an organization
Assessing and implementing a Data Governance program in an organization Executive Summary As companies realize the importance of data and the challenges they face in integrating the data from various sources,
A Survey on Data Warehouse Architecture
A Survey on Data Warehouse Architecture Rajiv Senapati 1, D.Anil Kumar 2 1 Assistant Professor, Department of IT, G.I.E.T, Gunupur, India 2 Associate Professor, Department of CSE, G.I.E.T, Gunupur, India
Turkish Journal of Engineering, Science and Technology
Turkish Journal of Engineering, Science and Technology 03 (2014) 106-110 Turkish Journal of Engineering, Science and Technology journal homepage: www.tujest.com Integrating Data Warehouse with OLAP Server
Lection 3-4 WAREHOUSING
Lection 3-4 DATA WAREHOUSING Learning Objectives Understand d the basic definitions iti and concepts of data warehouses Understand data warehousing architectures Describe the processes used in developing
Enterprise Solutions. Data Warehouse & Business Intelligence Chapter-8
Enterprise Solutions Data Warehouse & Business Intelligence Chapter-8 Learning Objectives Concepts of Data Warehouse Business Intelligence, Analytics & Big Data Tools for DWH & BI Concepts of Data Warehouse
Harness the value of information throughout the enterprise. IBM InfoSphere Master Data Management Server. Overview
IBM InfoSphere Master Data Management Server Overview Master data management (MDM) allows organizations to generate business value from their most important information. Managing master data, or key business
Master Data Management. Zahra Mansoori
Master Data Management Zahra Mansoori 1 1. Preference 2 A critical question arises How do you get from a thousand points of data entry to a single view of the business? We are going to answer this question
APPROACH TO EIM. Bonnie O Neil, Gambro-BCT Mike Fleckenstein, PPC
USING A FRAMEWORK APPROACH TO EIM Bonnie O Neil, Gambro-BCT Mike Fleckenstein, PPC AGENDA The purpose of an EIM Framework Overview of Gartner's Framework Elements of an EIM strategy t Implementation of
Hadoop Data Hubs and BI. Supporting the migration from siloed reporting and BI to centralized services with Hadoop
Hadoop Data Hubs and BI Supporting the migration from siloed reporting and BI to centralized services with Hadoop John Allen October 2014 Introduction John Allen; computer scientist Background in data
Creating a Business Intelligence Competency Center to Accelerate Healthcare Performance Improvement
Creating a Business Intelligence Competency Center to Accelerate Healthcare Performance Improvement Bruce Eckert, National Practice Director, Advisory Group Ramesh Sakiri, Executive Consultant, Healthcare
Point of View: FINANCIAL SERVICES DELIVERING BUSINESS VALUE THROUGH ENTERPRISE DATA MANAGEMENT
Point of View: FINANCIAL SERVICES DELIVERING BUSINESS VALUE THROUGH ENTERPRISE DATA MANAGEMENT THROUGH ENTERPRISE DATA MANAGEMENT IN THIS POINT OF VIEW: PAGE INTRODUCTION: A NEW PATH TO DATA ACCURACY AND
Effecting Data Quality Improvement through Data Virtualization
Effecting Data Quality Improvement through Data Virtualization Prepared for Composite Software by: David Loshin Knowledge Integrity, Inc. June, 2010 2010 Knowledge Integrity, Inc. Page 1 Introduction The
MDM and Data Warehousing Complement Each Other
Master Management MDM and Warehousing Complement Each Other Greater business value from both 2011 IBM Corporation Executive Summary Master Management (MDM) and Warehousing (DW) complement each other There
Compunnel. Business Intelligence, Master Data Management & Compliance (Healthcare) Largest Health Insurance Company in New Jersey.
Compunnel Business Intelligence, Master Data Management & Compliance (Healthcare) Largest Health Insurance Company in New Jersey Business Intelligence, Master Data Management & Compliance (Healthcare)
An RCG White Paper The Data Governance Maturity Model
The Dataa Governance Maturity Model This document is the copyrighted and intellectual property of RCG Global Services (RCG). All rights of use and reproduction are reserved by RCG and any use in full requires
Paper DM10 SAS & Clinical Data Repository Karthikeyan Chidambaram
Paper DM10 SAS & Clinical Data Repository Karthikeyan Chidambaram Cognizant Technology Solutions, Newbury Park, CA Clinical Data Repository (CDR) Drug development lifecycle consumes a lot of time, money
IBM Analytics Make sense of your data
Using metadata to understand data in a hybrid environment Table of contents 3 The four pillars 4 7 Trusting your information: A business requirement 7 9 Helping business and IT talk the same language 10
www.ijreat.org Published by: PIONEER RESEARCH & DEVELOPMENT GROUP (www.prdg.org) 28
Data Warehousing - Essential Element To Support Decision- Making Process In Industries Ashima Bhasin 1, Mr Manoj Kumar 2 1 Computer Science Engineering Department, 2 Associate Professor, CSE Abstract SGT
CHAPTER 1 INTRODUCTION
CHAPTER 1 INTRODUCTION 1. Introduction 1.1 Data Warehouse In the 1990's as organizations of scale began to need more timely data for their business, they found that traditional information systems technology
Data Warehouse Overview. Srini Rengarajan
Data Warehouse Overview Srini Rengarajan Please mute Your cell! Agenda Data Warehouse Architecture Approaches to build a Data Warehouse Top Down Approach Bottom Up Approach Best Practices Case Example
8 Characteristics of a Successful Data Warehouse
Abstract Paper TU01 8 Characteristics of a Successful Data Warehouse Marty Brown, Lucid Analytics Corp Most organizations are well aware that a solid data warehouse serves as the foundation from which
Business Intelligence
Transforming Information into Business Intelligence Solutions Business Intelligence Client Challenges The ability to make fast, reliable decisions based on accurate and usable information is essential
A discussion of information integration solutions November 2005. Deploying a Center of Excellence for data integration.
A discussion of information integration solutions November 2005 Deploying a Center of Excellence for data integration. Page 1 Contents Summary This paper describes: 1 Summary 1 Introduction 2 Mastering
Building a Data Quality Scorecard for Operational Data Governance
Building a Data Quality Scorecard for Operational Data Governance A White Paper by David Loshin WHITE PAPER Table of Contents Introduction.... 1 Establishing Business Objectives.... 1 Business Drivers...
Bringing agility to Business Intelligence Metadata as key to Agile Data Warehousing. 1 P a g e. www.analytixds.com
Bringing agility to Business Intelligence Metadata as key to Agile Data Warehousing 1 P a g e Table of Contents What is the key to agility in Data Warehousing?... 3 The need to address requirements completely....
Why is Master Data Management getting both Business and IT Attention in Today s Challenging Economic Environment?
Why is Master Data Management getting both Business and IT Attention in Today s Challenging Economic Environment? How Can You Gear-up For Your MDM initiative? Tamer Chavusholu, Enterprise Solutions Practice
Master Data Management
Master Data Management David Loshin AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO Ик^И V^ SAN FRANCISCO SINGAPORE SYDNEY TOKYO W*m k^ MORGAN KAUFMANN PUBLISHERS IS AN IMPRINT OF ELSEVIER
Successful Outsourcing of Data Warehouse Support
Experience the commitment viewpoint Successful Outsourcing of Data Warehouse Support Focus IT management on the big picture, improve business value and reduce the cost of data Data warehouses can help
Business Intelligence for the Chief Data Officer
Aug 20, 2014 DAMA - CHICAGO Business Intelligence for the Chief Data Officer Don Soulsby Sandhill Consultants Who we are: Sandhill Consultants Sandhill is a global company servicing the data, process modeling
Data virtualization: Delivering on-demand access to information throughout the enterprise
IBM Software Thought Leadership White Paper April 2013 Data virtualization: Delivering on-demand access to information throughout the enterprise 2 Data virtualization: Delivering on-demand access to information
Explore the Possibilities
Explore the Possibilities 2013 HR Service Delivery Forum Best Practices in Data Management: Creating a Sustainable and Robust Repository for Reporting and Insights 2013 Towers Watson. All rights reserved.
POLAR IT SERVICES. Business Intelligence Project Methodology
POLAR IT SERVICES Business Intelligence Project Methodology Table of Contents 1. Overview... 2 2. Visualize... 3 3. Planning and Architecture... 4 3.1 Define Requirements... 4 3.1.1 Define Attributes...
White Paper. Thirsting for Insight? Quench It With 5 Data Management for Analytics Best Practices.
White Paper Thirsting for Insight? Quench It With 5 Data Management for Analytics Best Practices. Contents Data Management: Why It s So Essential... 1 The Basics of Data Preparation... 1 1: Simplify Access
Data Governance: A Business Value-Driven Approach
Global Excellence Governance: A Business Value-Driven Approach A White Paper by Dr Walid el Abed CEO Trusted Intelligence Contents Executive Summary......................................................3
Data Governance: The Lynchpin of Effective Information Management
by John Walton Senior Delivery Manager, 972-679-2336 [email protected] Data Governance: The Lynchpin of Effective Information Management Data governance refers to the organization bodies, rules, decision
5 Best Practices for SAP Master Data Governance
5 Best Practices for SAP Master Data Governance By David Loshin President, Knowledge Integrity, Inc. Sponsored by Winshuttle, LLC 2012 Winshuttle, LLC. All rights reserved. 4/12 www.winshuttle.com Introduction
ORACLE ENTERPRISE DATA QUALITY PRODUCT FAMILY
ORACLE ENTERPRISE DATA QUALITY PRODUCT FAMILY The Oracle Enterprise Data Quality family of products helps organizations achieve maximum value from their business critical applications by delivering fit
Framework for Data warehouse architectural components
Framework for Data warehouse architectural components Author: Jim Wendt Organization: Evaltech, Inc. Evaltech Research Group, Data Warehousing Practice. Date: 04/08/11 Email: [email protected] Abstract:
Tapping the benefits of business analytics and optimization
IBM Sales and Distribution Chemicals and Petroleum White Paper Tapping the benefits of business analytics and optimization A rich source of intelligence for the chemicals and petroleum industries 2 Tapping
Business Performance & Data Quality Metrics. David Loshin Knowledge Integrity, Inc. [email protected] (301) 754-6350
Business Performance & Data Quality Metrics David Loshin Knowledge Integrity, Inc. [email protected] (301) 754-6350 1 Does Data Integrity Imply Business Value? Assumption: improved data quality,
IBM Software A Journey to Adaptive MDM
IBM Software A Journey to Adaptive MDM What is Master Data? Why is it Important? A Journey to Adaptive MDM Contents 2 MDM Business Drivers and Business Value 4 MDM is a Journey 7 IBM MDM Portfolio An Adaptive
Measure Your Data and Achieve Information Governance Excellence
SAP Brief SAP s for Enterprise Information Management SAP Information Steward Objectives Measure Your Data and Achieve Information Governance Excellence A single solution for managing enterprise data quality
THE QUALITY OF DATA AND METADATA IN A DATAWAREHOUSE
THE QUALITY OF DATA AND METADATA IN A DATAWAREHOUSE Carmen Răduţ 1 Summary: Data quality is an important concept for the economic applications used in the process of analysis. Databases were revolutionized
Top 10 Business Intelligence (BI) Requirements Analysis Questions
Top 10 Business Intelligence (BI) Requirements Analysis Questions Business data is growing exponentially in volume, velocity and variety! Customer requirements, competition and innovation are driving rapid
Unveiling the Business Value of Master Data Management
:White 1 Unveiling the Business Value of (MDM) is a response to the fact that after a decade of enterprise application integration, enterprise information integration, and enterprise Data warehousing most
Finding, Fixing and Preventing Data Quality Issues in Financial Institutions Today
Finding, Fixing and Preventing Data Quality Issues in Financial Institutions Today FIS Consulting Services 800.822.6758 Introduction Without consistent and reliable data, accurate reporting and sound decision-making
IBM InfoSphere Discovery: The Power of Smarter Data Discovery
IBM InfoSphere Discovery: The Power of Smarter Data Discovery Gerald Johnson IBM Client Technical Professional [email protected] 2010 IBM Corporation Objectives To obtain a basic understanding of the
IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH
IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH Kalinka Mihaylova Kaloyanova St. Kliment Ohridski University of Sofia, Faculty of Mathematics and Informatics Sofia 1164, Bulgaria
Data Governance: A Business Value-Driven Approach
Data Governance: A Business Value-Driven Approach A White Paper by Dr. Walid el Abed CEO January 2011 Copyright Global Data Excellence 2011 Contents Executive Summary......................................................3
Service Oriented Data Management
Service Oriented Management Nabin Bilas Integration Architect Integration & SOA: Agenda Integration Overview 5 Reasons Why Is Critical to SOA Oracle Integration Solution Integration
BUSINESS INTELLIGENCE AS SUPPORT TO KNOWLEDGE MANAGEMENT
ISSN 1804-0519 (Print), ISSN 1804-0527 (Online) www.academicpublishingplatforms.com BUSINESS INTELLIGENCE AS SUPPORT TO KNOWLEDGE MANAGEMENT JELICA TRNINIĆ, JOVICA ĐURKOVIĆ, LAZAR RAKOVIĆ Faculty of Economics
Chapter 6 Basics of Data Integration. Fundamentals of Business Analytics RN Prasad and Seema Acharya
Chapter 6 Basics of Data Integration Fundamentals of Business Analytics Learning Objectives and Learning Outcomes Learning Objectives 1. Concepts of data integration 2. Needs and advantages of using data
Riversand Technologies, Inc. Powering Accurate Product Information PIM VS MDM VS PLM. A Riversand Technologies Whitepaper
Riversand Technologies, Inc. Powering Accurate Product Information PIM VS MDM VS PLM A Riversand Technologies Whitepaper Table of Contents 1. PIM VS PLM... 3 2. Key Attributes of a PIM System... 5 3. General
Logical Modeling for an Enterprise MDM Initiative
Logical Modeling for an Enterprise MDM Initiative Session Code TP01 Presented by: Ian Ahern CEO, Profisee Group Copyright Speaker Bio Started career in the City of London: Management accountant Finance,
How Global Data Management (GDM) within J&J Pharma is SAVE'ing its Data. Craig Pusczko & Chris Henderson
How Global Data Management (GDM) within J&J Pharma is SAVE'ing its Data Craig Pusczko & Chris Henderson Abstract See how J&J Pharma organizational alignment drove the evolution of Global Data Management
Data Governance for Master Data Management and Beyond
Data Governance for Master Data Management and Beyond A White Paper by David Loshin WHITE PAPER Table of Contents Aligning Information Objectives with the Business Strategy.... 1 Clarifying the Information
NCOE whitepaper Master Data Deployment and Management in a Global ERP Implementation
NCOE whitepaper Master Data Deployment and Management in a Global ERP Implementation Market Offering: Package(s): Oracle Authors: Rick Olson, Luke Tay Date: January 13, 2012 Contents Executive summary
Master Your Data and Your Business Using Informatica MDM. Ravi Shankar Sr. Director, MDM Product Marketing
Master Your and Your Business Using Informatica MDM Ravi Shankar Sr. Director, MDM Product Marketing 1 Driven Enterprise Timely Trusted Relevant 2 Agenda Critical Business Imperatives Addressed by MDM
Data Quality Assurance
CHAPTER 4 Data Quality Assurance The previous chapters define accurate data. They talk about the importance of data and in particular the importance of accurate data. They describe how complex the topic
Healthcare Data Management
Healthcare Data Management Expanding Insight, Increasing Efficiency, Improving Care WHITE PAPER This document contains Confidential, Proprietary and Trade Secret Information ( Confidential Information
Enterprise Information Management and Business Intelligence Initiatives at the Federal Reserve. XXXIV Meeting on Central Bank Systematization
Enterprise Information Management and Business Intelligence Initiatives at the Federal Reserve Kenneth Buckley Associate Director Division of Reserve Bank Operations and Payment Systems XXXIV Meeting on
DATA QUALITY MATURITY
3 DATA QUALITY MATURITY CHAPTER OUTLINE 3.1 The Data Quality Strategy 35 3.2 A Data Quality Framework 38 3.3 A Data Quality Capability/Maturity Model 42 3.4 Mapping Framework Components to the Maturity
Make the maturity model part of the effort to educate senior management, so they understand the phases of the EIM journey.
Research Publication Date: 5 December 2008 ID Number: G00160425 Gartner Introduces the EIM Maturity Model David Newman, Debra Logan Organizations cannot implement enterprise information management (EIM)
A Model-based Software Architecture for XML Data and Metadata Integration in Data Warehouse Systems
Proceedings of the Postgraduate Annual Research Seminar 2005 68 A Model-based Software Architecture for XML and Metadata Integration in Warehouse Systems Abstract Wan Mohd Haffiz Mohd Nasir, Shamsul Sahibuddin
Existing Technologies and Data Governance
Existing Technologies and Data Governance Adriaan Veldhuisen Product Manager Privacy & Security Teradata, a Division of NCR 10 June, 2004 San Francisco, CA 6/10/04 1 My Assumptions for Data Governance
Why You Still Need to Master Your Data Before You Master Your Business (Intelligence) Business Imperatives Addressed By Reliable, Integrated View
Why You Still Need to Master Your Data Before You Master Your Business (Intelligence) Business Imperatives Addressed By Reliable, Integrated View David Jordan Data Management Product Specialist 1 2 A simple
Measuring and Monitoring the Quality of Master Data By Thomas Ravn and Martin Høedholt, November 2008
Measuring and Monitoring the Quality of Master Data By Thomas Ravn and Martin Høedholt, November 2008 Introduction We ve all heard about the importance of data quality in our IT-systems and how the data
