What You Don t Know Does Hurt You: Five Critical Risk Factors in Data Warehouse Quality. An Infogix White Paper



Similar documents
Corporate Governance and Compliance: Could Data Quality Be Your Downfall?

Accenture Federal Services. Federal Solutions for Asset Lifecycle Management

White Paper The Benefits of Business Intelligence Standardization

Master Data Management and Data Warehousing. Zahra Mansoori

HYPERION MASTER DATA MANAGEMENT SOLUTIONS FOR IT

Data Quality Assessment. Approach

Next-Generation IT Asset Management: Transform IT with Data-Driven ITAM

Business Usage Monitoring for Teradata

Increase Business Intelligence Infrastructure Responsiveness and Reliability Using IT Automation

Sage ERP Solutions. Ten Signs You Need a New Solution. Have You Outgrown Your Small Business Accounting Software?

Service Oriented Data Management

White Paper. Data Quality: Improving the Value of Your Data

A TECHNICAL WHITE PAPER ATTUNITY VISIBILITY

Automated Business Intelligence

Implementing Oracle BI Applications during an ERP Upgrade

The New Jersey Enterprise Data Warehouse. State of New Jersey

Data Quality: Improving the Value of Your Data. White Paper

Management Update: The Cornerstones of Business Intelligence Excellence

Corralling Data for Business Insights. The difference data relationship management can make. Part of the Rolta Managed Services Series

The Importance of Data Quality for Intelligent Data Analytics:

MDM and Data Warehousing Complement Each Other

IT Outsourcing s 15% Problem:

Beyond the Single View with IBM InfoSphere

Sarbanes-Oxley Compliance for Cloud Applications

Whitepaper Data Governance Roadmap for IT Executives Valeh Nazemoff

Building the Bullet-Proof MDM Program

Torquex Customer Engagement Analytics. End to End View of Customer Interactions and Operational Insights

Why Most Big Data Projects Fail

SAP ERP FINANCIALS ENABLING FINANCIAL EXCELLENCE. SAP Solution Overview SAP Business Suite

SAP Thought Leadership Business Intelligence IMPLEMENTING BUSINESS INTELLIGENCE STANDARDS SAVE MONEY AND IMPROVE BUSINESS INSIGHT

Business Intelligence: Using Data for More Than Analytics

Making Business Intelligence Easy. Whitepaper Measuring data quality for successful Master Data Management

What to Look for When Selecting a Master Data Management Solution

Five Fundamental Data Quality Practices

Outperform Financial Objectives and Enable Regulatory Compliance

ElegantJ BI. White Paper. Considering the Alternatives Business Intelligence Solutions vs. Spreadsheets

BUSINESS INTELLIGENCE. Keywords: business intelligence, architecture, concepts, dashboards, ETL, data mining

Top Ten Keys to Gaining Enterprise Configuration Visibility TM WHITEPAPER

Support the Era of the App with End-to-End Network and Application Performance Visibility

Avalara Tax - The Perfect ERP Software For Your Business

Table of Contents. CHAPTER 1 The Struggle is Real. CHAPTER 2 Why this Approach Doesn t Always Work. CHAPTER 3 Why BI Projects Fail

Business Intelligence Solutions for Gaming and Hospitality

Serena Dimensions CM. Develop your enterprise applications collaboratively securely and efficiently SOLUTION BRIEF

Creating a Business Intelligence Competency Center to Accelerate Healthcare Performance Improvement

SAP BusinessObjects SOLUTIONS FOR ORACLE ENVIRONMENTS

BI and ETL Process Management Pain Points

BUSINESSOBJECTS DATA INTEGRATOR

!!!!! White Paper. Understanding The Role of Data Governance To Support A Self-Service Environment. Sponsored by

Management Accountants and IT Professionals providing Better Information = BI = Business Intelligence. Peter Simons peter.simons@cimaglobal.

BENEFITS OF AUTOMATING DATA WAREHOUSING

Measure Your Data and Achieve Information Governance Excellence

Data Warehousing Systems: Foundations and Architectures

Contact Center Analytics Primer

Data Warehouse Overview. Srini Rengarajan

Information Governance

Why Nonprofits Need Nonprofit Accounting Software

Informatica Master Data Management

Oracle Data Integrator 12c (ODI12c) - Powering Big Data and Real-Time Business Analytics. An Oracle White Paper October 2013

Getting started with a data quality program

Emptoris Contract Management for Healthcare HIPAA Compliance

STERLING COMMERCE WHITE PAPER. Four Keys to Effectively Monitor and Control Secure File Transfer

BUSINESSOBJECTS DATA INTEGRATOR

BRIDGE. the gaps between IT, cloud service providers, and the business. IT service management for the cloud. Business white paper

Compliance Management, made easy

The Business Value of e-invoicing

Improving sales effectiveness in the quote-to-cash process

Dashboards PRESENTED BY: Quaid Saifee Director, WIT Inc.

How to Create a Business Focused Data Quality Assessment. Dylan Jones, Editor/Community Manager editor@dataqualitypro.com

RapidDecision Making Business Intelligence Work Best

Four keys to effectively monitor and control secure file transfer

Best Practices in Contract Migration

Common Pitfalls in Implementing Application Performance Management

Next Generation Business Performance Management Solution

FIREWALL CLEANUP WHITE PAPER

Software License Asset Management (SLAM) Part 1

Enterprise Information Flow

Top 10 Root Causes of Data Quality Problems. White Paper

QAD Business Intelligence

Business Intelligence Solution for Small and Midsize Enterprises (BI4SME)

Coverity White Paper. Reduce Your Costs: Eliminate Critical Security Vulnerabilities with Development Testing

Data Quality for BASEL II

OPERA BI OPERA BUSINESS. With Enterprise and Standard Editions INTELLIGENCE SUITE

Delivering Real-Time Business Value for Aerospace and Defense SAP Business Suite Powered by SAP HANA

IBM Software Five steps to successful application consolidation and retirement

Four Methods to Monetize Service Assurance Monitoring Data

CI for BI. How the Business Intelligence Industry can benefit from Continuous Integration. by Lance Hankins CTO, Motio, Inc.

CIOSPOTLIGHT. Business Intelligence. Fulfilling the Promise of

Introduction. By Santhosh Patil, Infogix Inc.

Enterprise Data Quality Dashboards and Alerts: Holistic Data Quality

Cyber Governance Preparing for the Inevitable Perimeter Breach

archives: no longer fit for purpose?

IBM Software A Journey to Adaptive MDM

CHAPTER SIX DATA. Business Intelligence The McGraw-Hill Companies, All Rights Reserved

case study Core Security Technologies Summary Introductory Overview ORGANIZATION: PROJECT NAME:

The Advantages of a Golden Record in Customer Master Data Management. January 2015

JOURNAL OF OBJECT TECHNOLOGY

Finding insight through data collection and linkage. Develop a better understanding of the consumer through consolidated and accurate data

Boosting enterprise security with integrated log management

DRIVING SUCCESS 8 BEST PRACTICES FOR EASY EMPLOYEE EXPENSE TRACKING

Is your Contract Management just Good Enough?

Transcription:

What You Don t Know Does Hurt You: Five Critical Risk Factors in Data Warehouse Quality

Executive Summary Data warehouses are becoming increasingly large, increasingly complex and increasingly important to the businesses that implement them. They are becoming increasingly large and complex because they are drawing more data from a greater number of more diverse sources across the enterprise to create larger, richer assemblages of both alphanumeric and financial information. They are becoming increasingly important to the business because they are being leveraged by a wider range of users to support a greater number of decisions that impact the bottom line every day. That s why it s essential for businesses to rigorously ensure the quality of the data in their data warehouses. If they don t ensure this data quality, users will make faulty decisions based on incorrect data. Over time, their confidence in the data will erode to the point where they won t use the business intelligence tools and other applications that rely on the data warehouse which mean huge investments in IT will be wasted. Just as important, any business using financial data in its data warehouse must be able to withstand the scrutiny of auditors and regulators. In other words, without effective data quality management measures in place to support their data warehouses, businesses will remain highly vulnerable operational, financial and regulatory risks. Unfortunately, few companies have adequate safeguards in place for their data warehouses today. They may have conventional tools in place for validating certain types of data (such as customer names and addresses) once they re in the data warehouse, but they lack the controls necessary to prevent bad data from getting there in the first place, to properly validate financial data, to discover and remediate the root-causes of chronic data quality problems, or to document data quality management measures to third parties such as auditors and regulators. This white paper exposes five of the top risk factors associated with today s complex data warehouses. It also outlines a strategy for addressing those risk factors and others. By understanding these risk factors and taking informed action to eliminate them, businesses will be able to avoid wasteful spending, improve total performance, and better maintain regulatory compliance. 2006, Infogix, Inc. All Rights Reserved. Page 2 of 9

What You Don t Know Does Hurt You Data warehouses play central role in corporate information strategies. Rather than managing enterprise information resources in disparate, fragmented systems, CIOs have learned that they re better off consolidating data into a unified data warehouse environment from which it can be appropriately sliced and diced to meet the various needs of various types of business users. This unified approach has been particularly important in the rise of Business Intelligence (BI) as a strategic technology for leveraging information in order to more effectively optimize business performance and capitalize on emerging market opportunities. In fact, according to market research firm IDC, the worldwide data warehouse market is expected to grow to $13.5 billion in 2009 at a nine percent compound annual growth rate. Several other recent studies indicate that data warehousing is an active technology initiative at more than three-quarters of all corporate IT organizations. At the same time as they are growing in importance, data warehouses are reaching dizzying heights of scale and complexity. Data from more and more different sources across the enterprise is being pulled into the data warehouse in order to achieve goals such as a single version of the truth and/or a 360-degree view of the customer. In addition, the volume of data being generated by these various sources is staggering as call centers track every single customer interaction and point-of-sale systems track every in-store transaction. The problem for many companies is that their ability to safeguard the quality of this data has not grown at the same pace as its scale, complexity or importance. Most IT organizations are still entirely dependent on conventional data quality tools that, while useful, don t address the specific problems associated with data warehouses that draw from a wide range of diverse source applications. For one thing, most quality initiatives have historically focused on customer data. They have therefore been designed to ensure the consistency of alpha information rather than the accuracy of numeric information. Also, such initiatives have almost always centered on the data as it exists once it is already within the warehouse rather than ensuring that it jibes with the source systems from which it is continually being drawn. As a result, most data warehouses are extremely susceptible to data quality problems. These data quality problems have serious consequences, including: Financial Losses Due to Impaired Business Performance When marketers, supply-chain managers and finance departments act upon inaccurate or flawed data, the business suffers. Mailings are sent to non-existent prospects. Products aren t on the shelves when and where customers need them. Capital gets poorly allocated. The Data Warehousing Institute actually estimates that companies lose more than $600 billion every year because of these data quality problems through lost productivity, lost revenue and lost customers. 2006, Infogix, Inc. All Rights Reserved. Page 3 of 9

Reduced Use and Reduced ROI for IT Investments It doesn t take many bad experiences to turn users off to an IT system, especially if the consequences of that experience are significant. Bad data can therefore quickly reduce utilization of data warehouses and the various resources they support including BI, dashboards, CRM applications and business performance management (BPM) tools. This robs the business of the potential benefits of these systems and quickly erodes total returns on the sizeable investments IT makes in their development and maintenance. Legal and Regulatory Risk When data warehouses play a role in corporate reporting on finance and operations, it becomes essential to ensure the validity of the data they contain. The consequences of even relatively small data problems in these cases can include costly restatements, loss of investor confidence, damage to corporate reputation, financial penalties from regulatory agencies, and even possible criminal proceedings. Given these stakes, it s clear that every company must take the measures necessary to ensure that the data in their data warehouses is valid, accurate, consistent and up-to-date. Unfortunately, current data warehouse quality management practices typically ignore several key risk factors. Most businesses therefore remain vulnerable to data quality problems and their significant potential consequences. Five Critical Risk Factors in Data Warehouse Quality Of course, every IT organization uses some form of ETL (extract, transform and load) technology when it implements a data warehouse. They also take some basic measures to ensure data quality. However, given the increased complexity of data warehouse environments and the growing downside risk associated with bad data the ETL and data quality solutions commonly used today simply do not provide adequate protection for the business. There are several reasons for this. First, ETL and data quality solutions have mainly focused on alpha-based information even though today s data warehouses are increasingly being used for financial and other numerical information. Second, the checks that current tools typically execute are somewhat rudimentary. They may ensure that customer data is updated or current, that there are no duplicate records, and/or that data conforms to basic parameters (such as phone numbers having seven digits and containing known area codes). This is not the same thing as reconciling dollar balances with source data or validating account totals with source systems. 2006, Infogix, Inc. All Rights Reserved. Page 4 of 9

Third, these existing checks focus almost entirely on validating the data only after it has already entered the data warehouse. They therefore do little or nothing to prevent such data from entering the warehouse in the first place. Nor do these checks help ensure the integrity and validity of the various transformations and exchanges of data that feed the warehouse and its derivative data marts. So they don t help IT organizations pinpoint and remedy the underlying causes of data quality problems. Because of these shortcomings and others, most businesses using data warehouses remain exposed to five critical risk factors: 1) Insufficient safeguards against quality problems in source systems and/or ETL processes. IT organizations that only validate data once it is in the data warehouse are violating a basic principle of the quality gospel according to Deming; they re simply spotting defects rather fixing the process. They thus put an inordinate amount of trust in the process much to their own peril. By failing to implement appropriate controls in source systems and at each of the various steps between those source systems and the data warehouse, IT organizations are virtually guaranteeing that problems will arise with data in the warehouse itself. Also, in addition to going through extensive transformation, data often moves through a variety of systems and/or data marts before and after landing in the warehouse. These successive transformations introduce even greater risks to the quality of data. Yet most companies still do not put in place the checks necessary to ensure that this data accurately reflects the source system. This is unfortunate and unnecessary, since even highly transformed information can be verified against source systems with the appropriate technology. 2) Inadequate controls for financial and numerical data. The validation of financial and numerical data requires a variety of specific and often relatively sophisticated analytical capabilities. These range from checking total sums against source systems to verifying the accuracy of currency conversions to applying appropriate formulas to determine the cost of capital. Such checks are essential for good governance and regulatory compliance. In fact, given the rigorous financial reporting requirements now mandated by law, there is virtually no room for error when it comes to financial data in the warehouse. This is a radically different situation than with alpha data, and thus requires far more rigorous quality controls. For public companies especially, the consequences of allowing and replicating errors in financial data in and beyond the warehouse are simply unacceptable. New controls are therefore essential to protect against any degradation in data quality. 2006, Infogix, Inc. All Rights Reserved. Page 5 of 9

3) Poor or non-existent auditing of source-to-warehouse information flows. In today s highly sensitive financial reporting environment, maintaining data quality is not enough. Corporate IT organizations must also be able to audit end-to-end information flows and document the activity of data quality controls. This auditability must include validation that the information in a source system (such as a subledger) is accurate so that there is a high level of confidence that it can be trusted when it is fed to the warehouse. In other words, in today s regulatory environment, it s not enough to simply make a best-effort attempt to optimize the quality of the data in the warehouse. Companies must also be able to prove to auditors and regulators that appropriate measures have been taken to protect data quality across its entire lifecycle from source to warehouse. Companies that can t provide these types of audit logs leave themselves exposed to additional legal and financial risks above and beyond the operational problems associated with bad data. 4) Underestimating the volatility of the data warehouse/bi environment itself. Data sources, data warehouses and the applications that leverage them are not static. The types of transformation that data must undergo as it moves between data marts and is loaded into analytical cubes will also change in accordance with shifting business requirements. IT organizations that only perform quality checks on data at a single point in the warehouse will therefore probably fail to adequately protect themselves from the data quality problems that emerge when information is exchanged between all of these moving parts. Again, financial reporting is particularly vulnerable to these volatility-related problems that can easily create disparity between the data in a continuously changing warehouse and the data in continuously changing source systems. Companies that can t keep this data in sync or, just as important, can t prove to third parties that it is in synch will not be able to ensure the accuracy of the information presented to end-users, customers and regulators. 5) Failure to implement controls that are independent and adaptable. In the final analysis, IT organizations fail to ensure the quality of data in the data warehouse because they rely too much on the mechanisms within the warehouse itself as well on source systems to perform their assigned functions without error. This approach is unacceptable given the complexity of today s warehouses and the importance of getting information right everywhere across the enterprise. Quality controls must ultimately operate independently of the warehouse in order to safeguard the warehouse. Controls must also be sufficiently adaptable and manageable to allow IT to quickly modify them as required to respond to the addition of new data sources, changes in business rules, and the discovery of vulnerabilities in information flows. 2006, Infogix, Inc. All Rights Reserved. Page 6 of 9

Not every IT organization is ready and willing to confront these risk factors. Some simply have too much confidence in how well they ve engineered the data warehouse. Others are overly concerned with minimizing CPU cycles and processing costs and will therefore resist anything that adds to the computing intensity of the environment. But these risk factors are very real and must be addressed if the business is to be optimally protected from the downside impact of poor data quality. If they re not, the substantial investments made in data warehouses and BI simply won t pay off. Even worse, the business can wind up operating inefficiently, losing customers, and incurring the wrath of dissatisfied regulators. Best Practices for Safeguarding Data Quality in the Data Warehouse Given the importance of maintaining data quality in the data warehouse and given the complexity of today s data warehousing environments how can IT organizations best protect the business from information risk? How can they expand their data quality strategies to meet the growing challenges posed by their ever-evolving data warehouse implementations? Based on the experience of successful data quality innovators, three key best practices have emerged: Prevent Bad Data from Entering the Data Warehouse in the First Place Rather than simply waiting until data quality problems emerge in the data warehouse itself, IT organizations are discovering that it s much more effective to prevent bad data from getting there in the first place. Typically, this is done by checking information both before and after all ETL processes. By pushing quality controls out into the information pathways that feed the warehouse, these organizations have found that they can create a layered defense much as they do with information security. This pro-active approach enhances their ability to discover the root-causes of quality problems and allows them to take immediate, appropriate remedial action. Early detection and correction also reduces the overall cost of data quality management. Automate Controls at Every Point of Information Exchange Current advances in data quality technology enable IT organizations to implement highly automated quality controls at every point in the information supply chain from the various data sources that feed the data warehouse to the applications and data marts that the data warehouse supports. These controls use sophisticated algorithms to validate, balance and reconcile all types of data, including financials. They also provide the error/exception reporting IT organizations need to quickly and effectively discover the root-cause of data quality problems. Plus, because these controls can be fully automated, they significantly reduce the staff workloads associated with data quality management. 2006, Infogix, Inc. All Rights Reserved. Page 7 of 9

Systems Validate to Source Systems Warehouse OLAP Consumers External Data Extract Enterprise Data Load OLAP Tool Customer Data Legacy Data Transform Data Marts Validate Between Phases BI Tools or Report Generators Key = Control Point Metadata Repository = Information Flow Build Auditing into the End-to-End Information Supply Chain The above-mentioned controls can also be used to integrate auditing capabilities into full end-to-end information supply chain that flows into and out of the data warehouse. This audit trail ensures visibility into data quality and controls at all points. It also ensures the viability and legitimacy of the data warehouse as a source for financial reporting. It is important to note that these best practices are required to complement existing ETL solutions and conventional data quality tools. ETL solutions typically do no identify errors in information exchanges and are not effective for real-time transaction processing. They simply provide a mechanism for getting data from various sources into the warehouse. Conventional data quality tools, for their part, are typically designed to address issues such as duplicate data, incomplete data, standardization of data, and data cleansing within the warehouse itself. These capabilities don t safeguard the quality of data before it enters the warehouse environment or after it leaves. Companies that want to safeguard the quality of the data at the end-user desktop must therefore take more pro-active measures to prevent error from creeping into the process at any of the many complex steps between data sources and warehouse-driven business applications. Without these measures, all data passing into and out of the warehouse will be suspect. 2006, Infogix, Inc. All Rights Reserved. Page 8 of 9

About Infogix For more than 20 years, Infogix, Inc. (formerly Unitech Systems, Inc.) has provided software solutions to organizations that help eliminate Information Risk and deliver Information Integrity for all key stakeholders. As the leading provider of automated information control solutions, Infogix has implemented millions of automated information controls into the information supply chains for hundreds of organizations, to ensure the accuracy, consistency, and reliability of information within business processes. Infogix serves hundreds of customers including eight of the top ten financial services companies worldwide, and seven of the top ten U.S. life insurance companies. The company maintains its home office in Naperville, Illinois, and has regional offices in the U.S., Canada, Western Europe, and affiliates in Italy, Chile, and Australia. For more information, call +1.630.505.1800 or visit www.infogix.com. Content from this white paper is the property of Infogix, Inc. Any reprint or reproduction in any format without permission is strictly prohibited. 2006, Infogix, Inc. All Rights Reserved. Page 9 of 9