White Paper Data Quality: Improving the Value of Your Data
This document contains Confidential, Proprietary and Trade Secret Information ( Confidential Information ) of Informatica Corporation and may not be copied, distributed, duplicated, or otherwise reproduced in any manner without the prior written consent of Informatica. While every attempt has been made to ensure that the information in this document is accurate and complete, some typographical errors or technical inaccuracies may exist. Informatica does not accept responsibility for any kind of loss resulting from the use of information contained in this document. The information contained in this document is subject to change without notice. The incorporation of the product attributes discussed in these materials into any release or upgrade of any Informatica software product as well as the timing of any such release or upgrade is at the sole discretion of Informatica. Protected by one or more of the following U.S. Patents: 6,032,158; 5,794,246; 6,014,670; 6,339,775; 6,044,374; 6,208,990; 6,208,990; 6,850,947; 6,895,471; or by the following pending U.S. Patents: 09/644,280; 10/966,046; 10/727,700. This edition published November 2014
White Paper Table of Contents Introduction... 2 High Quality Data = Highly Valuable... 2 Customer Data is Key.... 2 The Many Facets of Data Quality.... 3 Cost of Bad, Dirty Data.... 4 Lather, Rince, and Repeat.... 4 Stay Clean from the Get-Go................................. 5 Enhance Your Database... 5 Conclusion... 5 Data Quality: Improving the Value of Your Data 1
Introduction Information and data are an organization s strategic assets. The ability to harness and mine one s business data is critical for solid decision-making. According to a report by The Data Warehousing Institute, Intellectual capital and know-how are more important assets than physical infrastructure and equipment. In regards to IT, enterprises spend months and even years determining which computer hardware and software solutions will help them grow their business. However, some organizations fail to devote equal attention to the quality of data that will support their investments in these systems. TDWI claims that information is the currency of the new economy and data is the critical raw material needed for success. Without this input, businesses are crippled. The needle cannot be moved if companies are plagued by bad or dirty data. Dirty data refers to information that can be misleading, incorrect, and without generalized formatting. Unfortunately, no industry or organization is immune to it. In addition, dirty data affects companies of all sizes. If not identified and corrected early on, defective data can pose serious threats. High Quality Data = Highly Valuable Data is paramount. Therefore, data quality should be a key initiative on your company s radar. Why? Here are a few reasons on how high quality data can improve businesses: Efficient operations Enhanced customer experience Increased revenue/cost reduction According to Forrester Research, Business drivers for data quality implementation are plentiful and often differ based on industry. No matter the industry, many organizations build their business case for data quality investments to increase revenue through improved direct marketing and account management, reduce costs through improvements to operational efficiencies, and mitigate and control regulatory and financial risk. High quality data allows greater confidence in analytic systems and decreases the time spent reconciling data. It enables a more uniform version of the truth, allowing stakeholders the ability to identify and implement necessary changes. This in turn causes companies to cut costs and increase ROI. Customer Data is Key In a recent report, companies identified these as the main types of data prone to quality problems: Customer data 74% -- names, addresses, phone numbers, social security numbers, etc. Product data 43% -- part numbers, descriptions, quantities, supplier codes, etc. Financial data 36% -- dates, loan values, balances, titles, account numbers, and types of account 2
Sales contact data 27% Data from ERP systems 25% Employee data 16% International data across multinational companies 12% Other 10% The study showed that customer data was significantly more flawed with errors and inconsistencies, as compared to other data types. Customer contact data is notoriously volatile and difficult to maintain at high accuracy levels. Experts estimate that 2% of records in a CRM database become obsolete each month due to customers dying, divorcing, marrying or moving. To put this statistic into perspective, assume that a company has 500,000 customers or leads in its CRM database. This means each month 10,000 customer records become obsolete, culminating in 120,000 out-ofdate records every year. If no action is taken then within two years about half of all the records will become outdated. While other data is important, we will be focusing on customer contact data, since it is the key data type that companies are struggling to monitor and maintain. As you can tell, these data elements are the most problematic because its quality quickly degenerates over time. The Many Facets of Data Quality Since bad data is a multi-faceted and costly problem, companies rely on a variety of solutions and processes aimed at improving data quality. First, let s take a step back by defining what data quality entails. Data quality takes into account the following: Existence: whether the organization has the data Validity: whether the data values fall within an acceptable range or domain Consistency: whether the same piece of data stored in multiple locations contains the same values Integrity: completeness of relationships between data elements and across data sets Accuracy: whether the data describes the properties of the object it is meant to model Relevance: whether the data is the appropriate data to support the business objectives In short, data quality solutions and processes are aimed at improving the accuracy and completeness of the information your organization receives. This involves cleansing and transforming data by removing inaccuracies and standardizing on common values. Data Quality: Improving the Value of Your Data 3
In regards to customer contact data, data quality practices enable companies to communicate and accurately profile customers. In addition to marketing effectiveness, high quality customer contacts means increased sales performance, operational excellence, customer satisfaction, and cost savings. There is no such thing as perfect or defect-free data. Nevertheless, your company should not discredit the importance of trying to improve data quality to the best of its ability. Cost of Bad, Dirty Data If left alone, defective data can contaminate systems and information assets. The end-result is a myriad of problems from high costs, jeopardized customer relationships, imprecise forecasts and poor business decisions. According to Gartner, Fortune 1000 enterprises will lose more money in operational inefficiency due to data quality issues than they will spend on data warehouse and customer relationship management (CRM) initiatives. Dirty data is a costly problem that affects all verticals. The SiriusDecisions 1-10-100 Rule helps demonstrate just how costly defective data can be. It posits that it takes $1 to verify a record as it is being entered (and cloud-based data quality solutions have decreased this cost substantially). If it is cleansed and de-duped later then it will cost $10. If a company decides to do nothing, it will incur a cost of $100 as the ramifications of the mistake are repeatedly felt. As you can imagine, even a low data error rate can add up. According to TDWI, bad customer data costs U.S. businesses more than $611 billion each year. They explain that this happens because most organizations overestimate the quality of their data and underestimate the impact that errors and inconsistencies can have on their bottom line. Lather, Rinse, and Repeat Now that you understand the critical importance of data quality, let s discuss ways to start cleaning your customer contact records. The first step is to build a team. There is too often a divide present between IT and business stakeholders. Much to the detriment of organizations, departments exist as silos. This causes a barrier between trusted data (typically driven by IT) and process transformation initiatives (typically driven by business leaders). The effective collaboration between business process and data management professionals is the key to success for data quality. Business processes will break down if they do not use trusted data, and data quality initiatives will fail to deliver business value if the data does not support your organization s most critical business processes and decisions. In short, it is important to build a data quality team across all departments. An initiative cannot thrive without the proper support. After building a team, the next step is to verify all existing customer contact data in your database. There is a variety of validation cleansing algorithms that improve data quality. 4
Stay Clean from the Get-Go However, one of the most important data quality lessons to learn is that you need to focus on preventing errors at their source, not in finding and fixing them as they crop up. As the saying goes, Garbage in, garbage out. If you put bad data into your database then expect poor results. For this reason, it is important to rely on data quality solutions that work in real-time (i.e. cloud-based), as compared to on-premise database vendors. Validating and managing data in the earliest stages of collection can lead to better lead scoring and lift conversion rates by about 25% between the customer inquiry stage and the point where marketing/sales qualifies the leads. Data entry errors can be prevented by using validation routines that check data as it is entered into the Web, client/server, or terminal-host systems. Think about all the sources of how you obtain and enter data into a database. Figure out ways to implement data quality solutions at each point-of-capture. Enhance Your Database In addition to verifying and validating data at all sources, it is important to rely on an append solution that can enhance your contact records. Even if you cleanse the data you currently have available, there is a wealth of knowledge that can be gained from the information you do not have at your disposal. Append solutions bridge the gap between what you know and what you d like to know about customers. The businesses that do the best job of closing this chasm will be the most successful. Conclusion There is no better time to recognize the huge costs of dirty data. As discussed, ignoring the problem and letting it go unchecked only magnifies the issue. Build a team that will help work to come up with successful data quality policies and programs. After all, data will only continue to grow and businesses and institutions will further rely on it. Ultimately, the goal for companies is to manage the quality of data with the same attention devoted to other critical resources. Once a company values data as a significant raw material, it will see the natural progression to making a corporate commitment to manage data quality. This commitment demands establishing a program that organizes processes, systems, and data quality tools to achieve a common goal of high quality data. Data Quality: Improving the Value of Your Data 5
About Informatica Informatica Corporation (Nasdaq:INFA) is the world s number one independent provider of data integration software. Organizations around the world rely on Informatica to realize their information potential and drive top business imperatives. Informatica Vibe, the industry s first and only embeddable virtual data machine (VDM), powers the unique Map Once. Deploy Anywhere. capabilities of the Informatica Platform. Worldwide, over 5,000 enterprises depend on Informatica to fully leverage their information assets from devices to mobile to social to big data residing on-premise, in the Cloud and across social networks. For more information, call +1 650-385-5000 (1-800-653-3871 in the U.S.), or visit www.informatica.com. Worldwide Headquarters, 100 Cardinal Way, Redwood City, CA 94063, USA Phone: 650.385.5000 Fax: 650.385.5500 Toll-free in the US: 1.800.653.3871 informatica.com linkedin.com/company/informatica twitter.com/informaticacorp 2013 Informatica Corporation. All rights reserved. Informatica and Put potential to work are trademarks or registered trademarks of Informatica Corporation in the United States and in jurisdictions throughout the world. All other company and product names may be trade names or trademarks. IN00_0000_00000