Data Quality Governance: Proactive Data Quality Management Starting at Source By Paul Woodlock, Clavis Technologies About the Author: Paul Woodlock is a business process and management expert with nearly 25 years experience in delivering businessfocused information technology solutions to the manufacturing and financial services sectors. He has a deep understanding of Data Quality and works closely with Calvis customers implementing lasting and effective Data Quality Governance programs. Throughout his career Paul has focused on the intersection of business and technology initially in delivering specialized applications for financial services clients. Throughout the 1990s Paul managed the design and deployment of large multinational supply chain solutions in the US, Canada and Western Europe. Prior to joining Clavis Paul held a senior business development position at data integration and Data Quality market leader Informatica. Before that he worked at a global consulting company where he was responsible for the delivery of enterprise application implementation projects, and data and information quality solutions for its clients. 2008 Clavis Technologies. All Rights Reserved. Reproduction and distribution of this publication in any form without prior written permission is forbidden. Clavis Technologies offers no specific guarantee regarding the accuracy or completeness of the information presented, but the author, Clavis Technologies and its Associates makes every reasonable effort to present the most reliable information available to it and to meet or exceed any applicable industry standards. Clavis Technology, Data Quality Governance: Page 1
Data Quality Governance: Proactive Data Quality Management Starting at Source Contents Page 1: Overview 1 2: The Data Creation Challenge 2 3: Data Quality at source, right first time 2 4: Three Ps in Prevention 4.1: People 4.2: Processes 4.3: Policy 3 3 4 5 5: The Clavis Solution 6 6: Summary 7 Clavis Technology, Data Quality Governance: Page 2
1: Overview The argument for stringent quality management, control and governance in manufacturing and services was accepted decades ago. Indeed the application of modern quality concepts, promoted by W. Edwards Deming and Joseph Juran, are credited with revitalizing Japanese manufacturing after the devastation of World War II. A focus on quality from the start of the production process eventually led to Japan becoming the second largest economy in the world in a relatively short period of time. In today s market, where information is arguably the most important resource, organizations need to apply the same quality rigor and governance to their information and data administration processes as they do to the quality of their products and services that represent their brand to customers and trading partners. Data and information are at the heart of a successful organization: underpinning day-to-day operations and business decisions. In a world of financial uncertainty, growing competition, more demanding customers and increased regulation the focus on data and critical data processes is all the more important. Organizations striving to drive down costs and increase productivity can only do so if they have the highest quality data. Just as manufacturing and services quality regimes moved away from the concept of inspection and sampling (i.e. removing the defective product from production lines after it has been made) to a more proactive approach that emphasizes prevention, so too must Data Quality. Mindful of the success that proactive quality control and management spawned in manufacturing, data-centric organizations are starting to take the same upfront and process driven approach to Data Quality. Some go as far as to use standard quality methodologies such as Six Sigma and Total Quality Management (TQM) to identify and address the root causes of Data Quality defects in their organizations and operations. This type of approach enables process excellence, continuous improvement and drives efficiencies and tangible benefits. By addressing the root causes of Data Quality defects and focusing on data collection and creation processes as well as empowering key stakeholders data providers, business process owners and downstream consumers of the data with a combination of processes, procedures and policies organizations elevate their efforts to the level of Data Quality Governance. By getting data right at its source and maintaining it through Data Quality Governance, people that are tasked with and accountable for Data Quality are finally able to deliver high quality data to the business. Clavis Technology, Data Quality Governance: Page 1
2: The Data Collection Challenge The data collection process in most large organizations is complex. It typically involves personnel from across the organization (and sometimes from outside the organization) and is dependent on many processes, polices, and procedures each devised to achieve its own specific goals. These processes may not be aware of the end customer of the data or other parties who rely on the data they create. The end-to-end data collection process can be difficult to control and manage. Bringing together a consolidated view of a new product, for example, to upload into an enterprise application will involve collecting different data elements, descriptions, dimensions, specifications and regulatory information etc. These disparate data elements are originally created in many different systems, dispersed across separate departments, and much of it may even come from third parties. The creation of master data objects such as Customer or Vendor may involve relatively common processes across the organization, but depending on the industry other master data objects such as Materials, Finished Goods, loan or mortgage product definitions or the definition of services or jobs will be more complex and involve many parties across different areas of the business. Data creation processes have to serve the needs and timelines of all the different interested parties, including the customers and consumers of the information, and at the same time adhere to the rules enforced by the systems and applications that use the data. This is often the point where Data Quality defects and inconsistencies come to light later in the process. 3: Data Quality at Source, Right First Time Philip Crosby, a businessman, thought leader and quality management evangelist described four major principles for achieving quality, one of which states the system of quality is prevention. In the context of Data Quality, experience has shown that this simple thesis holds a lot of water. The benefits gained from applying Data Quality controls at source, or as close to the source as possible, are far reaching and go beyond those of having a high standard of Data Quality alone. However, given the complexity of the data creation process as described above, implementing stringent Data Quality standards at source has proved difficult in many organizations. Even in organizations where one person (or more likely a dedicated team of people) is accountable for data acquisition it is rare for them to be able to oversee the entire process, if only because of the number of different constituencies involved. It can be difficult to identify bottlenecks or where poor Data Quality is being entered and by who. Clavis Technology, Data Quality Governance: Page 2
To deliver sustainable Data Quality and support Data Governance the key stakeholders, data providers, data stewards, process owners and customers of this data need to have greater understanding and oversight of the entire process than can be provided by today s Data Quality solutions and techniques. And the solution needs to address the problem upfront, at source when the data is being created. 4: Three Ps in Prevention Getting things right from the start sounds like a fairly simple mantra. And it is once it is gone about in the right way. Clavis believes there are three Ps in Prevention and Data Quality Governance, which are People, Processes and Policies. 4.1: People One of the most common barriers to adoption for Data Quality Governance initiatives is the challenge of empowering the stakeholders such as data providers and data stewards. They need visibility to oversee when and where defects are introduced and where processes are broken, and the ability to effect real change to improve data processes and prevent defects. In terms of Data Quality Governance the key people are information consumers or subject matter experts who interact with or are affected by the data capture process (or process outputs). Data providers: within the organization or a third party, who populate forms or are accountable for the collection and capture of data. Data stewards: people responsible for the definition of business entities and relationships, the business rules, Data Quality requirements at each stage in the process and the standards and policies to be employed. Process owners: the people accountable for key data processes or the end-to-end process (which may include data from many different providers and sources), the definition of the process lifecycle, the key milestones and the requirements at each milestone. Other stakeholders: such as those who may not interact directly with the process or sub processes, but need to be informed of progress, incidents or defects at different stages. Clavis Technology, Data Quality Governance: Page 3
In most organizations, data creation and administration processes involve many different business functions. Each area of the business engineering, packaging, documentation, marketing and finance has different objectives, timelines and priorities resulting in a staged process with gates and milestones along the way. To deal with this the Data Quality process needs to support the phased nature of the data collection process and the dispersed nature of the people involved. 4.2: Processes By tackling Data Quality at source as part of the data creation process organizations have the opportunity to implement effective Data Governance strategies, ensuring that data attributes and business rules are applicable for each of the business areas involved and are consistent across departments. For example packaging detailing sustainability can be checked for consistency with engineering details on battery life and recycle information. All too often in long running business processes, such as the introduction of a new product, inconsistency in data is not identified until the deadline for the product data to be passed to operational systems, or to trading partners, is imminent. The nature of most products today means that key data will change many times during development. The key stakeholders responsible for Data Quality and Data Governance need visibility across the entire process so they can take action early, before it s too late and deadlines are pressing. Data Quality controls and business rules need to be applied at the appropriate stage in the process; this is critical to governance and the ongoing quality of information. To be capable in the context of a process, a business rule should only be applied when it is logically or practically appropriate. This is where a number of enterprise applications fall down: an attribute is mandated but the data provider may not have access to the information at the stage in the process when the data is mandated. This leads to default or incorrect values being entered, and Data Quality immediately suffers. Carrying out data quality controls at the time of capture and also applying appropriate business rules at the time that is relevant (not necessarily the first occasion a data provider enters data) gives the process owner point-in-time visibility across functional areas or organization silos. The stakeholders can review progress and proactively facilitate root cause analysis and collaboration to rectify issues and conflicts at the right time. Clavis Technology, Data Quality Governance: Page 4
4.3: Policy Policies; the standards and definitions that govern the process to ensure that data is managed securely, accurately and responsibly within and externally to the organization in conformance with internal standards as well as industry or regulatory requirements. They will include business rules, reference data, controls and reports. Although different business processes will have their own context and priorities, it is important to apply business rules and Data Quality controls consistently across them. Best practice dictates having one place where these critical definitions and business rules are managed and stored today in many organizations complex and unwieldy spreadsheets are used in an attempt to achieve this, with obvious limitations. In an effort to govern data across different processes and departments, a Data Steward might be responsible for documenting and defining rules and policies for specific data attributes, their relationship with each other and also their relationship with even more attributes and processes that he or she as no control over. Therefore not only should data definitions and rules be maintained and managed in one place, but they must also be accessible, understandable and usable across the organization. For example: a Data Steward for weights, dimensions and configuration information will define policies about weights and dimensions, such as unit of weight, length and volume, for each product category. Parameters might include: Where is it acceptable (if acceptable?) to use imperial or metric units What are the accepted customer, selling, shipping units of measure What are the standards for each shipping unit, for each product category (Inner Pack, Pack, Case, Layer, Pallet etc.), rollups, tolerances and so on These fundamental characteristics are critical to many parts of the organization and can touch almost every operational process from design, packaging, logistics and costing, to sales and pricing. It is essential that consistent policies and rules are used, and managed centrally, to govern this data appropriately across all the areas of the business. The ability to monitor and report process performance across organizational barriers and apply standards and policies for core business rules and data standards, consistently across processes and departments is the key to governing Data Quality and achieving capable data processes. Clavis Technology, Data Quality Governance: Page 5
5: The Clavis Solution Most of today s Data Quality solutions are designed to inspect and correct data after it has been created and used, i.e. they follow a quality control strategy long outmoded in the physical world. Few deal adequately with data at source with the notable exception of some name and address Data Quality solutions. The diversity of systems, applications and processes involved in capturing and bringing data together poses a significant challenge for Data Governance and Data Quality at source solutions. They need to be able to encompass everything from manual paper processes, Microsoft Excel and tactical databases, to Business Process Management (BPM) and Workflow applications, Enterprise Recourse Planning (ERP), Customer Relationship Management (CRM) and Master Data Manage (MDM) systems. Clavis focuses on enabling existing data administration, Data Quality and Data Governance processes and providing an easily accessible solution that dramatically improves the way in which disparate systems and diverse processes are governed and managed. Our Data Quality Governance solution is delivered as software as a service (SaaS) with little or no need for changes to processes, forms or applications. It uses simple but flexible interfaces and light-touch integration so it can be easily deployed with minimal investment in IT resources. The approach coupled with SaaS delivery removes many of the traditional barriers to effective Data Quality Governance: enabling active participation by all the Data Quality stakeholders in an organization, driving standardization, re-use and teamwork across business processes and facilitating cross-enterprise collaboration. The solution empowers business stakeholders, process owners and data stewards to monitor and govern business entity Data Quality within data processes or across multiple processes: but more than that it actively supports and guides data entry personnel and processes to ensure that the data is entered correctly. By providing prompts, reports and feedback to the person entering the data, to stakeholders in the process and to people or applications that interact with the processes, the Clavis Data Quality Governance solution offers a level of oversight, management and control not previously available in Data Quality solutions. As a Software as a Service offering Clavis Data Quality Governance enables organizations to centrally manage Data Quality standards, business rules and content (whether they are internally or externally generated), and deliver them wherever they are needed. The model allows the business stakeholders to apply the relevant business rules or controls at the appropriate time in each process. This facilitates prioritization and visibility within and across processes and ensures the people closest to the source data are available to correct, to collaborate to rectify conflicts and to identify the root causes of defects in the data as it s captured. Clavis Technology, Data Quality Governance: Page 6
6: Summary To deliver sustainable Data Quality improvement, Data Quality problems must be tackled at source. This means focusing on the people, processes and policies that control and supply data to the organization. Empowering the business stakeholders with the right solutions and support is the first step to attaining the accountability required to move Data Quality into the realm of sustainable Data Quality Governance. Prevention is critical to a successful Data Quality Governance program, which if implemented correctly has the power to deliver significant value to any large organization, while at the same reducing cost and mitigating risk. The combination of mature Data Quality practices, along with newer enabling technologies and delivery methods such as SaaS for the first time make Data Quality Governance through Data Quality at source an achievable goal. Clavis Technology, Data Quality Governance: Page 7