1 Master Data Management Managing Data as an Asset By Bandish Gupta Consultant CIBER Global Enterprise Integration Practice Abstract: Organizations used to depend on business practices to differentiate them in the market; then various technology systems and applications came along, bringing in automation of business processes. Today we see that most organizations are automating their business processes, and that more mature organizations differentiate themselves from others in how they use and manage their data. The advent of Service Oriented Architecture and advancements such as Cloud Computing have begun a shift in the IT industry from application-centric solutions to datacentric solutions. With an increase in the pace of business, organizations have already built up Business Intelligence systems to aid efficient decision-making. Globalization and mergers and acquisitions have served as catalysts for organizations to realize the criticality of data and data integration. With this background, many organizations have begun to treat data as one of their key assets. This white paper explains how core business entities known as master data can be considered organizational assets and how to manage these entities holistically. A CIBER Data Management Best Practices Whitepaper
2 2 The What s and How s of ETL Architecture Introduction As organizations have expanded, acquired, and merged, their systems and applications have grown increasingly complex. Often, organizations realize that something is going wrong or out of control. To understand this, consider the following situation: Due to an economic downfall leading to cost cutting, a retail company s business head decided to send physical product promotional catalogs to only their top customers to maintain profitable relationships with them. With this in mind, he conveyed this assignment to his executive team and asked them to make it happen. The executive team contacted their Customer Relationship Management (CRM), billing and sales systems representatives to find such customers. Different systems needed to interoperate to come up with an answer and they were not able to reconcile and agree on the customer information they stored. The systems did not have a single true view of their customers. What went wrong in this company s IT systems? What could be the reason for ambiguity in customer data? Before answering these questions, let s examine the current business scenarios that exist in enterprises. Business operations in most enterprises are driven by data, and it can be even considered the lifeblood of the business. Data enters into enterprise systems and applications through many channels, such as messages and electronic files. It then flows through their various systems, such as Customer Relationship Management (CRM), sales and billing, gets transformed, and is stored in those systems in a variety of formats either partially or wholly, as required by the system. When a new enterprise application is added to carry out a new requirement, this data gets migrated and stored there as well. The result is a set of enterprise applications with their own sets of data, encapsulated within their systems, even when the scope of these data is enterprise-wide. The retail company soon figured out that while focusing on their growth, they lost focus on their data as an enterprise asset and the need for an enterprisewide data management approach. Figure 1 - Fragmented Data Master Data Management Master Data Management (MDM) is a set of policies, procedures, tools and infrastructure used to capture, integrate, and share master data in a consistent, accurate, complete, and timely manner. Master data are the reference data elements of an enterprise; customer, product, and employee are all master data as opposed to transactional data such as order, reservation, or claim. Some data, such as a list of states, units, and production composition. remains static and may not require management at an enterprise level. MDM deals with the issue of scattered and fractured master data from a business and technical perspective. Data as an Organizational Asset What is an organizational asset? In any organization an asset is considered to be: Something that has value. For example a company s inventory has value.
3 CIBER, Inc. 3 Something whose value can be measured. A computer system has some quantifiable value on its own. Even when it s not in use, its cost can be measured. Something that is required for day-to-day operations and helps an organization to achieve its objectives. Usually the term assets brings financial and tangible assets to one s mind. The focus of asset management has always been for tangible objects like cash, inventory, tools and equipment. There are several factors that make it difficult for organizations to treat their data the same way they treat other assets. Data is not tangible it is not locked physically in a vault. It does not have intrinsic value; the value comes from how you use it. The generally accepted accounting principles do not recognize data as an asset in an organization s financial record unless it has been purchased. Again, different users have different perceptions of the importance of data so it s not managed and valued consistently across the enterprise. So while many organizations will readily agree that their data is an important asset, when they are asked what they are actually doing to put this belief in action, the reality doesn t match the claims. Organizations must not only value tangible assets for their inherent contribution to business success, but must actively and carefully consider the intangible data asset as one of the key differentiators for the implementation of business goals. Why treat data as an asset? As organizations move quickly to adopt new technologies, trends and techniques as a way of responding faster to business needs, the one thing that remains unchanged is data. This gives a valid reason for data to be given more importance rather than treating it as only a piece of information. When an organization starts treating its data as an asset, it turns its focus from the effort and expense associated with only storing and processing data, towards a full strategic lifecycle of data as an asset and the business value that can be obtained from using it. Master Data Management emphasizes the data as an asset paradigm and its various facets instead of just business process perspectives. MDM Master Data as an Asset Managing data as an asset requires data to be defined, secured, and controlled in a business environment. The following diagram illustrates a solution for master data management with the data as an asset perspective. ETL ESB Views Dat a Governance Portals, Portlets Master Data Stored Procedures D a t a G o v e r n a n c e Change Notifications Maintenance Web Services Figure 2 Data as an Asset
4 4 The What s and How s of ETL Architecture Data Governance A key tenet of MDM states that the business must be an integral part of any MDM project. Data Governance is the manifestation of that involvement in the process; where business and IT come together. Data governance is where the policies and procedures are created to regulate data creation and maintenance. The governance committee develops rules for data quality and stewardship, and ultimately, drives the enterprise towards treating data as an enterprise asset. To get the full benefit from a data centric approach, data governance must be the foundation of your data management strategy. Data Architecture As master data is identified, it is important to establish a common business vocabulary for all business entities. This business vocabulary can be developed through enterprise data modeling and results in understandable and shared data definitions for all users across the enterprise. Data Ownership Creating a master data repository creates a single version of truth, but to maintain this data, every domain specific data and its associated data elements should have a clear operational owner. It s the responsibility of the business data owner to oversee the definitions, terminology, calculations and usage of their data. The data owners ensure the processes used to maintain and modify their domain data result in consistent data while satisfying business needs. They also monitor data security and privacy and data quality levels. Data Stewardship Data stewards are established to provide on the ground coordination for governance activities. These stewards work with the governance team, business data owners, and data governors to support their directives for data creation and usage, data quality, and data security. Data Quality, Security and Privacy The data governance team also oversees the accuracy, integrity, cleanliness, correctness, completeness, and consistency of data across the organization. Security issues such as network security, physical control, systems logs, incident response, and security audits are addressed. Based on their analysis, problem reports, and other feedback, the governance team will work with the business data owners to establish reactive and proactive activities to maintain and improve the quality of the enterprise s data. Awareness, Sponsorship and Training Finally, the governance team has the responsibility of promoting data governance awareness and act as a key sponsor of governance-based initiatives. As the governance processes are applied to each business area, the governance team, in conjunction with the training organization, provides executive, stakeholder, steward, and user training to support the governance activities. MDM Architecture Two primary architectures have emerged for MDM, System of Record and System of Reference. An organization may adopt one of these approaches or a combination to manage its master data. Both architectures consolidate master data and make it available to the enterprise in a form that is standardized according to the agreed upon guidelines. System of Reference The System of Reference architecture views master data as continuously updated reference data. This architecture aggregates master data in a central repository that acts as a reference across the enterprise. The data may enter through any business system, and is accessible to other systems through the central reference repository. Data Integration Reference MDM Figure 3 System of Reference
5 CIBER, Inc. 5 In this style of implementation, a copy of master data remains in the transactional systems. As a variation, a registry can be created which maps the master data creating a common key for reference across the Enterprise. System of Record The System of Record architecture assumes recordkeeping functionalities for master data, maintaining tight control of Create, Read, Update, and Delete (CRUD) actions. The MDM system becomes the point of entry, custodian and the authoritative reference for master data. Alternatively, systems and applications that receive master data may collaborate with the MDM system to author master data in the centralized repository. In the System of Record architecture, individual applications no longer maintain master data in their environment, except for technical reasons (such as caching for performance). Note that each of these applications retains a dedicated data store for application specific data such as transactions or logs. Data Entry Reference MDM Figure 4 System of Record Hybrid Architecture As the name suggests, this architecture is a combination of both the System of Record and System of Reference. In reality, not all the applications may be able to offload record-keeping functionalities to another system. Such systems will use the MDM system as a reference and while other systems may offload record-keeping and use the System of Record capabilities of the MDM system. Metadata All master data entities identified for an organization should capture the descriptive information about their Enterprise data known as metadata. This includes: Business Metadata This includes a dictionary or glossary of business terms, data elements, acronyms and abbreviations. It is all about making meaning explicit and providing business description, terminology, aliases, limits, constraints, calculations, privacy, and usage of information. Technical Metadata Technical metadata includes the internal data types and structures, its storage location, the systems that affect the information and more. Operational Metadata This includes operational run-time and performance statistics. Data Services To control and secure the MDM repository, it should be accessed and updated by a collection of data services. These services implement the business operations and support authentication, security, access control and audits in support of organizational goals. Any change to the data repository has to flow through the data services layer. This layer can contain services like ETL (Extract, Transform & Load), views, ESB (Enterprise Service Bus Adapter), web services, portals, notification services, maintenance and enhancement services. All of these services can be implemented without affecting other services and their functionality. If the organization wishes to switch to some emerging trend or technology, the existing service can be modified to adapt to the
6 6 The What s and How s of ETL Architecture new technology without affecting other business functionality. Some of the services are: Extract, Transform and Load (ETL) These services are utilized for batch processing of data. They can extract data from source systems, stage them for cleansing, standardization, enhancement and other data quality checks, and finally load the cleansed data to the repository. Notification Services These are outbound services that provide a common mechanism to notify subscribers (application systems) of changes to the Master Data Repository. Using these services, an application can assure that it is aware of the latest information of any master data entity, regardless of which application recorded the updated information initially. Web Services Web services provide input/output interface to the data repository. These can be utilized by middleware technologies to access and update data in near real time. Web services use XML formatted messages to communicate; a data model is defined to exchange data to and from the data repository. Maintenance and Enhancement Services These services perform periodic operations to ensure quality and integrity of the Data Repository. Examples of such services are Data Profiling Entity Matching Data Enrichment and Enhancement Data Quality Audits Backup and Recovery Archival and Purging Key Benefits of MDM Properly implemented, MDM promises to improve an enterprise s: Operational Efficiency - Clean, unambiguous and consolidated view of data helps to improve efficiency of business processes - Better control over data by implementing ownership and stewardship of data modification, flow and maintenance of data happens in a controlled manner - Avoids duplicated effort in maintaining and storing data saves money and management overhead Stakeholder Satisfaction - Better engagement and satisfaction levels from customers, business users, and technical teams Risk Management - Better compliance to business, technical and legal requirements Conclusion Data has always been the life blood of organizations, but typically the business processes have been getting more attention. The perspective of data as an asset will provide seamless control over the quality, security, management and lifecycle of data, which in turn provides improved capabilities to the business. With the increasing pace of business operations and demands for quality data and high availability, it is time to focus on data as the backbone for organizations. Though the initial effort of establishing data governance and data management disciplines involves lot of time and effort from business and IT stakeholders, once all policies, procedures and infrastructure is in place, the business becomes more nimble when meeting customer needs and business objectives. Treating data as an asset provides a single and centralized point of control, easier maintenance and a single version of truth. Data management along with data governance provides a framework to achieve complex business functions effectively and can be tracked to completion successfully. And finally, treating data as corporate asset gives a sense of satisfaction from top management to IT stakeholders while satisfying clients and customers as the same time.
7 CIBER, Inc. 7 About The Author Bandish Gupta has been involved with the technical and business aspects of building Data warehouses. She has a good exposure to various tools and techniques in BI/DW space, including tools like extract-transform-load(etl) and database. She has worked with organizations in retail and healthcare domains. Her current interests include business intelligence, data profiling, data quality, data governance and metadata. She is based out of Bangalore, India.
8 CIBER, Inc. (NYSE: CBR) is a pure-play international system integration consultancy and outsourcing company with superior value-priced services and reliable delivery for both private and government sector clients. CIBER s services are offered globally on a project- or strategicstaffing basis, in both custom and enterprise resource planning (ERP) package environments, and across all technology platforms, operating systems and infrastructures. Founded in 1974 and headquartered in Greenwood Village, Colo., CIBER now serves client businesses from over 40 U.S. offices, 25 European offices and seven offices in Asia/Pacific. Operating in 18 countries, with more than 8,500 employees and annual revenue approximately $1.2 billion, CIBER and its IT specialists continuously build and upgrade clients systems to competitive advantage status. CIBER is included in the Russell 2000 Index and the S&P Small Cap 600 Index. CIBER, the Reliable Global IT Services Partner. CIBER, Inc South Fiddler s Green Circle Suite 1400 Greenwood Village, CO CIBER, Inc. All rights reserved. CIBER and the CIBER logo are registered trademarks of CIBER, Inc. CIBER stock is publicly traded under the symbol CBR on the NYSE.