Uncovering four strategies to approach master data management Anne Cleven Institute of Information Management University of St. Gallen, Switzerland anne.cleven@unisg.ch Felix Wortmann Institute of Information Management University of St. Gallen, Switzerland felix.wortmann@unisg.ch Abstract Just recently much Information Systems (IS) research focuses on master data management (MDM) which promises to increase an organization s overall core data quality. Above any doubt, however, MDM initiatives confront organizations with multi-faceted and complex challenges that call for a more strategic approach to MDM. In this paper we introduce a framework for approaching MDM projects that has been developed in the course of a design science research study. The framework distinguishes four major strategies of MDM project initiations all featuring their specific assets and drawbacks. The usefulness of our artifact is illustrated in a short case narrative. 1. Introduction Master data management (MDM) represents one of the latest hot topics in the information systems discipline [31, 44]. As a comprehensive and enterprise-wide approach MDM is meant to provide organizations with the ability to integrate, analyze and exploit the value of their data assets, regardless of where that information was collected [41, p. 218]. The need for such an approach is reasoned in severe problems organizations face with (master) data quality. Decades of using and maintaining data in disparate data stores have led to a multitude of inconsistencies in data definitions, data formats, and data values, which makes it next to impossible for an organization to understand and use its key data [36, p. 65]. Problems like duplicate customer numbers, incomplete or missing attributes, deviations of actual values from denoted meta-labels, and the use of free text forms proliferate with a growing number of different systems and an increasing volume of data stored and maintained in these [42]. A recent survey discloses that organizations expect data error rates of up to 30% [11]. Even worse are the results of another survey indicating that 83% of the companies polled have suffered significantly as a result of having poor master data quality [32]. MDM as it is shortly drafted above shows great promise for overcoming these challenges. However, there is not a one-size-fits-all approach for implementing MDM as we could witness in different projects with partner organizations. Some approaches may be over- others undersized as the master data objectives may be. We thus argue that different objectives require different MDM approaches and propose four different strategies for setting up an MDM quality improvement project. We do so by first providing a theoretical background for our contribution (section 2). The theoretical background includes a data taxonomy that delineates master data from other business data like transactional or metadata and outlines key master data characteristics. Section 2 further briefly introduces the essential components of a holistic approach to implement MDM. Subsequently, we sketch the methodology of our research project which is based on the design science research (DSR) approach (section 3). Our framework of four different strategies for initiating MDM projects itself is presented in section 4 and rounded off with the description of a small case narrative at one of our partner companies (section 5). The concluding section 6 summarizes and discusses our approach and gives an outlook on future work. 2. Theoretical background 2.1. A data taxonomy The term or to be more precisely the concept of data is delicate and fuzzy and, thus, provoked a number of competing definitions [34]. Most commonly, however, data is defined as symbols, numbers, or other representations of facts and as such the raw material for information needed for everyday operations and adequate decision making [21]. In order to facilitate and support an adequate and systematic management of organizational data a further itemization of the concept data is necessary. Following TOZER data can most fundamentally be divided into domain (or subject area) data and metadata [39]. While domain data represents the business do- 978-0-7695-3869-3/10 $26.00 2010 IEEE 1
main at hand, metadata in turn relates to the domain data [12]. TOZER furthermore distinguishes between two non-disjoint metadata classes: operational and informational metadata [39]. The purpose of operational metadata is to enable design and technical operation of information systems [39]. Informational metadata on the other hand facilitates the understanding and access of domain data and is maintained for end users [39]. FOSHAY ET AL. further elaborate on informational metadata and differentiate between definitional ( What does this data mean, from a business perspective? ), data quality ( Does this data possess sufficient quality for me to use it for a specific purpose? ), navigational ( Where can I find the data I need? ) and lineage metadata ( Where did this data originate, and what s been done to it? ) [12]. Domain data can be divided into master and transactional data. Master data refers to core business entities a company uses repeatedly across many business processes and systems [6]. Typical master data entities are customers, suppliers, or products. The following characteristics are ascribed to master data: Independent existence [9]: In contrast to transactional data master data objects are independent of other objects. Master data objects can exist without any other object. A sales order for example cannot exist without a customer, whereas a customer does not need any other object to exist. Low change frequency [28]: Master data objects are stable compared to transactional data. Most attributes of a master data object are not changed during its lifecycle. While the sales order naturally changes during its lifecycle (e.g. status changes from released to in production to fulfilled ), the customer master data remains unchanged during business transactions. Constant data volume [28]: The number of master data objects (e.g. products) keeps relatively constant compared to transactional data objects (e.g. sales order). Transactional data represents business transactions [9]. Typical transactional data are sales orders, production requests or invoices. In contrast to master data, transactional data naturally changes during its lifecycle (e.g. status changes as described above). Furthermore, the volume of transactional data (e.g. number of sales orders) increases with ongoing business activity. Reference data represents an agreed-upon set of values e.g. the abbreviation for a state, currency or gender that is used across multiple organizational units or systems to ensure consistent values for attributes of master data, transactional data or metadata [9]. Reference data has similar characteristics as master data. It can exist without any other data (independent existence), its volume is low and changes to reference data occur even more rarely. Nevertheless there is a fundamental difference to master data. While master data refers to core business entities, reference data is more fine granular and not limited to domain data. Figure 1 summarizes the argumentation and relates the constituent components of the introduced data taxonomy to each other. Master and transactional data represent the two subtypes of domain data, whereas metadata can be differentiated into informational and operational metadata. All four data types refer to an agreed-upon set of reference data. Figure 1. Data taxonomy 2.2. Master data domains Each industry brings its own set of master data requirements and characteristics [4, 46]. Nevertheless, three core master data domains can be identified: Party [2, 4, 28, 46]: This domain contains all business partner related master data e.g. for customers, suppliers, distributors, employees or citizens. A business partner can be any kind of person or organization. Typically business partner master data contains contact and bank information. Moreover relationships between organizations and persons e.g. works for are covered. Thing [34, 37, 46]: This domain contains all master data that relates to the products, services, or assets a company offers and owns. While there is a common set of views on master data entities in this domain, the master data attributes are heavily driven by industries and their product characteristics. Typical views on products and services are sales (e.g. prices), planning (e.g. lead times), purchasing (e.g. prices) or financials (e.g. valuations). Location [34, 46]: This domain contains all master data which relates to places, sites or regions. A location can be a sales territory, a city, an office, a production facility or a shelf in a store. Location master data is often used together with party or thing master data to answer questions like Where is a product produced?, Where is a product sold? or To which sales territory does a customer belong? While there is a common understanding about and agreement on these three core master data domains, DREIBELBIS et al. suggest two more, relationship-based master data domains [9]: 2
Domain-specific groupings: This domain covers relationships between master data objects of one domain including groupings (e.g. service categories), hierarchies (e.g. product hierarchy) as well as dimensional master data that categorize core business entities for analytics and reporting. Cross-domain relationships: This domain covers all relationships between master data objects of different domains describing how parties, things and locations relate to each other. Typical questions addressed are Who is responsible for a customer?, Which products do we source from a specific supplier? Each core master data type can make a relationship with both further master data in its own domain and master data from other domains. Figure 2 summarizes the argumentation and visualizes the introduced master data domains and their relationships. Figure 2. Master data domains 2.3. Master data management Master data represents one of the key assets and most valuable resources an organization owns. Consequently, the lack of an adequate master data management may lead to a multitude of severe problems like operational malfunctions, inadequate decision making, and unnecessarily spent (human) resources and time [1]. There are, however, several reasons why organizations still experience continued struggle in managing their master data: Due to the rapid improvements in information processing and digital storage capabilities over the last few decades organizations have moved from having next to no data to having far too much [10]. Moreover, the evolutionary growth of application landscapes not least caused by an increasing number of mergers and acquisitions has brought along a broad dispersion of master data among a variety of different systems [9]. Especially customer master data is frequently stored in an array of systems [37, p. 39] each serving different purposes like sales force automation, customer relationship management, and enterprise resource planning. All of these factors indeed turn the enterprise-wide consistent use, flexible exchange, timely synchronization, and a guaranteed high quality of master data into a tough challenge [32]. Consequently, navigating the sea of MDM and making the entirety of a businesses core data unique, consistent, reliable, and traceable [31, p. 7] calls for a holistic approach addressing technical as well as organizational aspects. This, however, firstly requires an in depth understanding of both the fundamental elements of MDM and their interplay. The following two definitions give some indication of what an integrated approach to MDM should encompass. The first definition is offered by SMITH and MCKEEN: Master data management (MDM) is an application-independent process which describes, owns and manages core business data entities. It ensures the consistency and accuracy of these data by providing a single set of guidelines for their management and thereby creates a common view of key company data, which may or may not be held in a common data source [36, pp. 22 f.]. This definition may, however, further be endorsed with characteristics from a different definition provided by BERSON and DUBOV. They define MDM as follows: Master Data Management (MDM) is the framework of processes and technologies aimed at creating and maintaining an authoritative, reliable, sustainable, accurate and secure data environment that represents a single version of truth, an accepted system of record used both intra- and interenterprise across a diverse set of application systems, lines of business, and user communities [4, p. 11]. Each of the two definitions emphasizes in a more or less explicit way that MDM is not only about selecting the right technology but also about establishing a supportive organizational environment as well as adequate processes. This clear distinction accounts for the fact that prior efforts to consolidate enterprise master data have been criticized for being by and large IT-driven and that for an MDM initiative to be successful an organizational preparedness [28, p. 15] represents a key issue. Based on the perception that technical and organizational aspects are the two mutually dependent design areas, current MDM literature identifies five fundamental components to be configured when starting an MDM initiative, namely: master data structure, master data systems architecture, master data governance, master data processes, and master data quality [4, 9, 19, 28]. Setting up the master data structure means establishing an agreed-upon understanding of each data object s definition as well as modeling the relations between these objects [19]. This effort particularly serves a consistent use of master data throughout the whole organization. 3
The master data systems architecture deals with the design of adequate systems to support each step within the master data object lifecycle, i.e. the creation, storage, access, archiving, and retirement. Not least this MDM component needs to address master data security aspects like privacy and retention. Different approaches have been suggested for the maintenance and provision of master data, e.g. a central master data system, a leading system, the definition master data standards of that each system must comply with, or a master data repository [27]. Establishing the master data governance requires the definition of a clearly articulated mission statement as well as the assembly of appropriate organizational structures. The latter includes well-defined roles and stewardships, activities and decision areas, as well as responsibilities [26, 37]. A very popular means for the allocation of responsibilities is the RACI chart [43]. Master data processes partly represent the organizational counterpart to the master data systems architecture. They de- and prescribe how the primary activities of creating, using, maintaining, and archiving master data objects have to be executed. These core master data processes need to be embedded into an organization s daily business processes [23]. Moreover, master data processes outline how communication, support and training for MDM have to be conducted. The fifth component of a holistic MDM approach is the master data quality component. While none of the aforementioned components is to be understood independent of the others the quality component shows the highest contexture with each of the other MDM elements. For the reason of data quality being both of utmost concern [9, p. 486] for master data on the one hand and the entry point for the herein proposed strategies to approach MDM on the other it is subsequently described in more detail. Figure 3 summarizes the argumentation and depicts the interplay of the five core components of MDM. Quality improvement of enterprise master data is an ongoing endeavor that is conducted in an iterative course of action (cf. Figure 3). It follows the three main processes of strategic management, namely: analyze, implement and control [20]. During the analysis phase key organizational data objects are identified. Moreover, the state of master data quality is assessed in order to provide an accurate picture of the current situation [3]. Subsequently, within the implementation phase, a semantic harmonization of each core data entity must be developed [15]. Data integration then involves matching, normalizing, cleansing and synchronizing master data from different sources across the organization for the purpose of providing a consolidated master data base. A n I den t Figure 3. Core elements of master data management Thus, master data is first integrated from multiple sources and then itself becomes the definite source of that data for the enterprise [3, p. 75]. Subsequent to the integration process master data may further be enriched by adding organizational and/or technical metadata as well as external data in order to supply additional value [25]. The third phase of the master data quality management cycle addresses quality control. This phase is meant to ensure that achieved improvements are neither impaired by a recurrent semantic divergence of master data nor weakened by the potential return of data errors. Thus, in this process master data quality is monitored with respect to defined standards, policies, metrics and performance indicators [22]. The two phases control and analyze are strongly related and partly even apply the same methods and technologies, like e.g. data profiling for the purpose of a) initially identifying problematic master data or b) continuously monitoring master data quality. Dissatisfactory results from quality monitoring during master data control may again induce a new analysis and, thus, lead to another iteration of the master data quality management cycle. 3. Research methodology With this paper we follow a DSR approach [18, 29, 45]. The information systems discipline has long been dominated by the behavioral science approach, which strives for the development and verification of theories, i.e. for truth. The design science approach, to the contrary, focuses on the development of effective solutions for practical problems, i.e. on accomplishing utility [18]. Organizational engineering in turn can be seen as one vital aspect of DSR that does not primarily aim at Harm I n p I m 4
building information systems components, but on creating methods and techniques to analyze, model, and shape the interface between information technology (IT) and the changing organization [5]. A number of researchers have investigated prerequisites and approaches for doing high quality DSR [17, 24] and it is meanwhile common sense that relevance and rigor are the foundation for building valuable DSR artifacts. Usually DSR projects begin with an opportunity or a problem found in an actual application environment, which makes the development of a solution relevant to a certain group of stakeholders. In his desire to solve the given problem and provide an efficient solution the researcher draws on the knowledge, experience, and expertise as well as existing artifacts and processes found in the respective application domain and, thereby, ensures research rigor [17]. The essential core of the DSR project consists in actually creating the artifact. This design activity can be understood as an iterative process that starts with gathering requirements, continues with constructing an initial version of the artifact, then proceeds with incorporating feedback, which again leads to another iteration of the cycle. The fundamentals for conducting DSR projects have been summarized by HEVNER ET AL. in seven guidelines [18]. Subsequently, we briefly explain how we applied these guidelines for our work. Guideline 1 Design as an artifact: In our current research project we develop a method for supporting enterprises with introducing MDM and improving master data quality on both the organizational and the technical level. Methods are recognized as an essential artifact in DSR as well as in organizational engineering [5, 18]. In this paper, we propose the first fragment of the method, which deals with MDM strategy selection. Guideline 2 Problem Relevance: We conduct our research project in the context of a researching syndicate consisting of the university on the one and industrial partner companies on the other hand. This special setting allows us to directly address practical problems and gather real world requirements. Guideline 3 Design Evaluation: Evaluation plays a central role in our research project and is realized both formative and summative [38]. Formative evaluation in this case is based on interviews held with central stakeholders. A summative evaluation of the whole method will in the end be accomplished in the form of a comparative case study. Guideline 4 Research Contributions: With our research we aim at providing a valuable method for conducting sustainable MDM projects. Guideline 5 Research Rigor: For the development of our method we draw on a broad knowledge base of various data management topics like data quality, data governance, or data maintenance on the one hand [e.g. 4, 7, 28]. On the other hand we apply established research approaches for method engineering [40]. Guideline 6 Design as a Search Process: During our research project we are refining our method through constantly incorporating feedback and considering new requirements. For the purpose of enabling a replicable and comprehensible evaluation against the requirements defined by the stakeholders we accompany the development process with a detailed documentation. Guideline 7 Communication of Research: The insights and results gained in the course of our research will be reported via publications. We hope to thereby support practitioners in realizing better MDM implementations as well as researchers in gaining further knowledge and understanding of how to effectively use our four strategies. 4. Four strategies to approach master data management Like FUNG-A-FAt cuts right to the chase of the matter MDM is the ability of a company to control and validate data used to conduct business [13, p. 23]. This ability, however, can be achieved in different ways. We propose the following framework that introduces four different strategies for both initially identifying key data entities and for continuously monitoring and controlling master data quality (Figure 4). Thus, our strategies can be located in the analyze phase as well as the control phase of the master data quality management cycle introduced in section 2.3. Based on and motivated by both extant MDM literature as well as experiences made and insights won from projects with partner companies we distinguish two different perspectives from which to set out, when entering an MDM (quality improvement) project, namely: the object to start from and the course leading the project. Each of the two perspectives again subdivides into two characteristics. With regard to the object perspective a data-driven and a process-driven approach are distinguished. This distinction is very commonly made in existing MDM literature and accounts for the fact that MDM is both a technical and an organizational issue [4, 28, 30, 35, 37]. BERSON and DUBOV, for example, differentiate an information-focused [4, p. 259] and a processfocused [4, p. 276] approach for analyzing and integrating master data. Within the herein proposed 5
framework the data-driven approach largely seizes on early mainly IT-driven attempts to address MDM issues, whereas the process-driven approach picks up on the latest voices in the field stating that an MDM initiative is primarily intended to support an organization s business needs [28, p. 9]. The term process in this context refers to the companies daily business processes that are taken as a starting point for effectively addressing specific business problems and providing tangible business value [35]. By integrating both approaches, we aim at holistically addressing the introduction, establishment, quality assurance and maintenance of enterprise-wide MDM. With regard to the course perspective in turn we make the distinction between a problem-oriented and a solution-oriented approach. While the problemoriented strategy is understood as a bottom-up approach, taking the as-is situation as the startingpoint, the solution-oriented strategy promotes the topdown introduction of a to-be situation detached from specific one-time jobs. These two alternatives are very common in the field of business process re-engineering where they are called evolutionary and revolutionary [8, 16]. While the evolutionary approach strives for an incremental enhancement of the prevailing situation, the revolutionary approach aims at radical change by introducing a completely new and preferably optimal solution. We consider these two approaches for implementing change as a suitable counterpart to the object-perspective introduced above. The pair-wise combination of one characteristic per perspective then constitutes one strategy to approach master data management and improve master data quality. Figure 4. Four strategies to approach MDM The different strategies identified here can also be interpreted as design patterns as they are known and frequently used in the world of object oriented programming. A pattern describes the pair of a regularly recurring problem and a respective general solution. The general solution is not meant to solve one specific problem, but to serve as a template that can be reused for fastening the problem solving process for a certain problem-class [14]. The following sections provide a detailed description as well as an elucidating example for each of the four strategies. 4.1. Data-driven, problem-oriented strategy Taking a data-perspective for setting up an MDM (quality improvement) project is an obvious and straight-forward approach. The basic idea of the datadriven, problem-oriented strategy is to analyze and control existing master data in an explorative way. Identified master data quality issues then serve as a basis for improvement. To ensure a low effort analysis, each master data source is analyzed in an isolated manner - even if master data is often stored redundantly and master data inconsistencies between different master data sources are fundamental master data issues. Following RAHM and DO wo kinds of data quality problems can be identified applying this approach [33]: Schema-related problems: The lack of appropriate model-specific or application-specific integrity constraints lead to schema-related problems. Root causes are typically data model limitations, poor schema design or insufficient integrity constraints. Instance-specific problems: Instance-specific problems are errors and inconsistencies that cannot be prevented at schema level like missing values, duplicated records or invalid references. Data profiling is used as a means to uncover these master data quality issues. It represents a set of algorithms for statistical anomaly analysis and assessment of data values [28]. Data profiling focuses on the instance analysis of individual attributes. It derives information such as the data type, length, value, range, discrete values and their frequency, variance, uniqueness, occurrence of null values, typical strings [33, p. 6]. This information is then used as basis to detect master data quality problems. Typical checks are [33]: Identifier: Identifiers like product id are first of all checked for uniqueness. Furthermore, all identifier attribute values are checked for pattern conformance as there are typically basic rules how an identifier has to be specified e.g., product id is a single alpha followed by three numbers. Descriptor: Descriptors like product description are checked for uniqueness, too. Fuzzy logic is applied to overcome misspellings and variations. Furthermore pattern analysis is regularly performed on these kinds of attributes. Classification Code: In case of classification attributes the cardinality of attribute values is often limited to a specific number, e.g. there are not more than two genders. In order to assess the quality of 6
classification codes typically all classifications code values and their frequency are determined. Table 1 concludes the argumentation and depicts usage scenarios, advantages as well as disadvantages of the data-driven, problem-oriented strategy. Table 1. Characteristics of strategy I Data-driven, problem-oriented strategy Usage Scenarios Problem-oriented data analysis as initial root cause analysis Problem-oriented data analysis as a low budget root cause analysis Advantages Root cause approach to address master data issues in multiple dependent business processes Low effort Approach to determine unbiased, neutral master data status quo Disadvantages No focus on business impact of master data quality issues No systematic approach 4.2. Data-driven, solution-oriented strategy The basic idea of the data-driven, solution-oriented strategy is to analyze and assess existing master data in a systematic, top-down approach. Thus master data objects are analyzed with a clear perception of the tobe situation in mind and deviations thereof are methodologically eliminated. Therefore, a set of business rules is defined for each master data object [9]. Examples of product master data business rules are: Product Uniqueness: Each real-world product is represented by exactly one entity across all systems. Product Description: All product descriptions are maintained (no missing or empty values) and apply to a defined pattern e.g. string with maximum length of 150 characters. Product Category: The product category of each product is maintained (no missing or empty values) and relates to a valid product category. Prerequisite to apply and evaluate these rules is an integrated, cross system master data approach [27]. At the schema level, data model and schema design differences have to be addressed by the steps of schema translation and schema integration [33]. At the instance level data conflicts - different value representations or different interpretation of values across sources - have to be resolved [33]. Table 2 concludes the argumentation and depicts usage scenarios, advantages as well as disadvantages of the data-driven, solution-oriented strategy. Table 2. Characteristics of strategy II Data-driven, solution-oriented strategy Usage Scenarios Solution-oriented data analysis as a detailed, systematic root cause analysis Solution-driven data analysis as the solid foundation for a master data change project Advantages Systematic approach to determine status quo with regard to a well defined target solution Root cause approach to address master data issues in multiple dependent business processes Disadvantages High effort No focus on business impact of master data quality issues 4.3. Process-driven, problem-oriented strategy The third strategy we propose to address MDM again exhibits an explorative, bottom-up character. In this case, however, business processes are taken as the point of origin to deal with master data problems in order to identify the major pain points the organization experiences with bad master data. This strategy takes into account that a purely data-driven approach indeed enables the organization to get started quickly with MDM, but may perhaps not effectively solve the actual business problems or ensure the intended business value [35]. SHANKAR argues in a similar way, stating that MDM solves business problems by efficiently managing master data that is critical to a company s business operations. Consequently, the way an MDM solution is implemented depends foremost on which business problems are being tackled [35, p. 38]. Thus, in this case the MDM project adopts a strong business attitude, analyzes which business processes need to be improved first and goes down to the data and systems only in a second step. The realization of this strategy requires interviews with employees of each department. The staff is requested to report on those processes that are most severely affected by bad master data quality. Furthermore, estimations have to be made on the impact of bad data quality, e.g. in terms of process performance metrics like the cycle-time or in terms of the number of dissatisfied customers or lost orders. Based on these figures the impact of bad data quality can be quantified and measured. Subsequently, a source analysis of the low-quality master data objects is conducted in order to 7
spot where data quality issues originate. This approach represents a low effort and thus low cost assessment of the most serious master data quality issues. Table 3 concludes the argumentation and depicts usage scenarios, advantages as well as disadvantages of the process-driven, problem-oriented strategy. Table 3. Characteristics of strategy III Process-driven, problem-oriented strategy Usage Scenarios Problem-oriented process analysis as initial step to identify business impact of bad master data Problem-oriented process analysis as a low budget impact analysis Approach is not seen as yet another IT initiative and is thus better supported by the business side Advantages Focus on business impact of master data quality issues Low effort Disadvantages Focus on business processes may not uncover all major master data root cause problems No systematic approach 4.4. Process-driven, solution-oriented strategy The fourth strategy of the proposed framework represents a process-driven, solution-oriented approach. As such it is characterized as a systematic, topdown objective. The basic idea is to draw on either the companies process handbook/documentation or a comparable business process reference model as a tobe reference in order to holistically analyze the organizations business process landscape and thereof derive the related master data life-cycle processes. Subsequently, business processes need to be assorted with regard to their share of the added value in order to develop a prioritization of master data quality measures. This approach is especially valuable if an enterprise is willing to spend effort and money for systematically and holistically addressing its master data quality issues. A major advantage of the approach is its business orientation which signals that MDM is not yet another IT initiative, but an approach to support the employees in doing their daily business. This again leads to a much higher project support from the business side. However, problems of disparate business units having to come together to a higher degree than they are typically used to may not be neglected [30]. Table 4 concludes the argumentation and depicts usage scenarios, advantages as well as disadvantages of the data-driven, solution-oriented strategy. Table 4. Characteristics of strategy IV Systematic Process Analysis Usage Scenarios Solution-driven process analysis as a detailed, systematic business impact analysis Solution-driven process analysis as the solid foundation for a master data change project Advantages Focus on business impact of master data quality issues Systematic approach to determine business impact of bad master data quality Approach is not seen as yet another IT initiative and is thus better supported by the business side Disadvantages Focus on business processes may not uncover all major master data root cause problems High effort The strategies introduced in the herein proposed framework may not be interpreted as mutually exclusive. It is much more suggested that subject to the pursued MDM and master data quality objectives different strategies may either be applied in parallel or in sequence. A brief example is given in the subsequent case narrative. 5. Case narrative One of our partner companies, a medium-sized organization in the leasing industry, recently set up a project for master data quality improvement. Their IT landscape is characterized by a high degree of heterogeneity due to the use of several different contract systems for different product-lines. Though the overall master data quality felt bad and some departments were complaining about inconsistencies, there was little to no backed knowledge about the prevailing master data quality situation in the beginning. In order to get a quick first impression with low effort it was thus decided to initially implement the data-driven, problem-oriented strategy (1). The results from the system-wise explorative data analysis were alarming and proved that many data records were incomplete and inconsistent. It was thus decided to then combine the process-driven, problem-oriented strategy (2a) in order to buy in the business side and quickly identify the major pain points with the data-driven, solutionoriented strategy (2b) for systematically creating a uniform master data landscape in a to-be manner across the whole organization. Figure 5 visualizes how the strategies were combined. 8
[5] C. Braun, F. Wortmann, M. Hafner, and R. Winter: "Method construction - a core approach to organizational engineering", in: Proceedings of the 2005 ACM symposium on Applied computing, Santa Fe, New Mexico 2005, pp. 1295-1299. [6] J.-S. Brunner, L. Ma, C. Wang, L. Zhang, D.C. Wolfson, Y. Pan, and K. Srinivas: "Explorations in the use of semantic web technologies for product information management", in: Proceedings of the 16th International Conference on World Wide Web, Banff, Alberta, Canada 2007, pp. 747-756. Figure 5. Exemplary combination of strategies 6. Conclusion and future work With this paper we proposed a framework consisting of four different strategies to approach MDM We developed these strategies as the first fragment of a method for implementing MDM projects and improving master data quality in the course of a collaborative research endeavor with organizations from different industries. The short case narrative shows that the approach provides valuable assistance for organizations addressing the topic of MDM. However, in the current version the strategies are not yet deeply elaborated and approved, and need further evaluation and refinement. Furthermore, though based on both the current body of knowledge in MDM and the experiences made in a number of research projects with our partner companies we do not claim the introduced perspectives to be exhaustive. There may by further perspectives that play a role when setting up an MDM (quality improvement) initiative. Moreover, it has to be investigated how other components of the MDM framework are affected by decisions taken within the analysis phase. 7. References [1] A. Andreescu, and M. Mircea: "Combining Acual Trends in Software Systems for Business Management", in: Proceedings of the International Conference on Computer Systems and Technologies (CompSysTech'08), Gabrovo, Bulgaria 2008, pp. V.9-1-V.9-6. [2] C. Beasty: "The Master Piece", CRM Magazine, Vol. 12, 2008, pp. 39-42. [3] P.A. Bernstein, and L.M. Haas: "Information Integration in the Enterprise", Communications of the ACM, Vol. 51, 2008, pp. 72-79. [4] A. Berson, and L. Dubov, Master Data Management and Customer Data Integration for a Global Enterprise, Mcgraw- Hill, 2007. [7] DAMAInternational, The DAMA Guide to the Data Management Body of Knowledge (DAMA-DMBOK), Technics Publications, LLC, 2009. [8] T.H. Davenport, and J.E. Short: "The New Industrial Engineering - Information Technology and Business Process Redesign", Sloan Management Review, Vol. 31, 1990, pp. 11-27. [9] A. Dreibelbis, E. Hechler, I. Milman, M. Oberhofer, P. van Run, and D. Wolfson, Enterprise Master Data Management: An SOA Approach to Managing Core Information, IBM Press, 2008. [10] J. Dyché, and E. Levy: "Not Your Father's List Management: MDM Matures, Part 1", DM Review, Vol. 18, 2008, pp. 14-16. [11] W. Fan, F. Geerts, and X. Jia: "A Revival of Integrity Constraints for Data Cleaning", in: Proceedings of the 34th International Conference on Very Large Data Bases, Auckland, New Zealand 2008, pp. 1522-1523. [12] N. Foshay, A. Mukherjee, and A. Taylor: "Does Data Warehouse End-User Metadata Add Value?", Communications of the ACM, Vol. 50, 2007, pp. 70-77. [13] M. Fung-A-Fat: "Why is Consistency so Inconsistent? The Problem of Master Data Management", Cutter IT Journal, Vol. 20, 2007, pp. 23-29. [14] E. Gamma, R. Helm, R. Johnson, and J. Vlissides, Design Patterns: Elements of Reusable Object-Oriented Software, Addison-Wesley Professional, 1994. [15] J. Griffin: "Overcoming Challenges to Master Data Management Implementation", DM Review, Vol. 16, 2006, p. 17. [16] M. Hammer, and J. Champy, Reengineering the Corporation - A Manifest for Business Revolution, HarperCollins Publishers, New York, 1993. [17] A.R. Hevner: "A Three Cycle View of Design Science Research", Scandinavian Journal of Information Systems, Vol. 19, 2007, pp. 87-92. 9
[18] A.R. Hevner, S.T. March, J. Park, and S. Ram: "Design Science in Information Systems Research", MIS Quarterly, Vol. 28, 2004, pp. 75-105. [19] S. Hoberman, D. Burbank, and C. Bradley, Data Modeling for the Business. A Handbook for Aligning the Business with IT Using High-Level Data Models, Technics Publications, LLC, Bradley Beach, NJ, 2009. [20] C.W. Hofer, R.A. Pitts, E.A. Murray, and R. Charan, Strategic Management: a casebook in Busines Policy and Planning, West Publishing Co, 1980. [21] K.-T. Huang, Y.W. Lee, and R.Y. Wang, Quality Information and Knowledge, Prentice Hall, Upper Saddly River, NJ, 1999. [22] K.M. Hüner, M. Ofner, and B. Otto: "Towards a Maturity Model for Corporate Data Quality Management", in: Proceedings of the 24th Annual ACM Symposium on Applied Computing, Shin, D. (ed.), Honolulu, Hawaii, USA 2009, pp. 231-238. [23] K.M. Hüner, and B. Otto: "Functional Reference Architecture for Corporate Master Data Management": BE HSG / CC CDQ / 21, St. Gallen, Switzerland 2009. [24] J. Iivari: "A Paradigmatic Analysis of Information Systems as a Design Science", Scandinavian Journal of Information Systems, Vol. 19, 2007, pp. 39-64. [25] B. Inmon, B. O'Neil, and L. Fryman, Business Metadata: The Quest for Business Clarity, Morgan Kaufmann Publishers, Burlington, MA, 2007. [26] A. Joshi: "MDM Governance: A Unified Team Approach", Cutter IT Journal, Vol. 20, 2007, pp. 30-35. [27] C. Loser, C. Legner, and D. Gizanis: "Master Data Management for Collaborative Service Processes", in: Proceedings of the International Conference on Service Systems and Service Management, Chen, J. (ed.), Beijing 2004. [28] D. Loshin, Master Data Management, Morgan Kaufmann, 2008. [29] S.T. March, and G.F. Smith: "Design and natural science research on information technology", Decision Support Systems, Vol. 15, 1995, pp. 251-266. [30] W. McKnight: "Justifying and Implementing Master Data Management for the Enterprise", DM Review, Vol. 16, 2006, pp. 12-14. [31] L.T. Moss: "Critical Success Factors for Master Data Management", Cutter IT Journal, Vol. 20, 2007, pp. 7-12. [32] S. Neil: "A new structure for corporate data", Managing Automation, Vol. 22, 2007, pp. 40-43. [33] E. Rahm, and H.H. Do: "Data Cleaning: Problems and Current Approaches", IEEE Bulletin of the Technical Committee on Data Engineering, Vol. 23, 2000, pp. 3-13. [34] T.C. Redman, Data Quality: The Field Guide, Digital Press, Woburn, MA, 2000. [35] R. Shankar: "Master Data Management Strategies to Start Small and Grow Big", Business Intelligence Journal, Vol. 13, 2008, pp. 37-47. [36] H.A. Smith, and J.D. McKeen: "Developments in Practice XXX: Master Data Management: Salvation Or Snake Oil?", Communications of the Association for Information Systems, Vol. 23, 2008, pp. 63-72. [37] C. Snow: "Embrace the role and value of master data management", Manufacturing Business Technology, Vol. 26, 2008, pp. 38-40. [38] D.L. Stufflebeam: "Evaluation Checklists: Practical Tools for Guiding and Judging Evaluations", American Journal of Evaluation, Vol. 22, 2001, p. 71. [39] G. Tozer, Metadata Management for Information Control and Business Success, Artech House, Norwood, 1999. [40] D. Truex, and D.E. Avison: "Method Engineering - Reflections on the Past and Ways Forward", in: Proceedings of the Ninth Americas Conference on Information Systems, DeGross, J.I. (ed.), Atlanta, Georgia, USA 2003, pp. 508-514. [41] S. Tuck: "Is MDM the route to the Holy Grail?", Journal of Database Marketing & Customer Strategy Management, Vol. 15, 2008, pp. 218-220. [42] J. Vosburg, and A. Kumar: "Managing dirty data in organizations using ERP: lessons from a case study", Industrial Management & Data Systems, Vol. 101, 2001, pp. 21-31. [43] K. Wende: "A Model for Data Governance Organising Accountabilities for Data Quality Management", in: Proceedings of the 18th Australasian Conference on Information Systems (ACIS 2007), Toleman, M., Cater-Steel, A., and Roberts, D. (eds.), Toowoomba, Queensland 2007. [44] A. White, J. Radcliffe, and C. Eschinger: "Predicts 2009: Master Data Management Is Applicable in Down Economies and in Times of Growth", GartnerGroup Research, 2008, ID # G00164023. [45] R. Winter: "Design science research in Europe", European Journal of Information Systems, Vol. 17, 2008, pp. 470 475. [46] A. Zornes: "The Fourth Generation of MDM", DM Review, Vol. 17, 2007, pp. 26-37. 10