1 Master Data Management Architecture Version Draft 1.0 TRIM file number - Short description Relevant to Authority Responsible officer Responsible office Date introduced April 2012 Date(s) modified Describes the architecture for managing data that is shared between organisational systems that is referred to as master data. Incorporating management requirements for each phase of the master data lifecycle. The architecture to support the delivery of the DIT business model of how information/data is to be shared between CSU information systems. CSU DIT, Data Custodians, Data Consumers, Business Analysts, Solution Architects, Data Governance Committee, Developers, System Officers. Executive Director, Division of Information Technology Enterprise Architect - Information Next scheduled review date December Related University documents Enterprise Architecture & Liaison, Division of Information Technology CSU Enterprise Architecture Principles Data Standards Master Data Integration Standards Application Standards CSU Identity Standards CSU IT Infrastructure Standards CSU Security Standards Master Data Definitions Master Data Governance Framework CSU Data Principles CSU Information Strategy Enterprise Architecture Glossary of Terms Related legislation Key words State Records Act 1998 (NSW) Privacy Act data, data asset, data architecture, enterprise data, custodians, source systems, guidelines, rules, master data, shared data, data principles, data standards,
2 1 Definition Master Data Management Master data management (MDM) is a technology-enabled discipline in which business and IT work together to ensure the uniformity, accuracy, stewardship, semantic consistency and accountability of the enterprise's official, shared master data assets. Gartner Master data management applies to all enterprise data that is (electronically) shared to support the delivery of required business operations and strategic goals. Master data can be categorised as: - Structured & unstructured data - Data located across multiple subject areas (Multi Domain environment) - Operational data set - Authoritative data Master data has an origin system, destination system/s and defined enterprise master data schema. It is informed by the CSU Information Strategy. 2 Purpose & Benefit Master Data Management is to ensure that master data is a reusable asset that is semantically assured across the enterprise. Gartner Primary purpose of this architecture is to support the organisation in the sharing of the right data between CSU systems and Organisational Units in an effective and sustainable way to enable the timely delivery of business data needs. Core benefits to the organisation are: Single source of truth established Consistency in the data shared Sharing of the right data Supports data integrity and protection Improves availability of data Leveraging master data management for cost optimisation, enabling growth and agility, risk management and regulatory compliance.
3 3 Scope To realise purpose and benefits, the Master Data Management architecture will deliver the following functions: Master Data Lifecycle - Single lifecycle model to meet CSU needs - Defined stages, associated processes, roles & responsibilities - Master data is managed within each defined stage - Organisational change process/es mapped to stages - Tracking events - Reporting Identification of master data - Meet business needs (one university, knowledge based organisation, plug n play, authoritative data shared) - Authoritative data source - Authoritative data lifecycle management processes 1 - Data Custodian Data Description & Context Definition & classification - Organisational view - Common view of data - What data means & purpose 1 - Data Domain & relationships - Business logic 1 - At a level of abstraction that is open, non proprietary - Associated CSU policies & legislative requirements - Security classification Note: Informed by relevant CSU policies and legislative requirements. Data Sharing Master data definitions available (search) Accurate matching of data requirement to master data Process for approved access & use Data security classification Master data schema Instantiation of master data (copy stored within the MDC 2 ) Dependencies exist with other contributing architectures for: - Quality Assurance methods - Data Issues Register - Data integration methods - Available data integration services - Infrastructure standards (security, reliability, performance) Maintenance of shared data sets Standard managed procedure Auditing of integrity and protection mechanisms Collaborative Management Business Stakeholders for data rules & governance 1 Requires input from a contributing business architecture, refer to section 11 for more details. 2 MDC Master Data Cache, an oracle database that stores a copy of every populated master data definition, independent of any application.
4 Data Custodians Data Consumers Information Architect Enabling Technologists Data Governance Committee Effective match of role to responsibility Communication & Culture Availability of resources to support Data Stakeholders (in respective role responsibilities) Access to repository of master data artefacts Supports collaborative work (shared data, shared understanding, one university) Informs and educates University community on master data and relevant updates Master Data Advisory Services
5 4 Past States With the introduction of the concept of master data into CSU in 2006 through the Data Architecture Project the management of master data was limited to the identification and definition of a core set of master data entities managed by the Information Architect in consultation with the respective Data Custodian. Along with a technology solution for enabling a standard method of data integration (webmethods) between disparate systems and an additional database setup to store a copy of shared (master) data to be known as Master Data Registry (MDR). The original purpose of this database was as a backup source for master data. A CSU Data Governance Committee (DGC) was also established to provide governance over data issues that limited or impeded the population and improvement of master data. This project established the close relationship between the Data and Integration Architectures. At this initial stage the concept of master data management was predominately focused around data and technology with less focus on business processes and people. the conclusion of the project included the release of a first version of the data integration standards, each new technology project started to reference these new standards. As a consequence, this initiated the use of (defined) master data by information systems (destination systems) and the need for additional master data (definitions). As the number of information systems requiring access to source data increased the role and responsibilities of the Data Custodian came more into focus, in the need to ensure the right master data was defined, acceptable level of integrity existed and appropriate security/protection classification. Working with the Information Architect on security classification, integrity measures and other aspects relating to each master data set, which also directly supported the organisation in its one university principle. For each project to enable the accurate identification of shared data requirements a clear description of the business requirements was essential, and so the Business Analyst role started to contribute to the management of master data. Of critical importance was having clarity on the needs of the business and associated rules. With master data definitions and use expanding, visibility of other processes and roles that contribute to the management of master data increasingly came into view such as: - Software Development Lifecycle (SDLC) and the roles of the Solution Architect, Developer and Testers. - System Maintenance and the roles of the Application Custodian, Systems Officer, and Data Custodian. - Organisational Change Management Processes Variations in use of the MDR to support data integration services began to occur for various reasons which created some confusion on the source of truth for particular master data. This triggered a review of the implementation of master data in context to the integration architecture, resulting in clarification of the role of the MDR as a stored copy of master data and was subsequently renamed Master Data Cache (MDC). As time moved on, the enterprise view, classification and integration of master data began to be tested by individual project data requirements and timeframes. This was a strong indicator that a master data management solution was not sufficiently in place or at a level of maturity to support all phases of the master data lifecycle given the current CSU business and technology environments. From a business perspective such things as the need to be agile in the delivery of data that was accurate, timely, reliable and suitably protected. From a technology perspective the use & adherence of standards is important to support reliable, cost effective and maintainable solutions. Concurrently, there is the influence of industry trends in both business and technology sectors. The broader group of stakeholders and growing number of master data integration services has increased the need for a more expanded set of metadata that can be made available in real time to enable the timely management of master data as required by Architects, Data Custodians, Developers and Data Consumers. The growing urgency for metadata has also initiated a new discussion around the role of the MDC and the
6 value of/need for developing an alternate view of the technical architecture that is to enable master data management, as distinct from the Integration Architecture view. The status and maturity of master data management capabilities has also been influenced by its dependencies on other contributing architectures. The primary contributors are: - Business architecture, the availability of business needs, processes and rules to allow accurate identification of the right master data, who and how this data can be shared. - Data architecture for principles, standards, issues register and data classifications. - Integration architecture for data integration principles and standards. - Application architecture, role of the Application Custodian, data formats & standards. As master data definitions and use has expanded, there has been recognition and development of a number of management components to enable master data management to occur more at an organisational level and in collaboration with respective Stakeholders changes have occurred in other contributing architectures that directly support the management of master data.
7 5 Present State The current state architecture can be characterised by the following components: Approach - Change driven by business priorities as per IP:ISI and ICT:SWR - Enterprise view with alignment to CSU Strategic Plan - Management evolving and maturing as guided by business needs around master data, benefits realisation and sustainability. No formal master data management strategy documentation exists. - Fostering closer association with business process architecture to support informed use and management of master data - Business Data Custodians remains responsible for the respective management of master data at the origin system and in associated destination system/s. Managing data lifecycle activities in accordance to relevant processes as informed by policies, rules, legislation etc. Master Data Lifecycle - Distributed management and responsibilities for master data in the various lifecycle phases. No organisational agreed & published view of the CSU master data lifecycle. Indicates need for improvement. - Change management process for master data entities closely interlinked with project, SDLC & RFC change processes. Adds complexity to management and can create a level of uncertainty on role responsibilities. - No central technology solution/s that captures and tracks master data lifecycle activities to support master data management Master Data Use Pattern - Operational data transactions - Multi domain (broad as per CSU priorities/ not restricted to student or research domain, etc) NOTE: Increasing business need for a broader set of enterprise business intelligence and analytical services will need to use master data to align with one university principle. This will likely expand the type of change process that will trigger master data requirements and may introduce a new category of data integration services that are based on date parameters. Scope - Structured data - Unstructured data limited sharing Business requirements, scope & timeframes driven by projects - Targets priority needs of business with an enterprise view - Hard to get ahead of project master data needs as detailed business requirements unknown, therefore project timeline directly influences master data timelines Data Domain Dimension (multi domain) - Accommodates a range of subject data domains - Determined by business requirements - Common domains with application & process architectures Master Data Assets - Master Data Cache (MDC) is in place, with a current role description of: Stored copy of every populated master data definition, independent of any application. - Sharing master data assets has a dependency on the Integration architecture - Earlier methods of sharing data between applications still exists and not all shared data elements are known or with a corresponding master data definition.
8 Metadata Modelling notation & levels Descriptive, functional details captured for each Master Data entity (Definition Template) Data Rules (extraction/manipulation/transformation logic): - Not visible to Stakeholders, in particular Data Custodians - No central store that is searchable - Level of duplication between definition and data integration service documentation No specialist management tools in place for: - Master data definitions - Version control - Workflow management Key Publications - Master Data Definitions Catalogue - Master Data Data Dictionary (list view) Governance Data Governance Framework (from Data Architecture) - Data Governance Committee (DGC) established & operational. Meets every 6 weeks. - Data Principles established - Data conflict resolution process (DGC) - Governance Roles (Data Custodian, DGC, Information Architect, Data Consumer) Data Security Classification Scheme - Classification scheme available and referenced Other DIT Change Governance Mechanisms used: - Initiatives Handling - Project Governance - SDLC Governance - CAB Governance Communication & Culture - Selected resources published to EA&L website - Restricted resources located on S drive - Information and education sessions about changes associated with master data management are irregular and vary across the different stakeholder groups. - Working party activities coordinated to address common stakeholder data issues The complexity of master data management is emerging and increasing over time as more systems are integrated to support one university view by accessing the authoritative source systems for data requirements. A common master data lifecycle is not sufficiently visible & shared by Stakeholders. Increasingly creates gaps in managing master data. There are a good base set of roles & responsibilities in place however depending on the change process (eg. project vs. project lite vs. maintenance) increasingly gaps appear in tracking origin & destination systems, QA, and governance. This also hinders identification and implementation of appropriate metrics to measure the value & benefit of MDM to CSU. Delivering the right set of metrics to the right stakeholder group. An increased ease of visibility of data extraction rules and data consuming/sharing rules is required for governance, reuse, agility, value, etc importantly for stakeholder roles of Custodian, Information Architect, Developer, Solution Architect,?others? Lack of timely communication between master data stakeholders can occur often because of given different approaches to managing change activities or schedules. This should improve with a published CSU master data lifecycle and respective role responsibilities agreed. The establishment of a community of practice or other stakeholder reference group may further advance and support communication.
9 6 Target State Following is a general description of the currently known requirements for the future state of the master data management architecture in order to enable the delivery of purpose and benefits. Identified improvements and extensions to the present state are presented below. Where no change is listed for present state element, it indicates it will remain as is into the foreseeable future, eg. as in the existing elements under Approach. For details on when, how and prioritisation of changes to be made to move from the present to target state, refer to the Master Data Management Roadmap. Master Data Management Processes Within the master data lifecycle there are a number of different processes, process owners, stakeholders and change schedules that contribute to master data management. To improve collaboration and opportunities for efficiency gains, the objectives are for: - Established CSU Master Data Lifecycle, agreed at an organisational level - Clear mapping of processes & outputs supporting master data management - Processes that can trigger (gateway to) master data management processes - Contributing roles and responsibilities known and supported - Structured & disciplined approach - Improve alignment to Industry best/good practice - Sustainable solution - Automate where possible - Auditable & track able changes - Every management or change process has a communication action Master Data Use Pattern Expanded Now extends to support for organisational analytical data (data warehousing) requirements by appropriate engagement points within Planning & Audit s relevant change process. Scope Unstructured Data Improve the capability for identifying and describing unstructured master data. The ability to effectively identify and share unstructured data will be influenced by the available metadata therefore this will require collaboration and contribution from other architectures and disciplines. For example, the information and records management disciplines with the aim to achieve an organisational information classification scheme/s that can support the wide range of CSU s unstructured data collections in the ability to discover, manage and share appropriately. Integrity Improved management of master data integrity at both definition and as it is shared between an origin and destination system as a data integration services. As the integrity of master data is influenced by business requirements and activities, together with technology solutions, improvement will need to consider integrity validation from both a business and technology perspective. Identified areas for improvement include: - Initial development and QA testing - Viewable registry of Custodian and Consumer signoff for relevant integrity checkpoints - Service requests in particular job queue, response times and availability of relevant knowledge articles - Minor Change and Maintenance request process - Source data lifecycle management processes Dependencies between Origin and Destination Processes The sharing of data between (origin & destination) systems and consequently business processes will automatically create a dependency between processes. The integrity and currency of master
10 data services is often assumed at fault when there is unexpected data as a result of a data management process (either origin or destination). Each process will have associated policies, rules, volatility, schedule, user groups, criticality of service, etc. Improved access to information on the business process layer will support improved assessment, planning and implementation of shared master data services. Master Data Cache (MDC) The role of the MDC is managed appropriately to support its defined role within the master data management technical architecture. Metadata An expanded set of metadata available to enable increased management efficiency, collaboration and sustainability. Definition (Template) - Alternative view/s are available to improve support for different Stakeholder audiences, namely Custodian, Solution Architect, Developer, and Business Analyst. - Introduction of an organisational classification scheme/s to support data sharing, discoverability, management and protection. Data Sharing Managed access by required Stakeholder roles of information about master data elements such as: - Origin system/s - Destination systems Data Rules (Layer) Single copy and expanded to provide: - Improved visibility of data rules applied for the extraction of master data from source systems - Organisational classification (scheme/s) applied to data element - Improved testing & validation - Entity rules - Aggregation & split rules - Synchronisation rules - Transformation rules - Managed access by required Stakeholder roles Enabling Technology - Introduce a centralised or hybrid metadata repository to capture, manage and report on master data. Governance Master Data Management (MDM) - The CSU Master Data Lifecycle is agreed by Stakeholders and endorsed by the Data Governance Committee (DGC). - A Master Data Management Strategy exists that informs the Master Data Management Architecture. This strategy is authorised by the Data Governance Committee (DGC). MDM Stakeholder Roles & responsibilities - As master data has evolved at CSU it has confirmed Business and IT staff are both responsible for master data, therefore to support the ongoing effective management of master data a clear understanding of roles and responsibilities must be agreed and actioned. Aiming for the right balance of management contribution and collaboration from Stakeholders that delivers a functional and sustainable MDM solution for CSU. - Establish a MDM Stakeholder Representative Group for the purpose of meeting regularly to support the MDM practices. Need to develop a terms of reference or guideline for purpose of group and possibly endorsed by DGC. Example of activities may include discussion on any issues that need addressing with MDM or working on an improvement task, etc.
11 Data Security Classification Scheme - A revised security classification scheme is in place to improve security, protection and privacy assessment and decision-making for data and user access in particular those associated with externally hosted systems, cloud computing services, web 2.0, mobile applications, and other mobility technologies. - Master data reclassified according to revised security classifications Organisational Information classification scheme/s A classification scheme or schemes exist to support identification of unstructured data that will support management of information within CSU, including the sharing of master data. A classification scheme/s would likely cover identification by content type, Org Unit ownership, record type, privacy, etc. Communication - A communication plan, process and associated publication channels are in place and operational. - Use of one Artefact Repository that has a publication interface that allows for role based access control to range of resources. - All supporting artefacts are appropriately branded, provide brief description of purpose, audience and version. Technical Architecture - A developed and implemented technical architecture that will deliver on technology required to effectively and efficiently support CSU s master data management requirements. Management Technology Toolset/s Additional technology and automation is required to support the ability to keep pace with the increasing demand for master data and subsequent growing complexity of the shared data environment. Targeting a right size fit to CSU management requirements that can deliver effective and sustainable management of master data. A general description of a technology solution would include: - Centralised master data management toolset/s - Supports a range of management processes and role responsibilities - Use of role based access control to support a range of user types, with self service capabilities - Automation and workflow capabilities - Delivers agility, integrity, governance, timeliness in management of master data - integration with other architecture domain solutions gain more efficiencies and agility, eg. integration, process, application. - Sustainable solution (from aspect of administration, ongoing cost, skill set, etc.)
12 7 Stakeholders Customers System Officer Developer Business Analyst Solution Architects Solution Coordinators Data Custodians Data Consumers End to End Change Process Customers -Recipient of a product output (service, product, and information) Partners Governance Information Architect Integration Architect Data Governance Committee Application Architect Solution Architect Business Architect Data Consumers Master Data Management Architecture SEC Application Custodians Data Custodians Business Analysts Data Custodians DIT Exec Director EA&L Director & Manager Developer Partners - Partners include those that are jointly engaged in the delivery of the product DIT Resources External Hosts Service Providers CSU Privacy Officer Industry Vendors Contractors Service Providers/ Enablers / Suppliers - Provides resources and support mechanisms to enable the product delivery Governance - the systems and processes in place for ensuring proper accountability and openness in the conduct of the University s business. Description Master Data Management Architecture Stakeholder Model Version: Draft 1 Author: C Middleton Date: 6 Jan Governance Governance Bodies: Data Governance Committee link to Terms of Reference. SEC decisions on the prioritisation of change activities on the IP:ISI. Data Custodians approve access and use of data for individual sharing requests. Application Custodians approve access and any changes to application data structure. DIT Executive Director, EA&L Director & Managers approve the data architecture, principles, standards & roadmap in principle and implementation. Governance Mechanisms: CSU Policies Legislative Requirements Architecture Principles & Standards suite CSU business rules CSU processes 9 Influences Technology: Web 2.0 technologies Cloud computing Mobile apps Agile development methodology Big Data, Research Data BYOD (for organisational activities)
13 NBN (improved user access, expanding user needs, expectations) Business: CSU s one university approach CSU s Strategic Plan and Policies 2012 uncapping of student loads Legislative compliance 10 Opportunities & Risks Overlap with integration architecture in master data lifecycle activities/architecture elements opportunity for rationalisation Inconsistent and not readily available data mapping through the master data lifecycle (source to multiple destination systems) Limited tracking in change processes for signoff of development stages, improvements, reuse Challenge, align MDM maturity activities & resources with business priorities & projects. Risk of legislative or business economics may result in compromise of optimal data architectural solution. Need to accommodate whilst understanding immediate and future impact to CSU s master data architecture. (maybe roadmap item to include development of a simple matrix to measure impact/debt to architecture.) visibility & credit tags The complexity of master data management is emerging and increasing (see note in present state) Data Protection right classification, technology capability & monitoring, consistently applied at origin system, integration services and destination systems. Data Privacy same comment as above. Communication many stakeholders, many processes, many changes. 11 Contributing Architectures Strong interdependencies with other architectures as listed in Table 1. Table 1: Summary of interdependencies Business ( incl. process) Information Architecture Records Mgmt Data Architecture Application Architecture Online Architecture IdM Architecture Integration Infrastructure Security Shared data requirements, availability, and quality, associated rules for input, extraction, security & use. Custodians, Stakeholders, User Groups, data lifecycle mgmt, associated policies, legislative requirements, record mgmt, governance, associated services & resources. CSU Information Strategy and associated policies. Identification of data sets that come under the classification of a record and must align to records mgmt policies & legislative requirements in particular to security, authorisation, and other governance classifications. Data principles & standards, issue register, Data Governance Committee & procedures. Source, target, formats, translation requirements, transfer options, data mgmt processes &/or workflows, logic to implement business rules, reporting tools, automated provisioning. Physical implementation of master data domain model, population of master data entities, integration methods, availability, security. Data storage, connectivity, performance, BC, DR, security.
14 12 Supporting Artefacts CSU Policies Legislative compliance Business rules Data Standards Data Principles Data Security Classifications Data Architecture Roadmap Data Issues Register Enterprise Architecture Glossary of Terms Data Governance Committee Terms of Reference Data Architecture Glossary of Terms Catalogs: Master Data Definitions Catalogue Master Data Dictionary Master Data Schedule of Work Matrices: Information Domain/Master Data Application/Master Data matrix Integration Services/Master Data matrix Diagrams: Master Data Lifecycle Conceptual Master Data Domains Logical Master Data Model Data Security? MDM viewpoints: Include diagrams such as Data volatility Strategic alignment Change processes Fit within Architecture Processes (specific to master data management): Master Data Access Approval Master Data Definition Master Data Integration Services External References: Industry standards (eg. Eduperson) Technology standards (ISO 11179, Security ISO/IEC 27002) Table of amendments Version number Date Short description of amendment