DATA GOVERNANCE AT UPMC A Summary of UPMC s Data Governance Program Foundation, Roles, and Services
THE CHALLENGE Data Governance is not new work to UPMC. Employees throughout our organization manage data assets every day. Historically, we have managed our data assets independent of a comprehensive enterprise strategy. Knowledge of where our data resides, how it is used, and what it means is not centrally documented or broadly available, and business ownership of our data is not clearly established. To address these issues, UPMC has created a Data Governance program. Data Governance is an ongoing program not a one-time project designed to establish enterprise data ownership, improve the quality of our data, broadly share metadata, and establish policies and guidelines for proper access and use of our valued data assets. OUR MISSION AND VISION to collect, change, store, move, consume, and release UPMC data assets efficiently, accurately, and legally. UPMC has made an investment in new technologies that will permit us to capture and share the answers to these types of common data questions Where can I find the information I need? Is the data good? Does it mean what we all think it means? Did the data come from a trusted source? This program s foundation is based on creating shared stewardship across UPMC s technical, operational, and clinical departments. Broadly sharing the knowledge of our data and what it means will develop smarter users and requesters of information.
BUILDING THE PROGRAM UPMC s Data Governance program models the IBM Data Governance Unified Process. The foundation of this implementation strategy contains the following tasks: 1. Define the Business Problem 2. Obtain Executive Sponsorship 3. Conduct a Maturity Assessment 4. Build a Roadmap 5. Establish an Organizational Blueprint 6. Build the Data Dictionary 7. Understand the Data PROGRAM MATURITY 1 2 Initial Process unpredictable, poorly controlled and REACTIVE Managed Process characterized for PROJECTS and is MANAGEABLE 3 Defined Process characterized for ORGANIZATION and is PROACTIVE 8. Create a Metadata Repository 9. Define Metrics 10. Measure the Results 11. Govern Master Data 12. Govern Analytics 13. Manage Security and Privacy 14. Govern the Information Lifecycle The third step of the IBM Data Governance Unified Process, Conduct a Maturity Assessment, helps to measure the progress of the program. A maturity assessment will be conducted annually to set goals to advance the maturity of the Data Governance program. Maturity levels range from 1 to 5, with 1 being the initial level where the program is in its infancy stage and not automated, up to level 5 where the program is fully-automated and all of the necessary measurements and personnel are in place to validate the program. The UPMC Data Governance Office used IBM s InfoGov Assessment to evaluate the Data Governance program after the first year of operation. This assessment contains nine essential competencies tracked over five levels of maturity. Forty questions are used to evaluate the following competencies: Classification and Metadata Data Architecture Data Integrity Management Data Risk and Compliance Organizational Structure Policy Stewardship Value Creation 4 Quantitatively Managed Process QUANTITATIVELY measured and controlled 5 Optimized Focus on CONTINUOUS process improvement WHAT IS DATA GOVERNANCE? The Data Governance Institute defines Data Governance as: A system of decision rights and accountabilities for informationrelated processes, executed according to agreed-upon models which describe who can take what actions with what information, and when, under what circumstances, using what methods. The Master Data Management Institute defines Data Governance as: The formal orchestration of people, processes, and technology to enable an organization to leverage data as an enterprise asset. While there are varied definitions of Data Governance, the common goal among Data Governance programs is to create a structure and context around an organization s data to: Deliver high quality data Ensure proper access and usage of the data Establish standard definitions of the data Identify the best source of data Reduce unnecessary duplication of data and information Understand and promote the value of data
THE ROLES UPMC s Data Governance program is designed to provide enterprise ownership and management of data, which is achieved through a network of Information Owners and Stewards across the organization. The program s organizational structure is tiered with decision rights and accountabilities assigned to each level. Enterprise Analytics Leadership Consists of senior leaders from Provider, Finance, Research, Strategic Planning, and Health Plan. This group sets priorities related to information and analytics and provides the Data Governance program guidance. Data Governance Council A cross-functional, executive-level group, representing business and technical stakeholders that make policy decisions and define the data management strategy for the organization. Information Owner A person accountable for the quality and use of a particular domain of data who knows what the data means, where all the data is moved, and how the data is used. Data Governance Office A team dedicated to defining, deploying, and managing the Data Governance program and supporting technologies. Data Steward A person assigned by an information owner to a particular domain of data that is responsible for data definitions and data integrity. Analytics Steward A person representing a team that generates information for use across the organization who is responsible for documenting this use and compliance with data governance policies and guidelines. Application Steward A person with expert knowledge of a specific application in operation at UPMC that is responsible for the quality of the data managed in that application and compliance with data governance policies and guidelines
UPMC DATA GOVERNANCE TECHNOLOGIES UPMC utilizes several Informatica tools to perform Data Quality and Data Integrity validation, Metadata Management, and Master Data Management. Master Data Management The Informatica Master Data Management (MDM) product delivers consolidated and reliable business-critical data, also known as master data, to applications in operation throughout UPMC. Master data is the fundamental data entities upon which a business is based. UPMC master data examples include patients, providers, employees, departments, facilities, and item masters. Master Data Management is the maintenance of reliable, trustworthy, accurate, non-duplicative master data that a business can rely on for effective decision making and efficient business operations. UPMC has selected Informatica s MDM Hub for deploying MDM solutions across the enterprise. Informatica s MDM Hub utilizes a Trust Framework to determine and deliver the Gold Source of master data by: 1. Capturing all factors affecting the essential quality of each data source. a. Data age, change history, source type, syntax, etc. 2. Applying custom rules for data validation and matching. 3. Dynamically maintaining a single best result of the attributes using trust scores. Gold source records will represent the best known version of the truth as defined in the trust. Informatica MDM also provides a hierarchy manager tool with the ability to manage and view hierarchical relationships within the master data.
Data Integrity Informatica s Data Quality (IDQ) tools allow business and information technology (IT) to collaborate on data quality processes, which reduces dependence on IT resources. IDQ allows for the creation of data integrity rules that can be used across all forms of data integration, MDM, and data integrity projects. IDQ is open to all applications, allowing users to access any data source, anywhere, and deploy centralized data integrity rules to improve data integrity across all applications. Informatica IDQ Analyst This easy-to-use, browser-based tool empowers Information Owners and Stewards to easily participate in data integrity improvement without the need for IT intervention. Informatica IDQ Analyst enables data profiling and analysis and creates data integrity scorecards with the flexibility to filter and drill-down on specific records for better detection of problems. Specification, validation, configuration, and tests of data integrity rules become streamlined, improving collaboration with IT in sharing profiles and implementing data integrity rules. Informatica IDQ Analyst enables the business to finally engage all the right people in improving data integrity. Informatica IDQ Developer This powerful data integrity development environment enhances IT developer productivity. Users quickly discover and access all data sources with Infomatica s IDQ Developer, improving the steps of analyzing, profiling, validating, and cleansing data by combining data integrity rules with sophisticated data transformation logic and conducting midstream and comparative profiling to validate and debug logic as it is developed. Users also can configure data integrity services, connecting to data physically or virtually. Informatica Developer enables IT developers to reuse all profiling and rule specifications across all applications and projects.
METADATA MANAGER AND BUSINESS GLOSSARY Metadata Manager and Business Glossary are key features of Informatica s Power- Center Advanced Edition. Metadata Manager collects metadata from across data integration environments and provides a visual map of the data flows within that environment. The Business Glossary enables business users to define and annotate business terms that describe their business environment and link them to the underlying technical metadata, which provides a common vocabulary for the discussion of business terms and for business/ IT collaboration. The Business Glossary works in close association with Metadata Manager to enhance collaboration and boost productivity between business and IT. For enterprise data integration projects, they provide the visibility and control needed to manage change, reduce errors caused by change, and ensure data integrity. Benefits of Informatica Metadata Manager and Business Glossary: Enhances business/it collaboration with a common business vocabulary and built-in collaboration tools as well as the management of data integration change and reduce the errors caused by change Increases IT productivity by ensuring clear and error-free communication with business users resulting in fewer iterations and faster project delivery. Makes it easy to scope projects and understand the impact of proposed data integration changes before they are implemented. Ensures regulatory compliance and provides a basis for data governance with a standard enterprise business vocabulary thus avoiding errors and by providing a complete audit trail of data flows and transformations.
To learn more, visit Infonet.UPMC.com/DataGovernance. DGO@upmc.edu UPMC is an equal opportunity employer. UPMC policy prohibits discrimination or harassment on the basis of race, color, religion, ancestry, national origin, age, sex, genetics, sexual orientation, marital status, familial status, disability, veteran status, or any other legally protected group status. Further, UPMC will continue to support and promote equal employment opportunity, human dignity, and racial, ethnic, and cultural diversity. This policy applies to admissions, employment, and access to and treatment in UPMC programs and activities. This commitment is made by UPMC in accordance with federal, state, and/or local laws and regulations. SYS409008 06/13 2013 UPMC