1 September 2013 Industry Models and Information Server Data Models, Metadata Management and Data Governance Gary Thompson (gary.n.thompson@ie.ibm.com ) Information Management
Disclaimer. All rights reserved. U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY. WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED AS IS WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. IN ADDITION, THIS INFORMATION IS BASED ON IBM S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE. IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION. NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, NOR SHALL HAVE THE EFFECT OF, CREATING ANY WARRANTIES OR REPRESENTATIONS FROM IBM (OR ITS SUPPLIERS OR LICENSORS), OR ALTERING THE TERMS AND CONDITIONS OF ANY AGREEMENT OR LICENSE GOVERNING THE USE OF IBM PRODUCTS AND/OR SOFTWARE. 2 IBM, the IBM logo, ibm.com, Information Management, WebSphere and Rational are trademarks or registered trademarks of International Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol ( or ), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at Copyright and trademark information at www.ibm.com/legal/copytrade.shtml
Introduction The role of and Technical metadata in Data Governance Metadata integration to explore both business and technical metadata IBM InfoSphere Information Server for Metadata Management Use IBM Banking Datawarehouse Models as an example of Metadata Design process using metadata management from requirements gathering to implemented database Logical and Physical Data Modeling in InfoSphere Data Architect Discuss the benefits and challenges of using such tools as part of Data Governance initiatives and processes Good metadata requires a combination of people, process and technology 3
Metadata as part of Data Governance Value Creation Outcomes Data Risk Management The process by which data assets are qualified and quantified to enable the business to maximize the value created by data assets. The methodology by which data risks are identified, qualified, quantified, avoided, accepted, mitigated, or transferred out. Enablers Require Policy Policy is the written articulation of desired organizational behavior. Organization Describes the level of mutual responsibility between and IT, and recognition of the fiduciary responsibility to govern data at different levels of management. Core Disciplines Stewardship Stewardship is a quality control discipline designed to ensure custodial care of data for both asset enhancement, risk mitigation, and organizational control. Enhance Data Quality Information Lifecycle Mgt Security & Privacy Methods to measure, improve, and certify the quality and integrity of production, test, and archival data. A systemic policy-based approach to information collection, use, retention, and deletion. Describes the policies, practices and controls used by an organization to mitigate risk and protect data assets. Support Data Architecture Supporting Disciplines Data Classification / Metadata Audit and Reporting The architectural design of structured and unstructured data systems and applications that enable data availability and distribution to appropriate users. The methods and tools used to create common semantic definitions for business and IT terms, data models, types, and repositories. Metadata that bridge human and computer understanding. The organizational processes for monitoring and measuring the Data value, risks, and efficacy of Governance. 4
Metadata Primer B Metadata Rules, Definitions, Terminology, Glossaries, Algorithms and Lineage using business language Audience: Users T Technical Metadata Defines Source and Target systems, their Table and Fields structures and attributes, Derivations and Dependencies Audience: Specific Tool Users BI, ETL, Profiling, Modeling O Operational Metadata Information about application runs: their frequency, record counts, component by component analysis and other statistics Audience: Operations, Management Literally, data about data that describes your company s information from both a business and a technical perspective 5 5 IBM Confidential Information
Communicating and Recording Information (Conceptual) Language Vocabulary Documented Entities and Relationships Logical Data Model Structured Tables and Foreign Keys Physical Data Model Implemented Physical Information Storage Implemented Database 6
Communicating and Recording Information (Requirement to Implementation) The Account Number identifies savings, checking and brokerage Accounts. Account Number is generally unique within a country, but not between countries Arrangement Id is Primary Key of Arrangement Entity Account Number is an attribute of Arrangement AR_ID is Primary Key on Table AR 7 AR_ID is Primary Key on Partitioned Table AR in DB67
Example of How Metadata assists Data Governance Questions What is the business meaning this Database table? 8
Example of How Metadata assists Data Governance Questions How is this data used in Reports? 9
Common Metadata in IBM InfoSphere Information Server Focus today is on the Common Metadata component on Information Server A number of products will be used to work with the metadata Metadata Workbench Glossary Value Data Architect Industry Models Banking Data Warehouse 10
IBM Industry Data Models: comprehensive and industry-specific Banking Insurance Financial Markets Retail Healthcare Telecom Basel II/III Dodd Frank FATCA COREP Solvency II CRM Profitability IAS/IFRS Customer Churn Network Mgmnt Supply Chain Mgmnt Market Basket Analysis HEDIS HIPAA Models contains business content representing industry concepts, in a form which can be used in I.T. projects. Project Acceleration Technical IBM Industry Data Models Vocabulary & Requirements Models Analysis Models Design Models Vocabulary / Requirements Models represent industry concepts and management reporting requirements in plain business language Analysis Models represent industry concepts and inter-relationships, in a form which can be used by I.T. Design Models represent industry concepts in specific technology formats, such as database designs. Information Integration & Governance Multiple tooling options available to assist with deployment 11 Data Warehouse Data Marts Operational Data Store Big Data
Glossary and Data Process Glossary 1 Term Management The simple process has 5 main steps Logical Data Model 2 Logical 1. Managing Terms in Glossary 2. Logical modelling of data Physical Data Model 3 Physical 3. Physical of Database in IDA leading to the creation of DDL 4. Deployment of to database Database 4 DDL Implementation 5. Information Server Metadata Integration and Test Information Server 5 Metadata Integration 12
Analytical Requirements Map to Facts and Dimensions 1 Term Management Analytical Requirement for Customer Credit Risk Profile full attributes set not show in this presentation Dimension 2 Logical 3 Physical 4 DDL Implementation Fact Entity 5 Metadata Integration 13 Licensed Materials - Property of IBM
Convert to Physical Model and generate DDL 1 Term Management 2 Logical The logical model can be converted to the physical model InfoSphere Data Architect (IDA) has an integrated database conversion utility Quickly generate & optimize a database specific model from the requirements scope A refined scope means less data to extract, transform and load 3 Physical 4 DDL Implementation 5 Metadata Integration 14 Licensed Materials - Property of IBM
Metadata Integration 1 Term Management Information Server Metadata 2 Industry Data Model Database Other Logical Glossary Logical Data Model Physical Data Model Database ETL Job Reports 3 4 Physical DDL Implementation Information Server represents each of information assets as metadata The Industry Data Model includes Glossary for Analytical Requirements and IDA for all data modelling 5 Metadata Integration Data Governance method and tools must facilitate the integration of this metadata with database, ETL and Reporting Tools 15
Metadata Integration 1 Term Management Information Server Metadata 2 Models Database Other Logical Glossary Logical Data Model Physical Data Model Data Resource ETL Job BI Object 3 4 Physical DDL Implementation 5 Metadata Integration Analytical Requirements Industry LDM BDW PDM Netezza DataStage Job The number of interactions between and Technical metadata grows rapidly Highlights the need for Tooling, Process and People Cognos Package 16
Metadata focused roles when implementing Data Solutions Project Roles Solution Architect Project Manager Analysts Report Developer ETL Developer Data Modeler DBA Metadata Specialist Tester Focus Solution design and review Planning, reporting, standards compliance requirements and data governance Implementing reporting requirements with available and correct data Build and unit test of ETL components Data analysis, data models and database guidelines Database implementation Configuration and integration of metadata Data integration and metadata testing 17
Metadata must part of the Solution Lifecycle Analyst ETL Developer User Data Modeller DBA 18