A Comprehensive Approach to Master Data Management Testing



Similar documents
MDM and Data Warehousing Complement Each Other

Course Outline. Module 1: Introduction to Data Warehousing

Course 10777A: Implementing a Data Warehouse with Microsoft SQL Server 2012

Implementing a Data Warehouse with Microsoft SQL Server 2012

Implementing a Data Warehouse with Microsoft SQL Server 2012

Data Warehouse and Business Intelligence Testing: Challenges, Best Practices & the Solution

Implementing a Data Warehouse with Microsoft SQL Server

COURSE 20463C: IMPLEMENTING A DATA WAREHOUSE WITH MICROSOFT SQL SERVER

Implementing a Data Warehouse with Microsoft SQL Server 2012 MOC 10777

Implementing a Data Warehouse with Microsoft SQL Server MOC 20463

COURSE OUTLINE MOC 20463: IMPLEMENTING A DATA WAREHOUSE WITH MICROSOFT SQL SERVER

Data Warehouse / MIS Testing: Corporate Information Factory

Implementing a Data Warehouse with Microsoft SQL Server 2014

Quality Assurance - Karthik

Implement a Data Warehouse with Microsoft SQL Server 20463C; 5 days

James Serra Data Warehouse/BI/MDM Architect JamesSerra.com

Microsoft. Course 20463C: Implementing a Data Warehouse with Microsoft SQL Server

ETL-EXTRACT, TRANSFORM & LOAD TESTING

Implementing a Data Warehouse with Microsoft SQL Server

Implementing a Data Warehouse with Microsoft SQL Server 2012

Implementing a Data Warehouse with Microsoft SQL Server

Effective Testing & Quality Assurance in Data Migration Projects. Agile & Accountable Methodology

JOURNAL OF OBJECT TECHNOLOGY

POLAR IT SERVICES. Business Intelligence Project Methodology

SQL Server 2012 Business Intelligence Boot Camp

Course Outline: Course: Implementing a Data Warehouse with Microsoft SQL Server 2012 Learning Method: Instructor-led Classroom Learning

Service Oriented Data Management

Data Quality Assessment. Approach

Course Outline. Business Analysis & SAP BI (SAP Business Information Warehouse)

Course 20463:Implementing a Data Warehouse with Microsoft SQL Server

Implementing a Data Warehouse with Microsoft SQL Server

Implementing a SQL Data Warehouse 2016

Agile Business Intelligence Data Lake Architecture

East Asia Network Sdn Bhd

Top Five Reasons Not to Master Your Data in SAP ERP. White Paper

Integrating MDM and Business Intelligence

THOMAS RAVN PRACTICE DIRECTOR An Effective Approach to Master Data Management. March 4 th 2010, Reykjavik

Beta: Implementing a Data Warehouse with Microsoft SQL Server 2012

Levels of Software Testing. Functional Testing

Master Data Management and Data Warehousing. Zahra Mansoori

A McKnight Associates, Inc. White Paper: Effective Data Warehouse Organizational Roles and Responsibilities

Implementing a Data Warehouse with Microsoft SQL Server 2012 (70-463)

Making Business Intelligence Easy. Whitepaper Measuring data quality for successful Master Data Management

High-Volume Data Warehousing in Centerprise. Product Datasheet

FSW QA Testing Levels Definitions

Building a Data Quality Scorecard for Operational Data Governance

Keywords : Data Warehouse, Data Warehouse Testing, Lifecycle based Testing

Frequently Asked Questions

INFORMATION TECHNOLOGY STANDARD

Oracle Insurance Policy Administration System Quality Assurance Testing Methodology. An Oracle White Paper August 2008

Master Data Management Architecture

TEST PLAN Issue Date: <dd/mm/yyyy> Revision Date: <dd/mm/yyyy>

An RCG White Paper The Data Governance Maturity Model

For Sales Kathy Hall

QA Tools (QTP, QC/ALM), ETL Testing, Selenium, Mobile, Unix, SQL, SOAP UI

EAI vs. ETL: Drawing Boundaries for Data Integration

BIG DATA: BIG CHALLENGE FOR SOFTWARE TESTERS

Testing Big data is one of the biggest

How To Test For Performance

Keywords: Data Warehouse, Data Warehouse testing, Lifecycle based testing, performance testing.

By Makesh Kannaiyan 8/27/2011 1

A Service-oriented Architecture for Business Intelligence

The IBM Cognos Platform

ORACLE ENTERPRISE DATA QUALITY PRODUCT FAMILY

Bringing Value to the Organization with Performance Testing

Data Warehouse (DW) Maturity Assessment Questionnaire

Instant Data Warehousing with SAP data

Data Warehousing and OLAP Technology for Knowledge Discovery

Foundations of Business Intelligence: Databases and Information Management

SQL Server Master Data Services A Point of View

Data Warehouse Testing

Busting 7 Myths about Master Data Management

Copyrighted , Address :- EH1-Infotech, SCF 69, Top Floor, Phase 3B-2, Sector 60, Mohali (Chandigarh),

Master Data Management Decisions Made by the Data Governance Organization. A Whitepaper by First San Francisco Partners

Bringing agility to Business Intelligence Metadata as key to Agile Data Warehousing. 1 P a g e.

Integrating SAP and non-sap data for comprehensive Business Intelligence

Metadata Repositories in Health Care. Discussion Paper

HYPERION MASTER DATA MANAGEMENT SOLUTIONS FOR IT

SAP Data Services and SAP Information Steward Document Version: 4.2 Support Package 7 ( ) PUBLIC. Master Guide

Exploring the Synergistic Relationships Between BPC, BW and HANA

Datawarehouse testing using MiniDBs in IT Industry Narendra Parihar Anandam Sarcar

Luncheon Webinar Series May 13, 2013

Techniques and Tools for Rich Internet Applications Testing

Software Testing. Knowledge Base. Rajat Kumar Bal. Introduction

<Insert Picture Here> Master Data Management

Chapter 6 Basics of Data Integration. Fundamentals of Business Analytics RN Prasad and Seema Acharya

Using SAP Master Data Technologies to Enable Key Business Capabilities in Johnson & Johnson Consumer

ROADMAP TO DEFINE A BACKUP STRATEGY FOR SAP APPLICATIONS Helps you to analyze and define a robust backup strategy

Rational Reporting. Module 3: IBM Rational Insight and IBM Cognos Data Manager

DATA GOVERNANCE AND DATA QUALITY

Data Vault and The Truth about the Enterprise Data Warehouse

State of Louisiana Department of Revenue. Development/implementation of LDR s First Data Mart RFP Official Responses to Written Inquiries

Software testing. Objectives

Business Intelligence In SAP Environments

4.13 System Testing. Section 4 Bidder's Products, Methodology, and Approach to the Project System Training

Knowledgent White Paper Series. Developing an MDM Strategy WHITE PAPER. Key Components for Success

Transcription:

A Comprehensive Approach to Master Data Management Testing Abstract Testing plays an important role in the SDLC of any Software Product. Testing is vital in Data Warehousing Projects because of the criticality of the data that is made available to end users. MDM data warehouse testing has not yet received substantial attention. MDM data warehouse testing is different from generic Software Testing as its focus area is Data and Information whereas generic Software Testing is focused on Program Code. In this paper I introduce the testing activities with respect to the Data Warehouse built using the Technology Master Data Management commonly known as MDM along with the What & How of the Testing Activities.

How is MDM Testing different from Generic Testing? Data warehouse testing involves huge data volumes, unlike generic testing. This significantly impacts performance and productivity. In Generic System Testing the testable combinations of scenarios are limited whereas in MDM Data warehouse testing valid scenarios are unlimited and hence it is not completely testable. Data Validation is one of the main goals of MDM Data warehouse testing due to the significance of the data delivered to the end users. Unlike Generic Testing, MDM Data warehouse testing continues after the System Release. Regression Testing is an integral part of MDM Data warehouse testing as it is very difficult to anticipate future requirements and thus errors that can be encountered in the real system. Before getting into the What & How of Testing MDM, I will explain in brief what MDM is and why it is needed? What is MDM? MDM comprises a set of rules, criterions, procedures & tools, which defines, and manages the data of the Organisation. MDM is used to analyse information across different source systems in an Organisation, resolve data discrepancies, and derive master data for end-users. The resulting records can also be referred to as Golden Records or True Records. Why MDM? It provides a single source for consistent and accurate Master Data. Reduces overall data maintenance costs by preventing multiple processing in different systems. Ensures data consistency and accuracy, which reduces the error-processing costs due to inconsistent master data. Can effectively manage Master Data within Companies that have heterogeneous system landscapes containing both SAP & non-sap systems. Has automated or scheduled processes for data import, creation, update and distribution using Workflow. Rich Master Data Content management for Catalog and Web publication (including PDFs/Images). Record/attribute and Field Level role based security. One MDM server can store multiple Master Data Repositories. Business Purpose MDM is used to build a Master Data hub to analyse information across different source systems, resolve data discrepancies and derive master data. It builds an integration framework so that the master data can be shared across the organisation. The final records generated can be referred to as Golden Records/True Records. IBM Master Data Management Server is being used to store client data. It uses IBM MDS for the identity resolution of records. Once duplicate parties have been identified in MDS a soft link is created in MDM. Any client survivorship rules are used to generate the virtual golden record on the fly.

Testing Activities Accurate Advanced Test Planning is one of the major keys to the success of a System as the earlier an error is detected in the SDLC, the lower the cost of correcting. From an Organisational point of view, there are several roles involved in the Testing of a system. Analysts responsible for the conceptual schema which is used by the testers for understanding user requirements. Designers responsible for logical schema of data repositories and data flows, which are tested for robustness and competence. Testers responsible for developing and executing Test Plans and Cases. Developers responsible for Unit Testing. Data Base Administrators responsible for Stress and Performance Testing. Also responsible for setting up Test Environments. Users responsible for performing functional testing on the GUI. Testing activities are divided into two parts below What is tested, and how it is tested? What is tested? Testing data quality is the core of MDM Testing. MDM Data warehouse projects mainly involve checking on the correctness of the data loaded by ETL procedures and accessed by front-end tools. However, the complexity involved in the MDM data warehouse projects means that testing the design quality is equally significant. The following items are tested: Conceptual Design: This explains the facts, measures, and hierarchies independent of DataMart from an implementation independent point of view. Logical Design: This describes the arrangement of the data repository at the core of the DataMart. ETL Procedures: The complex procedures which are used for feeding the data from the sources. Database: The repository where the data is stored. Front End: The end user applications used for analysing results and generating reports. How it is tested? The following tests are carried out in the MDM data warehouse testing: Functional Test: Verifies that the Business Requirements are fully and correctly met in the item. Usability Test: Users interact with the item to verify that the item is easily usable and understandable. Performance Test: Done to check the item performance under typical workload conditions. Stress Test: Checks how well the item performs with peak loads and heavy loads of data. Recovery Test: Checks how well an item is able to recover from crashes, hardware failures and other similar problems. Security Test: Checks that the data is secure and the intended functionality is maintained. Regression Test: Checking that even after a change has occurred, the item still functions correctly after a change has occurred.

What Vs. How in Testing? Conceptual Logical ETL Database Frontend Functional Yes Yes Yes Yes Usability Yes Yes Yes Performance Yes Yes Yes Yes Stress Yes Yes Yes Recovery Yes Yes Security Yes Yes Yes Regression Yes Yes Yes Yes Yes Analysis & Design Implementation Test Coverage Testing can minimise the probability of a system fault but cannot remove it completely. Measuring the coverage of tests is required to assess overall system consistency. The first thing needed to measure test coverage is an appropriate definition of the Coverage Criteria. Different Coverage Criteria, such as statement coverage, decision coverage, and path coverage, are pre-arranged in the scope of code testing. The choice of one or other criteria deeply affects the test length and cost, as well as achievable coverage. Trading off test effectiveness and efficiency chooses the coverage criteria. Examples of coverage criteria that we propose for some of the testing activities described above are covered in the table below. Coverage Criteria for Testing Activities; the expected coverage is expressed with reference to the coverage criterion Testing Activity Coverage Criteria Measurement Expected Coverage Fact Test All information needs Percentage of queries in Partial expressed must be tested the workload supported by the conceptual schema Conformity Test All data mart dimensions Bus matrix sparseness Total must be test Conceptual Schema All facts, dimensions and Conceptual metrics Total Test measures must be test ETL Unit Test All decision points must be Correct loading of the test Total tested data sets ETL-forced Error Test All errors specified by users must be tested Correct loading of the faulty data sets Total Frontend Unit Test Minimum of 1 group set must be tested for each attributes Correct analysis of real data set Total Timeline for Testing From an organisational point of view, the three main phases of testing are:

- Creating a Test Strategy: The Test Strategy describes the tests that must be executed and their expected impact on System Requirements. - Preparing Test Scripts: Test Scripts enable the execution of the test strategy by detailing the testing steps together with their expected results. The reference databases for testing should be prepared during this phase and a comprehensive set of workloads should be well defined. - Executing Test Scripts: A test execution log tracks each test along with its results. Below is the of a Health Care Project; MDM Client Intelligence Program. Situation The Company s Client Information is stored in multiple and proprietary source applications. As the same Client Information is stored across different source systems this leads to the following issues: 1. Inconsistent and Inaccurate data. 2. High Maintenance Costs due to multiple processing. 3. High Costs due to Inconsistent data. 4. Duplication of data. 5. Data Discrepancy issues. Information Source Project MDM Client Intelligence Program (CIP). Solution Implemented The MDM Server solution for CIP consists of the integration of the IBM Initiate MDS and IBM MDM Server with the hub connector. The solution provides an organisation master data that has enabled the company to have a single trusted view of all clients. The solution has also enabled this company to share the single trusted view across multiple applications and systems. It is a foundation for building a person/contract/product master data hub in the future. Approach The Initial Load Data from the source systems to the Client Intelligence Program MDM Server are loaded using the Rails Process using Data Stage from the different source Applications to SDS, CCD and MDM Tables. The required format conversion from source system model to the SDS, CCD & MDM Servers Table format is done by the ETL Team. This process also covers Address Cleansing, Business Rules and Survivorship Rules before the data is loaded to Temp Tables in MDM. All the composite transactions are used for the initial load. The integrity of the initial loaded data is verified prior to making an attempt to do the delta load. All the sources included as a part of MDM CIP are assigned a ranking for different attributes used for storing Client Information. The customised survivorship rules work based on the source rank and the last update date. Only the MDM Server provides the last update date. Since the last update date does not make sense in the initial load, the source system order on loading is required.

Testing Scope The following types of tests are performed in the MDM CIP Project: 1. Create/Update Testing The new records are created and existing records are updated to verify that the records are correctly loaded. These are then reflected from Source to Physical MDM considering the different rules such as Address Cleansing, Business Rules and Survivorship Rules. 2. Survivorship Rules Testing The rules are tested to show they have been applied correctly on the records available in MDM Temp Tables and in Physical MDM. 3. Link/Unlink Testing Survivorship Rules are tested when the records are linked/unlinked. 4. MDS Initiate UI Testing The Virtual MDM UI Application is tested to verify the UI and to check if the user can add/update the records successfully. 5. Data Stewardship UI Testing The Physical MDM is tested to verify that the records are available in the Physical MDM with all the rules applied on the records. 6. Data Mapping Testing Data is verified in the different Databases SDS, CCD and MDM to verify if it has been loaded correctly.