Contents. Ensure Accuracy in Data Transformation with Data Testing Framework (DTF)

Similar documents
Effective Testing & Quality Assurance in Data Migration Projects. Agile & Accountable Methodology

A U T H O R S : G a n e s h S r i n i v a s a n a n d S a n d e e p W a g h Social Media Analytics

Mobile Automation: Best Practices

Abstract. SAP Upgrade Testing : In A Nutshell Page 2 of 15

Our Business Knowledge, Your Winning Edge. Consulting & Thought Partnership

SAP Data Services 4.X. An Enterprise Information management Solution

Implementing a Data Warehouse with Microsoft SQL Server 2012 MOC 10777

Reduce and manage operating costs and improve efficiency. Support better business decisions based on availability of real-time information

Enhanced System Integration Test Automation Tool (E-SITAT) Author: Akshat Sharma

Master Data Management as a Solution Using SAP MDM and Complementing Technologies

Higher Focus on Quality. Pressure on Testing Budgets. ? Short Release Cycles. Your key to Effortless Automation. OpKey TM

The Worksoft Suite. Automated Business Process Discovery & Validation ENSURING THE SUCCESS OF DIGITAL BUSINESS. Worksoft Differentiators

Data Warehouse and Business Intelligence Testing: Challenges, Best Practices & the Solution

We are live on KFS Now What? Sameer Arora Director Strategic Initiatives, Syntel

Five Commandments for Successful COTS Package Testing

Cost-Effective Business Intelligence with Red Hat and Open Source

Data Virtualization for Agile Business Intelligence Systems and Virtual MDM. To View This Presentation as a Video Click Here

OWB Users, Enter The New ODI World

Chapter 5. Learning Objectives. DW Development and ETL

GET GOING WITH CUTTING EDGE TECHNOLOGY CUTTING EDGE COST EFFECTIVE CUSTOMER-CENTRIC

Capgemini and Oracle WebCenter: A Global Partnership

ETL-EXTRACT, TRANSFORM & LOAD TESTING

DELIVERING AGILE QUALITY ASSURANCE THROUGH EXTREME AUTOMATION

RS MDM. Integration Guide. Riversand

BUSINESSOBJECTS DATA INTEGRATOR

Data Masking: A baseline data security measure

Course 10777A: Implementing a Data Warehouse with Microsoft SQL Server 2012

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 7

Automatic Migration of Files from Pro/Engineer to Windchill Server by Integrating Pro/Engineer with Visual C++ using Pro/Toolkit

Data processing goes big

Implementing a Data Warehouse with Microsoft SQL Server 2012

Mobile Performance Testing Approaches and Challenges

SQL Server 2012 Performance White Paper

Business Intelligence in Oracle Fusion Applications

Jitterbit Technical Overview : Salesforce

Course Outline: Course: Implementing a Data Warehouse with Microsoft SQL Server 2012 Learning Method: Instructor-led Classroom Learning

Is Cloud Middleware the Way Ahead? 1 Executive Summary Cloud Integration & Need Middleware Options & Considerations...

COURSE 20463C: IMPLEMENTING A DATA WAREHOUSE WITH MICROSOFT SQL SERVER

Course Outline. Module 1: Introduction to Data Warehousing

Implementing a Data Warehouse with Microsoft SQL Server

Implementing a SQL Data Warehouse 2016

Implementing a Data Warehouse with Microsoft SQL Server 2012

Beta: Implementing a Data Warehouse with Microsoft SQL Server 2012

Best Practices for CAD Data Migration

Automated Data Validation Testing Tool for Data Migration Quality Assurance

END-TO-END BANKING SOLUTIONS

Capgemini s Guidewire Services. Leading services and solutions to support your Guidewire initiatives

Analance Data Integration Technical Whitepaper

<Insert Picture Here> Oracle Data Integration 11g Overview Tim Sawyer

Statement of Direction

Ten Things You Need to Know About Data Virtualization

Next Generation Business Performance Management Solution

Data Migration in SAP environments

Mainframe Managed Tools as a Service (MFMTaaS) Accelerating Growth

IBM Rational Asset Manager

High-Volume Data Warehousing in Centerprise. Product Datasheet

About the Tutorial. Audience. Prerequisites. Disclaimer & Copyright. ETL Testing

FREQUENTLY ASKED QUESTIONS. Oracle Applications Strategy

Lavastorm Resolution Center 2.2 Release Frequently Asked Questions

Microsoft. Course 20463C: Implementing a Data Warehouse with Microsoft SQL Server

Quality Testing. Assured.

Implement a Data Warehouse with Microsoft SQL Server 20463C; 5 days

Connected Product Maturity Model

Analance Data Integration Technical Whitepaper

Whitepaper. Data Warehouse/BI Testing Offering YOUR SUCCESS IS OUR FOCUS. Published on: January 2009 Author: BIBA PRACTICE

BUSINESSOBJECTS DATA INTEGRATOR

Compunnel. Business Intelligence, Master Data Management & Compliance (Healthcare) Largest Health Insurance Company in New Jersey.

High Availability of VistA EHR in Cloud. ViSolve Inc. White Paper February

Implementing a Data Warehouse with Microsoft SQL Server

Data warehouse and Business Intelligence Collateral

Dr. John E. Kelly III Senior Vice President, Director of Research. Differentiating IBM: Research

Retail POS Data Analytics Using MS Bi Tools. Business Intelligence White Paper

East Asia Network Sdn Bhd

Test Run Analysis Interpretation (AI) Made Easy with OpenLoad

Testing Trends in Data Warehouse

Dell Microsoft Business Intelligence and Data Warehousing Reference Configuration Performance Results Phase III

For Sales Kathy Hall

Talend Metadata Manager. Reduce Risk and Friction in your Information Supply Chain

Migrations from Oracle/Sybase/DB2 to Microsoft SQL Server it s easy!

Test Data Management Concepts

Innovate and Grow: SAP and Teradata

Digital Transformation with Intelligent Solutions from Infosys and Pega

IBM Cognos 8 Business Intelligence Analysis Discover the factors driving business performance

SAP BO Course Details

The Customer. Manual and Automation Testing for a leading Enterprise Information Management (EIM) Solution provider. Business Challenges

A Scheme for Automation of Telecom Data Processing for Business Application

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2

Understanding the Value of In-Memory in the IT Landscape

CA IT Client Manager Asset Inventory and Discovery

Why Big Data in the Cloud?

Cut Costs and Improve Agility by Simplifying and Automating Common System Administration Tasks

4.10 Reports. RFP reference: 6.10 Reports, Page 47

Transcription:

Contents A U T H O R : S o u r a v D a s G u p t a Ensure Accuracy in Data Transformation with Data Testing Framework (DTF) Abstract... 2 Need for a Data Testing Framework... 3 DTF Overview... 3 What is DTF?... 4 Execution Steps... 4 Building Blocks of DTF... 6 Software Requirements... 7 Hardware Requirements... 7 Benefits offered by DTF... 8 Differentiators... 8 Conclusion... 8 References... 9 Abbreviations and Acronyms... 9 About the Author... 10 About L&T Infotech... 10 L&T Infotech Proprietary Data Testing Framework Page 1 of 10

Abstract Information stored in a data warehouse is critical to organizations for decision making and predictive analysis. The huge volume of data loaded onto a data warehouse makes exhaustive manual comparison of data impractical. The existing quality tools are either manual or have other limitations, and do not cover all aspects of data warehouse testing. Therefore, a holistic solution is required to test high-volume applications that are built on Data Warehouse (DW) or Business Intelligence (BI) architecture. L&T Infotech s Data Testing Framework (DTF) is a comprehensive, easy-to-use tool designed to test high-volume, data-centric DW/BI applications. This paper discusses the need for such a DTF along with its features, course of action, software and hardware requirements. The benefits offered by DTF are also briefly discussed towards the end of the paper. L&T Infotech Proprietary Data Testing Framework Page 2 of 10

Need for a Data Testing Framework Testing holds high stakes in helping businesses make insightful and intelligent decisions using available information. Given the growing complexity in Data Warehousing and Business Intelligence space in the IT Industry; L&T Infotech has developed a cost-effective solution to address the following challenges faced by clients: Unavailability of comprehensive testing tools Varied skill sets required to understand various file formats Voluminous data from heterogeneous sources 100 % data validation is not be feasible Manual comparison of data is tedious and error-prone Data Testing Framework is a testing framework that easily integrates with users needs for different types of data validation processes. It enables users to compare and validate data across various types of data sources and databases. DTF Overview DTF is L&T Infotech s Open Source data validation and comparison framework that allows a user to perform data-centric testing. Its simple User Interface (UI) enables users to easily configure the tool as per their testing needs. The framework also provides detailed results of the test cases enabling faster analysis of test results. L&T Infotech Proprietary Data Testing Framework Page 3 of 10

What is DTF? The DTF has been developed by synthesizing years of experience in the Database Testing area. DTF can be used for comparison of data from two different data feeds after data migration or reconciliation. These source and target data-feeds can be database table, database query, flat file, CSV, PSV or an Excel file. DTF has a proven track record of comparing high volume of data and supports leading databases in the market. DTF can be configured to perform the following types of comparisons: File to File comparator File to Database table comparator Database to Database comparator Query output to File comparator Database Table comparator Database Table to Query output Database table to Fixed Length File Comparator Database table to XML comparator Database table to Stored Procedure output comparator DTF provides a user-friendly UI to the testers from non-technical background and allows them to configure the tool to operate in different modes for different types of comparisons. Execution Steps Common test scenarios required for data conversion testing can be broadly classified into the following categories: Table/schema validation (includes the verification of indexes, stored procedures and trigger) Count and data validation Data character set conversion File processing (In cases where the source is a file) Batch job and business rule validation Interface testing L&T Infotech Proprietary Data Testing Framework Page 4 of 10

Figure 1: DTF Process The process for data testing using DTF is as follows: 1. Analyze Study the data model of the source and the target databases to understand the conversion process. If the source is a flat file, analyze the file s structure and its mappingwith the target database. 2. Data Mapping The mapping between the source and the target databases & tables needs to be configured in the DTF. If there are no schema changes, the mapping of the source and target databases at database level is enough. There may be scenarios where either the data of one source table is distributed to multiple target tables or the data of multiple source tables is merged in one target table. In such cases, the mapping of source and the target tables will be required to be configured in the DTF at column level. 3. Test Case Creation The test cases for various data comparison and validation scenarios can be created in DTF using the data mappings done. DTF also provides the user an option to create test suites and execute multiple test cases in a single framework execution. 4. Execute & Report DTF test case or test suite can be executed in DTF by providing different run time DTF execution options. Following are some DTF execution options: Trim Data before Comparison Ignore case in Comparison Database Schema Comparison Full Database Comparison L&T Infotech Proprietary Data Testing Framework Page 5 of 10

Once the execution is complete, a detailed report is generated which gives the following details: Summary report Mismatched records Extra records in source Extra records in Target All reports are generated in a spreadsheet, which are detailed and convenient to analyze. Building Blocks of DTF DTF comprises the following three blocks: Figure 2: DTF Building Blocks 1. DTF Util Manager - DTF Util Manager is responsible for reading/writing data into files/databases and data conversion, if required, for internal DTF logic. It ensures that the source and target data arein same format before data goes to the DTF Compare Engine. It implements logic for all other activities other than actual data comparison and report generation. L&T Infotech Proprietary Data Testing Framework Page 6 of 10

2. DTF Compare Engine - DTF Compare Engine is responsible for actual comparison of source and target data. If the data is huge, it divides the data into predefined sized chunks and does the comparison. Formation of the data chunks and data comparison is done in parallel to have faster comparison. This engine communicates with DTF Report Manager to give details of comparison execution result. 3. DTF Report Manager - DTF Report Manager is responsible for generating DTF reports by taking comparison execution results from DTF Compare Engine. It generates reports in excel format. Reports are generated in two categories: summary reports and detailed reports. It takes comparison execution time as a reference and creates folders with that name to store reports for every execution. In addition to the three primary blocks, DTF has the following building blocks, each of which represents different data feeds: Excel Files Flat Files Database Tables Database Query Excel Config file block represents configuration input excel files. Typically, a user lists the parameters for comparison between the source and destination in these configuration file(s). DTF Report block represents DTF summary as well as DTF detailed reports generated after comparison execution. Software Requirements JRE 1.6 Microsoft Office Windows Operating System Hardware Requirements 1 GB RAM or greater 3 GHz CPU L&T Infotech Proprietary Data Testing Framework Page 7 of 10

Benefits offered by DTF DTF is a very cost-effective solution as it is developed using Open Source Tool Detailed reports help in identifying problems Reduction in test execution effort Reusability of the framework across different Data Warehousing projects Less maintenance because of the modular structure of the framework Ability to work with different types of data feeds Easier result analysis through Excel sheets Differentiators Simple test script creation and execution Tester productivity increased with improved quality of testing Cost savings of 30% Compressed testing cycle Conclusion DTF, the open source technology based framework that supports all databases currently available in the market, creates detailed reports that help organizations identify defects and take corrective actions based on the inputs. Enterprises are thus able to achieve cost and efforts savings with enhanced test coverage through automation. Accurate, real-time information is readily available to help in making informed decisions. L&T Infotech Proprietary Data Testing Framework Page 8 of 10

References No external references. Abbreviations and Acronyms DTF Data Testing Framework DW Data Warehouse BI Business Intelligence UI User Interface CSV Comma Separated Value PSV Prefix Separated Value L&T Infotech Proprietary Data Testing Framework Page 9 of 10

About the Author Sourav Das Gupta, L&T Infotech Sourav Das Gupta is an ISTQB certified tester with over 6 years experience in the testing domain. He is currently associated with L&T Infotech as a Test Lead. He has experience in the implementation of QA Processes, endto-end testing for Business Intelligence (BI) applicatons, Database Testing and Web-based Application Testing. About L&T Infotech Larsen & Toubro Infotech (L&T Infotech), one of the fastest growing IT Services companies, is a part of USD 11.7 billion L&T Group, India s Best Managed Company with presence in the areas of engineering, manufacturing and financial services. It is ranked by NASSCOM as 8th largest Indian software & services exporter from India and is ranked 7 th in DATAQUEST-IDC top 20 IT Best Employers Survey 2010. It offers comprehensive, end-to-end technology solutions and services in banking & financial services, energy & petrochemicals, insurance, manufacturing (automotive, consumer packaged goods/retail, industrial products,) and product engineering services (telecom). Its horizon is filled with the promise of new and cutting edge offerings in the technology space including launch of an end-to-end cloud computing adoption toolkit and cloud advisory consulting services; and a new service to provide enterprise mobility solutions to clients and launch of a smart access platform. This is in addition to exciting offerings in media & entertainment; and life sciences & healthcare. Business solutions are also offered in SAP, Oracle, infrastructure management, testing, consulting, domain services, business intelligence/data warehousing, legacy modernization, applications outsourcing, architecture consulting, enterprise integration, SOA, systems integration, PLM and software as a service. In addition to delivering solutions to vertical industries served, L&T Infotech maintains its own intellectual property; the flagship IP being its wealth management platform Unitrax. L&T Infotech s unique brand differentiation is Business-to-IT Connect which enables the Company to convert the business knowledge acquired, into a winning edge for clients, leading to faster time to market. For more information, visit us at www.lntinfotech.com or email us at info@lntinfotech.com. Follow L&T Infotech on: L&T Infotech Proprietary Data Testing Framework Page 10 of 10