YOUR SUCCESS IS OUR FOCUS Whitepaper Published on: January 2009 Author: BIBA PRACTICE 2009 Hexaware Technologies. All rights reserved.
Table of Contents 1. 2. Data Warehouse - Typical pain points 3. Hexaware Solution 4. DWH Testing Why is it different? 5. DWH Testing What needs to be tested? 6. DWH Testing Customer benefits 04 05 2009 Hexaware Technologies. All rights reserved. 2
1. Introduction Data Warehouses (DW) are an integral part of any organisation to get greater insights into the business. Many organisations have invested a lot of time, money and effort in building right information warehouses using modern tools and technologies. Building or enhancing a Data Warehouse is a complex project which requires meticulous planning and execution with the right resources and tools. Typical of any IT project, a DW project also goes through a formal process and as part of the execution model, a comprehensive testing phase becomes a key success factor. DW Testing is important to have reliable Business Intelligence (BI). DW systems rely on data sourced from multiple source systems and general software testing approaches does not focus on data. DW has no genuine testing tradition and that is one of the reasons why BI projects fail. 2. Data Warehouse - Typical pain points Poor data quality which alienates users from using the DW for information Unavailability of data from right sources Poor data architecture Performance problems with Data Warehouse/BI environment Slow running reports and analysis queries Deviation of report formats during migration/upgrade Validation of reports during BI platform upgrade/migration Not knowing the real cause, it could be Your network The application server or the application The database design The infrastructure it s running on Poor data from your source systems 3. Hexaware Solution To address the above mentioned issues and more, Hexaware offers a focused solution, combining its DWH/BI and Testing knowledge. Hexaware s DWH/BI Testing addresses the challenge to keep your Data Warehouse at peak performance and scalability levels as demanded by the information users. The required solution helps develop an effective way to predict DWH behaviour and performance under realistic stress conditions. When problems or bottlenecks occur, it helps you to find a quick way to diagnose and fix root causes. Hexaware s proven methodology for Data Quality Improvement is using Six Sigma Techniques. Hexaware s Six Sigma Data Quality Framework can be applied on Reporting systems and ETL processes to identify the root causes like record duplications, completeness, inconsistent entries, unexpected entries etc. Hexaware s ACE (Automated Comparison across Environments) is a report comparison tool that helps in comparing the data and formatting across reports. 4. DWH Testing Why is it different? Testing fundamentals and its core principles do not change when it comes to testing a DWH/BI application/environment, however the approach, resource needed and methods to do comprehensive testing differ from normal testing practices. Testing a web application or screen based application needs checking the values at the output level and does not need any programming support to do so. Typical Black-Box testing approach is effective here. But, testing a Data Warehouse application needs programming (scripting) skills and design review skills as there are no screens to test. A thorough knowledge on the database design aspects is mandatory to review and verify DWH designs as it can impact the performance of the DWH and its downstream reporting and other applications. 2009 Hexaware Technologies. All rights reserved. 3
5. DWH Testing What needs to be tested? DWH projects can be considered as a simple sequence of data transformation, changes and aggregation through a set of processes. But this simple chain of data movement leads to complications in testing. For every transformation of a dataset, testing must ensure that the transformation is right by including the transformation logic into test scripts. With no front end screens, most test scripts have to be created as backend scripts (say SQL queries) for testing. Thus, DWH testing is more intensive and more programmatic than regular application testing and requires extensive domain knowledge and DWH concepts to create test scripts. There is no readily available user interface to visually inspect and validate. A typical DWH implementation will have three core modules, namely: ETL (Extraction, Transformation and Loading with, in some cases, data quality checks built into it) Data Warehouse (multiple data bases in the name of ODS, EDW, Data Marts, etc) Reporting and Analysis packs These three modules are interlinked with the organisation networks and it can use multiple technology products from multiple vendors to make up a single implementation. Operational System ETL OLAP Analysis ERP Extraction, Transformation, Loading Metadata Summary Data Raw Data Reporting CRM Data Warehouse Data Mining Flat Files Data Integration Layer (Source Data) Data Warehouse Layer (Target Data) Reporting Layer (Presentation Data) Data Format Availability and missing data Business data Transformation Basic data cleansing Data frequency Full vs incremental load verification Multi source collaboration Loss of data Incomplete data Inaccurate formats and data Aggregate data validations DB design reviews Dimensional models Master data validation Functional complexity Performance Access control verifications User access privileges Report variable calculation checks Functionality reviews Web portal checks Dashboard values and presentation System configuration checks 2010 Hexaware Technologies. All rights reserved. 4
Some of the key data oriented testing in a typical DW testing assignment would include: Comparison between source and target data sets (tables) Review and count of number of records and totals (for numeric values) using all tables and fields Review and verification of all mapping documents and related business rules and algorithms Review of load strategy (incremental vs. full) Handling clean up of source systems and resetting data fields for a new load Dimensional data validation to check data integrity Data quality validation might involve field by field value checks to ensure completeness of data load Data model and architecture review to check if the architecture is fine tuned for maximum performance Network checks to ensure performance of web portals and web reporting Report/dashboard variable computation checks to ensure consistent and correct reporting BI Testing in an integrated environment covering Inbound and out bound interfaces, migrations / conversions, performance testing 6. DWH Testing Customer benefits The key customer benefits include end user confidence on the data which would dramatically improve adoption of BI in the organisation. DW testing helps to implement a DW/BI project quickly, thus reducing the project time and improving the ROI. 2009 Hexaware Technologies. All rights reserved. 5
To learn more, visit http:///wp-bi.htm Address 1095 Cranbury South River Road, Suite 10, Jamesburg, NJ 08831. Main: 609-409-6950 Fax: 609-409-6910 Safe Harbor Certain statements on this whitepaper concerning our future growth prospects are forward-looking statements, which involve a number of risks, and uncertainties that could cause actual results to differ materially from those in such forward-looking statements. The risks and uncertainties relating to these statements include, but are not limited to, risks and uncertainties regarding fluctuations in earnings, our ability to manage growth, intense competition in IT services including those factors which may affect our cost advantage, wage increases in India, our ability to attract and retain highly skilled professionals, time and cost overruns on fixed-price, fixed-time frame contracts, client concentration, restrictions on immigration, our ability to manage our international operations, reduced demand for technology in our key focus areas, disruptions in telecommunication networks, our ability to successfully complete and integrate potential acquisitions, liability for damages on our service contracts, the success of the companies in which Hexaware has made strategic investments, withdrawal of governmental fiscal incentives, political instability, legal restrictions on raising capital or acquiring companies outside India, and unauthorized use of our intellectual property and general economic conditions affecting our industry. 2009 Hexaware Technologies. All rights reserved. 6