Data-Warehouse & Big Data Testing at The End of the Food Chain



Similar documents
Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone

Modern Data Warehouse

Testing Big data is one of the biggest

Whitepaper. Data Warehouse/BI Testing Offering YOUR SUCCESS IS OUR FOCUS. Published on: January 2009 Author: BIBA PRACTICE

Chapter 6 Basics of Data Integration. Fundamentals of Business Analytics RN Prasad and Seema Acharya

Testing 3Vs (Volume, Variety and Velocity) of Big Data

Agile Testing of Business Intelligence. Cinderella 2.0

Amazon Redshift & Amazon DynamoDB Michael Hanisch, Amazon Web Services Erez Hadas-Sonnenschein, clipkit GmbH Witali Stohler, clipkit GmbH

Whitepaper. Data Warehouse/BI Testing Offering. Published on: January 2010 Author: Sena Periasamy

Big Data Analytics. Lucas Rego Drumond

Reference Architecture, Requirements, Gaps, Roles

Getting Started Practical Input For Your Roadmap

North Highland Data and Analytics. Data Governance Considerations for Big Data Analytics

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

5.5 Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall. Figure 5-2

Understanding the Value of In-Memory in the IT Landscape

Chapter 6 8/12/2015. Foundations of Business Intelligence: Databases and Information Management. Problem:

Big Data-Challenges and Opportunities

Chapter 7. Using Hadoop Cluster and MapReduce

Data Warehouse Testing

Trustworthiness of Big Data

Big Data Analytics Nokia

CIO Guide How to Use Hadoop with Your SAP Software Landscape

Automating the process of building. with BPM Systems

BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014

Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth

Enterprise Solutions. Data Warehouse & Business Intelligence Chapter-8

How to Enhance Traditional BI Architecture to Leverage Big Data

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

Big Data. White Paper. Big Data Executive Overview WP-BD Jafar Shunnar & Dan Raver. Page 1 Last Updated

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

Decoding the Big Data Deluge a Virtual Approach. Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco

Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data

Data Warehousing and Data Mining in Business Applications

Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges

THE DEVELOPER GUIDE TO BUILDING STREAMING DATA APPLICATIONS

Parallel Data Warehouse

Transforming the Telecoms Business using Big Data and Analytics

Architectures for Big Data Analytics A database perspective

Big Data Technology ดร.ช ชาต หฤไชยะศ กด. Choochart Haruechaiyasak, Ph.D.

Retail POS Data Analytics Using MS Bi Tools. Business Intelligence White Paper

CBW NLS IQ High Speed Query Access to Database and Nearline Storage

Luncheon Webinar Series May 13, 2013

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2

Can the Elephants Handle the NoSQL Onslaught?

1. Understanding Big Data

Exploring the Synergistic Relationships Between BPC, BW and HANA

TRANSFORMING YOUR BUSINESS

... Foreword Preface... 19

Datawarehouse testing using MiniDBs in IT Industry Narendra Parihar Anandam Sarcar

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

Next Generation Business Performance Management Solution

Chapter 6. Foundations of Business Intelligence: Databases and Information Management

THE QUALITY OF DATA AND METADATA IN A DATAWAREHOUSE

Traditional BI vs. Business Data Lake A comparison

CERULIUM TERADATA COURSE CATALOG

Turning Big Data into More Effective Customer Experiences. Experience the Difference with Lily Enterprise

Advanced Big Data Analytics with R and Hadoop

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

CBW NLS High Speed Query Access to Database and Nearline Storage

A Scalable Data Transformation Framework using the Hadoop Ecosystem

Near-line Storage with CBW NLS

Oracle Big Data SQL Technical Update

Data Warehouse Modeling Industry Models

Introduction to Engineering Using Robotics Experiments Lecture 17 Big Data

Apache Hadoop in the Enterprise. Dr. Amr Awadallah,

Next presentation starting soon Business Analytics using Big Data to gain competitive advantage

MS 20467: Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Customized Report- Big Data

Relational Databases for the Business Analyst

Big Data Introduction

BI/Analytics for NoSQL: Review of Architectures

The Role of the BI Competency Center in Maximizing Organizational Performance

A Big Data Storage Architecture for the Second Wave David Sunny Sundstrom Principle Product Director, Storage Oracle

An Overview of SAP BW Powered by HANA. Al Weedman

Toronto 26 th SAP BI. Leap Forward with SAP

European Archival Records and Knowledge Preservation Database Archiving in the E-ARK Project

Big Data for the Rest of Us Technical White Paper

Data Warehousing and Data Mining

Product to Customer. through MDM. Presented by Luminita Vollmer, MBA, CDMP, CBIP

Big Data and Analytics in Government

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!

Big Data. Donald Kossmann & Nesime Tatbul Systems Group ETH Zurich

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Data Virtualization for Agile Business Intelligence Systems and Virtual MDM. To View This Presentation as a Video Click Here

Enterprise Operational SQL on Hadoop Trafodion Overview

A Brief Outline on Bigdata Hadoop

Transcription:

Data-Warehouse & Big Data Testing at The End of the Food Chain Thomas Abinger, Georg Fischer, September 18th, 2014 Copyright 2014, Tricentis GmbH. All Rights Reserved. 1

Agenda DWH vs. Big Data Automated Testing in DWH-Projects Big Data Testing Summary Differentiation between Data Warehouses and Big Data Set-Up Test and use Tosca iq Big Data and Automated Test Wrap up Copyright 2014, Tricentis GmbH. All Rights Reserved. 2

Data Warehouses vs. Big Data Data Warehouse Large Data Volume Big Data Operational Database with a Huge Volume of Data Data Updates and Data Archiving in Defined Intervals from Online DB Online Data Line Oriented SQL Structured Data Central Architecture Column Oriented / File Based NoSQL non-relational DBs or JSON Structured and Unstructured Data Distributed (HDFS) Architecture Copyright 2014, Tricentis GmbH. All Rights Reserved. 3

Primary Systems DWH Testing - Overview ETL Stages BI Stages Extract Transform Load Consolidation Aggregation Reporting Reports Big Data Stage 0 Stage Core DWH Stage Stage n Copyright 2014, Tricentis GmbH. All Rights Reserved. 4

Customer Survey I don t agree I agree Poor quality of data delivered to the DWH Limited regression testing of data processing along the DWH/BI Business departments are highly involved in manual testing of reports Copyright 2014, Tricentis GmbH. All Rights Reserved. 5

Primary Systems DWH Testing complex SQL Queries Extract ETL Stages Transform Load Consolidation Aggregation BI Stages Reporting Big Data Check SQL Query Stage 0 SQL - Queries Stage Hyper complex Stage Slow! limited number of verifications possible Who can understand and maintain this? Reports Stage n Copyright 2014, Tricentis GmbH. All Rights Reserved. 7

Data-Profiling Test Attribute Concerns the Logic in Stage n Product Category Candy Frozen Food Beer... Store Stadthalle Airport Central Station Business Rules Concerns the Attributes No Frozen Food at the Airport Profile 1 Beer Stadthalle Profile 2 FF Airport Profile Revenue 100.000 EUR 0 EUR Tolerance of Deviation +/-10.000 EUR 0 EUR Copyright 2014, Tricentis GmbH. All Rights Reserved. 8

Landscaping Großglockner Alps Großglockner view from south-west: 1=Glocknerwand, 2=Untere Glocknerscharte, 3=Teufelshorn (left) / Glocknerhorn (right), 4=Teischnitzkees, 5=Großglockner, 6=Kleinglockner, 7= Stüdlgrat, 8=Ködnitzkees, 9=Adlersruhe Copyright 2014, Tricentis GmbH. All Rights Reserved. 9

Landscaping Großglockner Alps Großglockner view from south-west: 1=Glocknerwand, 2=Untere Glocknerscharte, 3=Teufelshorn (left) / Glocknerhorn (right), 5=Großglockner, 6=Kleinglockner Copyright 2014, Tricentis GmbH. All Rights Reserved. 10

Landscaping: Example Billa Ref. Revenue Product Groups / Store May 2014 600 k [EUR] 500 k [EUR] 400 k [EUR] 300 k [EUR] 200 k [EUR] 100 k [EUR] 0 k [EUR] Product Group Cosmetic Product Group Beer Product Group Baked Goods Product Group Fruit 0 k [EUR]-100 k [EUR] 100 k [EUR]-200 k [EUR] 200 k [EUR]-300 k [EUR] 300 k [EUR]-400 k [EUR] 400 k [EUR]-500 k [EUR] 500 k [EUR]-600 k [EUR] Copyright 2014, Tricentis GmbH. All Rights Reserved. 11

Landscaping: Example Billa Revenue Product Groups / Store May 2015 600 k [EUR] 500 k [EUR] 400 k [EUR] 300 k [EUR] 200 k [EUR] 100 k [EUR] 0 k [EUR] Product Group Cosmetic Product Group Beer Product Group Baked Goods Product Group Fruit 0 k [EUR]-100 k [EUR] 100 k [EUR]-200 k [EUR] 200 k [EUR]-300 k [EUR] 300 k [EUR]-400 k [EUR] 400 k [EUR]-500 k [EUR] 500 k [EUR]-600 k [EUR] Copyright 2014, Tricentis GmbH. All Rights Reserved. 12

Process Quality: Testing of Business Rules DWH Challenges exactly the same in Testing Business Rules are grown who knows them all? No Contact Person available Data consistency can be tested through Stages Copyright 2014, Tricentis GmbH. All Rights Reserved. 13

Checks for DWH Testing Vital Check Basic-Checks like Number of Data sets and other Parameters Key and Join Tests Tool-Support: Tosca DB Engine, predefined building blocks in Tosca TestCase Design Delivery Check Column and Dependency-Checks - Business Logic Tool-Support: Tosca DB Engine, predefined building blocks in Tosca TestCase Design Checks Tosca iq Speed Optimized Memory Optimization for Queries Variant Records are shown Copyright 2014, Tricentis GmbH. All Rights Reserved. 14

TOSCA iq Operating principle Profiles Physical Queries TC1 TC2 Causes Error TC3 TC4 TOSCA IQ TC5 TC6 TC7 TC8 TC9 Causes Error TC10 TC9 Record set: ID xyz123456 Copyright 2014, Tricentis GmbH. All Rights Reserved. 15

Big Data Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it... [Dan Ariely; Facebook Posting; January 6 th, 2013] Copyright 2014, Tricentis GmbH. All Rights Reserved. 16

Big Data Market Potential McKinsey: Multi-Million USD in the following areas: Big data: The next frontier for innovation, competition and productivity, McKinsey Global Institute, October 2011 Copyright 2014, Tricentis GmbH. All Rights Reserved. 17

Big Data Example Use Cases Sport Tracker (GPS, Pulse Rate, Blood Pressure) Medical Data Logger (Elderly Care at Home) Connected Cars Copyright 2014, Tricentis GmbH. All Rights Reserved. 18

Big Data feeding Data Warehouses Big Data Unstructured Data Potential starting point Big Data Analysis Structured Data Potential starting point or intermediate stage Big Data Analysis Other Use Cases DWH Copyright 2014, Tricentis GmbH. All Rights Reserved. 19

Global data generated per year (Exabyte) 45000 40000 40026 35000 30000 25000 20000 15000 10000 8591 5000 2837 1227 130 0 2005 2010 2012 2015 2020 Source: Statista 06/2014 Copyright 2014, Tricentis GmbH. All Rights Reserved. 20

Structured and unstructured data 90% of the global data is unstructured Pictures Music Videos Social Media Content Used by: Copyright 2014, Tricentis GmbH. All Rights Reserved. 21

Leading questions for testing Which type of data are processed in the context of Big Data? Unstructured data are used indirect and not direct. The analysis starts with structured data, generated in an interpretation step. Copyright 2014, Tricentis GmbH. All Rights Reserved. 22

Focus Big Data Testing Data Warehouse Large Data Volume Big Data Operational Database with a Huge Volumn of Data Data Updates and Data Archiving in Defined Intervals from Online DB Online Data Line Oriented SQL + Tosca iq Structured Data Central Architecture Column Oriented / File Based NoSQL non-relational DBs or JSON Structured and Unstructured Data Distributed (HDFS) Architecture Copyright 2014, Tricentis GmbH. All Rights Reserved. 23

Summary Data Warehouse The End of the Food Chain : Data Quality as a additional Risc Factor Data and Process Quality can be tested using Profiling and Data Landscaping Big Data New Technologies promising Future Classical functional Testing to analyze the Data Source for Data Warehouse: Classic Methods for Monitoring the Data Quality Copyright 2014, Tricentis GmbH. All Rights Reserved. 24

Thank You! Now it s your turn Questions & Answers Copyright 2014, Tricentis GmbH. All Rights Reserved. 25