Test Data Management Concepts BIZDATAX IS AN EKOBIT BRAND
Executive Summary Test Data Management (TDM), as a part of the quality assurance (QA) process is more than ever in the focus among IT organizations as they realize the benefits of TDM structured approach such as shorter testing cycles, less operating and storage costs associated, better test quality and test case coverage. Furthermore, TDM plays an important role in maintaining compliance to data protection policies. What organizations look for is an end-to-end TDM solution encompassing data protection policy management, sensitive data discovery, powerful data masking, synthetic data generation and data subsetting capabilities, enterprise scale runtime environment, extensive reporting and integration capabilities designed for highly complex scenarios. BizDataX Test Data Management solution meets these requirements by implementing a number of innovative concepts in tools designed to support different TDM user roles. Business stakeholders use BizDataX Portal to manage policies, discover sensitive data within data stores and to get all sorts of reports. Solution implementers use BizDataX Designer to implement policy based data transformation rules within visual workflow editor and the operations team uses BizDataX Runtime to create test data sets on demand. BizDataX solution fits seamlessly into the testing process of any complexity, increasing its quality and efficiency, while at the same time enabling significant cost savings. 2
Introduction Test Data Management (TDM) is an integral part of the testing process, supporting the process in all of its phases, enabling fast provisioning of the test data at the lowest possible cost while staying compliant to industry and data privacy regulations. TDM also contributes to other disciplines such as data governance, knowledge management, testing environment management and business intelligence helping increasing process efficiency and the overall data quality and minimizing risks. Here are the key concepts of a well thought-out test data management solution: Near real data Data Protection Policy Management Sensitive Data Discovery Powerful and flexible test data rules designer Support for key test data generation concepts, such as data masking, data subsetting and synthetic data generation Fast test data provisioning Support for a variety of source and target data repositories Support for data analysis Test data project configuration management Support for third party test management and test automation tools Enterprise features (scalability, role based security..) Documenting and archiving capabilities This document s goal is to help you get a better understanding of each of these key concepts and to provide you with information on how the BizDataX Test Data Management solution implements these concepts. BIZDATAX PORTAL 3
Near real data The concept of near real or realistic enough data relies on the fact that test data should resemble, in terms of data quality and semantics, as much as possible the data in the production databases (or the data that is expected to be used in real world scenarios). This is a key concept in a test process, assuring the correlation between the controlled test environment and the real world scenarios. BizDataX employs many data generation and data masking techniques and algorithms that enable generation of the near real data. Some of them are lists of replacement values, national identification number generators, generators of financial values, generators of phone numbers, emails and many others. BIZDATAX WORKFLOW EXAMPLE 4
Data Protection Policy Management Using production data to generate test data is a common and often preferred method of test data provisioning. Along with many positive aspects, there are regulatory concerns regarding the data protection that need to be taken into account. Efficient policy management supports setting up policies with data transformation rules that need to be applied to the sensitive production data in order to comply with the data protection and corporate regulations. An important part of the BizDataX Portal is a Policy Management module where users can set up an arbitrary number of policies and assign data transformation/ masking rules. Stored policy items can be used as an input for sensitive data discovery module and for compliancy reports. BIZDATAX PORTAL POLICY MANAGEMENT 5
Sensitive Data Discovery In a highly complex IT landscapes with a number of different data stores it takes significant effort to find all data that is sensitive in terms of data privacy or business importance. Sensitive data could be hidden all over the data stores, in a well named and not so well named database columns, in remarks, info fields, structured or unstructured content. A powerful Sensitive Data Discovery module is therefore of a great help. Discovered sensitive data is handled in the implementation phase as defined by the data protection policies. BizDataX Sensitive Data Discovery module supports searching for sensitive data using several methods: searching metadata dictionary, searching within the production data using data samples or regular expressions and searching for sensitive data foreign key patterns. BIZDATAX PORTAL SENSITIVE DATA DISCOVERY 6
Fast test data provisioning Depending on the business scenarios, test data could be needed on a daily, even hourly basis. In a complex business applications environment, a demand for the just-in-time relevant test data will be very high. This can incur significant costs for hardware and trained resources who will be responsible for providing test data to testing teams. BizDataX is designed from the ground up to enable fast test data generation workflow design, deployment and execution. You can create complex test data scenarios using BizDataX Designer, deploy them in BizDataX Runtime and generate billions of test data records, all within hours. Once created, test data generation jobs can be executed an arbitrary number of times. Special care is taken to speed up the jobs execution. BizDataX Runtime engine analyzes declarative rules to determine the optimal execution plan, generate statements, and achieve the best possible performance with parallel execution and paging of large record sets. Runtime Host Flat file XML Designer Management Console Host Test Databases Host BIZDATAX DESIGNER AND RUNTIME 7
Powerful and flexible test data rules designer Every TDM solution needs to have an intuitive and efficient way to design rules for test data generation. This is where most solutions today fail. In order to cover all the real world scenarios, your rules design engine can t be bound to just a number of predefined rules for a specific industry or use only scripting languages or even worse (education costs) own proprietary language as an extensibility point. Both user interface (UI) and user experience (UX) also play a very important role when choosing the right solution. BizDataX Designer integrates with the Microsoft Visual Studio environment and Workflow Editor. Wizards and visual clues support the process of defining the rules. Drag-n-drop helps set up the common parameters, the properties window is there to help tweak the details. The rules are designed visually using domain-specific terminology. One doesn t have to think about tables, views, raw SQL, loops, cursors, transactions and such. BizDataX naturally extends to the.net platform offering full support for programming languages such as C#, JavaScript etc. By leveraging Microsoft s world class development platform, BizDataX speeds up test data generation workflow development, shortens the education curve and cuts the overall cost of test data management. BIZDATAX DESIGNER 8
Support for key test data generation concepts like data masking, data subsetting and synthetic data generation BizDataX enables provisioning of test data by combining feature-rich data masking, data subsetting and synthetic data generation capabilities. When masking real data, generating from scratch or combining the two, the system uses built-in: Real data Masked data Lists of replacement values: person names with country/region and gender attributes, places, postal codes, streets, banks National identification number generators (SSN, AHV, OIB ) Generators of financial values: credit card numbers, account numbers, IBANs Generators of phone numbers, emails Data shuffling engine Templates with placeholders used to populate free text fields Formulas to shift dates Conditional constructs to handle special cases Distributions to generate data with certain statistical properties and more.. Many day-to-day testing processes are able to function and benefit from using a very small subsets of originally huge sets of records. Smaller databases lower the investment in hardware and software licenses needed to build a parallel infrastructure. 9
Support for a variety of source and target data repositories In many TDM scenarios, access to production data is needed to generate test data. Production data is typically stored in many different data storage systems, and accessing data across them with some sort of home-grown, script-based approach could be very challenging. Support for different data repositories is a must for a good TDM solution. BizDataX can connect to a wide variety of data sources, including relational and legacy databases. It can also connect to flat files, Excel files, MS SQL Analysis Services projects and XML. Directly connecting to different data storage systems is supplemented by the option to transfer data to an intermediate database to separate the core test data generation process from ETL. The resulting test database can be created in a variety of database formats. Additionally, BizDataX preserves referential data integrity across database and system boundaries. BIZDATAX WORKFLOW EXAMPLE WITH ACCESS TO ORACLE, DB2 AND SQL SERVER DATABASES 10
Support for data analysis A TDM solution has to have the capability to analyze production data in order to accurately define test data generation rules according to the business rules. A thorough data analysis is the foundation for successful and efficient test process. It enables identifying relevant data needed to complete test cases, thus saving time and increasing the quality of the process. It also helps optimize test data volumes for easier database management and lower hardware costs. BizDataX can connect to a wide variety of data sources, importing their schemas and enabling production metadata analysis. BizDataX also enables definition of criteria for identifying and grouping equivalent records and support equivalence partitioning. Data record groups can be analyzed to identify special cases and achieve 100% test case coverage. The criteria is then used by the sampling and subsetting engine to extract the minimal relevant subset of original data or by generators to generate synthetic data that targets specific test scenarios. SUPPORTING EQUIVALENCE PARTITIONING 11
Test data project configuration management As the application and the underlying database change, the test data (generation) rules also need to be upgraded from time to time. A good TDM solution should enable test data rules versioning in order to support application upgrades, transparency and reusability, especially when it comes to testing older versions of the same application (which, of course, happens a lot). BizDataX integrates into the software development lifecycle. With BizDataX, you ll be able to checkin changes to test data projects, just as you would be able to check-in any changes on the application s source code; test data rules would always be up to date and changes would be tracked in your source control system. Versioning of test data rules works with standard technologies such as the Microsoft Team Foundation Server, Git, Subversion and many other source code management solutions. TEST DATA PROJECT TREE AND TEST DATA JOB EXECUTION HISTORY 12
Support for third party test management and test automation tools Organizations report that they spend between 50-75 out of every 100 minutes of manual test execution time on finding and preparing appropriate test data. Yet they are still unable to achieve stable test automation due to a lack of reliable test data control. Test Management and Test Automation solutions can greatly benefit from integration with TDM solutions as they, used efficiently together, could significantly increase the testing execution times. In addition to generating test data, BizDataX can be configured to label data for test cases and provide data for test automation tools, such as Microsoft Test Manager, HP Unified Functional Testing, Tricentis Tosca, imbus Test Bench etc. Test data Data labeled for test cases Other TM/TA tools BIZDATAX INTEGRATION SCENARIOS 13
Enterprise features (scalability, role based security...) Processing TBs of data in an enterprise environment requires support for features, such as the role-based security model, ability to scale test data generation jobs on several hosts, job execution status protocol, to name a few. These features come at a price and are not there by default on every TDM solution on the market. As already mentioned in the Fast test data provisioning section, BizDataX Runtime is an enterprise level application that supports role-based security and other enterprise features and fits perfectly within a complex IT landscape. BizDataX Runtime can be installed on a single or on multiple systems, virtualized or real, and managed centrally with industry standard tools, such as Microsoft Management Console. BIZDATAX RUNTIME 14
Documenting, protocolling and archiving capabilities In order to comply with data privacy regulations and corporate policies and to support transparency and traceability, a TDM solution must support documenting of all steps within a test data project. This documentation should enable external and internal audits to easily validate policy rules, sensitive data findings, workflows used to generate test data and protocols of test data generation execution runs. Once the test data project is over, it should be archived and eventually restored to support generating historical test data. BizDataX supports documenting each and every step within the BizDataX Project. Examples of reports include: Requirements document provides information on all project related policies. Sensitive Data Discovery document lists all sensitive data found within the data stores with implementation hints. Job Execution report shows detailed information about BizDataX job execution status. Workflow Implementation document consists of images of the workflow with implementation details and annotations. BizDataX Test Data Projects can be versioned, archived and restored in order to recreate historical test data. BIZDATAX DOCUMENTING OPTIONS 15
BizDataX Professional Services It is the people and the tools that make the difference! Our Professional Services Team will help you get the most out of the test data management process and the BizDataX solution. They will help you set up your test data environment and assess your test data according to your needs, adapt BizDataX to fit in your test data usage scenarios and implement new test data algorithms for your specific requirements (e.g. industry specific data masking rules). They will also educate your test professionals how to design, deploy and execute BizDataX test data generation jobs on their own. Professional services portfolio includes: Test Data Assesment Proof of Concept Custom Workflow and Algorithms Implementation BizDataX Solution Installation BizDataX Workshops 16
BIZDATAX IS AN EKOBIT BRAND BizDataX is an end-to-end Test Data Management solution designed to enable fast and cost effective test data provisioning in highly complex IT systems. BizDataX promise is to: deliver near real test data just in time comply to data privacy regulations find sensitive data in your data stores support implementation of all test data scenarios using visual design tools integrate seamlessly with common test management and test automation tools increase test process efficiency while lowering costs Call Us: +41 76 579 16 41 +385 1 6312 635 Email: info@bizdatax.com Ekobit d.o.o. Koturaška 69 10000 Zagreb, Croatia, EU www.ekobit.com BizDataX Sales partner DACH region aminodata GmbH Gartenstrasse 23 5400 Baden, Switzerland www.aminodata.com