IM02 How to manage your Test Data on zenterprise. 18-20 September, 2012 IBM Forum Brussels



Similar documents
Test Data Management in the New Era of Computing

Welcome Tata Consulting Services, DSP Managed Services IBM and Azlan. Oracle e-business Suite. R12 Upgrade Workshop Summer 2011

IBM InfoSphere Optim Data Masking solution

IBM InfoSphere Optim Test Data Management

IBM InfoSphere Optim Test Data Management Solution

IBM DB2 Data Archive Expert for z/os:

IBM InfoSphere Optim Test Data Management solution for Oracle E-Business Suite

IBM Software Five steps to successful application consolidation and retirement

IBM Software The fundamentals of data lifecycle management in the era of big data

AD04 - Batch Modernization Strategies for Mainframe Environments

Big Data Analytics with IBM Cognos BI Dynamic Query IBM Redbooks Solution Guide

Integrated Data Management: Discovering what you may not know

IBM Configuring Rational Insight and later for Rational Asset Manager

OS Deployment V2.0. User s Guide

IBM Cognos Controller Version New Features Guide

Tivoli Endpoint Manager for Security and Compliance Analytics. Setup Guide

Rapid Data Backup and Restore Using NFS on IBM ProtecTIER TS7620 Deduplication Appliance Express IBM Redbooks Solution Guide

IBM Financial Transaction Manager for ACH Services IBM Redbooks Solution Guide

Active Directory Synchronization with Lotus ADSync

Disaster Recovery Procedures for Microsoft SQL 2000 and 2005 using N series

Release Notes. IBM Tivoli Identity Manager Oracle Database Adapter. Version First Edition (December 7, 2007)

Database lifecycle management

Tivoli Endpoint Manager for Configuration Management. User s Guide

Getting Started with IBM Bluemix: Web Application Hosting Scenario on Java Liberty IBM Redbooks Solution Guide

Why Add Data Masking to Your IBM DB2 Application Environment

Closing the data privacy gap: Protecting sensitive data in non-production environments

Test Data Management for Security and Compliance

IBM Security QRadar Version (MR1) Checking the Integrity of Event and Flow Logs Technical Note

Redbooks Paper. Local versus Remote Database Access: A Performance Test. Victor Chao Leticia Cruz Nin Lei

Case Study: Process SOA Scenario

Balance and maximise your Oracle EBS investment with IBM Optim A Priceline and Travel Industry Case Study Philip McBride

IBM Enterprise Marketing Management. Domain Name Options for

Cúram Business Intelligence and Analytics Guide

IBM Enterprise Marketing Management. Domain Name Options for

IBM Security SiteProtector System Migration Utility Guide

IBM TRIRIGA Anywhere Version 10 Release 4. Installing a development environment

Tivoli Endpoint Manager for Security and Compliance Analytics

Platform LSF Version 9 Release 1.2. Migrating on Windows SC

Packet Capture Users Guide

IBM RDX USB 3.0 Disk Backup Solution IBM Redbooks Product Guide

Linux. Managing security compliance

IBM Lotus Protector for Mail Encryption

InfoSphere Governance Solutions Maximizing your Information Supply Chain

IBM Rational Rhapsody NoMagic Magicdraw: Integration Page 1/9. MagicDraw UML - IBM Rational Rhapsody. Integration

IBM TRIRIGA Version 10 Release 4.2. Inventory Management User Guide IBM

IBM VisualAge for Java,Version3.5. Remote Access to Tool API

Integrating ERP and CRM Applications with IBM WebSphere Cast Iron IBM Redbooks Solution Guide

Getting Started With IBM Cúram Universal Access Entry Edition

Version 8.2. Tivoli Endpoint Manager for Asset Discovery User's Guide

Software Usage Analysis Version 1.3

IBM Security QRadar Version Installing QRadar with a Bootable USB Flash-drive Technical Note

Business-driven governance: Managing policies for data retention

IBM Optim. The ROI of an Archiving Project. Michael Mittman Optim Products IBM Software Group IBM Corporation

IBM PowerSC Technical Overview IBM Redbooks Solution Guide

IBM Endpoint Manager for Software Use Analysis Version 9 Release 0. Customizing the software catalog

IBM Cognos Controller Version New Features Guide

Creating Applications in Bluemix using the Microservices Approach IBM Redbooks Solution Guide

Rational Reporting. Module 2: IBM Rational Insight Data Warehouse

SupportPac CB12. General Insurance Application (GENAPP) for IBM CICS Transaction Server

IBM FileNet Capture and IBM Datacap

IBM Endpoint Manager Version 9.2. Software Use Analysis Upgrading Guide

Front cover Smarter Backup and Recovery Management for Big Data with Tectrade Helix Protect

IBM Client Security Solutions. Password Manager Version 1.4 User s Guide

data express DATA SHEET OVERVIEW

IBM SmartCloud Analytics - Log Analysis. Anomaly App. Version 1.2

DevOps for the Mainframe

Industry Models and Information Server

Manageability with BPM

CA Repository for z/os r7.2

IBM Tivoli Service Request Manager 7.1

Best practices for protecting Enterprise Information in BigData & Datawarehouse. Anwar Ali, Senior Solution Consultant, Information Management

CA Deliver r11.7. Business value. Product overview. Delivery approach. agility made possible

IBM InfoSphere Discovery: The Power of Smarter Data Discovery

Application retirement: enterprise data management strategies for decommissioning projects

IBM CommonStore Archiving Preload Solution

Patch Management for Red Hat Enterprise Linux. User s Guide

One Step Closer To Making Data Breaches a Thing of the Past

Tivoli IBM Tivoli Monitoring for Transaction Performance

Tivoli Security Compliance Manager. Version 5.1 April, Collector and Message Reference Addendum

Rational Reporting. Module 3: IBM Rational Insight and IBM Cognos Data Manager

IBM Tivoli Web Response Monitor

Installing and Configuring DB2 10, WebSphere Application Server v8 & Maximo Asset Management

IBM Security QRadar Version (MR1) Replacing the SSL Certificate Technical Note

WebSphere Application Server V6: Diagnostic Data. It includes information about the following: JVM logs (SystemOut and SystemErr)

Sterling Supplier Portal. Overview Guide. DocumentationDate:9June2013

Installing on Windows

Beyond the Single View with IBM InfoSphere

IBM Security QRadar Version (MR1) Configuring Custom Notifications Technical Note

IBM TRIRIGA Application Platform Version Reporting: Creating Cross-Tab Reports in BIRT

CA Endevor Software Change Manager Release 15.1

Redpaper. IBM Workplace Collaborative Learning 2.5. A Guide to Skills Management. Front cover. ibm.com/redbooks. Using the skills dictionary

Sametime Version 9. Integration Guide. Integrating Sametime 9 with Domino 9, inotes 9, Connections 4.5, and WebSphere Portal

IBM WebSphere Message Broker - Integrating Tivoli Federated Identity Manager

Test Data Management. Services Catalog

Transcription:

IM02 How to manage your Test Data on zenterprise 18-20 September, 2012 IBM Forum Brussels

Notices This information was developed for products and services offered in the U.S.A. Note to U.S. Government Users Restricted Rights Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-ibm product, program, or service. IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not give you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing, IBM Corporation, North Castle Drive Armonk, NY 10504-1785 U.S.A. The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you. This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice. Any references in this information to non-ibm Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk. IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Information concerning non-ibm products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-ibm products. Questions on the capabilities of non-ibm products should be addressed to the suppliers of those products. This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental. COPYRIGHT LICENSE: This information contains sample application programs in source language, which illustrates programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. You may copy, modify, and distribute these sample programs in any form without payment to IBM for the purposes of developing, using, marketing, or distributing application programs conforming to IBM's application programming interfaces. 2

Trademarks This presentation contains trade-marked IBM products and technologies. Refer to the following Web site: http://www.ibm.com/legal/copytrade.shtml 3

Agenda Data Governance - Focus on Test Data creation and masking Obtaining test data. Some common practices. Obtaining test data. Best practices. InfoSphere Optim Test Data Management / Data Privacy Q & A

A Smarter Planet harnesses today s information explosion for business benefit creating a need for better Information Governance Instrumented Interconnected Intelligent Streamlining processes to manage business growth with consistency Ensuring compliance with policies, laws and regulations Controlling costs and optimizing infrastructure

Success requires governance across the Information Supply Chain Transactional & Collaborative Applications Integrate Analyze Content Analytics Business Analytics Applications Manage Master Data Big Data Cubes Data External Information Sources Content Streaming Information Data Warehouses Govern Information Governance Quality Lifecycle Security & Privacy Standards

Requirements for managing data across its lifecycle Information Governance Core Disciplines Lifecycle Management Discover & Define Develop & Test Optimize & Archive Consolidate & Retire Discover where data resides Develop & test database structures/ code Enhance performance Integrate into single data source Classify & define data and relationships Create & refresh test data Manage data growth Move only the needed information Define policies Capture & replay production workloads Report & retrieve archived data Enable compliance with retention & e-discovery

Organizations continue to be challenged with building quality applications Increasing Costs Increasing Risk Time to Market Defects are caught late in the cycle Mandatory to protect data and comply with regulations Lack of realistic test data and inadequate environments

Organizations continue to be challenged with building quality applications Increasing Costs $300 billion Annual costs of softwarerelated downtime. d 32% Low success rate for software projects e Increasing Risk 45,000+ Number of sensitive records exposed to 3 rd party during testing c 70% companies use actual customer data to test applications a Time to Market 37% Satisfied with speed of software development f 30-50% Time testing teams spend on setting up test environments, instead of testing b a. The Ponemon Institute. The Insecurity of Test Data: The Unseen Crisis b. NIST, Planning Report. The Economic Impacts of Inadequate Infrastructure for Software Testing c. Federal Aviation Administration: Exposes unprotected test data to a third party http://fcw.com/articles/2009/02/10/faa-data-breach.aspx d. The Standish Group, Comparative Economic Normalization Technology Study, CHAOS Chronicles v12.3.9, June 30, 2008 e. The Standish Group, Chaos Report, April 2009 f. Forrester Research, Corporate Software Development Fails To Satisfy On Speed Or Quality", 2005

Vulnerable non-production environments at risk Most ignore security in non-production environments 70% of organizations surveyed use live customer data in non-production environments (testing, Q/A, development) Database Trends and Applications. Ensuring Protection for Sensitive Test Data $194 per record cost of a data breach The Ponemon Institute. 2012 Cost of Data Beach Study 50% of organizations surveyed have no way of knowing if data used in test was compromised The Ponemon Institute. The Insecurity of Test Data: The Unseen Crisis 52% of surveyed organizations outsource development The Ponemon Institute. The Insecurity of Test Data: The Unseen Crisis

CIO Impact Across the Enterprise The benefits of TDM strategy Align application performance to business processes Ensure business continuity Respond quickly and accurately to audit and discovery requests Leverage existing investments in applications, databases and storage Reduce resource requirements for key IT operations Business Profit from superior application performance and availability Provision resources to meet priority business needs Automate data retention to support compliance initiatives Eliminate budget variances IT Streamline application and database upgrades Speed disaster recovery Simplify database administration Reclaim underutilized capacity Protect data privacy, integrity, and security

Agenda Data Governance - Focus on Test Data creation and masking Obtaining test data. Some common practices. Obtaining test data. Best practices. InfoSphere Optim Test Data Management / Data Privacy Q & A

Test data creation is often accomplished through cloning Simple to do Positives Requires little knowledge of the data model or infrastructure Creates an exact duplication of production Negatives Uses significant storage - Much more than team needs - Often done once and not for each team member Data is production ready and therefore a privacy risk Takes significant amounts of time to create No way to compare to original after test is complete Cannot span multiple data sources/applications Developer/Tester downtime when sharing data accessibility Production Database Test Database Clone

The Data Multiplier Effect 500 GB Development 500 GB Production 500 GB Test 500 GB Backup 3000 GB Total 500 GB User Acceptance 500 GB Disaster Recovery Actual Data Burden = Size of production database + all replicated clones

Generating synthetictest data Positives Negatives Safe Resource-intensive: - Huge commitment from DBA - Deep knowledge of database schema Tedious: DBA s must intentionally include errors to ensure robust testing process. Created data does not always reflect the integrity of the original data. Time-consuming: process is slower and can be error-prone. Test Database

Test data creation by writing SQL Positives Negatives? Write and maintain SQL. Complex and subject to change. Referential Integrity? Right data? Expensive, dedicated staff. Cannot span multiple data sources/applications. Developer/Tester downtime when sharing data accessibility. Production Database Test Database SQL

Agenda Data Governance - Focus on Test Data creation and masking Obtaining test data. Some common practices. Obtaining test data. Best practices. InfoSphere Optim Test Data Management / Data Privacy Q & A

Test Data Management Concepts Test Data Management (TDM) refers to the need to manage data used in various preproduction environments and is a vital part of Application Quality & Delivery. Extract production data into referentially intact data subsets to be used to support application data in other environments. De-identify (mask) extracted production data to protect privacy. Compare before and after images of test data. Speed application quality and delivery.

Test Data Management Best Practices 50 GB 50 GB Unit 2 TB Function 3 TB Integration Performance 4 TB UAT Production Subset Privatize Subset Privatize Inspect/Browse Inspect/Browse & & Seed Seed Test Test Cases Cases Correct Production Errors Refresh Test Data Refresh Test Data Run Test Production Errors Compare Compare Before Before and and After After Results Results No Errors Promote to Production! Source 1 Compare Process Source 2

The Test Data Management Process Gold Copy TDM Processes Subset and Privatized Test Data Extract Extract & Subset Subset Convert Convert & Mask Mask Compare Compare & Audit Audit Load Load & Distribute Distribute Subset Subset Masking Masking Source Source Criteria Criteria Rules Rules & Target Target Secured Lock-down Environment DB DB List List & Auth Auth

Refresh test data Test Environments Tester Refresh Developer Developer Refresh Test Database 50 GB Training Database 75 GB Tester Tester Refresh Dev Database 25 GB

Example process Speed delivery, reduce costs and improve quality while reducing risk and increasing compliance with Test Data Management Without Test Data Management With Test Data Management Tester Submits request for test data DBA Sits in queue for days Create test data Tester Submits request for test data Sits in queue for days Takes several days to create DBA Create test data Tester -Use test data in testing -Request data refresh Sits in queue for days and take several days to create Tester Takes hours to create -Use test data in testing - Refresh test data DBA Create or Refresh test data

Agile development relies on agile testing, agile testing relies on continuous access to test data Organization Process Technology an agile test organization and testers need continuous access to test data with agile development you have continuous integration and delivery and have to test often test data management software needs to support agile method by streamlining access to test data and having insight into test data

Agenda Data Governance - Focus on Test Data creation and masking Obtaining test data. Some common practices. Obtaining test data. Best practices. InfoSphere Optim Test Data Management / Data Privacy Q & A

Enterprise Data Governance for System z Archive inactive data and reduce amount of data exposed and requiring protections. Reduce risk from Security breaches Optim Data Growth Solution Comply with regulatory compliance requirements Manage Data Lifecycle Data Retention Data Retirement Data Governance Secure Prevent Access Restrict Access Monitor Access DB2/RACF Security Tivoli zsecure Audit Protect sensitive customer data and employee data Guardium for z Audit Audit Privileges Audit Users Audit Access Protect & Privacy Mask Data Encrypt Data Data Encryp. for IMS / DB2 Optim TDM and DP IBM is only solution provider with an end to end comprehensive solution 25

IBM InfoSphere Optim supports the heterogeneous enterprise Discover Manage Test Data Capture & Replay Archive Partner-delivered Solutions Single, scalable, heterogeneous information lifecycle management solution provides a central point to deploy policies to extract, archive, subset, and protect application data records from creation to deletion

OPTIM Server / Repository IMS VSAM / SEQ Files DB2 Orders Products Customers Payments Employee Payroll IMS Native Access Native Access DB2 Access DB2 Optim Directory Workstation ISPF Server Repository Services Data Access Services Subsetting Services Archiving Services An ISPF Data workbench Privacy software Servicesrunning under Z/OS Open utilized Data to design, Management test and deploy projects to the OPTIM Server. The ISPF workbench software Securityenables either Online and Batch (JCL) execution. Metadata Data Data Index Index Artifacts Storage Extract Independent & Archive Files Archive ODBC/JDBC

OPTIM Server / Repository IMS VSAM / SEQ Files DB2 Orders Products Customers Payments Employee Payroll IMS Native Access Native Access DB2 Access Utilizes IMS provided Utilizes drivers to VSAM access Native Accesss Utilizes to access SQL data. to access Metadata data. Provides data. Metadata is captured is captured via via copybook imports process (COBOL to capture or PL/1) metadata from DB2 Metadata copybook imports (COBOL or PL/1) catalog. DB2 Server Optim Directory Workstation ISPF Repository Services Data Access Services Subsetting Services Archiving Services Data Privacy Services Open Data Management Security Z/OS Store and retrieve metadata Data Data information, Index Indexproject information, archive catalog in the Optim Access source or destination Databases via Directory Enable data access relational with access Artifacts specific to drivers Archived per Data file type via ODBC/JDBC and Storage SQL-92. Extract and restore Storage relationally Extract Independent & Can Archive be used intact Business Archive Files in Archive conjunction to remote Database services of Objects across multiple DB2 Databases, IMS and VSAM Store ODM (e.g. and to integrate retrieve, Orders from restore, with Enterprise DB2, delete, Customers compress Data Access from VSAM data, Consistently for Business and metadata Credit and Intelligence and Cards predictably artifacts from IMS) mask (e.g. external and documents propagate data as BLOBs) for the related purpose to of Business test data Objects management with data compliance ODBC/JDBC Provide functional and object security to separate product and data access by role and responsibilities using RACF

Optim capture the complete Business Object

Application view Application-level business rules for data relationships Optim Captures the Complete Business Object Business Object : Represents application data record payment, invoice, customer Referentially-intact subset of data across related tables and applications; includes metadata, DDL, Reference + Transaction. Benefit: Referential Integrity: Ensure data is captured and masked consistently DBA view Referentially-intact subset of data Related LUW Files or Documents Complete Data RI Preserved! OS Independent DB independent ODBC Accessible Federated access to data and metadata IMS IMS DB2 DB2 IMS IMS VSAM VSAM

Test Data Management 2TB Production or Production Clone IBM InfoSphere Optim Test Data Management Solution Requirements -Subset -Mask Create right-size production-like environments for application testing 25 GB Unit Test 100 GB -Compare -Refresh Integration Test 25 GB Development 50 GB Training InfoSphere Optim TDM supports data on distributed platforms (LUW) and z/os. Out-of-the-box subset support for packaged applications ERP/CRM solutions as well as : Other Create referentially intact, right-sized test databases Automate test result comparisons to identify hidden errors Protect confidential data used in test, training & development Shorten iterative testing cycles and accelerate time to market Benefits Deploy new functionality more quickly and with improved quality Easily refresh & maintain test environments Protect sensitive information from misuse & fraud with data masking Accelerate delivery of test data through refresh

InfoSphere Optim Test Data Management Standard methodology Deploy Application Optim fits with your testing methodology Extract Subset Privatize Load Edit Compare Production Extract / privatize production data Success Unsuccessful Results? Automate all or part of the process Extract file Refresh and Retest Compare results Load test database Test Test Application Edit data

Enterprise Testing Solution with Rational and InfoSphere Optim Building better quality applications Comprehensive software quality process to minimize cost and shorten development cycles Manage test labs Create realistic test environments from production data Ensure protection of sensitive data Manage unit, functional and performance testing and quality test cases Streamline your test data management processes and deliver your project sooner and with fewer defects Design & Manage Test Campaign Initiate Data Extract Scripts Subset & Mask Production Data for Testing Refresh Masked Test Data Browse & Edit Test Data Execute Automated Test Routines InfoSphere Optim InfoSphere Optim InfoSphere Optim Fail Compare Before & After Data InfoSphere Optim Go Production!

The process : Access Definiton Legacy are VSAM files, seq. files or IMS segments

Type of Relationships DB2 DB2 defined relationship OPT Relationship defined with Optim

Selection Criteria Only records where State = GA

Extract Process PRODDB EXTRACT Point & Shoot CUSTOMERS ORDERS -- -- ------ -- --------- ---- -- -- ------ -- --------- ---- -- -- ------ -- --------- ---- -- -- ------ -- --------- ---- Extract File Use BROWSE to verify extracted data DETAILS Process Report Extract from source tables Extract data and/or object definitions

The Extract Report

Browse extract file with join

Insert Process : Populate Destination Tables Table Map Table names need not match Change qualifier and/or table name Can be saved in PST Directory

Insert Process : Populate Destination Tables Column Map Map unlike column names Transform/mask sensitive data Datatype conversions Column-level date aging Literals Special Registers Expressions Default Values User exits

Edit / Browse : Traditional vs. Relational Tools Single Table Editors The Relational Editor One table/view at a time No edit of related data from multiple tables Simultaneous browse/edit of related data from multiple tables FIND CUSTOMER NOTE INFO EXIT TABLE FIND ORDERS NOTE INFO EXIT TABLE FIND DETAILS NOTE INFO EXIT TABLE CUSTOMERS ORDERS............... DETAILS

Editing Data Edit data to: Insert Rows Delete Rows Update Rows

Relationally Joined Data Browse or edit related rows Scroll of higher-level table automatically synchronizes all lower-joined tables

Commit/Restore Commits are automatically made to the database when you move your pointer to a different row Each instance of a commit counts as an undo level Restore changes to a row, table or fetch set

Backing Out Changes - Row Level Undo removes last change made to the current row Undo brings up a row list and lets you select how far back you want to restore the current row Undo All removes all changes made to the current row since the last fetch

Challenges of Enterprise Data Privacy Multi-platforms Relational database applications in the enterprise Complex data model Multiple databases Legacy data components Interconnected applications Distributed work teams Employees and contractors Global 24 x 7 operations

What is data masking? Definition Method for creating a structurally similar but inauthentic version of an organization's data. The purpose is to protect the actual data while having a functional substitute for occasions when the real data is not required. Requirement Effective data masking requires data to be altered in a way that the actual values cannot be determined or reengineered, functional appearance is maintained. Other Terms Used Obfuscation, scrambling, data de-identification Commonly masked data types Name, address, telephone, SSN/national identity number, credit card number Methods Static Masking: Extracts rows from production databases, obfuscating data values that ultimately get stored in the columns in the test databases Dynamic Masking: Masks specific data elements on the fly without touching applications or physical production data store

Statically mask data in non-production databases Patient No 123456 SSN 333-22-4444 Name Erica Schafer Address 12 Murray Court City Austin State TX Zip 78704 Statically mask Patient No 112233 SSN 123-45-6789 Name Amanda Winters Address 40 Bayberry Drive City Elgin State IL Zip 60123 Mask data in non-production databases such as test and development Improve security of non-production environments Facilitate faster testing processes with accurate test data Support referential integrity Mask custom and packaged ERP/CRM applications

Optim Data Privacy Solution Production Test VSAM IMS DB2 Contextual, Application- Aware, Persistent Data Data Masking DB2 IMS VSAM Substitute confidential information with fictionalized data Deploy multiple masking algorithms Provide consistency across environments and iterations Enable off-shore testing Protect private data in non-production environments

Consistent mapping Across the enterprise Client Billing Application IMS SSN#s 157342266 132009824 DB2 SS#s 157342266 132009824 Data is masked Masked fields are consistent SSN#s 134235489 323457245 SSN#s 134235489 323457245

De-Identify test data During Extract Process Production Data Extract and Convert Masked Test Data Or Standalone Convert Process Or During Insert/Load Process Transform or Replace sensitive data using Standard mapping rules: Literals, Special Registers, Expressions, Default Values, Look-up tables Complex mapping rules: User exits

Optim Data Privacy in Application Testing NewDB Extract a relationally intact subset from production database(s) CUSTOMERS CUSTOMERS -- -- ---- ---- -- ---- Create CUST ORD DETL INSERT/ UPDATE TESTDB CUST ORDERS ORDERS -- -- ------ -- --------- ---- -- -- -- -- ------ -- -- --------- ---- ---- -- -- -- -- ------ -- -- --------- ---- ---- -- -- -- -- ------ -- -- --------- ---- ---- -- -- ------ -- --------- ---- DETAILS DETAILS -- -- ---- ---- ----- ---- ---- -- -- ---- ---- --------- ---- -- -- ---- ---- ----- ---- ---- Extract File Transform / mask sensitive data Load Files LOAD ORD DETL QADB CUST ORD DETL Extract data and/or object definitions Define a new set of test tables Apply masking during population process Extract file may be reused but contains un-masked data Good practice for testing masks

Optim Data Privacy in Application Testing NewDB Extract a relationally intact subset from production database(s) CUSTOMERS CUSTOMERS -- -- ---- ---- -- ---- ORDERS ORDERS -- -- ------ -- --------- ---- -- -- -- -- ------ -- -- --------- ---- ---- -- -- -- -- ------ -- -- --------- ---- ---- -- -- -- -- ------ -- -- --------- ---- ---- -- -- ------ -- --------- ---- DETAILS DETAILS -- -- ---- ---- ----- ---- ---- -- -- ---- ---- --------- ---- -- -- ---- ---- ----- ---- ---- Extract File Create Transform / mask sensitive data Masked Extract File CUST ORD DETL INSERT/ UPDATE Load Files LOAD TESTDB CUST ORD DETL QADB CUST ORD DETL Extract data and/or object definitions in pre-masked file Use pre-masked Extract file to create new set of tables Convert Pre-masked extract file data into second masked extract file Share masked extract file to be reused for population step Good practice for testing masks using COMPARE

Optim Data Privacy in Application Testing Only Users authorized to see Private data Extract a relationally intact subset from production database(s) CUSTOMERS CUSTOMERS -- -- ---- ---- -- ---- ORDERS ORDERS -- -- ------ -- --------- ---- -- -- -- -- ------ -- -- --------- ---- ---- -- -- -- -- ------ -- -- --------- ---- ---- -- -- -- -- ------ -- -- --------- ---- ---- -- -- ------ -- --------- ---- DETAILS DETAILS -- -- ---- ---- ----- ---- ---- -- -- ---- ---- --------- ---- -- -- ---- ---- ----- ---- ---- Transform / mask sensitive data Most Secure Approach Extract data only Convert during extract Extract File Extract file already contains masked data Can be shared with testers to reuse INSERT/ UPDATE Load Files LOAD TESTDB CUST ORD DETL QADB CUST ORD DETL

Transformation Techniques String literal values Character substrings Random or sequential numbers Arithmetic expressions Concatenated expressions Date aging Lookup values Intelligence

IBM InfoSphere Optim Data Masking JASON MICHAELS De-identify sensitive information with realistic but fictional data ROBERT SMITH Personal identifiable information is masked with realistic but fictional data Requirements Protect confidential data used in test, training & development systems Mask data on screen in applications Implement proven data masking techniques Support compliance with privacy regulations Solution supports custom & packaged ERP applications Benefits Protect sensitive information from misuse and fraud Prevent data breaches and associated fines Achieve better information governance

Contextually accurate masked data facilitates business processes Satisfy Privacy regulations String literal values Character substrings & concatenation Random or sequential numbers Reduce risk of data breaches Arithmetic expressions Lookup values Business data types (CCN, NID) Maintain value of test data Generic mask Dates User defined Patient Information Patient Patient No. No. 112233 123456 SSN SSN 123-45-6789 333-22-4444 Name Name Amanda Erica Schafer Winters Address 40 12 Bayberry Murray Court Drive City City Elgin Austin State State IL TX Zip Zip 60123 78704 Data is masked with contextually correct data to preserve integrity of data Personal Info Table PersNbr FirstName LastName 08054 10000 Jeanne Alice Renoir Bennett 19101 10001 Claude Carl Davis Monet 27645 10002 Pablo Elliot Flynn Picasso Referential integrity is maintained with key propagation Event Table PersNbr FstNEvtOwn LstNEvtOwn 27645 10002 Pablo Elliot Flynn Picasso 27645 10002 Pablo Elliot Flynn Picasso

Total Assets Street Address/City/State/Zip Code Data Sets Customers Street City State Zip Code $534,674,233 54,999 12 Buttercup Ln Cleveland OH 44101 $8,777,733,811 105,333 6767 Rte 10 S Princeton NJ 08594 1) Client is a Bank who wishes to mask its assets by location 288 Helm St 12 Roden Dr Milwaukee Los Angeles WI CA Address Lookup Table 53201 90001 2) Optim provides corresponding Street Address/City/State/Zip Codes for masking 3526 Diamond Rd Seattle WA 98101 12 Street Road 2 Applegarth Ln Total Assets Las Vegas NV Brunswick ME New Table with Masked Data Customers Street City 89101 04011 State Zip Code 3) Leverage Multiple Column Replacement. Entire address row can be masked with a valid Coding Accuracy Support System (CASS) address using enhanced random lookup function $534,674,233 54,999 3526 Diamond Rd Seattle WA 98101 $8,777,733,811 105,333 21 Street Rd Las Vegas NV 89101

First Names and Last Names Data Sets Production Database First Name Last Name GPA High School Advisor State John Bob Danielle Dave Stacey Paul Smith 3.2 Princeton Johnson NJ Kate Last Jones Name 2.7 Albany Kline Lookup NY First Name Lookup Table Newton Nelson Kline Howell Reese Table Test Database First Name Last Name GPA High School Advisor State Dave Nelson 3.2 Princeton Johnson NJ 1) Client is a University who wishes to mask the first and last name fields in their admissions database 2) Optim now has a first name lookup table with over 5,000 male/female names and a last name lookup table with over 80,000 names 3) Use Lookup Tables to randomly replace table first and last names Stacey Reese 2.7 Albany Kline NY

Data Privacy Transformation Library Functions TRANS SSN Generates valid and unique U.S. Social Security Number (SSN). By default, algorithmically generates consistently altered destination SSN based on source SSN. Can also generate a random SSN when the source data does not have an SSN value or when there is no need for transforming the source SSN in a consistent manner. TRANS CCN Use the TRANS CCN function to generate a valid and unique credit card number (CCN). By default, randomizes entire string, can also randomize parts of the credit card (example- preserve cc type). TRANS EML Generates a random e-mail address. An e-mail address consists of two parts, a user name followed by a domain name, separated by @. For example, user@domain.com. JASON MICHAELS ROBERT SMITH

Without Key Propagation Original Data Customers Table Cust ID Name Street 08054 Alice Bennett 2 Park Blvd 19101 Carl Davis 258 Main 27645 Elliot Flynn 96 Avenue Orders Table Now these are Orphans! Without Key Propagation Customers Table Cust ID Name Street 10000 Auguste Smith Mars23 10001 Claude Jones Venus24 10002 Pablo Adams Saturn25 Orders Table Cust ID Item # Order Date 27645 80-2382 20 June 2004 27645 86-4538 10 October 2005 Cust ID Item # Order Date 27645 80-2382 20 June 2004 27645 86-4538 10 October 2005

Masking with Key Propagation Original Data Customers Table Cust ID Name Street 08054 Alice Bennett 2 Park Blvd 19101 Carl Davis 258 Main 27645 Elliot Flynn 96 Avenue Orders Table Cust ID Item # Order Date 27645 80-2382 20 June 2004 27645 86-4538 10 October 2005 Referential integrity is maintained De-Identified Data Customers Table Cust ID Name Street 10000 Auguste Smith Mars23 10001 Claude Jones Venus24 10002 Pablo Adams Saturn25 Orders Table Cust ID Item # Order Date 10002 80-2382 20 June 2004 10002 86-4538 10 October 2005

Using Custom Masking Exits Apply complex data transformation algorithms and populate the resulting value to the destination column Selectively include or exclude rows and apply logic to the masking process Valuable where the desired transformation is beyond the scope of supplied Column Map functions Example: Generate a value for CUST_ID based on customer location, average account balance, and volume of transaction activity

Agenda Data Governance - Focus on Test Data creation and masking Obtaining test data. Some common practices. Obtaining test data. Best practices. InfoSphere Optim Test Data Management / Data Privacy Q & A

Participate in the System z Expert and Superhero contest! Fill in your answer to the question below on the scorecard and deposit your card in the box! Which IBM Software is key for delivering a completly masked Test data subset? A. Combination of answer B & D B. InfoSphere Optim Test Data Management C. InfoSphere Guardium D. InfoSphere Optim Data Privacy 66 2011 IBM Corporation

More information on zenterprise IBM zenterprise / System z Redbooks Portal: http://www.redbooks.ibm.com/portals/systemz IBM zenterprise Announcement Landing Page: ibm.com/systems/zenterprise196 IBM zenterprise HW Landing Page: ibm.com/systems/zenterprise196 IBM zenterprise Events Landing Page: ibm.com/systems/breakthrough IBM Software: ibm.com/software/os/systemz/announcements IBM System Storage: ibm.com/systems/storage/product/z.html IBM Global Financing: ibm.com/financing/us/lifecycle/acquire/zenterprise/ Global Technology Services: http://www.ibm.com/services/us/index.wss/offerfamily/gts/a1027714 67

Traditional Chinese Thai Russian Thank You English Bedankt Nederlands Merci French Obrigado Gracias! Spanish Arabic Brazilian Portuguese Danke German Simplified Chinese Japanese 68