SAP Data Services Hacks Auto Generating Data Migration Jobs Shobhit Acharya Session# 3507

Similar documents
Lofan Abrams Data Services for Big Data Session # 2987

Data Integrator: Object Naming Conventions

Enabling Data and Analytics. for Enterprise Asset management (EAM) Devang Patel & Arun Narayanan. SEAL Consulting Inc SESSION CODE: EM2098

Consolidate by Migrating Your Databases to Oracle Database 11g. Fred Louis Enterprise Architect

SAP Data Services 4.X. An Enterprise Information management Solution

Rajan Arora (Deloitte) SAP Business Objects Backup and Recovery Scenarios and Best Practices Session # 3233

Value Realization at Johnson Controls using SAP HANA smart data integration Steve Carpenter Johnson Controls Ryan Champlin - SAP

Rajesh Gupta Best Practices for SAP BusinessObjects Backup & Recovery Including High Availability and Disaster Recovery Session #2747

Data processing goes big

Srini Santhanam, Capgemini US, LLC Integration of Complex SAP and Non-SAP Applications Through SAP BusinessObjects Data Services for the Fortune 500

Implement Hadoop jobs to extract business value from large and varied data sets

SAS Enterprise Data Integration Server - A Complete Solution Designed To Meet the Full Spectrum of Enterprise Data Integration Needs

appmdmtm MASTER DATA MANAGEMENT

<Insert Picture Here> Move to Oracle Database with Oracle SQL Developer Migrations

Understanding and Leveraging Improvements in SAP Data Integration and Data Services Platform 4.2

... Foreword Preface... 19

SAP BOBJ. Participants will gain the detailed knowledge necessary to design a dashboard that can be used to facilitate the decision making process.

Microsoft. Course 20463C: Implementing a Data Warehouse with Microsoft SQL Server

Sumit Sarkar Real-time BO Universe to Cloud Data Sources Session #

<Insert Picture Here> Oracle SQL Developer 3.0: Overview and New Features

SAP Thought Leadership Data Migration. Approaching the Unique Issues of Data Migration

Best Practices for Implementing Oracle Data Integrator (ODI) July 21, 2011

Addressing Risk Data Aggregation and Risk Reporting Ben Sharma, CEO. Big Data Everywhere Conference, NYC November 2015

Salesforce.com and MicroStrategy. A functional overview and recommendation for analysis and application development

Case Studies Data Migration

IBM InfoSphere Information Server Ready to Launch for SAP Applications

Implementing SAP Vendor Invoice Management by OpenText in the Public Sector. Maria Hourani & Trent Ryan Employment and Social Development Canada

John D. Bonam Disaster Recovery Architecture Session # 2841

From Oracle Warehouse Builder to Oracle Data Integrator fast and safe.

Use Case: Secure and Affordable SAP HANA Cloud- Based Solutions. Kevin Knuese, Symmetry SESSION CODE: SM1833

Welcome to online seminar on. Oracle PIM Data Hub. Presented by: Rapidflow Apps Inc

SAP EDUCATION SAMPLE QUESTIONS: C_BODI_20. Questions:

High-Volume Data Warehousing in Centerprise. Product Datasheet

Managing Third Party Databases and Building Your Data Warehouse

iway Roadmap Michael Corcoran Sr. VP Corporate Marketing

Data Governance in the Hadoop Data Lake. Michael Lang May 2015

Release Automation for Siebel

Scribe Demonstration Script Web Leads to Dynamics CRM. October 4,

What's New in SAS Data Management


June JMS and Hadoop Agent. Automic Workload Automation

Distributed Computing and Big Data: Hadoop and MapReduce

Jabil. Case Study: How Jabil Circuit Integrated Dassian Contract Flow Down functionality in SAP ERP and SAP SNC.

Jet Data Manager 2012 User Guide

SAP Data Services and SAP Information Steward Document Version: 4.2 Support Package 7 ( ) PUBLIC. Master Guide

End the Microsoft Access Chaos - Your simplified path to Oracle Application Express

Build your own Fiori hybrid mobile app rapidly using SAP Web IDE Marc Anderegg, SAP SESSION CODE: BT404

Beyond High Availability Replication s Changing Role

Oracle Data Miner (Extension of SQL Developer 4.0)

Integrating VoltDB with Hadoop

Brent Atkins, Cris Hadjez An Agile BI Approach: Mead Johnson Uses Better Data to Push Boundaries and Increase Customer Value Session # 3544

Implementing a Data Warehouse with Microsoft SQL Server

Course Outline. Module 1: Introduction to Data Warehousing

Implementing and Managing Windows Server 2008 Hyper-V

SAP BO 4.1 COURSE CONTENT

Five Steps to Integrate SalesForce.com with 3 rd -Party Systems and Avoid Most Common Mistakes

Managing and Maintaining Windows Server 2008 Servers

BarTender Integration Methods. Integrating BarTender s Printing and Design Functionality with Your Custom Application WHITE PAPER

Using Oracle Data Integrator with Essbase, Planning and the Rest of the Oracle EPM Products

XpoLog Competitive Comparison Sheet

<Insert Picture Here> Extending Hyperion BI with the Oracle BI Server

Enterprise Data Management. Data Factory as a Service (DFaS)

Kuali Security Request Installation Guide

WebSphere Cast Iron Cloud integration

OWB Users, Enter The New ODI World

SAP HANA SPS 09 - What s New? HANA IM Services: SDI and SDQ

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

COURSE 20463C: IMPLEMENTING A DATA WAREHOUSE WITH MICROSOFT SQL SERVER

Leveraging the SAS Open Metadata Architecture Ray Helm & Yolanda Howard, University of Kansas, Lawrence, KS

Raising. Product Data. MCAD & PLM Integration and Legacy PDM Data Migration. the Value of your. May 18, 2009

Implementing a Data Warehouse with Microsoft SQL Server

IBM InfoSphere Discovery: The Power of Smarter Data Discovery

Using SAP Master Data Technologies to Enable Key Business Capabilities in Johnson & Johnson Consumer

Affordable Innovations for SAP ERP on SAP HANA for the Midsize Enterprise

Oracle Data Integrator Technical Overview. An Oracle White Paper Updated December 2006

Oracle to SQL Server 2005 Migration

Implementing Support and Monitoring For a Business- Critical Application Migrated to Windows Azure

Decoding the Big Data Deluge a Virtual Approach. Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco

Relational Databases for the Business Analyst

Outline SSS Microsoft Windows Server 2008 Hyper-V Virtualization

from Microsoft Office

6422: Implementing and Managing Windows Server 2008 Hyper-V (3 Days)

PUBLIC Performance Optimization Guide

Data Management for SAP Business Suite and SAP S/4HANA. Robert Wassermann, SAP SE

Automated Data Ingestion. Bernhard Disselhoff Enterprise Sales Engineer

SAP Business Objects XIR3.0/3.1, BI 4.0 & 4.1 Course Content

Course 10777A: Implementing a Data Warehouse with Microsoft SQL Server 2012

How to construct a world class SAP Center of Excellence in the era of Cloud. Vipin Singh NTTDATA Inc SESSION CODE: SS1354

How to Migrate From Existing BusinessObjects or Cognos Environments to MicroStrategy. Ani Jain January 29, 2014

An Oracle White Paper February Oracle Data Integrator 12c Architecture Overview

SQL Databases Course. by Applied Technology Research Center. This course provides training for MySQL, Oracle, SQL Server and PostgreSQL databases.

Implementing a Data Warehouse with Microsoft SQL Server 2012 MOC 10777

Test Data Management Concepts

Implementing a Data Warehouse with Microsoft SQL Server

Implementing a Data Warehouse with Microsoft SQL Server 2012

Transcription:

SAP Data Services Hacks Auto Generating Data Migration Jobs Shobhit Acharya Session# 3507

Learning Points Improve data migration efficiency using SAP Data Services and implementing a few custom approaches that will speed up the extraction and load of source data. These approaches deal with Programmatically generating data migration jobs to replace labor intensive and monotonous job design Using xml import mechanism to create job templates Using datastore configurations to ingest multiple instances of identical source databases

Agenda Introductions & Overview Data Migration & SAP EIM tools Data Services: Designer & Workbench Information Steward Review a baseline framework The use case for DS Workbench Alternative solutions for efficiency Understanding the DS Job xml structure Developing job generation programs for automation Datastore configurations When to use these programs Demo

Data Migration Assess Extract Consolidate Cleanse Load Reconcile Data migrations at an enterprise scale need a focus on migrating data efficiently, quickly, repeatedly and assuredly. The goal is not just to move and convert data; its to ensure that the data is of high quality and supports the business processes and target system s operational needs.

Data Migration 80% of organizations will underestimate the costs related to the data acquisition tasks by an average of 50 percent. Gartner

SAP Data Services & Information Steward Overview

SAP Data Services One Solution that Provides Data Integration Data Quality Text Data Processing One server to execute all capabilities One design environment to manage all development One administration console to monitor all functions

SAP Information Steward Business Measure and compare against Information governance rules and standards IT Share data quality metrics and problems with business

Case Study Review a baseline Data Migration Framework 1 2 Ingestion Extraction into Stage 4 Transform to common structure 7 Apply Relevancy Rules 10 Consolidate cleansed data 12 Fix Reference Data issues SAP BODS Source 1 2 3 Legacy / Staging Transformed 4 7 Target Relevant 10 Load 11 14 12 Target System 16 Staging Area 3 Initial Data Profiling 5 6 SAP IS 8 BoA 13 Pre-load sign off 5 Initial Health Check IVM 9 BoA Staging 14 Load to Target System 15 Reconciliation 6 Auto De-dup & Cleanse 8Secondary Health 9 Facilitated Check Cleansing 11 Reference Data validation 13 BOBJ (Reporting) 15 16 Post load sign off

Case Study Ingestion Scope 25+ distinct source systems Multiple source product versions Need separate job streams Need separate job control Over 400 databases to ingest Hadoop as staging Multiple waves of migration 1 Source Extraction 1 SAP BODS 2 3 Legacy / Staging 2 Ingestion 3 Initial Data into Profiling Stage

The use case for DS Workbench Quick to build data replication projects Data flow design Additional customizations in DS Designer Progressively additional functionality added each release

The use case for DS Workbench Data Replication Design

The use case for DS Workbench Data flows

The use case for DS Workbench Goodies Monitor performance View data

The use case for DS Workbench And where it falls short No big data sources or targets Little or no workflow customizations A small list of supported transforms Additional work could be required in DS Designer for job control and customizations

Alternative solutions Generate your own jobs in XML Dataflow= Source Tables -> Query Transforms -> Target (including HDFS) Workflows Custom script stages Jobs Datastores and Configurations Flat file formats Import generated XML as DS Designer Jobs, workflows, dataflow Configure datastore for multiple deployments of the same source product database

Data Services job export in XML Understanding the structure of a simple job Example : 1 Dataflow in a job Export

Data Services job export in XML Understanding the structure of a simple job Example : 1 Dataflow in a job 350 lines of xml Lets look closer

Data Services job export in XML Understanding the structure of a simple job DIDatabaseDatastore DIAttributes DSConfigurations *Variables* Repeat for each --- <odbc_data_source>*datastore*</odbc_data_source> DITable DIProperties DIColumn DIDataflow DITransforms DIAttributes DIDatabaseTableSource DIOutputView DIFileTarget DIAttributes DIQuery DIAttribute DISchema DIElement DISelect DIProjection DIExpression DIFrom DIFlatFileDatastore DISchema DIElement DIAttributes DIUIOptions --- name="*table_name*" owner="*dbowner* datastore="*datastore --- List all columns and column properties, *COLUMN_NAME* --- name= *DATAFLOW*" --- array sizes, static parameters --- datastorename="*datastore* tablename="*table_name*" --- name="*table_name*" --- formatname="*datastore*_*table_name*" filename="*table_name* --- HDFS File location/path + static parameters --- name="*table_name*" value="*table_name*" --- List all columns and column properties, *COLUMN_NAME* --- column="*column_name*" --- Name="*DATASTORE*_*TABLE_NAME*" --- List all columns and column properties, *COLUMN_NAME* --- Datastore input and output file store attributes, 1 per job --- name="*datastore*_*table_name*" value="*table_name*"

Generating the xml programmatically Understanding what you need Programmers for your code Sql programming skills Or on java/python/.net Data Services Sandbox Repository Source database table and column definitions Oracle : all_tab_columns, Sql Server : information_schema, Progress DB: sysprogress.syscolumns_full. DB schema for code and column definitions + True Grit

Generating the xml programmatically Applying the understanding for complex designs Auto generated Imported xml This example: 1 Source Datastore 20+ Configurations 400+ Tables 400+ HDFS formats 400+ Dataflow 400+ Workflows 1 Job

Generating the xml programmatically Applying the understanding for complex designs Workflow Auto generated Scripts Dataflow Datastores

Data Services Datastore Configurations Contains alternate connection parameters for the datastore Typically used for promotions to new environments Could be leveraged for using template jobs on multiple databases with identical schemas (e.g. QAD Progress databases)

Best practices Always isolate any custom xml imports into a sandbox repository Use datastore configurations to maximum effect Pre-import, export (to xml) the intended source and destination datastore without any tables included Post-import, re-import these datastores to override the generated datastores

Generating the xml programmatically When could you need this?

Return on Investment Initial assessment at JCI for developing the custom programs needed (Codename : ATLGEN) Effort invested in ATLGEN development: 1 pers. week Typical pre-use efforts: 2 weeks per source Potential post use efficiency : 10x per source # of sources : 25-50+ distinct sources

Live demo

STAY INFORMED Follow the ASUGNews team: Tom Wailgum: @twailgum Chris Kanaracus: @chriskanaracus Craig Powers: @Powers_ASUG

SESSION CODE 3507