Using Pentaho Data Integration (PDI) with Oracle Nabil Juwale Al Lopez RMOUG Training Days February 11-13, 2013
About DBAK Oracle Database, Technology and E-Business Suite applications Co-founded in 2005 Colorado owned and operated Average 15 Years of Oracle Expertise Awards Top 250 Private Companies, 2011 and 2012 CoBIZ Magazine Fastest-Growing Private Companies 2012 Finalist, Denver Business Journal Colorado Companies to Watch 2012 Finalist, Colorado Office of Economic Development and International Trade Emerging Business of the Year, 2008 South Metro Denver Chamber of Commerce 100+ Clients 200+ Implementations, Upgrades, Conversions, Support Projects Oracle Gold Partner Oracle Accelerate Solution Hosted Financials Specialized - Oracle Database - Oracle Enterprise Manager DBAK 2013 2
What is Pentaho? Data integration and business analytics tool ETL tool similar to Informattica, but supported by a large user community which has contributed with very unique capabilities not normally available to commercial tools. Offers broad connectivity: Spreadsheets and CSV files DBMSs (Oracle, DB2, MySQL, SQL Server) Direct Enterprise Applications (SAP) Indirect Enterprise Applications (Oracle EBS, PeopleSoft, etc) Cloud-based and SaaS Applications (e.g. Salesforce, Amazon Web Services) 3
Pentaho Data Integration (PDI) Pentaho Data Integration Community Edition (PDI CE) also known as Kettle CE is open source Enterprise Edition also available which offers: Tech Support Enhanced editions Warranty Powerful ETL tool Pure Java tool with Windows and Unix/Linux versions available After downloading, use spoon.sh or spoon.bat to open the UI 4
Jobs and Transformations Kettle uses jobs (.kjb file) and transformations (.ktr file) for ETL A transformation can comprise of several steps A job can comprise of several transformations and other steps. 5
Scheduling Jobs You can use scheduling options within the PDI tool Spoon Online (creates and runs jobs, transformations and schedules) Kitchen is a program that can execute jobs designed by Spoon in XML or in a database repository. Usually jobs are scheduled in batch mode to be run automatically at regular intervals. Pan is a program that can execute transformations designed by Spoon in XML or in a database repository. Usually transformations are scheduled in batch mode to be run automatically at regular intervals. Windows Scheduler or Linux Crontab 6
Scheduling Jobs You an use Windows Scheduler or Linux Crontab We used crontab as shown below: 01 2 * * * /d01/pdi_interface/custcron.sh 2>&1 mail -s "Customer Cronjob Output" njuwale@dbak.com custcron.sh contents:. ~/.bash_profile cd /d01/pdi_interface/pdi/data_integration/kitchen.sh - file=/d01/pdi_interface/pdi/arint/r12_customers_dail y.kjb -level=detailed -logfile=t.txt 7
Sample Job 8
Sample Transformation 9
Database Connection 10
E-Mailing Within a Job 11
Client Case Study Client Publishing company using Oracle EBS 11i Due to change in business model the business required to re-implement using Oracle EBS 12.1.3 The re-implementation required: Multiple Master/Transactional data conversions from 11i to R12 Multiple live interfaces from Front-End AR system We used PDI to Extract data from the Legacy AR system (including EBS 11i) Transform/adjust several legacy fields Populate interface tables and then used standard Oracle APIs to move data from the interface tables to Oracle AR base tables Customer, Banks, Bank Sites, Invoices, Payments etc. 12
Demo 13
Questions? DBAK 2013 14
Contact Alfredo (Al) Lopez Nabil Juwale 720.475.8600 alopez@dbak.com njuwale@dbak.com www.dbak.com DBAK 2013 15