Oracle Data Integrator Overview and Demo Maria Mundrova ETL Developer/DW Consultant BGOUG Spring Conference, 2009
Presentation Agenda Oracle Data Integrator What is ODI? ELT Approach Architecture Oracle Data Integration The Three Rights Scenarios Build ETL Process Interfaces, Sources/Targets, Mappings, Filters, Transformations Oracle Data Profiling Oracle Data Quality ETL Platforms Overview DW Best Practices for Oracle ODI Case Studies Q & A 2 Oracle Data Integrator
What is ODI? Data Integration Tool ELT for Data Warehouse Data movement, migration and consolidation Transformation from multiples sources to heterogeneous targets Data Profiling and Data Quality Provides Data Services to Oracle SOA Suite 3 Oracle Data Integrator
ELT Approach Data transformations on source or target database Leverages the power of the database No need for ETL Server Describe what the process does and not how 4 Oracle Data Integrator
Architecture Graphical modules, Runtime components, Repository Designer Reverse-Engineer Develop Projects Release Scenarios Operator Operate production Monitor sessions Topology Manager Define the IS infrastructure Security Manager Manage user privileges Any Web Browser Browse metadata lineage Operate production Installable on any platform that supports Java 1.5, including Windows, Solaris, Linux, HP-UX, pseries File system space: approx. 200Mb Repository Oracle, DB2, any ISO-92 RDBMS Approx. 500 Mb Master (security, topology) + 500 Mb Work( models, projects, execution) Scheduler Agent Handles schedules Orchestrate sessions Metadata Navigator Web access to the repository 5 Oracle Data Integrator
Oracle Data Integration Data Integration is referred to as ETL Get the right data in the right place at the right time Scenarios: ELT for Data Warehouse, MDM 6 Oracle Data Integrator
Build ETL Process We need to: Create and reverse-engineer models Create Project, Folders, Packages Procedures Import Knowledge Modules R(everse), L(oad), I(ntegrate), J(ournalize), C(heck), S(ervice) Create Interface: Define Targets Define Sources, Filters, Transformations Define mappings between source and target Define flow control Execute 7 Oracle Data Integrator
8 Expressions, Joins, Sources, Targets, Mapping
Oracle Data Profiling Investigate data Examine dependences Create joins Check data compliance (patterns) Define business rules Assess data though metrics 9 Oracle Data Integrator
Oracle Data Quality Data cleansing Data enrichment Data standardization Repair and correct fields, values, records Data Validation Understanding the Data Managing Data Quality Issues (Tactical) Addressing Data Quality Issues (Strategic) Enriching the Data 10 Oracle Data Integrator
ETL Platforms Overview Oracle Data Integration Enterprise Edition Informatica Ab Initio IBM DataStage Microsoft DTS 11 Oracle Data Integrator
DW Best Practices for Oracle Bitmap index Bitmap index low cardinality columns; best suited for DSS regardless of cardinality select t.cust_id, t.cust_gender, t.cust_marital_status, t.cust_income_level from customers t select count(*) from customers where cust_marital_status='married' and cust_gender='m' and cust_income_level in ('F: 110,000-129,000', 'I: 170,000-189,000') 12 Oracle Data Integrator
DW Best Practices for Oracle Bitmap join index Bitmap join index on fact table sales for cust_gender CREATE BITMAP INDEX sales_cust_gender_bjindx ON sales(customers.cust_gender) FROM sales, customers WHERE sales.cust_id = customers.cust_id LOCAL NOLOGGING COMPUTE STATISTICS; Join result used to create the bitmaps stored in the bitmap join index SELECT sales.time_id, customers.cust_gender, sales.amount_sold FROM sales, customers WHERE sales.cust_id = customers.cust_id; 13 Oracle Data Integrator
DW Best Practices for Oracle Exchange partition Load Technique for large tables No physical move just reset pointers Steps 1. Create partition table TEST(Destination) 2. Create temp table TEST_TMP 3. Load records inn TEST_TMP 4. Add Local PK to the partition table TEST 5. Add PK to the offline TEST_TEMP table 6. Gather optimizer statistics on TEST_TEMP 7. Swap offline table into the partition 14 Oracle Data Integrator
DW Best Practices for Oracle Exchange partition CREATE TABLE test ( id NUMBER(12,6), description VARCHAR2(10), data VARCHAR2(100)) PARTITION BY RANGE(id) ( -- Partion Key = Primary Key PARTITION test_partition VALUES LESS THAN (MAXVALUE)); CREATE TABLE test_temp ( id NUMBER(12,6), description VARCHAR2(10), data VARCHAR2(100)); INSERT /*+ append ordered full(s1) use_nl(s2) */ INTO test_temp SELECT TRUNC((ROWNUM-1)/500,6), TO_CHAR(ROWNUM), RPAD('X',100,'X') FROM all_tables s1, all_tables s2 WHERE ROWNUM <= 10000; ALTER TABLE test ADD CONSTRAINT pk_test PRIMARY KEY(id) USING INDEX (CREATE INDEX pk_test ON TEST(id) NOLOGGING LOCAL); ALTER TABLE test_temp ADD CONSTRAINT pk_test_temp PRIMARY KEY(id) USING INDEX (CREATE INDEX pk_test_temp ON test_temp(id) NOLOGGING); ALTER TABLE TEST EXCHANGE PARTITION test_partition WITH TABLE test_temp INCLUDING INDEXES WITHOUT VALIDATION; 15 Oracle Data Integrator
ODI Case Studies Raiffeisen International InterBank, Netherlands Nestle Nesspreso Replace existing ETL tool which did not provided scalability to populate data warehouses in 12 countries Improve ETL performance Reduce hand-coded ETL jobs Improved DWH processing performance 40% Increased development productivity Executes loading process from 8 hours to 2 hours Increased market share because of correct client profiling based on the risk Supply a data warehouse from consolidated operational database of all countries Use horse-power of source and/or target RDBMS for transformation Replaced weekly insertion with daily recording of change for all clients Improved target campaigns and activities 16 Oracle Data Integrator
Practical demo Work with Designer Operator Repository Security Manager Transform and load SH schema Investigate with Data Profiling 17
Conclusion Automates ETL process Provides Data Quality New tool for Oracle Recommend it ODI Needus trilogy 18
Thank you! Q & A 19 Oracle Data Integrator