Incremental Replication from SQL Server to Kane McKenzie Liberty University
Our Challenge Efficiently Replicate Data from SQL Server based CRM database to based Data Warehouse
Business Drivers Issues with Reporting against Production Impact to Production CRM Benefits of Data Integration Analyze CRM Data Alongside Other Replicated Data
Business Requirements Technology Sustainable Affordable Fresh
Environment Details Live Since February 2013 Source: SQL Server based CRM database Target: Data Warehouse Refreshed Every 30 Minutes (Takes 4 Minutes) 77 Tables Replicated Incrementally (30 GB) Busiest Table: 350K Changes/Day Largest Table: 61 Million Rows Takes approximately 1 hour to configure new table for replication.
Challenges Incremental Change Detection Heterogeneous Replication Available Tools Costs of Technology Extract Transform Load
ODBC High Level Architecture Source SQL Server Windows Gateway Server Target DB SQL Server Gateway
Aerial Architecture SQL Server CRM cron Source Table CDC Table Staging Table Note View Target Table Data Warehouse ODBC Gateway
Architecture Explored: SQL Server SQL Server CRM Source Table CDC Table View ODBC cron Source table identified for replication to Data Warehouse. Staging Table SQL Server change data capture (CDC) table used for incremental change detection in source table. Target Table Extraction view. Standard SQL Server view against underlying table which is used to handle necessary Gateway conversions due to differences between SQL Server and. Data Warehouse
Architecture Explored: Connectivity Linux cron tool used to schedule jobs which extract incremental SQL Server CRM changes every 30 minutes. Source Table generic ODBC Gateway utilized to enable to talk CDC Table to SQL Server and pull data. cron Staging Table View Standard ODBC driver enables Heterogeneous Services to connect to SQL Server. Target Table Data Warehouse ODBC Gateway
Architecture Explored: Standard Insert Select SQL copies incremental changes for SQL Server CRM each table to staging table in Data Warehouse. Source Table cron Staging Table captures and CDC Table interprets changes applied to change table, converting each change row into Viewan UPDATE, INSERT, or DELETE operation. applies Logical Change Records (LCR), created ODBC from updates to staging table, to target table. Gateway Target Table Data Warehouse
Details: Change Data Capture cron SQL Server CRM Source Table CDC Table View EXEC sys.sp_cdc_enable_db GO Staging Table EXEC sys.sp_cdc_enable_table @source_schema = N'dbo', @source_name = N ContactBase', @role_name = NULL; GO Target Table Data Warehouse ODBC Gateway
Details: Heterogeneous Services Heterogeneous Services consists of an Database Server cron Home, a running listener, and one ore more Gateways. We use only the free, included, default ODBC Gateway. (Free per SR 232482.1) SQL Server CRM Source Table 1 init file CDC Table View 2 tnsnames entry ODBC Gateway Staging Table initcrmprod.ora HS_FDS_CONNECT_INFO = crmprod HS_FDS_TRACE_LEVEL = off tnsnames.ora SID_LIST_LISTENER Target Table = (SID_LIST = (SID_DESC Data Warehouse = (PROGRAM = dg4odbc) (SID_NAME = crmprod) (ORACLE_HOME = <path to home>) ) Key to the ODBC connection name
Details: Changes from SQL Server INSERTED into staging table. cron SQL Server CRM Source Table Handler converts INSERT operations to the appropriate type CDC based Table on what actually happened in the SQL Server world. View CAPTURE APPLY Staging Table Target Table Data Warehouse ODBC Gateway
Handler FUNCTION crm_handler(evt IN SYS.ANYDATA) RETURN SYS.ANYDATA IS... BEGIN... -- Set LCR Command Type based on "operation" column -- value from MSSQL CDC Columns -- DELETE Operation if oper = 1 then -- Change Command Type in LCR to a DELETE lcr.set_command_type('delete');
Details: Miscellaneous cron SQL Server CRM DECLARE Source Table max_tsn VARCHAR2(38); Staging Table BEGIN CDC Table SELECT MAX(START_LSN) INTO max_tsn FROM CRMSTAGE.LEADBASE; INSERT ViewINTO CRMSTAGE.LEADBASE SELECT "dbo"."ludw_leadbase_ct".* FROM "dbo"."ludw_leadbase_ct"@ho.world WHERE OPERATION IN (1,2,4) AND START_LSN > max_tsn; END; ODBC Gateway Target Table Data Warehouse
Aerial Architecture Review SQL Server CRM cron Source Table Staging Table CDC Table View Target Table Data Warehouse ODBC Gateway
Relevant Documents and Notes Document Name Part Number Concepts and Administration E17069-07 Replication Administrator's Guide E10705-09 Database Gateway Installation and Configuration Guide E12013-07 Database Heterogeneous Connectivity User s Guide E11050-01 Database Gateway for ODBC User s Guide E12070-03 Support Note Note Number Master Note for Gateway Products 1083703.1 Gateway Configuration 1266572.1 ODBC Gateway License 232482.1
Questions & Contact Kane McKenzie Liberty University KaneMcKenzie@me.com kmckenzie20@liberty.edu 909-276-5263 (Google Voice)