An Oracle White Paper Updated August 2010. Oracle GoldenGate for Linux, UNIX, and Windows

Similar documents

An Oracle White Paper December Advanced Network Compression

Oracle Enterprise Manager

An Oracle White Paper May Exadata Smart Flash Cache and the Oracle Exadata Database Machine

DEPLOYMENT GUIDE Version 1.1. Configuring BIG-IP WOM with Oracle Database Data Guard, GoldenGate, Streams, and Recovery Manager

An Oracle White Paper June High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

An Oracle White Paper July Oracle Primavera Contract Management, Business Intelligence Publisher Edition-Sizing Guide

An Oracle White Paper May Oracle Audit Vault and Database Firewall 12.1 Sizing Best Practices

StreamServe Persuasion SP5 Microsoft SQL Server

An Oracle White Paper January A Technical Overview of New Features for Automatic Storage Management in Oracle Database 12c

Maximum Availability Architecture

BrightStor ARCserve Backup for Windows

Oracle Recovery Manager 10g. An Oracle White Paper November 2003

Monitoring DoubleTake Availability

An Oracle White Paper February, Oracle Database In-Memory Advisor Best Practices

An Oracle White Paper November Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

Comprehending the Tradeoffs between Deploying Oracle Database on RAID 5 and RAID 10 Storage Configurations. Database Solutions Engineering

Maximum Availability Architecture. Oracle Best Practices For High Availability

JD Edwards EnterpriseOne 9.1 Clustering Best Practices with Oracle WebLogic Server

Running a Workflow on a PowerCenter Grid

An Oracle White Paper October Oracle Data Integrator 12c New Features Overview

An Oracle White Paper May Oracle Database Cloud Service

Disaster Recovery for Oracle Database

Oracle Enterprise Manager

Oracle Primavera P6 Enterprise Project Portfolio Management Performance and Sizing Guide. An Oracle White Paper October 2010

Oracle Insurance General Agent Hardware and Software Requirements. Version 8.0

Rapid Bottleneck Identification A Better Way to do Load Testing. An Oracle White Paper June 2009

Sawmill Log Analyzer Best Practices!! Page 1 of 6. Sawmill Log Analyzer Best Practices

Oracle Real-Time Scheduler Benchmark

An Oracle White Paper September Advanced Java Diagnostics and Monitoring Without Performance Overhead

Dell InTrust Preparing for Auditing Microsoft SQL Server

Oracle Fusion Middleware

Automatic Service Migration in WebLogic Server An Oracle White Paper July 2008

An Oracle White Paper August Oracle VM 3: Server Pool Deployment Planning Considerations for Scalability and Availability

Oracle GoldenGate. Tutorial for Oracle to Oracle Version July 2013

An Oracle White Paper June Security and the Oracle Database Cloud Service

An Oracle White Paper February Rapid Bottleneck Identification - A Better Way to do Load Testing

Oracle SQL Developer Migration. An Oracle White Paper September 2008

Oracle Enterprise Single Sign-on Technical Guide An Oracle White Paper June 2009

An Oracle White Paper June Oracle Database Firewall 5.0 Sizing Best Practices

An Oracle White Paper March Best Practices for Real-Time Data Warehousing

An Oracle Technical White Paper June Oracle VM Windows Paravirtual (PV) Drivers 2.0: New Features

How To Load Data Into An Org Database Cloud Service - Multitenant Edition

CA Unified Infrastructure Management

An Oracle White Paper February Oracle Data Integrator 12c Architecture Overview

An Oracle White Paper November Oracle Real Application Clusters One Node: The Always On Single-Instance Database

Siebel Installation Guide for UNIX. Siebel Innovation Pack 2013 Version 8.1/8.2, Rev. A April 2014

An Oracle White Paper July Introducing the Oracle Home User in Oracle Database 12c for Microsoft Windows

An Oracle White Paper August Oracle Database Auditing: Performance Guidelines

Load Testing Hyperion Applications Using Oracle Load Testing 9.1

Symantec Endpoint Protection 11.0 Architecture, Sizing, and Performance Recommendations

Oracle Net Services for Oracle10g. An Oracle White Paper May 2005

Maximum Availability Architecture. Oracle Best Practices For High Availability. Backup and Recovery Scenarios for Oracle WebLogic Server: 10.

ORACLE GOLDENGATE BIG DATA ADAPTER FOR HIVE

Oracle SQL Developer Migration

Siebel Installation Guide for Microsoft Windows. Siebel Innovation Pack 2013 Version 8.1/8.2, Rev. A April 2014

Oracle Total Recall with Oracle Database 11g Release 2

CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1

Oracle Enterprise Manager

An Oracle Technical White Paper May How to Configure Kaspersky Anti-Virus Software for the Oracle ZFS Storage Appliance

An Oracle White Paper January Using Oracle's StorageTek Search Accelerator

IBM TSM DISASTER RECOVERY BEST PRACTICES WITH EMC DATA DOMAIN DEDUPLICATION STORAGE

SUN ORACLE EXADATA STORAGE SERVER

An Oracle White Paper October BI Publisher 11g Scheduling & Apache ActiveMQ as JMS Provider

How To Use The Correlog With The Cpl Powerpoint Powerpoint Cpl.Org Powerpoint.Org (Powerpoint) Powerpoint (Powerplst) And Powerpoint 2 (Powerstation) (Powerpoints) (Operations

An Oracle White Paper December Leveraging Oracle Enterprise Single Sign-On Suite Plus to Achieve HIPAA Compliance

Oracle Easy Connect Naming. An Oracle White Paper October 2007

An Oracle White Paper July Load Balancing in Oracle Tuxedo ATMI Applications

HP Array Configuration Utility User Guide

Oracle Database In-Memory The Next Big Thing

WITH A FUSION POWERED SQL SERVER 2014 IN-MEMORY OLTP DATABASE

An Oracle White Paper July Oracle ACFS

Novell ZENworks 10 Configuration Management SP3

An Oracle White Paper September Oracle WebLogic Server 12c on Microsoft Windows Azure

INCREASING EFFICIENCY WITH EASY AND COMPREHENSIVE STORAGE MANAGEMENT

Ahsay Replication Server v5.5. Administrator s Guide. Ahsay TM Online Backup - Development Department

DiskPulse DISK CHANGE MONITOR

An Oracle White Paper November Backup and Recovery with Oracle s Sun ZFS Storage Appliances and Oracle Recovery Manager

Open Systems SnapVault (OSSV) Best Practices Guide

Novell Storage ServicesTM File System Administration Guide for Linux

Improve Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database

Best Practices for Optimizing SQL Server Database Performance with the LSI WarpDrive Acceleration Card

Mission-Critical Java. An Oracle White Paper Updated October 2008

TSM Studio Server User Guide

How To Configure An Orgaa Cloud Control On A Bigip (Cloud Control) On An Orga Cloud Control (Oms) On A Microsoft Cloud Control 2.5 (Cloud) On Microsoft Powerbook (Cloudcontrol) On The

An Oracle White Paper March Oracle Transparent Data Encryption for SAP

Intel RAID Controllers

Canopy Wireless Broadband Platform

An Oracle White Paper April How to Install the Oracle Solaris 10 Operating System on x86 Systems

VirtualCenter Database Maintenance VirtualCenter 2.0.x and Microsoft SQL Server

Outline. Failure Types

An Oracle White Paper October Oracle Database and IPv6 Statement of Direction

Running Oracle s PeopleSoft Human Capital Management on Oracle SuperCluster T5-8 O R A C L E W H I T E P A P E R L A S T U P D A T E D J U N E

Monitoring and Diagnosing Production Applications Using Oracle Application Diagnostics for Java. An Oracle White Paper December 2007

PERFORMANCE TUNING ORACLE RAC ON LINUX

Twin Peaks Software High Availability and Disaster Recovery Solution For Linux Server

An Oracle White Paper. Using Oracle GoldenGate to Achieve Operational Reporting for Oracle Applications

BrightStor ARCserve Backup for Windows

Next Generation Siebel Monitoring: A Real World Customer Experience. An Oracle White Paper June 2010

CA Workload Automation Agent for Databases

Transcription:

An Oracle White Paper Updated August 2010 Oracle GoldenGate for Linux, UNIX, and Windows

Oracle GoldenGate for Linux, UNIX, and Windows Executive Overview... 1 Introduction... 1 Server Environment... 2 ODBC... 2 Disk... 2 GoldenGate Environment... 4 Comments... 4 GoldenGate Macros... 4 GoldenGate Manager... 5 GoldenGate Capture... 8 GoldenGate Data Pump... 9 TCPFLUSHBYTES... 11 GoldenGate Trails... 11 GoldenGate Apply... 12 Load Balancing GoldenGate Processes... 13

Oracle GoldenGate for Linux, UNIX, and Windows Executive Overview This document presents generic best practices for Oracle GoldenGate installed in Linux, UNIX, or Windows environments. Database specific best practices are intentionally omitted and are discussed in individual database best practices documents. Introduction Oracle GoldenGate is inherently complex to architect, the following sections provide generic configuration settings for areas commonly missed, or improperly configured by novice users. The target audience of this document has an existing knowledge of GoldenGate and is adept in its configuration. 1

Oracle GoldenGate for Linux, UNIX, and Windows Server Environment There are several items in the server environment that should be considered when running GoldenGate. Best practices for ODBC authentication and disk configuration are below. ODBC For installations utilizing ODBC connectivity to the database perform user authentication at the ODBC level instead of the GoldenGate Capture or Apply. This will preclude the storing of user database logon ids and passwords in the GoldenGate parameter files; which could be a security issue. In the GoldenGate parameter file all that will be required to access the database will be the option SOURCEDB <system dsn>. Linux and Unix To configure user authentication for OBDC for Linux or Unix servers, add the following to the odbc.ini file: Windows LogonUser = server_username LogonAuth = password_for_logonuser In the Windows environment, when configuring the ODBC System DSN, select Windows Authentication where applicable (SQL Server, etc) or provide the username and password if Windows Authentication is not supported (Teradata, etc). Disk Internal vs. RAID Disk If GoldenGate is not being installed on a cluster server, use internal disks. Testing has shown that checkpointing could be delayed up to 2 seconds depending upon the RAID architecture. If RAID disks are required, RAID1+0 is preferred over RAID5 due to the overhead required for RAID5 writes. RAID1+0 Explained RAID1 is data mirroring. Two copies of the data are held on two physical disks, and the data is always identical. RAID1 has a performance advantage, as reads can come from either disk, and is simple to implement. RAID0 is simply data striped over several disks. This gives a performance advantage, as it is possible to read parts of a file in parallel. However not only is there no data protection, it is actually less reliable than a single disk, as all the data is lost if a single disk in the array stripe fails. RAID1+0 is a combination of RAID1 mirroring and data striping. This means it has very good performance, and high reliability, so it is ideal for mission critical database applications. RAID5 Explained RAID5 data is written in blocks onto data disks, and parity is generated and rotated around the data disks. This provides good general performance, and is reasonably cheap to implement. The problem with RAID5 is write overhead. If a block of data on a RAID5 disk is updated, then all the unchanged data blocks from the RAID stripe must be read from the disk, and a new parity calculated before the new data block and new parity block can be written out. This means that a RAID5 write operation requires 4 IOs. The performance impact is usually masked by a large subsystem cache. 2

Oracle GoldenGate for Linux, UNIX, and Windows RAID5 has a potential for data loss on hardware errors and poor performance on random writes. RAID5 will not perform unless there is a large amount of cache. However, RAID5 is fine on large enterprise class disk subsystems as they all have large, gigabyte size caches and force all write IOs to be written to cache, thus guaranteeing performance and data integrity. For RAID5 configurations, a smaller stripe size is more efficient for a heavy random write workload, while a larger block size works better for sequential writes. A smaller number of disks in an array will perform better, but has a bigger parity bit overhead. Typical configurations are 3+1 (25% parity) and 7+1 (12.5% parity). Disk Space Sufficient disk space should be allocated to GoldenGate on the source server in order to hold extract trails for the worse case scenario target server outage. This is customer dependent; however, space allocation for seven days may be used as a general rule of thumb for disaster recovery purposes. NFS Mounts Unless IO buffering is set to zero (0) then NFS mounts should not be used by any GoldenGate disk input or output process. The danger occurs when one process registers the end of a trail file or transaction log and moves on to the next in sequence yet after this event data in the NFS IO buffer gets flushed to disk. The net result is skipped data and this cannot be compensated for with the parameter OEFDELAY. 3

GoldenGate Environment Comments Comments should always be included in GoldenGate parameter files. Comments aid in troubleshooting issues and in documenting configurations. In GoldenGate comments can be denoted by the word COMMENT but are more commonly expressed by 2 dashes ( -- ) preceding any text. Comments should (1) identify modifications to the files (who, when, what), (2) provide an explanation for various settings, and (3) provide additional information about the process and configuration. GoldenGate Macros GoldenGate Macros are a series of commands, parameters, or data conversion functions that may be shared among multiple GoldenGate components. The best use of macros is to create a macro library ; which is a series of commonly used macros stored in a shareable location. To setup a GoldenGate Macro library, create a subdirectory named dirmac under the main GoldenGate installation directory. Macro library files stored in this location will consist of edit files with the suffix.mac. Macro Library Contents In its simplest form, the macro library contains database connection information used in Capture and Apply. Removing this information from the parameter files adds an additional layer of security in the server environment as the database access information cannot be viewed in the parameter or report files. Table 1 presents a sample macro file containing database connect information. Table 1. dbconnect.mac - Sample database connect macros. MACRO #odbc_connect_dsn BEGIN SOURCEDB MyDSNconnect END; MACRO #odbc_connect_clearpwd BEGIN SOURCEDB MyDSNconnect, USERID lpenton, PASSWORD lpenton END; MACRO #odbc_connect_encrypt BEGIN SOURCEDB MyDSNconnect, USERID lpenton, PASSWORD AACAAAAAAAAAAAKAVDCCTJNGFALEWEVECDIGAEMCQFFBZHVC, encryptkey default END; MACRO #dbconnect_clearpwd BEGIN USERID lpenton, PASSWORD lpenton END; As shown above, the sample macro library file, dbconnect.mac, contains the following macros: 1. #odbc_connect_dsn a. This shows the database connect string when user authentication is performed by ODBC. 2. #odbc_connect_clearpwd

a. This shows the database connect string when user authentication is not performed by ODBC. 3. #odbc_connect_encrypt a. This shows the database connect string when user authentication is not performed by ODBC, and an encrypted password using the default GoldenGate encryption key is supplied. 4. #dbconnect_clearpwd a. This shows the database connect string for databases where GoldenGate does not use ODBC as the access method (i.e., Oracle). NOTE: CLEAR PASSWORDS SHOULD NEVER BE USED IN A CUSTOMER ENVIRONMENT. ALL PASSWORDS MUST BE ENCRYPTED VIA THE GGSCI COMMAND ENCRYPT PASSWORD WITH DEFAULT LEVEL ENCRYPTION USED AT A MINIMUM. MACRO denotes the macro name. BEGIN and END denote the starting and ending points for the macro definition. All statements between BEGIN and END comprise the macro body. Loading the Macro Library Macros files are loaded and processed only when GoldenGate components (Manager, Capture, Data Pump, or Apply) are started. To load a macro file, add text similar to that below in the respective parameter file: NOLIST include./dirmac/dbconnect.mac LIST NOLIST specifies that at start time, the process is not to log whatever follows this statement into its report file. include./dirmac/dbconnect.mac specifies that the process is to read the contents of the file and include them as part of its runtime options. LIST specifies that the process is to log whatever follows this statement into its report file. It is a best practice to never list the macro file contents. If the example above was included in a Capture parameter file, when the GGSCI command START EXTRACT <extract name> is executed the macro library file dbconnect.mac is opened and read as part of the processes runtime environment. If this file does not exist the process will abend with a nonexistent file error. Macro Execution To reference a macro that is part of our example library file, you would add a line such as the one below into the parameter file: #odbc_connect_dsn () If we had issued the start extract command above; during process startup, this line will be logically replaced by the macro body contents ( SOURCEDB MyDSNconnect in this case) and a database logon is attempted using the connect method specified. GoldenGate Manager Manager is the GoldenGate parent process and is responsible for the management of GoldenGate processes, resources, user interface, and the reporting of thresholds and errors. Even though the default setting for Manager will suffice in most instances; there are several settings that should be reviewed and modified for a well configured and running GoldenGate environment.

PORT The default Manager listener port is 7809. Because this is a well documented and publicized port number, it may not be the best setting in a customer environment; especially if the server is on a nonsecure network. The customer s network administrator should assign GoldenGate Manager a nondefault port number. DYNAMICPORTLIST When Manager receives a connect request from Capture, Apply, or GGSCI; the default functionality is to utilize any available free port for data exchange. In a production environment this may not be desired functionality; therefore, the customer s network administrator should assign a series, or range of ports for exclusive use by GoldenGate processes. DYNAMICPORTLIST may be configured as: 1. A series of ports a. DYNAMICPORTLIST 15301, 15302, 15303, 15380, 15420 2. A range of ports a. DYNAMICPORTLIST 12010-12250 3. A range of ports plus individual ports a. DYNAMICPORTLIST 12010-12020, 15420, 15303 A maximum of 256 ports may be specified. PURGEOLDEXTRACTS Use PURGEOLDEXTRACTS in the Manager parameter file to purge trail files when GoldenGate has finished processing them. As a Manager parameter, PURGEOLDEXTRACTS provides trail management in a centralized fashion and takes into account multiple processes. To control the purging, follow these rules: 1. Specify USECHECKPOINTS to purge when all processes are finished with a file as indicated by checkpoints. Basing the purge on checkpoints ensures that no file is deleted until all processes are finished with it. USECHECKPOINTS considers the checkpoints of both Extract and Replicat before purging. 2. Use the MINKEEP rules to set a minimum amount of time to keep an unmodified file: a. Use MINKEEPHOURS or MINKEEPDAYS to keep a file for <n> hours or days. b. Use MINKEEPFILES to keep at least <n> files including the active file. The default is 1. 3. Use only one of the MINKEEP options. If more than one is used, GoldenGate selects one of them based on the following: a. If both MINKEEPHOURS and MINKEEPDAYS are specified, only the last one is accepted, and the other will be ignored. b. If either MINKEEPHOURS or MINKEEPDAYS is used with MINKEEPFILES, then MINKEEPHOURS or MINKEEPDAYS is accepted, and MINKEEPFILES is ignored. Set MINKEEPDAYS on the source server to keep <n> number of local extract trails on disk to facilitate disaster recovery of the source database. The number of days is determined by the customer. BOOTDELAYMINUTES This WINDOWS parameter and must be the first entry in the Manager parameter file. BOOTDELAYMINUTES specifies the amount of time Manager is to delay after the server has booted. This delay is to allow other server components (RAID disks, database, network, etc) to startup and become active before Manager begins processing.

AUTOSTART AUTOSTART is used to start GoldenGate Capture or Apply processes as soon as Manager s startup process completes. LAGREPORT LAGREPORT denotes the interval at which Manager checks Capture and Apply processes for lag. This setting is customer dependent; however, a setting of LAGREPORTHOURS 1 should be sufficient in most environments. LAGCRITICAL LAGCRITICAL denotes the threshold at which lag becomes unacceptable. Manager writes a message to the GoldenGate Error Log. This setting is customer dependent; however, the recommended settings below will suffice for most installations: Environment Setting Disaster Recovery LAGCRITICALMINUTES 5 Decision Support LAGCRITICALMINUTES 5 Reporting LAGCRITICALHOURS 1 Windows Cluster Considerations Ensure that GoldenGate Manager is installed as a Service on each node in the cluster. The GoldenGate Manager resource in the SQL Server Cluster Group must be moved to each node and GoldenGate Manager installed as a service. To do so without moving all the resources in the SQL Server Cluster Group, do the following: 1. Create a new, temporary cluster group. 2. Ensure that the GGS Manager resource is NOT checked to Affect the Group on failover. 3. Ensure that the disk resource W, for GoldenGate is NOT checked to Affect the Group on failover if GoldenGate is the only resource using this disk resource. 4. Stop all Extracts and Pumps from GGSCI with STOP ER * 5. Take GGS Manager Offline from Cluster Administrator. 6. Delete any dependencies for the GG Manager resource. 7. Move the GGS Manager resource from the SQL Server Cluster Group to the newly created cluster group. 8. Move this drive resource to the newly created cluster group. 9. Move the newly created group to the new node. 10. Login to the new node and browse to the GG install directory. 11. Start GGSCI.EXE and type: a. SHELL INSTALL ADDSERVICE ADDEVENTS 12. Open the Windows Services applet and set the Log On account for the GGSMGR service that was just created if using other than the default, Local System Account. 13. Ensure that the Log On account for the GGSMGR service is a member of the new node s local Administrators Group. 14. Ensure that system DSN s have been created on the new node exactly as they have been created on the primary node. 15. Bring GGS Manager Resource online through Cluster Administrator. 16. Verify in GGSCI that All Extracts and Pumps have started. They should start automatically when GGS Manager Resource comes online, but may take a few seconds and may require a manual start with: START ER * 17. Stop Extracts and Pumps in GGSCI with: STOP ER * 18. Move the new cluster group with the GGS Manager and disk resource back to the primary node. 19. Move both resources in the new cluster group back to the SQL Server Cluster Group and reset the dependencies for the GGS Manager Resource (SQL Server and drive W resource dependencies). 20. Delete the temporary cluster group created in Step 1.

21. Bring GGS Manager Resource back online, and verify all Extracts and Pumps are running through GGSCI. GoldenGate Capture GoldenGate Change Data Capture retrieves transactional data from the source database. Below are some generic best practices guidelines for Capture: 1. As stated in section 3.2, use GoldenGate Macros to configure database access. 2. Do not configure Capture to transmit data over TCP/IP, Capture must store change data locally to EXTTRAILS. a. The data transmission causes Capture to slow down. b. Capture should only do one thing, get change data. c. If connectivity to the target server fails, Capture abends. Process restart and catch up causes undo stress on the server and database. 3. Do not have a number as the last character of a Capture Group Name. a. By default, GoldenGate store 10 report files in the dirrpt directory for each component, and appends a number (0 through 9) to the Group Name (i.e., myext0.rpt). Because these files are aged by altering them (renaming 0 to 1, 1 to 2, etc); having a number as the last character in the Group Name causes confusion when attempting to locate noncurrent report files. Activity Reporting Activity reporting is imperative in maintaining a well configured GoldenGate environment. This function not only provides valuable information to the end user, but it also provides a means for determining how to load balance Capture and Apply processes. At a minimum activity reporting should be performed on a daily basis. The following parameters will cause Capture to report activity per table daily at 1 minute after midnight: STATOPTIONS RESETREPORTSTATS REPORT AT 00:01 REPORTROLLOVER AT 00:01 REPORTCOUNT EVERY 1 HOUR, RATE As shown above: 1. STATOPTIONS RESETREPORTSTATS a. Controls whether or not statistics generated by the REPORT parameter are reset when a new process report is created. The default of NORESETREPORTSTATS continues the statistics from one report to another as the report rolls over based on the REPORTROLLOVER parameter. 2. REPORT AT 00:01 a. Causes Capture to generate per table statistics daily at 1 minute after minute and record those statistics in the current report file. 3. REPORTROLLOVER AT 00:01 a. Causes Capture to create a new report file daily at 1 minute after midnight. Old reports are renamed in the format of <group name><n>.rpt, where <group name> is the name of the Extract or Replicat group and <n> is a number that gets incremented by one whenever a new file is created, for example: myext0.rpt, myext1.rpt, myext2.rpt, and so forth. 4. REPORTCOUNT EVERY 1 HOUR, RATE a. Every hour a count of transaction records that have been processed since the Capture process started will be written to the report file. b. Rate reports the number of operations per second and the change in rate, as a measurement of performance. The rate statistic is the total number of records divided by the total time elapsed since the process started. The delta statistic is the number of records since the last report divided by the time since the last report.

TRANSMEMORY TRANSMEMORY controls the amount of memory and temporary disk space available for caching uncommitted transaction data. Because GoldenGate sends only committed transactions to the target database, it requires sufficient system memory to store transaction data on the source system until either a commit or rollback indicator is received. Transactions are added to the memory pool specified by RAM, and each is flushed to disk when TRANSRAM is reached. An initial amount of memory is allocated to each transaction based on INITTRANSRAM and is increased by the amount specified by RAMINCREMENT as needed, up to the maximum set with TRANSRAM. The value for TRANSRAM should be evenly divisible by the sum of (INITTRANSRAM + RAMINCREMENT). The setting for TRANSMEMORY will be customer dependent, based upon their unique workload. For OLTP environments, the default settings should be sufficient as transactions tend to be very short lived; however, in other environments long running transactions may exceed the default 500kb setting. If Capture exceeds allocated TRANSMEMORY the process will abend. The current settings will be written to the Capture report file. LOBMEMORY LOBMEMORY controls the amount of memory and temporary disk space available for caching transactions that contain LOBs. LOBMEMORY enables you to tune GoldenGate s cache size for LOB transactions and define a temporary location on disk for storing data that exceeds the size of the cache. Options are available for defining the total cache size, the per-transaction memory size, the initial and incremental memory allocation, and disk storage space. For OLTP transactions, the default 200Mb Ram setting should suffice; however, computations will need to be run for large Data Warehouses to ensure LOBMEMORY is set properly. GoldenGate Data Pump GoldenGate Extract Data Pump retrieves data from a local extract trail and transmits the records over TCP/IP to the target server. Below are some generic best practices guidelines for Data Pumps: 1. Do not have a number as the last character of a Data Pump Group Name. a. By default, GoldenGate stores 10 report files in the dirrpt directory for each component, and appends a number (0 through 9) to the Group Name (i.e., mypmp0.rpt). Because these files are aged by altering them (renaming 0 to 1, 1 to 2, etc); having a number as the last character in the Group Name causes confusion when attempting to locate noncurrent report files. 2. Do not evaluate the data. a. Data Pumps function most efficiently when in PASSTHRU mode. Evaluation of the data for column mapping or data transformation will require more resources and cause processing delays. 3. Read once, write many. a. One of the benefits of using Data Pumps is their ability to write to multiple remote locations. This is of benefit for load balancing Apply processes on the target server when there is no discernable lag evident for the Data Pump; or when the same data needs to be delivered to multiple locations.

RMTHOST Options COMPRESS COMPRESS should be set anytime data is being transmitted over TCP/P. Compressing outgoing blocks of records to reduce bandwidth requirements. GoldenGate decompresses the data before writing it to the trail. COMPRESS typically results in compression ratios of at least 4:1 and sometimes better. Encryption Data transmitted via public networks should be encrypted. Transactions containing sensitive data (credit card numbers, social security numbers, healthcare information, etc) must be encrypted before transmission over unsecure networks. The customer security administrator should make the determination as to whether encryption is required; however, and data that contains financial information (account numbers, etc), personal information (social security number, drivers license number, address, etc), or health care information must be encrypted. If there are any doubts, encrypt! TCPBUFSIZE TCPBUFSIZE controls the size of the TCP socket buffer in bytes. By increasing the size of the buffer larger packets can be sent to the target system. The actual size of the buffer depends on the TCP stack implementation and the network. The default is 30,000 bytes, but modern network configurations usually support higher values. Valid values are from 1000 to 200000000 (two hundred million) bytes. To compute the proper TCPBUFSIZE setting: 1. Use ping to determine the average round trip time to the target server. a. PING has several options that may be specified: i. -n <count> 1. The number of echo requests to send. Defaults to 4. ii. -l <size> 1. The size of the ping packet. Defaults to 32 bytes. b. If you know the average transaction size for data captured, use that value for the ping; otherwise, use the default. 2. After obtaining the average round trip time, multiply that value by the network bandwidth. a. For example, ping returned 64 ms as the average round trip time, and the network bandwidth is 100 Mb. Multiplying these two numbers will product a result of:.07 * 100 = 7 Mb (.065 rounded up) 3. Network bandwidth is in bits per second, so we need to convert the result from step 2 into bytes per second. a. Divide the result in step 2 by 8: 7/8 =.875 megabytes per second. 4. TCPBUFSIZE should be set to 875000 in this instance. On Unix/Linux systems using the ifconfig command to get the TCP Receive Space will also provide the correct value for the TCPBUFSIZE. On the target system issue the following command: ifconfig a Example output: eth0 Link encap:ethernet HWaddr 00:0C:29:89:3A:0A inet addr:192.168.105.166 Bcast:192.168.105.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST TCP Receive Space:125000 Metric:1 RX packets:23981 errors:0 dropped:0 overruns:0 frame:0 TX packets:5853 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:1000 RX bytes:2636515 (2.5 MiB) TX bytes:5215936 (4.9 MiB) Interrupt:185 Base address:0x2024 Note that the TCP Receive Space is 125000.Therefore for shortest send wait time set the TCPBUFSIZE also to 125000. The waittime can be determined by the ggsci command: send <extract pump>, gettcpstats Example output: Sending GETTCPSTATS request to EXTRACT PHER... RMTTRAIL.\dirdat\ph000000, RBA 717 OK Session Index 0 Stats started 2008/06/20 16:06:07.149000 0:00:11.234000 Local address 192.168.105.151:23822 Remote address 192.168.105.151:41502 Inbound Msgs 9 Bytes 66, 6 bytes/second Outbound Msgs 10 Bytes 1197, 108 bytes/second Recvs 18 Sends 10 Avg bytes per recv 3, per msg 7 Avg bytes per send 119, per msg 119 Recv Wait Time 125000, per msg 13888, per recv 6944 Send Wait Time 0, per msg 0, per send 0 The lower the send Wait time the better performance for the pump over the network. The customer network administrator can also assist in determining an optimal value. TCPFLUSHBYTES Controls the size of the buffer, in bytes, that collects data that is ready to be sent across the network. When this value is reached, the data is flushed to the target. Set TCPFLUSHBYTES to the same value as TCPBUFSIZE. GoldenGate Trails GoldenGate Trails are used to store records retrieved by Change Data Capture (EXTTRAIL) or transmitted over TCP/IP to a target server by Extract Data Pump (RMTTRAIL). A single GoldenGate instance supports up to 100,000 trails ranging in size from 10Mb (the default size) to 2000Mb, and sequentially numbered from 000000 to 999999. Below are some generic best practices guidelines for GoldenGate Trails: 1. Do not use numeric characters in the trail identifier. a. Trails are identified by two characters assigned via the GGSCI command add exttrail or add rmttrail. Using numeric characters, i.e., a1, could cause confusion when attempting to locate trails for troubleshooting purposes as there could be a trail named dirdat/a1111111 on disk. To eliminate this confusion, use characters a through z (and uppercase A through Z on Linux/Unix servers) only. If you need more that 52 unique trails, use a directory other than./dirdat as the repository. 2. Do not use the default trail size. a. 10Mb is really too small for modern production environments. File creation is a very expensive activity; so size the trails to hold 24 hours of data. If the customer is generating more than 2Gb of data per day, size both local and remote trails for the maximum. 3. Manage the trails via Manager

GoldenGate Apply a. Have Manager housekeeping tasks delete trails when they are no longer required. b. On the source server, keep a few days (up to 7) local trails on disk as an added security measure in the event of a catastrophic source database failure. GoldenGate Change Data Apply executes SQL statements on the target database based upon transactional data captured from the source database. Below are some generic best practices guidelines for Apply: 1. As stated in section 3.2, use GoldenGate Macros to configure database access. 2. Do not have a number as the last character of a Capture Group Name. a. By default, GoldenGate store 10 report files in the dirrpt directory for each component, and appends a number (0 through 9) to the Group Name (i.e., myext0.rpt). Because these files are aged by altering them (renaming 0 to 1, 1 to 2, etc); having a number as the last character in the Group Name causes confusion when attempting to locate noncurrent report files. 3. Always discard a. Always configure DISCARDFILE in Apply parameter files. In the event of a data integrity issue, the Apply process will log the data record to this file. b. Use.dsc as the suffix for Apply discard files. i. This just makes it easier to identify them c. Put discard files in a dedicated subdirectory. i. Create a directory dirdsc for discard file storage. This aids in troubleshooting as the files will be stored in a dedicated location. 4. Never use HANDLECOLLISIONS for real-time data Apply. a. HANDLECOLLISIONS should only be active during catch-up phase after database instantiation. b. If HANDLECOLLISIONS is active END RUNTIME or END <timestamp> must be active as well. Activity Reporting Activity reporting is imperative in maintaining a well configured GoldenGate environment. This function not only provides valuable information to the end user, but it also provides a means for determining how to load balance Apply processes. At a minimum activity reporting should be performed on a daily basis. The following parameters will cause Apply to report activity per table daily at 1 minute after midnight: STATOPTIONS RESETREPORTSTATS REPORT AT 00:01 REPORTROLLOVER AT 00:01 REPORTCOUNT EVERY 1 HOUR, RATE As shown above: 1. STATOPTIONS RESETREPORTSTATS a. Controls whether or not statistics generated by the REPORT parameter are reset when a new process report is created. The default of NORESETREPORTSTATS continues the statistics from one report to another as the report rolls over based on the REPORTROLLOVER parameter. 2. REPORT AT 00:01 b. Causes Apply to generate statistics daily at 1 minute after minute and record those statistics in the current report file. 3. REPORTROLLOVER AT 00:01 a. Causes Apply to create a new report file daily at 1 minute after midnight. Old reports are renamed in the format of <group name><n>.rpt, where <group name> is the name of the Extract or Replicat group and <n> is a number that gets incremented by one whenever a new file is created, for example: myrep0.rpt, myrep1.rpt, myrep2.rpt, and so forth. 4. REPORTCOUNT EVERY 1 HOUR, RATE

LOBMEMORY a. Every hour a count of transaction records that have been processed since the Apply process started will be written to the report file. b. Rate reports the number of operations per second and the change in rate, as a measurement of performance. The rate statistic is the total number of records divided by the total time elapsed since the process started. The delta statistic is the number of records since the last report divided by the time since the last report. LOBMEMORY controls the amount of memory and temporary disk space available for caching transactions that contain LOBs. LOBMEMORY enables you to tune GoldenGate s cache size for LOB transactions and define a temporary location on disk for storing data that exceeds the size of the cache. Options are available for defining the total cache size, the per-transaction memory size, the initial and incremental memory allocation, and disk storage space. For OLTP transactions, the default 200Mb Ram setting should suffice; however, computations will need to be run for large Data Warehouses to ensure LOBMEMORY is set properly. BATCHSQL BATCHSQL may be used to increase the throughput of Apply by grouping similar SQL statements into arrays and applying them at an accelerated rate. When BATCHSQL is enabled, Apply buffers and batches a multitude of statements and applies them in one database operation, instead of applying each statement immediately to the target database. Operations containing the same table, operation type (I, U, D), and column list are grouped into a batch. Each type of SQL statement is prepared once, cached, and executed many times with different variables. The number of statements that are cached is controlled by the MAXSQLSTATEMENTS parameter. BATCHSQL is best used when the change data is less than 5000 bytes per row. Given this, BATCHSQL cannot process data that contains LOB or LONG data types, change data for rows greater than 25k in length, or when the target table has a primary key and one or more unique keys. If the transaction cannot be written to the database via BATCHSQL, Apply will abort the transaction, temporarily disable BATCHSQL processing, and then retry the transaction in normal mode. If this is occurring frequently, you will need to evaluate the tradeoff of using BATCHSQL versus the impact of this error handing functionality. Load Balancing GoldenGate Processes Should any GoldenGate component (Capture, Data Pump, or Apply) report lag that is unacceptable to the customer, or exceeds their documented service level agreement (SLA); a performance audit will need to be conducted. Here, we ll discuss the basic concepts for performing a performance audit and load balancing exercise. The Pareto principle states that, for many events, 80% of the effects come from 20% of the causes. Applying this principal for optimization means that we will concentrate our efforts on the 20% top resource consumers. Step 1. Identify the Bottleneck What component is reporting an unacceptable lag? Typically, lag will be evident in Apply because of the work that must be performed to maintain the target database. However, lag can also be evident for Change Data Capture if data is added to the database logs faster than we can extract. Data Pumps may report lag for the same reason, or if the network is too slow, or if the TCPBUFSIZE setting is too small. For Data Pump lag, refer to section 3.5 and compute the recommended TCPBUFSIZE setting. The customer network administrator can provide information about network bandwidth and TCP settings for the server.

One methodology for determining where lag resides in the data transport mechanism is to enable a heartbeat table. For more information of heartbeat table usage and configuration, refer to the Best Practices Heartbeat Table document for more information on this subject. Step 2. Determine Database Activity What are the busiest tables, the type of activity being performed, and the average record length of the data? If Capture statistics are being generated on a daily basis, we can use the GoldenGate report files to find the busiest tables. Ideally, we want to use 7 days worth of data to compute a running average per day, per hour, per minute, and per second. To determine activity per transaction type for each table, use the following formulas (you may want to put this is a spreadsheet): 1. Daily activity: The raw numbers from each report file. 2. Hourly per day: daily/60 3. Per minute per day: hourly/60 4. Per second per day: per minute/60 5. Weekly average: sum(daily)/7 6. Hourly average: sum(hourly)/7 7. Per minute average: sum(per minute)/7 8. Per second average: sum(per second)/7 Sort the data to determine the most active tables per day, per hour, per second, weekly average, hourly average, and per second average. The busiest tables will be the ones where these six values intersect (they are within the top 20% of all tables in the list). Since we now know the busiest table, we need to determine the average record length of the data captured. For this we ll be using the EXTTRAILS on the source server and LOGDUMP. Ideally, we need several days of trail history to complete this step; however, we can use what s available. For each available EXTTRAIL, and for each table in our busiest tables list, do the following: 1. Start LOGDUMP 2. Open each trail. a. Open./dirdat/<xx><nnnnnn> 3. Set the detail level a. Detail data 4. Filter on the table a. Filter include filename <schema.table> b. Be sure to put the table name in upper case. 5. make sure we re at the beginning of the trail a. Pos 0 6. Get the count a. Count Logdump will return information similar to this: Scanned 10000 records, RBA 62732472, 2007/06/28 11:08:15.000.000 LogTrail C:\GGS\ora92\dirdat\e9000000 has 6738 records Total Data Bytes 19962835 Avg Bytes/Record 2962 Delete 352 Insert 4000 PK FieldComp 2386 Before Images 352 After Images 6386 Filtering matched 6738 records suppressed 13476 records Average of 5056 Transactions

Bytes/Trans... 4012 Records/Trans... 1 Files/Trans... 1 LPENTON.T2_IDX Partition 4 Total Data Bytes 19962835 Avg Bytes/Record 2962 Delete 352 Insert 4000 PK FieldComp 2386 Before Images 352 After Images 6386 We now know the daily total data bytes captured by table and the daily average bytes per record captured. We can use the daily average bytes to compute per hour and per minute values as we did above when determining table activity. Step 3. Configuration Changes Change Data Capture Before making configuration changes to Capture, we want to make sure that the bottleneck is due to database activity. Check the connectivity to the database to make sure there is not a network issue. If the bottleneck is related to workload, then we can use the information gathered in step 2 to split the database workload across two capture processes. Create a new Change Data Capture and divide the workload evenly across the two processes. Data Pump If the bottleneck is not related to TCP or network configuration, create a new data pump and split the workload evenly across the two processes. Multiple Data Pump processes may read from the same EXTTRAIL; however, when doing so be sure to monitor disk activity to ensure contention at the file level does not occur as that can impact both Capture and Data Pump. Change Data Apply Apply is where you ll see the most improvement by load balancing. Here we need to make some decisions: 1. Is BATCHSQL enabled? a. If not activate BATCHSQL and check Apply performance? 2. Is the workload Update or Delete intensive? a. If so, what is the primary key or unique index GoldenGate is using? b. Is this key sufficient for accessing the row with a minimal number of io s? i. If not, consider setting KEYCOLS in Apply or adding a new unique index to the table for GoldenGate access along with KEYCOLS. 3. What type of connectivity do we have to the target database? a. Do we have enough bandwidth? If none of the above apply, we can use the FILTER option for workload distribution across multiple Apply processes. In Apply, FILTER is activated as part of the MAP statement and the syntax is MAP <source table>, TARGET <target table>, FILTER (@RANGE (<n>, <n>));. Using the information gained in step 2, setup multiple Apply processes to distribute the workload for our busiest tables. The number of Apply processes will depend upon the workload and type of activity. When setting up multiple Apply processes to read from a single trail, monitor disk activity to ensure contention at the file level does not occur. WARNING: @RANGE cannot be used on tables where primary key updates may be executed. This will cause Apply failures and/or database out of sync conditions.

Oracle GoldenGate for Linux, UNIX, and Windows Updated August 2010 Oracle Corporation World Headquarters 500 Oracle Parkway Redwood Shores, CA 94065 U.S.A. Worldwide Inquiries: Phone: +1.650.506.7000 Fax: +1.650.506.7200 oracle.com Copyright 2010 Oracle and/or its affiliates. All rights reserved. This document is provided for information purposes only and the contents hereof are subject to change without notice. This document is not warranted to be error-free, nor subject to any other warranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of merchantability or fitness for a particular purpose. We specifically disclaim any liability with respect to this document and no contractual obligations are formed either directly or indirectly by this document. This document may not be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose, without our prior written permission. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. 0109