IBM PowerHA SystemMirror for i. Performance Information

IBM PowerHA SystemMirror for i Performance Information Version: 1.0 Last Updated: April 9, 2012

Table Of Contents 1 Introduction... 3 2 Geographic Mirroring... 3 2.1 General Performance Recommendations... 3 2.1.1 Source and Target Comparison... 3 2.1.2 CPU considerations... 3 2.1.3 Memory considerations... 4 2.1.4 Disk Subsystem... 4 2.1.5 System disk pool considerations... 4 2.1.6 Communications Lines... 5 2.1.7 Communication Transport Speeds... 6 2.2 Run-time Environment... 7 2.2.1 Delivery and Mode... 7 2.2.2 Sizing for optimum performance... 8 2.2.3 Monitoring the run-time environment... 9 2.3 Synchronization... 9 2.3.1 Partial and Full synchronizations... 9 2.3.2 Tracking Space... 10 2.3.3 Monitoring synchronization... 10 2.3.4 Calculating Full Synchronization Time... 10 2.3.5 Synchronization Priority... 11 2.3.6 Managing Contention between run-time and synchronization... 11 3 Metro Mirror and Global Mirror... 11 4 FlashCopy... 12 4.1 DS8000 FlashCopy SE... 12 4.2 SVC/V7000 Thin Provisioning... 13 2

1 Introduction The primary focus of this document is to give recommendations for achieving the best performance possible for the various PowerHA SystemMirror technologies. The technologies included are geographic mirroring, Metro Mirror and Global Mirror, and FlashCopy. 2 Geographic Mirroring With geographic mirroring, IBM i does the replication. It is very important to consider performance when planning to implement a geographic mirroring solution. While asynchronous geographic mirroring does allow a bit more flexibililty regarding distance between systems, there are still implications to undersizing the source, target, or the communications line between the two. There are two separate aspects to consider when sizing for a geographic mirroring environment. During the normal run-time of the production environment, there will be some overhead added by the geographic mirroring as the IBM i operating system is sending disk writes to the target system. The second aspect is the overhead and time required for synchronization, when the target IASP is reconnected to the source IASP and changes are pushed from the source to the target to make the two equivalent again. 2.1 General Performance Recommendations 2.1.1 Source and Target Comparison Geographic mirroring will consume resource on both the source and target resource. Especially for synchronous geographic mirroring, the best performance will be seen when the source and target systems are fairly equivalent in CPU, memory, and disk subsystem. 2.1.2 CPU considerations There is extra CPU and memory overhead required when doing geographic mirroring, both on the source and target system. There must be sufficient excess CPU capacity to handle this overhead, but there is no formula to calculate this exactly as it depends on many factors in the environment and the configuration. As a general rule, the source and target partitions used to run geographic mirroring need more than a partial processor. In a minimal CPU configuration, you can potentially see 5 20% CPU overhead while running geographic mirroring. The processor on the target system should be roughly equivalent to the processor on the source system. Undersizing the target system can affect run-time performance and also may not be acceptable in the event of a switchover or failover where production is now running on the target system. 3

2.1.3 Memory considerations Geographic mirroring also requires extra memory in the machine pool. For optimal performance of geographic mirroring, particularly during synchronization, increase the machine pool size by at least the amount given by the following formula and then use WRKSHRPOOL to set the machine pool size: Extra machine pool size = 300 MB + (0.3 * number of disk arms in the IASP) This extra machine pool storage is required on all nodes in the cluster resource group (CRG). It is important in the synchronization process on the target node, as well as when a switchover or failover occurs. NOTE: The machine pool storage size must be large enough before starting a resynchronization. Once the synchronization has started, increasing memory will not be utilized, and the synchronization could take a longer time. If the system value QPFRADJ is equal to 2 or 3, then the system might make changes to the storage pools automatically as needed. To prevent the performance adjuster function from reducing the machine pool size, set the machine pool minimum size (MINPCT parameter) to the calculated amount (the current size plus the extra size for geographic mirroring from the formula) by using the Work with Shared Storage Pools (WRKSHRPOOL) command or the Change Shared Storage Pool (CHGSHRPOOL) command. 2.1.4 Disk Subsystem Disk unit and IOA performance can affect overall geographic mirroring performance. The disk subsystem on the target side should be equivalent to that on the source side. It does not need to be identical, but should have around the same number of arms with the same performance characteristics, as well as equivalent IOA performance on both sides. IOA cache has been found to affect geographic mirroring performance. Performance will be best with a large amount of IOA cache available on both the source and target systems. When possible, the disk assigned to the IASP should be placed on a separate IO adapter from the SYSBAS disk to reduce any possible contention. 2.1.5 System disk pool considerations Similar to any system disk configuration, the number of disk units available to the application can have a significant affect on its performance. Putting additional workload on a limited number of disk units might result in longer disk waits and ultimately longer response times to the application. This is particularly important when it comes to temporary storage ina system configured with independent disk pools. All temporary storage is written to the SYSBAS disk pool. You must also remember tha the operating 4

system and basic functions occur in the SYSBAS disk pool. As a starting point, use the guidelines shown in the following table. Disk Arms in IASP Arms for SYSBAS: Divide IASP arms by: Less than 20 3 20-40 4 Greater than 50 5 For example, if the IASP contains 10 drives, then SYSBAS should have at least 3. If the IASP contains 50 drives, then SYSBAS should have at least 10. You will want to monitor the percent busy of the SYSBAS disk arms in your environment to ensure that you have the appropriate number of arms. If utilization grows to over 40%, then more arms should be added. 2.1.6 Communications Lines When you are implementing a PowerHA solution using geographic mirroring, plan for adequate communication bandwidth so that the communication bandwidth does not become a performance bottleneck. Geographic mirroring can be used for virtually any distance. However, only you can determine the latency that is acceptable for your application. The type of networking equipment, the quality of service, the distance between nodes, the number and characteristics of data ports used can all affect the communications latency. As a result, these become additional factors that can impact geographic mirroring performance. To ensure better performance and availability, the following is recommended: To provide consistent response time, geographic mirroring should have its own redundant communication lines. Without dedicated communication lines, there might be contention with other services or applications that utilize the same communication line. Geographic mirroring supports up to four communication lines (data port lines), and a cluster heartbeat can be configured for up to two lines. It is important to know that a round-robin approach is used to send the data across the lines. This implies that for best performance, when multiple dataport lines are configured, they should have close to equivalent performance characteristics. If one slow line is added, then this will gate the sending of the data to that line speed. Geographic mirroring replication should also be run on a separate line from the cluster heartbeating line (the line associated with each node in the cluster). If the same line is used, during periods of heavy geographic mirroring traffic, heartbeating could fail, causing a false partition. From a high availability point of view, its is recommended to use different interfaces and routers connected to different network subnets for the four data ports that can be defined for geographic mirroring. It is better to install the Ethernet adapters in different expansion towers, using different hardware busses. Also, if you use multiport IO adapters, use different ports to connect the routers. If your configuration is such that multiple applications or services require the use of the same communication line, some of these problems can be alleviated by 5

implementing quality of service (QoS) through the TCP/IP functions of IBM i. The IBM I QoS solution enables the policies to request network priority and bandwidth for TCP/IP applications throughout the network. Ensure that throughput for each connection matches. Also, the speed and connection type should be the same for all connections between system pairs. If throughput is different, performance will be gated by the slowest connection. For example, a customer can have 1G Ethernet speed from their servers to the attached switches. However, if the connection is using a DS-3, then from site to site, they are utilizing a 44.736 mbits/sec connection. Physical capacity is not throughput capacity. Older 10M, 100M Ethernet connections use earlier implementations of Carrier-Sense Media Access/Collision Detection (CSMA/CD). You should plan on no more than 30-35% throughput. As the network becomes more saturated, tere are more collisions, causing more retransmissions. This becomes a limitation on the data throughput, as opposed to the speed of the actual line. With newer implementations of 10M and 100M, the data throughput can vary from 20% to 70%, and it is again dependent on network saturation. Ensure that your connections are taking an appropriate route. You want to understand whether it is a circuit-switching protocol (like a T-1), and whether the connection goes directly from point A to point B or whether its routed through other switching offices. Size the communication bandwidth for both resynch and normal production in parallel. In a disaster situation, you may have a scenario where you have switched over to your target system and are running production. Then the original source system comes online and must be resynchronized. The full synchronization will be taking place in conjunction with normal runtime changes. The resynchronization could cause application performance degradation if the communications pipe is saturated. 2.1.7 Communication Transport Speeds Just how fast is a T1 line? A data T1 transfers information at about 1.544 megabits every second, which translates to.193 MBps theoretical throughput. The absolute best that you can hope to get out of a T1 line is 70% effective throughput, and most network specialists say to plan to 30%. Therefore, the best that a T1 line can transfer is.135 MBps. If you have a 2 Gigabyte file to initially synch up, then that synch would take over 80 days. As you can see, most systems need more than a T1 line to achieve affective geographic mirroring throughput. T3 lines are a common aggregation of 28 T1 circuits that yield 44.736 Mbps total network bandwidth or 5.5 MBps with a best effective throughput of 70%, which equals 3.9 MBps and a planning number of 2 MBps. The OC (the optical carrier fiber optic-based broadband network) speeds provide more bandwidth to achieve higher throughput rates. 6

The following table provides other communication line speeds. Type Raw speed (Mbps) Raw speed (MBps) 30% planning (MBps) GB/hour during synch T1 1.544 0.193 0.06 0.22 DS3/T3 44.736 5.5 2 7.2 OC-1 51.840 6.5 2.1 7.6 OC-3 155.52 19.44 6 21.6 OC-9 455.56 56.94 18 64.8 OC-12 622.08 77.76 24 86.2 OC-18 933.12 116.64 35 126 OC-24 1244 155.5 47 169 OC-36 1866 233.25 70 252 OC-48 2488 311 93 335 OC-192 9953 1244.12 373 1342 1 Gb Ethernet local 1000 125 38 225 2.2 Run-time Environment 2.2.1 Delivery and Mode When configuring geographic mirroring, there are two main parameters which affect geographic mirroring run-time performance. The DELIVERY parameter will affect the performance of disk writes to the IASP. With synchronous delivery, the disk write will not complete until the affected page in storage has also been received on the target system. Asynchronous delivery will allow the disk write on the source to complete once the write has been cached. The actual sending of the disk write to the target system happens outside the scope of the write on the source. For synchronous delivery, there is also a synchronous or asynchronous MODE. Synchronous mode ensures that the write has arrived at the disk cache on the target (essentially on disk at that point) before returning. Asynchronous mode only ensures that the write is on memory on the target. Synchronous delivery and synchronous mode guarantees equivalent copies of the IASP on source and target while geographic mirroring is active. It also provides the added protection of a crash-consistent copy of the data in case of a target system failure, since all writes will have been received into the disk subsystem. Synchronous delivery and asynchronous mode may be beneficial for customers running with a significantly slower disk subsystem on their target system. This will allow the disk write on the source to complete without waiting for the completion on the target. This delivery and mode will still guarantee equivalent data on the source and target IASPs in the case of a failure of the source system. 7

With synchronous delivery, it is very important to have the communications bandwidth available to support the number of disk writes at all peak periods thoughout the day or night. The overhead of sending the data to the target will be added to the time for each disk write to complete, which could significantly affect production performance. Even with a very fast line, if the distance between the source and target is too great, production performance will suffer. For this reason, asynchronous delivery for geographic mirroring was introduced in release 7.1. Asynchronous delivery is best for those environments where the source and target are separated by too great a distance for acceptable synchronous response times, or for scenarios where the bandwidth cannot support the peak write rate. 2.2.2 Sizing for optimum performance For best run-time performance, it is important to know the write volume within the IASP. We only consider writes because those are the only IO which are transferred to the target system. If the IASP has not yet been defined, the write volume in SYSBASE can be used as a rough estimate, understanding that this may result in excess communications capacity. Both the peak and average megabytes per second written should be collected, preferably over short intervals, such as 5 minutes. For synchronous delivery, the bandwidth of the communications line(s) must be able to keep up with the peak write volume. If it cannot keep up, the writes will begin to stack up and production performance will suffer. For asynchronous delivery, the bandwidth of the lines must still keep up at least to the average write volume. Since writes on the source are not waiting, it is acceptable for some queuing to occur, but if the line cannot handle the average write volume, then geographic mirroring will continue to get further and further behind. It also is important to examine the variance of the write rate over time. If there is a large variance between peak and average, then it may be advisable to size more for the peak. Undersizing in this case would affect the recovery point objective in the case of a source system failure during the peak write rate. To determine the megabytes of writes per second for each interval, run the performance tools during a representative and peak period. From the resulting QADMDSK file, use these parameters: DSBLKW number of blocks written: A block is one sector on the disk unit. PD (11,0). INTSEC elapsed interval seconds: The number of seconds since the last sample interval. PD (7,0). Then take the following steps: 1. Calculate the disk blocks written per second 8

Disk blocks written per interval divided by the number of seconds in the interval (QAPMDISK.QAPMDISK.DSBLKW / QAPMDISK.QAPMDISK.INTSEC) 2. Convert disk blocks to bytes. Multiply by 520 to get the number of bytes. 3. Divide by a million to get megabytes per second. 4. If using mirrored disks, divide by 2 to get geographic mirroring traffic. The formula to calculate the amount of traffic expressed as megabytes written per second is as follows: ((QAPMDISK.QAPMDISK.DSBLKW / QAPMDISK.QAPMDISK.INTSEC) * 520) / 1000000 / 2 For example, if you determine that the amount of traffic is 5 MBps and you want to use geographic mirroring, then you need a pipe that can accommodate 5 MBps of data being transferred. If you are configuring two lines as data ports, then you need 2.5 MBps per line. From the table earlier, A DS3/T3 allows 5.6 MBps theoretical throughput with a 2 MBps with a best practice at 30% utilization. An OC-3 line allows 19.44 MBps theoretical throughput with 6 MBps with a best practice at 30% utilization. You can initially start with two DS3 lines, but may need to upgrade to two OC-3 lines to plan for growth. 2.2.3 Monitoring the run-time environment When using asynchronous delivery, it may be useful to determine whether geographic mirroring is keeping up with disk writes. On the DSPASPSSN command on the source system, the Total data in transit field will give the amount of data in megabytes which has been sent to the target system, but not acknowledged as received. This field will only be shown when the transmission delivery is *ASYNCH and the state is ACTIVE. 2.3 Synchronization 2.3.1 Partial and Full synchronizations When you suspend mirroring for any planned activities or maintenance, any changes made on the production copy of the independent disk pool arenot being transmitted to the mirror copy. So, when geographic mirroring is resumed, synchronization is required between the production and mirror copies. If geographic mirroring is suspended without tracking, then full synchronization occurs. This can be a lengthy process. If geographic mirroring is suspended with the tracking option, PowerHA will track changes up to the tracking space limit specified on the ASP 9

session. When mirroring is resumed,k the production and mirror copies are synchronized concurrently while geographic mirroring is running. Tracking is available on both the source side and the target side. Target side tracking greatly reduces the need for a full synchronization. Usually a full synchronization is only required when either the source or target IASP does not vary off normally, such as from a crash or an abnormal vary-off. While a synchronization is taking place, the environment is not highly available. This makes it essential to calculate the time required to do a full synchronization to understand whether the business can support that length of time exposed to an outage. 2.3.2 Tracking Space Tracking space is a reserved area within the IASP where the system tracks changed pages while geographic mirroring is not active which need to be synchronized when resuming mirroring. Tracking space is needed only when the target copy of the IASP is suspended, detached, or resuming. The changes themselves aren t contained within the tracking space, only a space-efficient indication of which pages require changes. The amount of tracking space allocated can be defined by the user. The maximum is 1% of the total space within the IASP. Using the CHGASPSSN command, a user can set the percentage of that 1%. For example, setting the field to 10% means that the tracking space would be 10% of 1% or.1% of the total IASP size. These parameters can be viewed using the DSPASPSSN command. Tracking Space Allocated is the percentage of the maximum (it would show 10% in the above example) and Tracking Space Used is the percentage of the available tracking space being used. If the Tracking Space Used reaches 100%, then no more changes can be tracked, and a full synchronization will be required. 2.3.3 Monitoring synchronization To track how much data is left to be synchronized, the DSPASPSSN command can be used on the source system. On the second screen, there are fields for Total data out of synch as well as Percent complete. These fields will display the megabytes of data which needs to be resynchronized and how far the synchronization has progressed. Both of these fields are updated as the synchronization runs. Each time a synchronization starts or is resumed, these fields will be reset. In the case of a resume, the percent complete will reset to 0, but you should also see a reduced total data out of synch. 2.3.4 Calculating Full Synchronization Time To determine the time needed for a full synchronization, divide the total space utilized in the IASP by the effective communications capability of the chosen communication lines. For example, if the IASP size is 900 GB and you are using 1 Gb Ethernet switches, then the synchronization time will be less than an hour. However, if you are using two T3/DS3 lines, each having an effective throughput of 7.2 GB/hour, it would take around 10

63 hours to do the synchronization. This was calculated by dividing the size of the IASP by the effective GB/hour, that is, 900 GB divided by 14 GBps. In most cases, the size of the data is used in the calculate, not the size of the IASP. An exception to this is a *NWSSTG in an IASP. An *NWSSTG object is treated as one file, so the size of the *NWSSTG is used instead of the amount of data within the *NWSSTG file. To compute the full synchronization time for an *NWSSTG in an IASP, divide the size of the network storage space of the IBm I hosted partition by the effective speed of the communications mechanism. For example, if the network storage space hosting IBM I was set up as 600 GB, it would take 42 hours to do the full synchronization using two DS3 lines. To improve the synchronization time, a compression device can be used. 2.3.5 Synchronization Priority Synchronization priority setting (low, medium, high) determines the amount of resources allocated to synchronization. Lower settings will gate synchronization which will also allow more resources to be allocated to non-synch work 2.3.6 Managing Contention between run-time and synchronization Ideally, synchronization will run best when the system is quiet. However, most businesses cannot support this amount of quiesced time. Thus, synchronization will most likely be contenting for system resources with normal production workload, as well as the normal geographic mirroring run-time workload. For the least effect on production work, a synchronization priority of low can be selected. However, this will lengthen the amount of time required to complete the synchronization, also lengthening the amount of time without a viable target. 3 Metro Mirror and Global Mirror When using the Metro Mirror or Global Mirror technology within PowerHA, the overhead of replication is offloaded to the external storage device. However, the SAN infrastructure between the local and remote storage system plays a critical role in the speed of replication. Specifically, for Metro Mirror, which is synchronous replication, if the SAN bandwidth is too small to handle the traffic, then application write I/O response time will be affected. Use the following guidelines when calculating bandwidth required for external storage replication: 11

1. Assume 10 bits per byte for network overhead 2. If the compression of devices for remote links is known, it can be applied 3. Assume a maximum of 80% utilization of the network 4. Apply a 10% uplift factor to the result to account for peaks in the 5 minute intervals of collecting data, and a 20-25% uplift factor for 15 minute intervals. The following is an example using these guidelines: 1. The highest reported write rate from the IBM i is 40 MBps. 2. Assume 10 bits per byte for network overhead: 40 MBps * 1.25 = 50 MBps 3. Assume a maximum of 80% utilization of the network: 50 MBps * 1.25 = 62.5 MBps 4. Apply a 10% uplift for 5 minute intervals: 62.5 MBps * 1.1 = 69 MBps 5. The needed bandwidth is 69 MBps. A Recovery Point Objective (RPO) estimation tool is available for IBM and IBM Business Partners. This tool provides a method for estimating the RPO in a DS8000 Global Mirror environment in relation to the bandwidth available and other environmental factors (see Techdocs Document Id: PRS3246 for IBM and IBM Business Partners). 4 FlashCopy FlashCopy is another technology integrated into PowerHA. FlashCopy is a very fast point in time copy done using external storage. The flashed copy can be attached to another partition or system and used for backups, queries, or other work. With a FlashCopy space-efficient or thin-provisioned relationship, disk space will only be allocated for the target when a write changes a sector on the source, or when a write is directed to the target. For this reason, using Flashcopy SE requires less disk capacity than using standard FlashCopy, which can help lower the amount of physical storage needed. FlashCopy SE is designed for temporary copies, so Flashcopy SE is optimized for use cases where a small percentage of the source volume is updated during the life of the relationship. If much more than 20% of the source is expected to change, there may be a trade-off in terms of performance versus space efficiency. Also, the copy duration should generally not last longer than 24 hours unless the source and target volumes have little write activity. 4.1 DS8000 FlashCopy SE The DS8000 FlashCopy SE repository is an object within an extent pool and provides the physical disk capacity that is reserved for space-efficient target volumes. When provisioning a repository, storage pol striping will autokmatically be used with a multi- 12

rank extent pool to balance the load across the available disks. RlashCopy SE is optikmized to work with repository extent pools consisting offour RAID arrays. In general, we recommend that the repository extent pool contain between one and eight RAID arrays. It is also important that adequate disk resources are configured to avoid creating a performance bottleneck. It is advisable to use the same disk speed or faster for the target repository as for the source volumes. We also recommend that the repository extent pool have as many disk drives as the source volumes. After the repository is defined in the extent pool, it cannot be expanded, so planning is important to ensure that it is configured to be large enough. If the repository becomes full, the Flashcopy SE relationships will fail. After the relationship fails, the target becomes unavailable for reads or writes, but the source volumes are not affected. You can estimate the physical space needed for e a repository by using historical performance data for the source volumes, along with knowledge of the duration of the FlashCopy SE relationship. In general, each write to a source volume consumes one track of space on the repository (57 KB for CKD, 64 KB for FB). Thus, the following calculation can be used to come up with a reasonable estimate: IO Rate * (% writes/100) * ((100-rewrite%)/100) * track size * duration in seconds * ((100+contingency%)) = repository capacity estimate in KB Because it is critical not to undersize the repository, a contingency factor or up to 50% is suggested. You can monitor that the repository has reached a threshold using Simple Network Management Protocol (SNMP) traps. You can set notification for any percentage of free repository space with a default notification at 15% free and 0% free. Using the Lab Services Advanced Copy Services Toolkit, you can convert and send these messages to the QSYSOPR message queue. 4.2 SVC/V7000 Thin Provisioning For SVC/V7000, when you are using a fully allocated source with the thin-rovisioned target, you need to disable the background copy and cleaning mode on the FlashCopy map by setting both the background copy rate and cleaning rate to zero. If these features are enabled, then the thin-rpvisioned volume will be either offline or as large as the source. You can select the grain size (32 KB, 64 KB, 128 KB, or 256 KB) for thinprovisioning. The grain size that you select affects the maximum virtual capacity for the thin-provisioned volume. If you select 32 KB for the grain size, the volume size cannot exceed 260,000 GB. The grain size cannot be changed after the thin-provisioned volume has been created. In general, smaller grain sizes save space and larger grain sizes produce better performance. For best performance, the grain size of the thin-provisioned volume must match the grain size of the FlashCopy mapping. However, if the grain sizes are different, the mapping still proceeds. 13

You can set the cache mode to readwrite for maximum performance when you create a thin-provisioned volume. Also, to prevent a thisprovisioned volume from using up capacity and getting offline, the autoexpand feature can be turned on. 14