VBLOCK SOLUTION FOR GREENPLUM August 2011 2011 VCE Company, LLC. All rights reserved. 1
Table of Contents Introduction...4 Goal...4 Audience...4 Scope...4 Objectives...4 Greenplum Architecture Overview...5 Setup...6 Installation...6 Vblock Series 700 model MX Building Block Configuration Specifications for Greenplum...6 Greenplum Design and Configuration Details...7 Compute Unified Computing System (UCS)... 10 UCS Server Blade Provisioning... 10 Service Profile Template... 10 UCS Firmware... 14 UCS Network Configuration... 14 Greenplum Segment Server/ESX Server Provisioning... 16 Symmetrix VMAX Storage Array... 19 The Symmetrix VMAX Architecture Overview... 19 Hardware List... 20 Disk Layout... 20 Front-end Storage Port Layout... 21 RecoverPoint... 22 RecoverPoint VSAN Zoning... 23 RecoverPoint Consistency Groups... 26 TimeFinder/Snap... 28 Replication Manager... 28 2011 VCE Company, LLC. All rights reserved. 2
Test Results... 30 Read and Load Performance Tests: Objectives and Results... 30 RecoverPoint Disaster Recovery Tests: Objectives and Results... 31 Conclusion... 34 References... 35 2011 VCE Company, LLC. All rights reserved. 3
Introduction Goal The purpose of this document is to provide architecture for hosting the Greenplum application on Vblock Infrastructure Platforms. Specifically, the Greenplum application is hosted on the Vblock Series 700 model MX shared infrastructure and is part of a multi-applications environment. Vblock 700 delivers high-performance, largescale virtualization across data centers of large enterprise customers. It includes Cisco Unified Computing system, EMC Symmetrix VMAX, and VMware vsphere 4 and can include flash technology to meet the high-performance demands of mission-critical applications. The architecture provides a building block approach for hosting Greenplum applications. This approach is scalable and supports a dynamic workload in a cost effective model. The Vblock 700 enables enterprises to meet their mobility, disaster recovery, security, and optimized data life cycle management requirements for hosting Greenplum along with other applications. Audience The target audience for this document includes technical engineering staff, managers, IT planners, administrators, and others involved in evaluating, managing, operating, or designing Greenplum Vblock platform deployments. Scope The project demonstrates the ability to: Run the Greenplum application on the Vblock platform Prove that Greenplum Data Warehousing (DW) is a viable solution for use on a Vblock platform Objectives The business objectives of the new Vblock Solution for Greeplum include advantages in the following areas: Provide a proven performance platform for a Greenplum and Vblock 700 architecture Establish a building-block scalable model with predictable performance growth Provide a showcase environment for a Greenplum and Vblock 700 Greenplum Workload Tests using massively parallel processing (MPP) o MPP/Row and MPP/Columnar Load queries o MPP/Row Sequential Read and Random Read queries o MPP/Columnar Sequential Read and Random Read queries o Mixed Workloads Automated Virtualization Functionality o Scale out to a new ESX server infrastructure with six VMs per ESX host, for a total of forty-eight Greenplum Segment Server VMs o Monitor the I/O Workload of the VMs 2011 VCE Company, LLC. All rights reserved. 4
Greenplum Architecture Overview Greenplum Database is a massively parallel processing (MPP) database server based on PostgreSQL opensource technology. MPP (also known as a shared nothing architecture) refers to systems with two or more processors which cooperate to carry out an operation, each processor with its own memory, operating system, and disks. Greenplum leverages this high-performance system architecture to distribute the load of multi-terabyte data warehouses, and is able to use all of a system s resources in parallel to process a query. Greenplum Database is essentially several PostgreSQL database instances acting together as one cohesive database management system (DBMS). It is based on PostgreSQL 8.2.9, and in most cases is very similar to PostgreSQL with respect to SQL support, features, configuration options, and end-user functionality. Database users interact with Greenplum Database as they would a regular PostgreSQL DBMS. The internals of PostgreSQL have been modified or supplemented to support the parallel structure of Greenplum Database. For example, the system catalog, query planner, optimizer, query executor, and transaction manager components have been modified and enhanced to be able to execute queries in parallel across all of the PostgreSQL database instances at once. The Greenplum interconnect (the networking layer) enables communication between the distinct PostgreSQL instances and allows the system to behave as one logical database. Greenplum Database also includes features designed to optimize PostgreSQL for business intelligence (BI) workloads. For example, Greenplum has added parallel data loading (external tables), resource management, query optimizations, and storage enhancements which are not found in regular PostgreSQL. Many features and optimizations developed by Greenplum do make their way back into the PostgreSQL community, now in standard PostgreSQL. Figure 1. MPP Shared-nothing Architecture For further Greenplum information, see the following: http://powerlink.emc.com/km/live1/en_us/offering_technical/technical_documentation/300-011-538.pdf 2011 VCE Company, LLC. All rights reserved. 5
Setup This section addresses the Greenplum setup. Installation The following link points to the Greenplum installation documentation. http://powerlink.emc.com/km/live1/en_us/offering_technical/technical_documentation/300-011-541.pdf Vblock Series 700 model MX Building Block Configuration Specifications for Greenplum The 700MX building block configuration for Greenplum comprises forty-eight Greenplum Segment Server VMs running on eight Blades and two storage engines. It will support 4 GB/sec scan rate. More throughputs can be achieved by adding an additional pair of Vmax engines and disks, or by adding more building blocks. See below for additional details. Table 1 Building block configuration specifications for Greenplum 700MX Compute B200 M2 Blades 8 x B200 M2 Blades w/96gb memory per Blade (8 Blades for Greenplum Segment Servers) across two or more chassis 700MX Storage Symmetrix VMAX storage 2 Engines 16 x 8Gb/sec FC ports 192 x 300GB FC drives (excluding hot spares) 700MX Virtualization VMware vsphere ESX 4.0 U2 servers RAID5 (3+1) vcenter connects to another Windows 2008 R2 Enterprise Edition Server running SQLServer 2005 Enterprise Edition, setup per VMware s installation guide Greenplum application on Vblock platform VMware Distributed Virtual Switch Greenplum v4.006 software Greenplum Utilities GP PerfMon Psql GP-Load Greenplum Perl Greenplum Connectivity GP/port 5432 Scan Rate for the given building block GP PerfMon /port 8888 4 GB /sec This scan rate is achieved by using 2 x VMAX engines, 16 front-end processors and 192 x Fibre Channel (FC) 15K RPM disks). More throughputs can be achieved by adding additional pairs of VMAX engines and disks. 2011 VCE Company, LLC. All rights reserved. 6
Greenplum Design and Configuration Details This section presents design and configuration details. Figures 2 and 3, below, illustrate the conceptual design through both the physical topology and the logical topology. The physical topology, figure 2, depicts the connectivity from the UCS to the SAN layer and LAN layer. In the SAN layer, a total of 16 x 8 Gb Fiber Channel (FC) connections were utilized from the USC Fabric Interconnects (A and B) to the MDS SAN directors. The SAN directors are fabric A and fabric B. VSANs 30, 130 are in director A, and VSANs 31 and 131 are in director B. VSANs 30 and 31 are backend VSANs consisting of the VMax storage ports and the RecoverPoint Appliance HBA ports. VSANs 130 and 131 are the corresponding front-end VSANs consisting of server HBA ports. In this case, the UCS blade servers are used as ESX servers. The front-end and backend VSANs are required by Cisco s SANTap to function as the write-splitter for RecoverPoint Appliances. In the LAN layer, a total of 16 x 10 Gb Ethernet port connections are used between the UCS Fabric Interconnects (A and B) and the Nexus 5020 access layer LAN switches, which in turn are connected to the Nexus 7000 switches as the aggregation layer. The logical topology is explained below, before figure 3. 2011 VCE Company, LLC. All rights reserved. 7
Figure 2. Conceptual Design Physical Topology Diagram 2011 VCE Company, LLC. All rights reserved. 8
See Figure 3, below. The logical topology depicts the configuration of the backend disk, ESX host LUNs, VMs, and Greenplum components. A total of 26 x VMs were created on 10 x ESX servers (the UCS blade servers). 24 x VMs are used as the data segment servers, 2 x VMs are used as the master servers, one active and one standby node. The 24 VMs are evenly distributed across 8 x ESX servers, 3 x VMs per ESX server. The master servers (active and standby node) are on the ninth and tenth ESX blade servers. RecoverPoint was configured in local replication mode called Continuous Data Protection (CDP). Figure 3. Conceptual Design Logical Topology Diagram 2011 VCE Company, LLC. All rights reserved. 9
Compute Unified Computing System (UCS) UCS Server Blade Provisioning The Cisco Unified Computing System (UCS) allows provisioning of Physical Servers using a Template. In UCS terminology, a server is described as a Service Profile. The template allows users to deploy one or more service profiles as a Service Profile Template. In this Greenplum on Vblock 700 configuration, a total of eight Service Profiles were used. The following Service Profile Template details are used for deploying Greenplum on Vblock 700. Service Profile Template A Service Profile Template has the following components: 1. vnic Template: This template is used for vnic configuration for all Service Profiles deployed using a Service Profile Template. For Greenplum, disable the failover between vnics. 2. Each vnic has the following VLAN configuration: vnic 0 (vmnic0) vnic 1 (vmnic1) VLAN ID Service Console/Management vmotion Network Greenplum Segment VLan-1 * Public Network vlan** Service Console/Management vmotion Network Greenplum Segment VLAN 2 * Public Network vlan** Fabric ID A B *Greenplum segment VLANs are private VLANs and completely isolated from each other. If the UCS is equipped with Palo Cards, create one additional NIC in each fabric for the Greenplum segment network. **The Public Network VLAN accesses the Master Segment in the Greenplum setup. 2011 VCE Company, LLC. All rights reserved. 10
3. vhba Template: This template is used for a vhba configuration for all service profiles deployed using a service profile template. For this setup, a total of eight fiber connections from each fabric Interconnect to the MDS Switches were used. Eight SAN Pin Groups also were created. This way, each Service Profile has a dedicated SAN Connection. The FC Adapter policy for VMware was used for all the vhbas. vhba-1 vhba-2 VSAN ID 201 202 Fabric ID A B 4. Boot Policy: Four Boot policies were created, which were named Boot from SAN (BFS). Each Boot policy points to front-end directors of the EMC Symmetrix VMAX storage array. Boot From SAN Policies (BFS) Policy 1: Policy 2: Policy 3: 2011 VCE Company, LLC. All rights reserved. 11
Policy 4: 5. MAC pools for each fabric: In this setup, we have two MAC Pools, one for each fabric. The UCS blade allows users to modify the last three octets of the MAC Address pools. 2011 VCE Company, LLC. All rights reserved. 12
6. World Wide Node and Port Name Pool: In this setup, two WWPN Pools were used, one for each fabric. The UCS blade allows users to modify the last three octets of the worldwide port and node names. The Universal Unique Identifier (UUID) pool for each Service Profile: 7. Service Profile Template and Service Profiles: The results achieved at the end of the whole process are shown in the following figure. Service Profiles gpesx101-105, and gpesx201-205, were used to host the Greenplum (segment server) virtual machines. Service profiles gpesx107 and gpesx207 formed the management cluster for the Greenplum environment. 2011 VCE Company, LLC. All rights reserved. 13
UCS Firmware The latest firmware available at the time, 1.3 (1c), was used for this deployment. Additionally, to enable the Palo interface cards, it is necessary to attach the firmware package with each service profile. This can be done in the service profile template, which then propagates the firmware to each service profile bound to this template. It is also necessary to update the BIOS to the latest version using a firmware package. The following shows a firmware package, which has Palo card (M81KR) firmware, and BIOS updates. To apply the update to all service profiles: 1. In the Service Profile template, go to Policies. 2. Select the firmware package as shown below: UCS Network Configuration 1. Create uplink ports for etherenet traffic. 2. Under networking, configure the LAN and SAN. LAN Configuration Ports 1/37-40 were selected as uplink ports for 10GB ethernet traffic. The following screen captures show the port channel (LAN) configurations on the fabric interconnect. Port-Channels used on the UCS 2011 VCE Company, LLC. All rights reserved. 14
Port-channel Configuration details for fabric Interconnect A Port-channel Configuration details for fabric Interconnect B SAN Configuration on Fabric Interconnects There are a total of eight fiber channel ports in use on the fabric interconnect. SAN Pin groups are created to isolate SAN traffic. The following screen capture shows the SAN Pin Group configurations on the fabric interconnect. 2011 VCE Company, LLC. All rights reserved. 15
Greenplum Segment Server/ESX Server Provisioning VM servers are provisioned as follows: 3 x VMs are created per ESX server on a total of 8 x ESX servers. These are used as Greenplum segment servers. 1 x VM is created on a 9th ESX server. It is used as the Greenplum Master/metadata Server. This server also handles client requests. 1 x VM is created on a 10th ESX server. It is used as the Greenplum Standby Server. Note: XFS-formatted devices on VMware ESX Guest RDMs are recommended for GP data segments for the highest performance. The table below provides the VM/ESX storage configuration details. Table 2 VM/ESX storage configuration details 2011 VCE Company, LLC. All rights reserved. 16
The following diagram depicts the graphical layout of a single blade, with an ESX instance and a VM (Greenplum Data Segment server) with two LUNs (FC) per VM. The ESX environment is located on the 500GB LUN, which holds the three VM instances. Figure 4. VM/ESX/LUN Layout The following setup shows the ESX Servers in the vcenter server. In this setup, there are six (only three VMs are used) Greenplum Segment servers (gpssx) on a single ESX server. 2011 VCE Company, LLC. All rights reserved. 17
The following shows the Greenplum Segment Servers (gpssx) Virtual Machines distribution on single ESX servers. The VM distribution on the remaining seven ESX servers is identical to gpesx101.gp.vce Other Management Virtual Machines are hosted on the following ESX server. These include the vcenter server, SQL Server Database and EMC Control Center. 2011 VCE Company, LLC. All rights reserved. 18
Symmetrix VMAX Storage Array The Symmetrix VMAX Architecture Overview At the heart of the Symmetrix VMAX series storage array architecture is the scalable Virtual Matrix interconnect design. The Virtual Matrix is redundant and dual active, and supports all Global Memory references, messaging, and management operations including internal discovery and initialization, path management, load balancing, failover, and fault isolation within the array. The Symmetrix VMAX array comprises from one to eight VMAX Engines. Each VMAX Engine contains two integrated directors. Each director has two connections to the VMAX Matrix Interface Board Enclosure (MIBE) via the System Interface Board (SIB) ports. Since every director has two separate physical paths to every other director via the Virtual Matrix, this is a highly available interconnect with no single point of failure. This design eliminates the need for separate interconnects for data, control, messaging, environmental, and system test. A single highly-available interconnect suffices for all communications between the directors, which reduces complexity. Figure 5. Symmetrix VMAX Virtual Matrix Interconnect The Symmetrix VMAX design is based on an individual Symmetrix VMAX engine with redundant CPU, memory, and connectivity on two directors for fault tolerance. Symmetrix VMAX engines connect to and scale-out linearly through the Virtual Matrix Architecture, which allows resources to be shared within and across VMAX Engines. To meet growth requirements, additional VMAX Engines can be added non-disruptively for efficient and dynamic scaling of capacity and performance, while dramatically simplifying and automating operational tasks are critical to addressing the infrastructure requirements and driving down cost in both virtual and physical deployments. The following figure illustrates the building block approach. 2011 VCE Company, LLC. All rights reserved. 19
Symmetrix VMAX Engine Building Block Easily Add More Symmetrix VMAX Engines Virtual Servers Front End Back End Front End Back End Host & Disk Ports Host & Disk Ports Symmetrix VMAX Engine Core Core Core Core Global Memory Core Core Core Core CPU Complex Virtual Matrix Interface A B Core Core Core Core CPU Complex Core Core Core Core Global Memory Virtual Matrix Interface A B VMAX Engine VMAX Engine VMAX Engine VMAX Engine VMAX Engine VMAX Engine VMAX Engine VMAX Engine Figure 6. Symmetrix VMAX Engine Building Blocks Hardware List Number of VMAX Engines: 2 Global Memory (GB): 256 Number of Front End (8 Gbps Ports): 32 ports Number of 300G (15K RPM) FC disks (excluding hot spares): 192 Disk Layout 48 x RAID5(3+1) RAID groups are created out of the total 192 x Fibre Channel (FC) disks o 24 x RAID groups are used as RecoverPoint CDP source o 24 x RAID groups are used as RecoverPoint CDP target Two 190 GB hyper volumes are created from each RAID group One concatenated metavolume is created from the above hyper-volumes from each RAID group. This is to achieve IO isolation on disk level. Each metavolume is allocated to each of the total 24 Greenplum segment VMs as an RDM disk 2011 VCE Company, LLC. All rights reserved. 20
Processor # Figure 7. Greenplum Backend Disk Layout on VMAX Front-end Storage Port Layout 2 x Engines 4 x Directors 16 x front-end processors, 2 x FC ports on each 2 x FC ports on each processor with total 32 x 8 Gbs FC ports. 16 FC ports (from each processor) are utilized In this configuration, only 16 x FC ports are used. Port 0 is taken from each processor. See figure below. Engine 4-128G Engine 5-128G Dir 7 Port Port Dir 8 Port Port Dir 9 Port Port Dir 10 Port Port H FA 0 1 FA 0 1 FA 0 1 FA 0 1 G FA 0 1 FA 0 1 FA 0 1 FA 0 1 F FA 0 1 FA 0 1 FA 0 1 FA 0 1 E FA 0 1 FA 0 1 FA 0 1 FA 0 1 Figure 8. Greenplum Front-end Storage Port layout VMax Mask View (LUN masking): A total of 10 x Mask Views are configured, 8 x for the ESX hosts running Greenplum segment VMs, 1 x for the ESX host running Greenplum master VM server, 1 x for the ESX host running standby VM server. Each ESX server HBA initiator accesses storage via 2 x Vmax storage ports, or total of 4 x Vmax storage ports per ESX server (with dual HBAs). For more information about Symmetrix VMAX, see the product documentation: http://powerlink.emc.com/km/live1/en_us/offering_technical/technical_documentation/300-008- 603.pdf?mtcs=ZXZlbnRUeXBlPUttQ2xpY2tDb250ZW50RXZlbnQsZG9jdW1lbnRJZD0wOTAxNDA2NjgwNTIyMz FkLG5hdmVOb2RlPVNvZndhcmVEb3dubG9hZHMtMg 2011 VCE Company, LLC. All rights reserved. 21
RecoverPoint RecoverPoint is EMC s leading out-of-band, block-level replication product for a heterogeneous server and storage environment. RecoverPoint continuous data protection (CDP) provides local synchronous replication between LUNs that reside in one or more arrays at the same site. RecoverPoint continuous remote replication (CRR) provides remote asynchronous replication between two sites for LUNs that reside in one or more arrays. Both RecoverPoint CDP and RecoverPoint CRR feature bi-directional replication and an any-point-in-time recovery capability, which allows the target LUNs to be rolled back to a previous point in time and used for read/write operations without affecting the ongoing replication or data protection. The bi-directional replication and any-point-in-time recovery capability can be enabled simultaneously with RecoverPoint concurrent local and remote (CLR) data protection. RecoverPoint supports three types of write-splitting technologies for maximal flexibility. Table 3 Splitter details Splitter Type How Deployed Overhead Host-based In I/O stack just above the multi-path software Adds write traffic at the HBA; no other impact Fabric-based In intelligent storage services hardware on a Brocade- or Cisco-based switch Operates at wire speeds; no impact CLARiiON-based In FLARE operating system; active in both storage processors No impact In the Greenplum configuration, the Cisco SANTap service is used in a RecoverPoint CDP deployment. Cisco 18/5 MSMs (Multi Service Modules) are installed in MDS 9513. Both the GP segment server data and the GP master metadata replicate locally on the VMAX for continuous data protection. See figure 9, below. 2011 VCE Company, LLC. All rights reserved. 22
Intelligentfabric splitter 1. Data is split and sent to the RecoverPoint appliance in one of three ways 3. Writes are acknowledged back from the RecoverPoint appliance / A / B / C r A r B r C Production volumes Replica volumes Journal volume 4. The appliance writes data to the journal volume, along with time stamp and application-specific bookmarks Figure 9. 5. Write-order-consistent data is distributed to the replica volumes RecoverPoint Sequence RecoverPoint VSAN Zoning RecoverPoint with Cisco SANTap deployment requires placing different components into two VSANs: Front-end VSAN Backend VSAN All I/O activity between the host and the storage is relayed by SANTap from the actual host port via the DVT (Data Virtual Target) in the front-end VSAN to the VI (Virtual Initiator) in the backend VSAN, and then to the actual storage port. This relay mechanism is completely transparent to the hosts. The following types of zones are required for each VSAN. Zones in the backend VSAN The backend VSAN contains the physical storage ports, the RecoverPoint Appliance HBA ports, the CVTs (control virtual targets created by SANTap service), and AVTs (Appliance Virtual Targets, created by the RecoverPoint Appliance): Zone Type 1 - a zone that contains a member of the ESX server HBA virtual initiators and the corresponding physical storage ports. These zones were almost identical to the zones in the front-end VSAN that contain the host HBA port and DVTs. See explanation below. Zone Type 2 - a zone that contains a member of the RPA HBA port and the physical storage ports. This allows the RPAs to access the storage on the Vmax. Zone Type 3 - a zone that contains RPA HBA ports and CVTs. This allows the RPA to request the CVT to open a splitting session. The I/O is then copied to the RPA, allowing the RPA to replicate data to the target. Zone Type 4 - a zone that contains RPA HBA ports and the AVTs. 2011 VCE Company, LLC. All rights reserved. 23
AVT are used to mask the identity of the appliance (RPA), allowing it to appear as the host. This masking is necessary to allow the RPA to overcome SCSI reservation of storage ports by the hosts and to get the same view of the SAN that the hosts have. Zones in the front-end VSAN A zone that is between the host HBA ports, in this case the UCS blade server HBA ports, and the SANTap Data Virtual Targets (the DVTs). Note: DVTs are created as the virtual storage port entity during SANTap configuration. Each physical storage port used in the backend VSAN would need a corresponding DVT created. For more information, see the following zoning tables. Table 4 RecoverPoint VSAN zoning table: Fabric A Fabric A, VSAN 30 (BE VSAN) Zone Member Member Member gpesx101_hba1_vmax gpesx101_hba1 Vmax_8eA Vmax_10eA gpesx102_hba1_vmax gpesx102_hba1 Vmax_8fA Vmax_10fA gpesx103_hba1_vmax gpesx103_hba1 Vmax_8gA Vmax_10gA gpesx104_hba1_vmax gpesx104_hba1 Vmax_8hA Vmax_10hA gpesx107_hba1_vmax gpesx107_hba1 Vmax_8gA Vmax_10gA gpesx201_hba1_vmax gpesx201_hba1 Vmax_8eA Vmax_10eA gpesx202_hba1_vmax gpesx202_hba1 Vmax_8fA Vmax_10fA gpesx203_hba1_vmax gpesx203_hba1 Vmax_8gA Vmax_10gA gpesx204_hba1_vmax gpesx204_hba1 Vmax_8hA Vmax_10hA gpesx207_hba1_vmax gpesx207_hba1 Vmax_8gA Vmax_10gA RPA1_HBA1_2_Vmax RPA1_HBA1 Vmax_8eA Vmax_10eA RPA1_HBA2 Vmax_8fA Vmax_10fA Vmax_8gA Vmax_10gA Vmax_8hA Vmax_10hA Vmax_8gA Vmax_10gA RPA8_HBA1_2_Vmax RPA8_HBA1 Vmax_8eA Vmax_10eA RPA8_HBA2 Vmax_8fA Vmax_10fA Vmax_8gA Vmax_10gA Vmax_8hA Vmax_10hA Vmax_8gA Vmax_10gA RPA_CVT_A All above RPA HBA ports All SANTap CVTs in Fabric A RPA_AVT_A All above RPA HBA ports All RPA AVTs 2011 VCE Company, LLC. All rights reserved. 24
Table 5 RecoverPoint VSAN zoning table: Fabric B Fabric B, VSAN 31 (BE VSAN) Zone Member Member Member gpesx101_hba2_vmax gpesx101_hba2 Vmax_9eA Vmax_7eA gpesx102_hba2_vmax gpesx102_hba2 Vmax_9fA Vmax_7fA gpesx103_hba2_vmax gpesx103_hba2 Vmax_9gA Vmax_7gA gpesx104_hba2_vmax gpesx104_hba2 Vmax_9hA Vmax_7hA gpesx107_hba2_vmax gpesx107_hba2 Vmax_9gA Vmax_7gA gpesx201_hba2_vmax gpesx201_hba2 Vmax_9eA Vmax_7eA gpesx202_hba2_vmax gpesx202_hba2 Vmax_9fA Vmax_7fA gpesx203_hba2_vmax gpesx203_hba2 Vmax_9gA Vmax_7gA gpesx204_hba2_vmax gpesx204_hba2 Vmax_9hA Vmax_7hA gpesx207_hba2_vmax gpesx207_hba2 Vmax_9gA Vmax_7gA RPA1_HBA3_4_Vmax RPA1_HBA3 Vmax_9eA Vmax_7eA RPA1_HBA4 Vmax_9fA Vmax_7fA Vmax_9gA Vmax_7gA Vmax_9hA Vmax_7hA Vmax_9gA Vmax_7gA RPA8_HBA3_4_Vmax RPA8_HBA3 Vmax_9eA Vmax_7eA RPA8_HBA4 Vmax_9fA Vmax_7fA Vmax_9gA Vmax_7gA Vmax_9hA Vmax_7hA Vmax_9gA Vmax_7gA RPA_CVT_B All above RPA HBA ports All SANTap CVTs in Fabric B RPA_AVT_B All above RPA HBA ports All RPA AVTs Table 6 RecoverPoint VSAN zoning table: Fabric A Fabric A', VSAN 130 (FE VSAN) Zone Member Member (DVT) Member (DVT) gpesx101_hba1_vmax gpesx101_hba1 Vmax_8eA Vmax_10eA gpesx102_hba1_vmax gpesx102_hba1 Vmax_8fA Vmax_10fA gpesx103_hba1_vmax gpesx103_hba1 Vmax_8gA Vmax_10gA gpesx104_hba1_vmax gpesx104_hba1 Vmax_8hA Vmax_10hA gpesx107_hba1_vmax gpesx107_hba1 Vmax_8gA Vmax_10gA gpesx201_hba1_vmax gpesx201_hba1 Vmax_8eA Vmax_10eA gpesx202_hba1_vmax gpesx202_hba1 Vmax_8fA Vmax_10fA gpesx203_hba1_vmax gpesx203_hba1 Vmax_8gA Vmax_10gA gpesx204_hba1_vmax gpesx204_hba1 Vmax_8hA Vmax_10hA gpesx207_hba1_vmax gpesx207_hba1 Vmax_8gA Vmax_10gA 2011 VCE Company, LLC. All rights reserved. 25
Table 7 RecoverPoint VSAN zoning table: Fabric B Fabric B', VSAN 131 (FE VSAN) Zone Member Member (DVT) Member (DVT) gpesx101_hba2_vmax gpesx101_hba2 Vmax_9eA Vmax_7eA gpesx102_hba2_vmax gpesx102_hba2 Vmax_9fA Vmax_7fA gpesx103_hba2_vmax gpesx103_hba2 Vmax_9gA Vmax_7gA gpesx104_hba2_vmax gpesx104_hba2 Vmax_9hA Vmax_7hA gpesx107_hba2_vmax gpesx107_hba2 Vmax_9gA Vmax_7gA gpesx201_hba2_vmax gpesx201_hba2 Vmax_9eA Vmax_7eA gpesx202_hba2_vmax gpesx202_hba2 Vmax_9fA Vmax_7fA gpesx203_hba2_vmax gpesx203_hba2 Vmax_9gA Vmax_7gA gpesx204_hba2_vmax gpesx204_hba2 Vmax_9hA Vmax_7hA gpesx207_hba2_vmax gpesx207_hba2 Vmax_9gA Vmax_7gA RecoverPoint Consistency Groups RecoverPoint replicates data by using logical groups called Consistency Groups (CGs). Each Consistency Group contains one or many replication sets. Each replication set is a paring between the replication source LUN and target LUN. Since each Consistency Group can be active on a particular RPA, in order to utilize all 8 x RPAs for optimal performance, a total of 8 x CGs were created with each CG containing 3 x replication sets. A Group Set was created to contain all 8 x CGs to provide replication consistency for the entire Greenplum environment. Data consistency is maintained at the Group Set level. This allows rapid, point-in-time recovery of the Greenplum environment. Below is the Consistency Group configuration table. 2011 VCE Company, LLC. All rights reserved. 26
Table 8 RecoverPoint consistency group configuration table For more information about RecoverPoint, see the RecoverPoint documentation set located at: http://powerlink.emc.com/km/appmanager/km/securedesktop?_nfpb=true&_pagelabel=freeformlinks2&internalid =0b014066800f517e&_irrt=true&rnavid=PT-2%3A0b0140668037fed7 2011 VCE Company, LLC. All rights reserved. 27
TimeFinder/Snap TimeFinder provides local storage replication for increased application availability and faster data recovery. Leveraging the industry leading high-end EMC Symmetrix system, TimeFinder offers unmatched deployment flexibility and massive scalability to meet any service level requirement. TimeFinder helps companies perform backups, load data warehouses, and easily provide data for application test and development without downtime. TimeFinder/Snap provides the following: Storage-based information replication; no-host cycles Snapshots create logical point-in-time images of a source volume Requires only a fraction of the source volume s capacity (~20 30%) Multiple snapshots can be created from a source volume and are available immediately Snapshots support both read and write processing In the Greenplum Vblock platform, SATA disks are configured into a Snap pool for the snaps. Production volume Production view Figure 10. TimeFinder/Snap Save area Snapshot view Cache-based pointer map Replication Manager Replication Manager (RM) is EMC s software that improves access to information by automating and managing disk-based replicas. Replication Manager is used to manage the TimeFinder Snap operations. Key benefits are that it: Automates the creation, management, and use of EMC disk-based, point-in-time replicas Auto-discovers the environment Has intelligence to orchestrate replicas with deep application awareness Is easy to use with point-and-click controls, wizards, and user access Supports VMware ESX Server Windows and Linux guest operating system environments and Virtual Machine File System (VMFS) containing virtual machines, and: o Reduces backup windows o Minimizes/eliminates impact on the application o Improves Recovery Point Objectives (RPO) and Recovery Time Objectives(RTO) 2011 VCE Company, LLC. All rights reserved. 28
o Enhances productivity o Offers data-warehouse refreshes o Provides decision support o Provides database-recovery checkpoints o Enables application development and testing o Enables fast restore o Enables application restart and business resumption 2011 VCE Company, LLC. All rights reserved. 29
Test Results This section presents the objectives and results for two different, complementary test sets: Tests of the read and load performance with Greenplum database on the Vblock 700 Tests of disaster recovery success with the RecoverPoint appliance Read and Load Performance Tests: Objectives and Results The following tests were performed with the Greenplum DB on a Vblock 700. Table 9 Read and load performance test objectives Test Objective 1. Test Greenplum read performance on a Vblock 700 with the specified building block. 2. Test Greenplum load performance on a Vblock 700. Description Test read scan rate and query response time on a Vblock 700 with a specific building block as previously described. Database size: eight TB Test data load rate on Vblock 700 with a specific building block as previously described. Database size: eight TB Table 10 Test Scenario Test 1: Read test summary and results Description Run a stored procedure with two variations: 1. One sequential run with no other workload on the system. 2. One sequential run while additional workload is being run on the system. Test Result 1. One sequential run with no other workload on the system. 2. One sequential run while additional workload was being run on the system. Read scan query and response times. Description Run #1 result: 9.6 minutes for 15 million records Run #2 result: 11.1 minutes for 15 million records Online = 1100 queries completed Production = 50 jobs completed Ad hoc = 15 jobs completed 2011 VCE Company, LLC. All rights reserved. 30
Table 11 Test Scenario Test 2: Load test summary and results Description 1. Load an empty monthly partition with one day of data. 2. Load a half full monthly partition with one day of data. 3. Load a full monthly partition with one day of data. Test Result 1. Empty Partition Load = 11.33 minutes 2. Half Full Partition Load = 11.32 minutes 3. Full Partition Load = 11.17 minutes One day of data equated to roughly 15 million records. Description Met performance metrics. RecoverPoint Disaster Recovery Tests: Objectives and Results The RecoverPoint (RP) appliance replication capability is leveraged to perform site Disaster Recovery (DR) testing. The following test scenarios to validate the DR testing solutions are within the BI/DW solution stack proposed by EMC. The objectives and results for four tests are summarized in the following tables. Table 12 Test Objective Objectives for DR Site recovery tests using RecoverPoint replication Description 1. Verify the Point in Time (PIT) bookmark for entire dataset. 2. Verify that the Point in Time (PIT) image was the correct image. 3. Verify that Snapshot consolidation works correctly. 4. Switch over the production DB to the secondary side. Perform a local bookmark function test to verify that users have access to the database. Perform a PIT copy on production. Enable Snapshot consolidation. Switch over to target side. 2011 VCE Company, LLC. All rights reserved. 31
Table 13 Test 1: Bookmark test summary and results Test Scenario Follow a sequence of steps to place the database in and out of suspend mode, while enabling the image on secondary hosts. Test Result 1. Place database in gp suspend mode. 2. Create RP BM. 3. Take the database out of gp suspend mode. 4. Enable the image on secondary hosts. 5. Mount BM image. 6. Start database. 7. Verify PIT. Description Verify that the database is open and accessible. Description Database was open and DBA s were able to access the database. Table 14 Test 2: PIT image test summary and results Test Scenario Copy a PIT image before inserting records to production. Test Result 1. DBAs insert 35 million records on production 2. A PIT image is copied. 3. DBAs are able to get correct image before the insert. Description Verify that the correct PIT image is copied before inserting 35 million records into production. Description DBAs successfully copied correct image on production. Table 15 Test 3: Snapshot consolidation summary and results Test Scenario Bookmark an image during Snapshot consolidation. Test Result 1. DBA restores the Snapshot consolidation bookmark. Description DBA must verify the image on Snapshot consolidation. Description DBA successfully completed Snapshot-consolidated image. 2011 VCE Company, LLC. All rights reserved. 32
Table 16 Test 4: Switch production DB summary and results Test Scenario Determine the latest point in time image and switch over from the production database to the target DB. Test Result 1. Enable the latest point in time image and switch over the production DB to the target DB. Description Target DB has to be primary. Description Successfully completed. 2011 VCE Company, LLC. All rights reserved. 33
Conclusion Our testing supports the benefits of the building block system approach used for hosting Greenplum applications. Key results from the read and load performance tests illustrate the scalability of Greenplum on a Vblock 700 solution: The Read Test results show that the scan rate and query response time on Vblock 700 have a similar read performance time whether a stored procedure was run alone where results showed 9.6 minutes to read 15 million records, or run with additional workload on the system where the results achieved were 11.1 minutes to read 15 million records. The Load Test results showed a similar load performance time whether the load was an empty partition load with results at 11.33 minutes, a half full partition load with results at 11.32 minutes, or a full partition load with results at 11.17 minutes. The key results from the RecoverPoint tests show the successful recovery and restoration of the database image and validate the disaster recovery solution included in the system. 2011 VCE Company, LLC. All rights reserved. 34
References For further Greenplum information, see the following: http://powerlink.emc.com/km/live1/en_us/offering_technical/technical_documentation/300-011-538.pdf Greenplum installation DOC link http://powerlink.emc.com/km/live1/en_us/offering_technical/technical_documentation/300-011-541.pdf Introduction to EMC RecoverPoint 3.3 New Features and Functions Applied Technology white paper http://www.emc.com/collateral/software/white-papers/h2781-emc-recoverpoint-3-new-features.pdf RecoverPoint Guides full set http://powerlink.emc.com/km/appmanager/km/securedesktop?_nfpb=true&_pagelabel=freeformlinks2&internalid =0b014066800f517e&_irrt=true&rnavid=PT-2%3A0b0140668037fed7 Symmetrix VMAX product guide http://powerlink.emc.com/km/live1/en_us/offering_technical/technical_documentation/300-008- 603.pdf?mtcs=ZXZlbnRUeXBlPUttQ2xpY2tDb250ZW50RXZlbnQsZG9jdW1lbnRJZD0wOTAxNDA2NjgwNTIyMz FkLG5hdmVOb2RlPVNvZndhcmVEb3dubG9hZHMtMg ABOUT VCE VCE, the Virtual Computing Environment Company formed by Cisco and EMC with investments from VMware and Intel, accelerates the adoption of converged infrastructure and cloud-based computing models that dramatically reduce the cost of IT while improving time to market for our customers. VCE, through the Vblock platform, delivers the industry's first completely integrated IT offering with end-to-end vendor accountability. VCE prepackaged solutions are available through an extensive partner network, and cover horizontal applications, vertical industry offerings, and application development environments, allowing customers to focus on business innovation instead of integrating, validating and managing IT infrastructure. For more information, go to www.vce.com. THE INFORMATION IN THIS PUBLICATION IS PROVIDED "AS IS." VCE MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OR MECHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Copyright 2011 VCE Company, LLC. All rights reserved. Vblock and the VCE logo are registered trademarks or trademarks of VCE Company, LLC. and/or its affiliates in the United States or other countries. All other trademarks used herein are the property of their respective owners. 2011 VCE Company, LLC. All rights reserved. 35