EMC Greenplum Data Computing Appliance and Data Integration Accelerator Getting Started Guide Version P/N: Rev: A01
|
|
|
- Barry Norris
- 10 years ago
- Views:
Transcription
1 The Data Computing Division of EMC EMC Greenplum Data Computing Appliance and Data Integration Accelerator Getting Started Guide Version P/N: Rev: A01
2 Copyright 2010 EMC Corporation. All rights reserved. EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. THE INFORMATION IN THIS PUBLICATION IS PROVIDED AS IS. EMC CORPORATION MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com All other trademarks used herein are the property of their respective owners.
3 EMC DCA and DIA Getting Started Guide Contents EMC DCA and DIA Getting Started Guide - Contents Preface... 1 About This Guide... 1 Document Conventions... 1 Text Conventions... 2 Command Syntax Conventions... 3 Getting Support... 3 Product information... 3 Technical support... 4 Chapter 1: About EMC Greenplum Greenplum DCA and DIA 5 About DCA and DIA... 5 Available DCA Configurations... 5 Available DIA Configurations...10 Component Specifications...13 Additional Solutions - Data Domain Backup and Recovery...16 About Greenplum Database...17 About the Master Hosts...18 About the Segment Hosts...19 About the Network Configuration...20 Chapter 2: Greenplum DCA / DIA Administration...23 Troubleshooting and Diagnostic Tools...23 Database and System Monitoring Tools...24 ConnectEMC Dial Home Capability...24 Greenplum Performance Monitor...26 Greenplum Database System Catalogs...28 Greenplum Database and SNMP Alerting...29 General Database Maintenance Tasks...29 Routine Vacuum and Analyze...29 Routine Reindexing...30 Managing Greenplum Database Log Files...30 Chapter 3: Connecting to Greenplum Database...33 Establishing a Database Session...33 Supported Client Applications...33 Greenplum Database Client Applications...34 pgadmin III for Greenplum Database...36 Database Application Interfaces...37 Third-Party Client Tools...37 Troubleshooting Connection Problems...38 Chapter 4: Next Steps...39 Understanding the SQL Features of Greenplum Database...39 Core SQL Conformance...39 SQL 1992 Conformance...40 SQL 1999 Conformance...41 SQL 2003 Conformance...42 SQL 2008 Conformance...42 Greenplum and PostgreSQL Compatibility...43 Providing User Access to Greenplum Database...49 Table of Contents iii
4 EMC DCA and DIA Getting Started Guide Contents Creating Databases and Loading Data...50 Glossary...51 iv Table of Contents
5 EMC DCA and DIA Getting Started Guide Preface Preface This guide is intended for database and system administrators who are new to the Greenplum Data Computing Appliance (Greenplum DCA), Greenplum Data Integration Accelerator, and to Greenplum Database. This guide provides an overview of the appliance configuration, as well as general information about using and administering a Greenplum Database system. About This Guide Document Conventions Getting Support About This Guide This guide provides high-level information to help administrators get started with Greenplum Database. It is intended for system and database administrators responsible for managing a Greenplum Database system on the Greenplum Data Computing Appliance. This guide assumes knowledge of Linux/UNIX system administration, database management systems, database administration, and structured query language (SQL). This guide contains the following chapters and appendices: Chapter 1, About EMC Greenplum Greenplum DCA and DIA explains the architecture, components, and configuration of Greenplum Database on the Greenplum Data Computing Appliance. Chapter 2, Greenplum DCA / DIA Administration describes the general database maintenance tasks and the tools available to diagnose, monitor, and troubleshoot a Greenplum Database system running on the Greenplum Data Computing Appliance. Chapter 3, Connecting to Greenplum Database explains how to connect to Greenplum Database using various client programs. Chapter 4, Next Steps explains the next steps to implementing your data warehouse requirements in Greenplum Database. Glossary defines Greenplum Database components and terminology. Document Conventions The following conventions are used throughout the Greenplum Database documentation to help you identify certain types of information. Text Conventions Command Syntax Conventions About This Guide 1
6 EMC DCA and DIA Getting Started Guide Preface Text Conventions Table 0.1 Text Conventions Text Convention Usage Examples bold italics monospace monospace italics monospace bold UPPERCASE Button, menu, tab, page, and field names in GUI applications New terms where they are defined Database objects, such as schema, table, or columns names File names and path names Programs and executables Command names and syntax Parameter names Variable information within file paths and file names Variable information within command syntax Used to call attention to a particular part of a command, parameter, or code snippet. Environment variables SQL commands Keyboard keys Click Cancel to exit the page without saving your changes. The master instance is the postgres process that accepts client connections. Catalog information for Greenplum Database resides in the pg_catalog schema. Edit the postgresql.conf file. Use gpstart to start Greenplum Database. /home/gpadmin/config_file COPY tablename FROM 'filename' Change the host name, port, and database name in the JDBC connection URL: jdbc:postgresql://host:5432/m ydb Make sure that the Java /bin directory is in your $PATH. SELECT * FROM my_table; Press CTRL+C to escape. 2 Document Conventions
7 EMC DCA and DIA Getting Started Guide Preface Command Syntax Conventions Table 0.2 Command Syntax Conventions Text Convention Usage Examples { } Within command syntax, curly braces group related command options. Do not type the curly braces. [ ] Within command syntax, square brackets denote optional arguments. Do not type the brackets.... Within command syntax, an ellipsis denotes repetition of a command, variable, or option. Do not type the ellipsis. Within command syntax, the pipe symbol denotes an OR relationship. Do not type the pipe symbol. FROM { 'filename' STDIN } TRUNCATE [ TABLE ] name DROP TABLE name [,...] VACUUM [ FULL FREEZE ] $ system_command # root_system_command => gpdb_command =# su_gpdb_command Denotes a command prompt - do not type the prompt symbol. $ and # denote terminal command prompts. => and =# denote Greenplum Database interactive program command prompts (psql or gpssh, for example). $ createdb mydatabase # chown gpadmin -R /datadir => SELECT * FROM mytable; =# SELECT * FROM pg_database; Getting Support EMC support, product, and licensing information can be obtained as follows. Product information For documentation, release notes, software updates, or for information about EMC products, licensing, and service, go to the EMC Powerlink website (registration required) at: Getting Support 3
8 EMC DCA and DIA Getting Started Guide Preface Technical support For technical support, go to Powerlink and choose Support. On the Support page, you will see several options, including one for making a service request. Note that to open a service request, you must have a valid support agreement. Please contact your EMC sales representative for details about obtaining a valid support agreement or with questions about your account. 4 Getting Support
9 EMC DCA and DIA Getting Started Guide Chapter 1: About EMC Greenplum Greenplum DCA and DIA 1. About EMC Greenplum Greenplum DCA and DIA Greenplum Data Computing Appliance (Greenplum DCA) is a self-contained data warehouse solution that integrates all of the database software, servers and switches necessary to perform big data analytics. EMC Greenplum Data Computing Appliance (DCA) is a turn-key, easy to install data warehouse solution that provides extreme query and loading performance for analyzing large data sets. The EMC Greenplum DCA integrates Greenplum Database software with compute, storage and network components; delivered racked and ready for immediate data loading and query execution. The DCA is available in two configurations - balanced and capacity. The balanced system uses high speed SAS drive technology, while the capacity system uses high capacity SATA drive technology. EMC Greenplum Data Integration Accelerator (DIA) is a fast, parallel data loading solution, built specifically to integrate with the DCA. The DIA comes pre-configured with Greenplum s gpfdist loading tool. The DIA supports health monitoring and ConnectEMC dial home notifications. For more information on the loading process, refer to the Greenplum Database Administrator Guide: Loading and Unloading Data. EMC Greenplum Data Computing Appliance runs the Greenplum Database relational database management system (RDBMS) software. Greenplum Database utilizes the DCA components to perform its database operations and processing. See the following sections for a description of the DCA components and configurations. About DCA and DIA About Greenplum Database About DCA and DIA This section explains the hardware components and specifications of the Greenplum Data Computing Appliance and Data Integration Accelerator. Available DCA Configurations Available DIA Configurations Component Specifications Available DCA Configurations This release of the Greenplum DCA is available in four configurations, each with a capacity and balanced model: the Greenplum GP10 and GP10C (quarter-rack configuration), the Greenplum GP100 and GP100C (half-rack configuration), Greenplum GP1000 and GP1000C (full-rack configuration), and Greenplum GP1000C plus one scale-out module (two-rack configuration) About DCA and DIA 5
10 EMC DCA and DIA Getting Started Guide Chapter 1: About EMC Greenplum Greenplum DCA and DIA The balanced configuration of DCA utilizes 600GB 15k SAS drives in the segment servers for a usable capacity of 36TB on a full-rack GP1000. The capacity configuration of DCA utilizes 2TB 7.2k SATA drives in the segment servers for a usable capacity of 124TB on a full-rack GP1000C. Figure 1.1 GP10 Quarter-Rack Configuration 6 About DCA and DIA
11 EMC DCA and DIA Getting Started Guide Chapter 1: About EMC Greenplum Greenplum DCA and DIA Figure 1.2 GP100 Half-Rack Configuration About DCA and DIA 7
12 EMC DCA and DIA Getting Started Guide Chapter 1: About EMC Greenplum Greenplum DCA and DIA Figure 1.3 GP1000 One-Rack Configuration 8 About DCA and DIA
13 EMC DCA and DIA Getting Started Guide Chapter 1: About EMC Greenplum Greenplum DCA and DIA Figure 1.4 GP Scale-Out Module Two-Rack Configuration About DCA and DIA 9
14 EMC DCA and DIA Getting Started Guide Chapter 1: About EMC Greenplum Greenplum DCA and DIA Available DIA Configurations This release of the Greenplum DIA comes in three configurations: the Greenplum DIA10 (quarter-rack), the Greenplum DIA100 (half-rack), and the Greenplum DIA1000 (full-rack). The DIA has a usable storage capacity of 71TB for a DIA10, 142TB for a DIA100 and 284TB for a DIA1000. Figure 1.5 DIA10 Quarter-Rack Configuration 10 About DCA and DIA
15 EMC DCA and DIA Getting Started Guide Chapter 1: About EMC Greenplum Greenplum DCA and DIA Figure 1.6 DIA100 Half-Rack Configuration About DCA and DIA 11
16 EMC DCA and DIA Getting Started Guide Chapter 1: About EMC Greenplum Greenplum DCA and DIA Figure 1.7 DIA1000 Full-Rack Configuration 12 About DCA and DIA
17 EMC DCA and DIA Getting Started Guide Chapter 1: About EMC Greenplum Greenplum DCA and DIA Component Specifications This section explains the specifications of the various server and networking components of the Greenplum DCA and DIA. Note that in the Greenplum Database product and documentation, physical servers are referred to as hosts. Table 1.1 DCA/DIA Components Component Master Host Quantity All Configurations = 2 (one primary and one standby) The Greenplum DIA contains no Master Hosts Segment/DIA Hosts GP10/DIA10 = 4 GP100/DIA100 = 8 GP1000/DIA1000 = 16 GP = 32 Interconnect Switch GP/10/GP100/GP1000 = 2 GP = 4 DIA10/DIA100/DIA1000 = 2 Administration Switch GP10/GP100/GP1000 = 1 GP = 2 DIA10/DIA100/DIA1000 = 1 About DCA and DIA 13
18 EMC DCA and DIA Getting Started Guide Chapter 1: About EMC Greenplum Greenplum DCA and DIA Master Host Specifications The following diagram shows an example of how a Greenplum Database master host is configured in the Greenplum DCA. Greenplum DCA has two master hosts (the primary master and a standby master). The Greenplum DIA contians no master hosts, however, the DCA s master hosts are used for management of the DIA. Figure 1.8 Greenplum Database Master Host Configuration on the Greenplum DCA Table 1.2 Master Host Server Specifications Hardware Specifications Quantity Processor Intel X GHz (6 core) 2 Memory DDR MHz 48 GB Dual-port Converged Network Adapter 2 x 10 Gbps 1 Quad-port Network Adapter 4 x 1 Gbps 1 RAID controller Dual channel 6 Gb/s SAS 1 Hard Disks 600 GB 10 K RPM SAS (one RAID5 volume of 4+1 with 1 hot spare) Master Host Server utilizes the same drives between balanced and capacity systems About DCA and DIA
19 EMC DCA and DIA Getting Started Guide Chapter 1: About EMC Greenplum Greenplum DCA and DIA Segment/DIA Host Specifications The following diagram shows an example of how a host is configured in the Greenplum DCA and DIA. Each segment host serves 6 Greenplum Database primary segment instances and 6 mirror segment instances. On the DIA, Greenplum recommends running two gpfdist loading processes per host, however this value may change based on the actual environment. Figure 1.9 Host Configuration on the Greenplum DCA / DIA Table 1.3 Host Server Specifications Hardware Specifications Quantity Processor Intel X GHz (6 core) 2 Memory DDR MHz 48 GB Dual-port Converged Network Adapter 2 x 10 Gbps 1 Dual-port Network Adapter 2 x 1 Gbps 1 RAID controller Dual channel 6 Gb/s SAS 1 Hard Disks Balanced System: 600 GB 15 K RPM SAS (two RAID5 volumes of 5+1 disks) Capacity System and DIA: 2TB 7.2K RPM SATA (two RAID5 volumes of 5+1 disks) 12 About DCA and DIA 15
20 EMC DCA and DIA Getting Started Guide Chapter 1: About EMC Greenplum Greenplum DCA and DIA Network Component Specifications Hardware Specifications Quantity Interconnect Switch 24-port Converged Enhanced Ethernet (CEE), Fibre Channel over Ethernet (FCoE) 8 Fibre Channel Ports (future use) 2 Admin Switch 24-port 1 Gb Ethernet Layer 3 1 Additional Solutions - Data Domain Backup and Recovery EMC Data Domain deduplication storage systems dramatically reduce the amount of disk storage needed to retain and protect enterprise data. By identifying redundant data as it is being stored, Data Domain provides a storage footprint that is up to 30 times smaller, on average, than the original dataset. Backup data can then be efficiently replicated and retrieved over existing networks for streamlined disaster recovery and consolidated tape operations. This allows Data Domain appliances to integrate seamlessly into database architectures, maintaining existing backup strategies with no changes to scripts, backup processes, or system architecture. Figure 1.10 Data Domain Backup Solution for DCA 16 About DCA and DIA
21 EMC DCA and DIA Getting Started Guide Chapter 1: About EMC Greenplum Greenplum DCA and DIA About Greenplum Database Greenplum Database is a massively parallel processing (MPP) database management system (DBMS). Greenplum Database 4.0 uses MPP as the backbone to its database architecture. MPP refers to a distributed system that has two or more individual servers, which carry out an operation in parallel. Each server has its own processor(s), memory, operating system and storage. All servers communicate with each other over a common network. In this instance a single database system can effectively use the combined computational performance of all individual MPP servers to provide a powerful, scalable database system. Greenplum uses this high-performance system architecture to distribute the load of multi-terabyte data warehouses, and is able to use all of a system s resources in parallel to process a query. Greenplum Database is based on PostgreSQL , and in most cases is very similar to PostgreSQL with regards to SQL support, features, configuration options, and end-user functionality. Database users interact with Greenplum Database as they would a regular PostgreSQL DBMS. Greenplum Database is able to handle the storage and processing of large amounts of data by distributing the load across several servers or hosts. The master is the entry point to the Greenplum Database system. It is the database instance where clients connect and submit SQL statements. Greenplum DCA comes with two master hosts one primary master and a standby master. The master coordinates the work across the other database instances in the system, the segments, which handle data processing and storage. Greenplum DCA comes with a configurable number of segment hosts. Each segment host serves 6 primary and 6 mirror Greenplum segment instances. The segments communicate with each other and with the master over the interconnect, which is the networking layer of Greenplum Database. The DCA interconnect is configured on a private LAN and utilizes two high-speed network switches, offering each segment host 20 Gb non-blocking duplex bandwidth. The Greenplum primary and mirror segments are configured to use different interconnect switches in order to provide redundancy in the event of a single switch failure. About Greenplum Database 17
22 EMC DCA and DIA Getting Started Guide Chapter 1: About EMC Greenplum Greenplum DCA and DIA In addition the interconnect switches, Greenplum DCA comes with an additional administration switch. Each master and segment server has a dedicated interface for remote system administration. This controller has its own processor, memory, battery, and network connection. This allows administrators to access the individual Greenplum DCA servers as if they were at the local console (terminal). Figure 1.11 High-Level Greenplum Database Architecture About the Master Hosts The master is the entry point to the Greenplum Database system from the public LAN. It is the database process that accepts client connections and processes the SQL commands issued by the users of the system. Users connect to Greenplum Database through the master using PostgreSQL-compatible client programs such as psql or ODBC. The master maintains the system catalog (a set of system tables that contain metadata about the Greenplum Database system itself), however the master does not contain any user data. Data resides only on the segments. The master does the work of authenticating client connections, processing and planning the incoming SQL commands, distributing the work load between the segments, coordinating the results returned by each of the segments, and presenting the final results to the client program. 18 About Greenplum Database
23 EMC DCA and DIA Getting Started Guide Chapter 1: About EMC Greenplum Greenplum DCA and DIA Master Redundancy - The Standby Master Greenplum DCA also has a standby master host to serve as a backup in case the primary master becomes unoperational. The standby master is a warm standby, meaning failover is not automatic. If the primary master fails, an administrator can promote the standby master to be the active master for the Greenplum Database system. The standby master is kept up to date by a transaction log replication process, which runs on the standby master host and keeps the data between the primary and standby master hosts synchronized. If the primary master fails, the log replication process is shutdown, and the standby master can be activated in its place. Upon activation of the standby master, the replicated logs are used to reconstruct the state of the master host at the time of the last successfully committed transaction. About the Segment Hosts In Greenplum Database, the segments are where the database data is stored and where the majority of query processing takes place. User-defined tables and their indexes are distributed across the available number of segments in the Greenplum Database system, each segment containing a distinct portion of the data. Segment instances are the database server processes that serve segments. Users and administrators do not interact directly with the segments in a Greenplum Database system, but do so through the master. Data Redundancy - Mirror Segments Greenplum Database provides data redundancy by deploying mirror segments. Mirror segments allow database queries to fail over to a backup segment if the primary segment becomes unavailable. A mirror segment always resides on a different host than its corresponding primary segment. A Greenplum Database system can remain operational if a segment host, network interface or interconnect switch goes down as long as all portions of data are available on the remaining active segments. During database operations, only the primary segment is active. Changes to a primary segment are copied over to its mirror using a file block replication process. Until a failure occurs on the primary segment, there is no live segment instance running on the mirror host -- only the replication process. Figure 1.12 Data Mirroring in Greenplum Database About Greenplum Database 19
24 EMC DCA and DIA Getting Started Guide Chapter 1: About EMC Greenplum Greenplum DCA and DIA In the event of a segment failure, the file replication process is stopped and the mirror segment is automatically brought up as the active segment instance. All database operations then continue using the mirror. While the mirror is active, it is also logging all transactional changes made to the database. When the failed segment is ready to be brought back online, administrators initiate a recovery process to bring it back into operation. About the Network Configuration The following diagram shows an example of how the network is configured in Greenplum GP1000 (full-rack configuration). The Greenplum Database interconnect and administration networks are configured on a private LAN. Outside access to Greenplum Database and to the Greenplum DCA systems goes through the master host. Figure 1.13 Greenplum DCA Network Configuration About the Greenplum Interconnect Networks The interconnect is the networking layer of Greenplum Database. When a user connects to a database and issues a query, processes are created on each of the segments to handle the work of that query. The interconnect refers to the inter-process communication between the segments, as well as the network infrastructure on which this communication relies. 20 About Greenplum Database
25 EMC DCA and DIA Getting Started Guide Chapter 1: About EMC Greenplum Greenplum DCA and DIA To maximize throughput, interconnect activity is load-balanced over two interconnect networks. To ensure redundancy, a primary segment and its corresponding mirror segment utilize different interconnect networks. With this configuration, Greenplum Database can continue its operations in the event of a single interconnect switch failure. About the Greenplum DCA Administration Network The administration network is used for system management facilities and Greenplum administration utilities, so as not to interfere with the network traffic related to database processing. Each master and segment host has one administration/idrac network interface. About idrac The Integrated Dell Remote Access Controller (idrac) is a built-in interface in Dell servers that provides out-of-band system management facilities. The controller has its own processor, memory, battery, network connection, and access to the system bus. Key features include power management, virtual media access and remote console capabilities, all available through a supported web browser. This gives system administrators the ability to manage a machine as if they were sitting at the local console. For more information about idrac, see the idrac User Guide. About Greenplum Database 21
26 EMC DCA and DIA Getting Started Guide Chapter 1: About EMC Greenplum Greenplum DCA and DIA 22 About Greenplum Database
27 EMC DCA and DIA Getting Started Guide Chapter 2: Greenplum DCA / DIA Administration 2. Greenplum DCA / DIA Administration This chapter describes the general database maintenance tasks and the tools available to diagnose, monitor, and troubleshoot a Greenplum Database system running on the Greenplum Data Computing Appliance. The Greenplum DIA supports ConnectEMC for dial home of hardware related issues and well as health monitoring through Greenplum Performance Monitor. Troubleshooting and Diagnostic Tools Database and System Monitoring Tools General Database Maintenance Tasks Troubleshooting and Diagnostic Tools Greenplum Database provides the following troubleshooting and diagnostic tools. More information on these tools can be found in the Greenplum Database Administrator Guide: Table 2.1 Greenplum Database Diagnostic Tools Tool Name gpcheck gpcheckperf gpstate gp_toolkit schema Description gpcheck is a Greenplum command-line management utility that can be used to validate the configuration and operating system settings of the DCA hosts. gpcheckperf is a Greenplum command-line management utility that can be used to validate baseline hardware performance. If you are experiencing slower than expected response times, running this utility can help identify if the issue is related to a hardware failure rather than a SQL workload or software problem. gpstate is a Greenplum command-line management utility that can be used to check the status and configuration of a running Greenplum Database system. It can be used to identify segment failures and general health of a Greenplum Database system. gp_toolkit is an administrative schema that is installed into every database within Greenplum Database. It contains a number of helpful views and database functions to help administrators diagnose common problems, such as checking for tables that need routine maintenance. Administrators access this schema by connecting to any database and issuing SQL queries against the views and functions in this schema. Troubleshooting and Diagnostic Tools 23
28 EMC DCA and DIA Getting Started Guide Chapter 2: Greenplum DCA / DIA Administration Table 2.1 Greenplum Database Diagnostic Tools Tool Name gpssh gplogfilter Description gpssh is a Greenplum command-line management utility that allows administrators to run shell commands on multiple hosts at once using SSH (secure shell). This allows administrators to execute a command on every segment host at once without having to log in to each machine individually. gplogfilter is a Greenplum command-line management utility that can be used to search through Greenplum Database log files for specific entries. This can be useful in tracking down more information in the logs about certain database activity or errors. Database and System Monitoring Tools Greenplum Data Computing Appliance provides various tools to monitor the status of Greenplum Database as well as the hardware components that Greenplum Database is running on. ConnectEMC Dial Home Capability Greenplum Performance Monitor Greenplum Database System Catalogs Greenplum Database and SNMP Alerting ConnectEMC Dial Home Capability The EMC Greenplum Data Computing Appliance and Data Integration Accelerator support dial home functionality through the ConnectEMC software. ConnectEMC is a support utility that collects and sends event data - files indicating system errors and other information - from EMC products to EMC Global Services customer support. ConnectEMC sends DCA event files using the secure file transfer protocol (FTPS). If an EMC Secure Remote Support Gateway (ESRS) is used for connectivty, HTTPS or FTP are available protocols for sending alerts. The ConnectEMC software is configured on the DCA master and standby master server and sent out through the external connection (eth1) either to an ESRS Gateway server or directly to EMC. The DIA routes notifications through the DCA, so a dedicated connection for dial home is not required. Dial Home Severity Levels Alerts that arrive at EMC Global Services can have one of the following severity levels: WARNING: This indicates a condition that might require immediate attention. This severity will create a service request. ERROR: This indicates that an error occurred on the Greenplum DCA or DIA. System operation and/or performance is likely affected. This severity will create a service request. 24 Database and System Monitoring Tools
29 EMC DCA and DIA Getting Started Guide Chapter 2: Greenplum DCA / DIA Administration UNKNOWN: This severity level is associated with hosts and devices on the Greenplum DCA or DIA that are either disabled (due to hardware failure) or unreachable for some other reason. This severity will create a service request. INFO: An event with this severity level indicates that a previously reported error condition is now resolved. An event with this severity level is also used to provide information about the system that does not require any action. This severity will not create a service request. For example, Greenplum Database startup triggers an INFO alert. The severity of events determines if a service request is created for EMC support to act on. The events listed in Table 2.2, ConnectEMC Events and Symptoms Codes on page 25 can generate multiple severity levels based on the error condition. For example, the failure of a segment server disk drive will generate Symptom Code 13 with a severity of ERROR. The ConnectEMC software will dial home to Global Services customer support, and a service request will be created. Upon successful replacement of the disk drive, Symptom Code 13 will be generated again, this time with a severity of INFO to notify the disk drive was replaced. Note: Monitoring in the DCA and DIA is primarily focused on hardware related events. The monitoring of Greenplum Database events is limited in this release and will be expanded in future versions. ConnectEMC Event Alerts The table below lists all the conditions that cause ConnectEMC to send event data alerts to EMC Global Services. Table 2.2 ConnectEMC Events and Symptoms Codes Symptom Code Item Description 1 Host Status A host or device on the Greenplum DCA/DIA is either down or not reachable. 2 Greenplum Database Status An alert generated by Greenplum Database. The alert can indicate successful startup (severity=info) or critical database conditions. The following four error conditions will generate an alert: Unknown transaction status. This may indicate that a table is corrupted. Database recovery interruption. This indicates that Greenplum Database should be restored from backup. Two-phase state file for transaction. This may indicate corruption of database files. Panic conditions. This indicates that Greenplum Database shutdown is imminent. This condition can be caused by data corruption and/or system resource issues. Other database error conditions are not monitored in the current DCA release. 3 Power Supply Status An issue with a power supply was detected. 4 Battery Status An issue with a battery was detected. Database and System Monitoring Tools 25
30 EMC DCA and DIA Getting Started Guide Chapter 2: Greenplum DCA / DIA Administration Table 2.2 ConnectEMC Events and Symptoms Codes Symptom Code Item Description 5 Cooling Device Status An issue with a fan was detected. 6 Processor Status An issue with a processor was detected. 7 Cache Device Status An issue with a CPU L1/L2/L3 cache was detected. 8 OS Memory Status An issue with OS memory was detected. 9 Memory Device Status An issue with a RAM device was detected. 10 Network Device Status An issue with a network interface card was detected. The device might be disconnected or is no longer serviceable. 11 Controller Status An issue with an IO controller was detected. 11 Controller Battery Status An issue with an IO controller battery was detected. 12 Virtual Disk Status An issue with virtual disk configuration was detected. 12 Virtual Disk Write Policy An issue with virtual disk configuration was detected. A sub-optimal performance policy can also trigger an alert. 12 Virtual Disk Read Policy An issue with virtual disk configuration was detected. A sub-optimal performance policy can also trigger an alert. 12 Virtual Disk State An issue with virtual disk configuration was detected. Any sub-optimal state can trigger an alert. 13 Array Disk Status An issue with a hard disk was detected. 14 Sensor Status An issue with a network switch hardware component was detected. 15 SNMP Monitoring Status An issue with SNMP configuration on a device was detected. This issue is preventing monitoring functions from occurring. Greenplum Performance Monitor Greenplum Performance Monitor allows administrators to collect query and system performance metrics from a running Greenplum Database system. Monitor data is stored within Greenplum Database. Greenplum Performance Monitor is comprised of data collection agents that run on the master host and each segment host. The agents collect performance data about active queries and system utilization and send it to the Greenplum master at regular intervals. The data is stored in a dedicated database on the master (called gpperfmon), where it can be accessed using the Greenplum Performance Monitor Console (a web application) or using SQL queries. Greenplum Performance Monitor Console is a browser-based application where administrators can view active and historical query and system metrics stored in the gpperfmon database. By default, Greenplum Performance Monitor Console is installed on the Greenplum Database master host using HTTP port It can be accessed through a browser using a URL such as 26 Database and System Monitoring Tools
31 EMC DCA and DIA Getting Started Guide Chapter 2: Greenplum DCA / DIA Administration To log in to Greenplum Performance Monitor Console, your Greenplum Database administrator must assign you a username and password (or see the Greenplum Performance Monitor Administrator Guide for instructions on granting access). The Dashboard, System Metrics, Query Monitor tabs of Greenplum Performance Monitor Console show information about active and historical database and system workload. This information can help an administrator track system utilization and performance for specific queries and usage periods. Figure 2.1 Performance Monitor Dashboard Database and System Monitoring Tools 27
32 EMC DCA and DIA Getting Started Guide Chapter 2: Greenplum DCA / DIA Administration The Greenplum Performance Monitor Console for Greenplum DCA has an additional Health tab, which shows the status of Greenplum DCA and DIA hardware components. For example, a failed network interface would show up as a server error. A failed fan would show as a warning. Figure 2.2 Performance Monitor Health Tab Greenplum Database System Catalogs Greenplum Database stores metadata information about the database system in special tables and views within the database called system catalogs. Database superusers can access the information in these catalogs using SQL commands. The following is a list of helpful system catalogs that administrators can query to check database activity and system status. For more information on these system catalogs, see the Greenplum Database Administrator Guide. Table 2.3 Useful Greenplum Database System Catalogs Catalog Name pg_resqueue_status pg_stat_activity pg_stat_last_operation Description Shows status and activity for a workload management resource queue. It shows how many queries are waiting to run and how many queries are currently active in the system from a particular resource queue. Shows one row per master database process, showing the database name, process ID, user name, current query, query s waiting status, time at which the current query began execution, time at which the process was started, and client s address and port number. Shows the last time certain database operations were performed on a database object, for example, the last time a table was vacuumed. 28 Database and System Monitoring Tools
33 EMC DCA and DIA Getting Started Guide Chapter 2: Greenplum DCA / DIA Administration Greenplum Database and SNMP Alerting The Greenplum Database system can be configured to trigger SNMP alerts or send notifications to system administrators whenever certain database events occur. These events can include fatal server errors, segment shutdown and recovery, and database system shutdown and restart. See the Greenplum Database Administrator Guide for instructions on enabling system alerts and notifications. General Database Maintenance Tasks Greenplum Database, like any database management system, requires that certain tasks be performed regularly to achieve optimum performance. The tasks discussed here are required, but they are repetitive in nature and can easily be automated using standard UNIX tools such as cron scripts. But it is the database administrator s responsibility to set up appropriate scripts, and to check that they execute successfully. Routine Vacuum and Analyze Routine Reindexing Managing Greenplum Database Log Files Routine Vacuum and Analyze Because of the multi-version concurrency control (MVCC) transaction model used in Greenplum Database, data rows that are deleted or updated still occupy physical space on disk even though they are not visible to any new transactions. If you have a database with lots of updates and deletes, you will generate a lot of expired rows. Running the VACUUM SQL command will reclaim this disk space. The VACUUM command also collects table-level statistics such as number of rows and pages, so it is necessary to periodically run VACUUM on all tables. Transaction ID Management Greenplum s MVCC transaction semantics depend on being able to compare transaction ID (XID) numbers to determine visibility to other transactions. But since transaction IDs have limited size, a Greenplum system that runs for a long time (more than 4 billion transactions) would suffer transaction ID wraparound: the XID counter wraps around to zero, and all of a sudden transactions that were in the past appear to be in the future which means their outputs become invisible. To avoid this, it is necessary to run VACUUM on every table in every database at least once every two billion transactions. See the Greenplum Database Administrator Guide for more information. System Catalog Maintenance Numerous database updates with CREATE and DROP commands can cause growth in the size of the system catalog that affects system performance. For example, after a large number of DROP TABLE statements, the overall performance of the system begins to degrade due to excessive data scanning during metadata operations on the catalog tables. Depending on your system, the performance loss may occur between thousands to tens of thousands of DROP TABLE statements. General Database Maintenance Tasks 29
34 EMC DCA and DIA Getting Started Guide Chapter 2: Greenplum DCA / DIA Administration Greenplum recommends that you periodically run VACUUM on the system catalog to clear the space occupied by deleted objects. If numerous DROP statements are a part of regular database operations, it is safe and appropriate to run a system catalog maintenance procedure with VACUUM daily at off-peak hours. This can be done while the system is running and available. The following example script performs a VACUUM of the Greenplum Database system catalog: #!/bin/bash DBNAME="<database_name>" VCOMMAND="VACUUM ANALYZE" psql -tc "select '$VCOMMAND' ' pg_catalog.' relname ';' from pg_class a,pg_namespace b where a.relnamespace=b.oid and b.nspname= 'pg_catalog' and a.relkind='r'" $DBNAME psql -a $DBNAME Vacuum and Analyze for Query Optimization Greenplum Database uses a cost-based query planner that relies on database statistics. Accurate statistics allow the query planner to better estimate selectivity and the number of rows retrieved by a query operation in order to choose the most efficient query plan. The ANALYZE command collects column-level statistics needed by the query planner. Both VACUUM and ANALYZE operations can be run in the same command. For example: =# VACUUM ANALYZE mytable; Routine Reindexing For B-tree indexes, a freshly-constructed index is somewhat faster to access than one that has been updated many times, because logically adjacent pages are usually also physically adjacent in a newly built index. It might be worthwhile to reindex periodically to improve access speed. Also, if all but a few index keys on a page have been deleted, there will be wasted space on the index page. A reindex will reclaim that wasted space. In Greenplum Database it is often faster to drop an index (DROP INDEX) and then recreate it (CREATE INDEX) than it is to use the REINDEX command. Bitmap indexes are not updated when changes are made to the indexed column(s). If you have updated a table that has a bitmap index, you must drop and recreate the index for it to remain current. Managing Greenplum Database Log Files Database Server Log Files Management Utility Log Files 30 General Database Maintenance Tasks
35 EMC DCA and DIA Getting Started Guide Chapter 2: Greenplum DCA / DIA Administration Database Server Log Files Greenplum Database log output tends to be voluminous (especially at higher debug levels) and you do not need to save it indefinitely. Administrators need to rotate the log files periodically so that new log files are started and old ones are removed after a reasonable period of time. Greenplum Database has log file rotation enabled on the master and all segment instances. Daily log files are created in pg_log of the master and each segment data directory using the naming convention of: gpdb-yyyy-mm-dd.log. Although log files are rolled over daily, they are not automatically truncated or deleted. Administrators will need to implement some script or program to periodically clean up old log files in the pg_log directory of the master and each segment instance. Management Utility Log Files Log files for the Greenplum Database management utilities are written to ~/gpadminlogs by default (the home directory of the gpadmin user). The naming convention for management log files is: <script_name>_<date>.log The log file for a particular utility execution is appended to its daily log file each time that utility is run. Administrators will need to implement some script or program to periodically clean up old log files in ~/gpadminlogs. General Database Maintenance Tasks 31
36 EMC DCA and DIA Getting Started Guide Chapter 2: Greenplum DCA / DIA Administration 32 General Database Maintenance Tasks
37 EMC DCA and DIA Getting Started Guide Chapter 3: Connecting to Greenplum Database 3. Connecting to Greenplum Database This chapter describes how to connect to a Greenplum Database system running on the Greenplum Data Computing Appliance. Note: Database users and administrators always connect to the Greenplum Database master. Establishing a Database Session Users can connect to Greenplum Database using a PostgreSQL-compatible client program, such as psql. Users and administrators always connect to Greenplum Database through the master - the segments cannot accept client connections. In order to establish a connection to the Greenplum Database master, you will need to know the following connection information and configure your client program accordingly. Table 3.1 Client Connection Parameters Connection Parameter Description Environment Variable Database name Host name Port User name The name of the database to which you want to connect. For a newly initialized system, use the template1 database to connect for the first time to create your database. The host name of the Greenplum Database master. The default host is the local host. The port number that the Greenplum Database master instance is running on. The default is The database user (role) name to connect as. Every Greenplum Database system has one superuser account that is created automatically at initialization time. This account has the same name as the OS user who initialized the Greenplum system (gpadmin). $PGDATABASE $PGHOST $PGPORT $PGUSER Supported Client Applications Users can connect to Greenplum Database using various client applications: Establishing a Database Session 33
38 EMC DCA and DIA Getting Started Guide Chapter 3: Connecting to Greenplum Database A number of Greenplum Database Client Applications are provided with your Greenplum installation. The psql client application provides an interactive command-line interface to Greenplum Database. pgadmin III for Greenplum Database is an enhanced version of the popular management tool pgadmin III. Since version , the pgadmin III client available from PostgreSQL Tools includes support for Greenplum-specific features. Installation packages are available for download from Greenplum Network and from the pgadmin download site. Using standard Database Application Interfaces, such as ODBC and JDBC, users can create their own client applications that interface to Greenplum Database. Because Greenplum Database is based on PostgreSQL, it uses the standard PostgreSQL database drivers. Most Third-Party Client Tools that use standard database interfaces, such as ODBC and JDBC, can be configured to connect to Greenplum Database. Greenplum Database Client Applications Greenplum Database comes installed with a number of client applications located in $GPHOME/bin of your Greenplum Database master host installation. The following are the most commonly used client applications: Table 3.2 Commonly Used Client Applications Name createdb createuser dropdb dropuser psql reindexdb vacuumdb Usage create a new database define a new database role remove a database remove a role PostgreSQL interactive terminal reindex a database vacuum and analyze a database When using these client applications, you must connect to a database through the Greenplum master instance. You will need to know the name of your target database, the host name and port number of the master, and what database user name to connect as. This information can be provided on the command-line using the options -d, -h, -p, and -U respectively. If an argument is found that does not belong to any option, it will be interpreted as the database name first. All of these options have default values which will be used if the option is not specified. The default host is the local host. The default port number is The default user name is your OS system user name, as is the default database name. Note that OS user names and Greenplum Database user names are not necessarily the same. If the default values are not correct, you can save yourself some typing by setting the environment variables PGDATABASE, PGHOST, PGPORT, and PGUSER to the appropriate values. 34 Supported Client Applications
39 EMC DCA and DIA Getting Started Guide Chapter 3: Connecting to Greenplum Database Connecting with psql Depending on the default values used or the environment variables you have set, the following examples show how to access a database via psql: $ psql -d gpdatabase -h master_host -p U gpadmin $ psql gpdatabase $ psql If a user-defined database has not yet been created, you can access the system by connecting to the template1 database. For example: $ psql template1 After connecting to a database, psql provides a prompt with the name of the database to which psql is currently connected, followed by the string => (or =# if you are the database superuser). For example: gpdatabase=> At the prompt, you may type in SQL commands. A SQL command must end with a ; (semicolon) in order to be sent to the server and executed. For example: => SELECT * FROM mytable; Getting Help in psql psql also has a number of meta-commands (backslash commands), that allow you to easily look up information in the Greenplum Database system catalogs. To see a list of all meta-commands, use \?. For example: => \? To get help with SQL command syntax, use the \h meta-command. For example, to see a list of all available SQL commands: => \h To see the syntax reference for a particular SQL command, follow the \h meta-command by the SQL command name. For example: => \h SELECT Some other commonly used psql meta-commands are: Table 3.3 common psql meta-commands command description \l List all databases in the system. \c <database_name> Connect to the specified database. \dn \dt \dts \d+ <object_name> \du List all schemas in the current database. List all user-created tables in the current database. List all system catalog tables. Show the definition of the specified database object (table, index, etc.). List all users (roles) in the system. Supported Client Applications 35
40 EMC DCA and DIA Getting Started Guide Chapter 3: Connecting to Greenplum Database For more information on using the psql client application, see the Greenplum Database Administrator Guide. pgadmin III for Greenplum Database pgadmin III is an open source graphical user interface (GUI) for PostgreSQL, which is also compatible with Greenplum Database. As of version , the pgadmin III client includes support for Greenplum-specific features. pgadmin III for Greenplum Database supports the following Greenplum-specific features: External tables Append-only tables, including compressed append-only tables Table partitioning Resource queues Graphical EXPLAIN ANALYZE Greenplum server configuration parameters Figure 3.1 Greenplum Options in pgadmin III 36 Supported Client Applications
41 EMC DCA and DIA Getting Started Guide Chapter 3: Connecting to Greenplum Database Installing pgadmin III for Greenplum Database The installation package for pgadmin III for Greenplum Database is available for download from the official pgadmin III download site ( Installation instructions are included in the installation package. Documentation for pgadmin III for Greenplum Database For general help on the features of the graphical interface, select Help contents from the Help menu. For help with Greenplum-specific SQL support, select Greenplum Database Help from the Help menu. If you have an active internet connection, you will be directed to online Greenplum SQL reference documentation. Database Application Interfaces You may want to develop your own client applications that interface to Greenplum Database. PostgreSQL provides a number of database drivers for the most commonly used database application programming interfaces (APIs), which can also be used with Greenplum Database. These drivers are not packaged with the Greenplum Database base distribution. Each driver is an independent PostgreSQL development project and must be downloaded, installed and configured to connect to Greenplum Database. The following drivers are available: Table 3.4 Greenplum Database Interfaces API Driver Download Link ODBC pgodbc DataDirect ODBC Driver for Greenplum Database The PostgreSQL ODBC driver is available in the Greenplum Database Connectivity package, which can be downloaded from Greenplum Network. DataDirect offers an enterprise ODBC Driver for Greenplum Database. reenplum/index.html JDBC pgjdbc Available in the Greenplum Database Connectivity package, which can be downloaded from Greenplum Network. Perl DBI pgperl Python DBI pygresql Third-Party Client Tools Most third-party extract-transform-load (ETL) and business intelligence (BI) tools use standard database interfaces, such as ODBC and JDBC, and can be configured to connect to Greenplum Database. Greenplum has certified the following third-party client applications with Greenplum Database: Business Objects Microstrategy Supported Client Applications 37
42 EMC DCA and DIA Getting Started Guide Chapter 3: Connecting to Greenplum Database Informatica Power Center Microsoft SQL Server Integration Services (SSIS) and Reporting Services (SSRS) Ascential Datastage SAS Cognos Greenplum Professional Services can assist users in configuring their chosen third-party tool for use with Greenplum Database. Troubleshooting Connection Problems A number of things can prevent a client application from successfully connecting to Greenplum Database. This section explains some of the common causes of connection problems and how to correct them. Table 3.5 Common Connection Problems Problem No pg_hba.conf entry for host or user Greenplum Database is not running Network problems Too many clients already Solution In order for Greenplum Database to be able to accept remote client connections, you must configure your Greenplum Database master instance so that connections are allowed from the client hosts and database users that will be connecting to Greenplum Database. This is done by adding the appropriate entries to the pg_hba.conf configuration file (located in the master instance s data directory). For more detailed information, see the Greenplum Database Administrator Guide. If the Greenplum Database master instance is down, users will not be able to connect. You can verify that the Greenplum Database system is up by running the gpstate utility on the Greenplum master host. If users are connecting to the Greenplum master host from a remote client, network problems may be preventing a connection (for example, DNS host name resolution problems, the host system is down, etc.). To ensure that network problems are not the cause, try connecting to the Greenplum master host from the remote client host. For example: ping hostname By default, Greenplum Database is configured to allow a maximum of 25 concurrent user connections. A connection attempt that causes that limit to be exceeded will be refused. This limit is controlled by the max_connections parameter in the postgresql.conf configuration file of the Greenplum Database master. See the Greenplum Database Administrator Guide for more information on increasing the allowed connections. 38 Troubleshooting Connection Problems
43 EMC DCA and DIA Getting Started Guide Chapter 4: Next Steps 4. Next Steps This chapter explains the next steps to implementing your data warehouse requirements in Greenplum Database. Understanding the SQL Features of Greenplum Database It is important to note that there are no commercial database systems that are fully compliant with the SQL standard. Greenplum Database is almost fully compliant with the SQL 1992 standard, with most of the features from SQL Several features from SQL 2003 have also been implemented (most notably the SQL OLAP features). This section addresses the important conformance issues of Greenplum Database as they relate to the SQL standards. For a feature-by-feature list of Greenplum s support of the latest SQL standard, see the Greenplum Database Administrator Guide. Core SQL Conformance SQL 1992 Conformance SQL 1999 Conformance SQL 2003 Conformance SQL 2008 Conformance Greenplum and PostgreSQL Compatibility Core SQL Conformance In the process of building a parallel, shared-nothing database system and query optimizer, certain common SQL constructs are not currently implemented in Greenplum Database. The following SQL constructs are not supported: 1. UPDATE statements that update the distribution key columns of a hash-distributed Greenplum table. There is currently no way for the system to redistribute a row to a different segment when its hash value changes. 2. UPDATE and DELETE statements that require data to move from one segment to another. This restricts the use of joins in update and delete operations to hash-distributed tables that have the same distribution key column(s), and the join condition must specify equality on the distribution key column(s). 3. Correlated subqueries that Greenplum s parallel optimizer cannot internally rewrite into non-correlated joins. Most simple uses of correlated subqueries do work. Those that do not can be manually rewritten using outer joins. 4. Certain rare cases of multi-row subqueries that Greenplum s parallel optimizer cannot internally rewrite into equijoins. 5. Some set returning subqueries in EXISTS or NOT EXISTS clauses that Greenplum s parallel optimizer cannot rewrite into joins. Understanding the SQL Features of Greenplum Database 39
44 EMC DCA and DIA Getting Started Guide Chapter 4: Next Steps 6. UNION ALL of joined tables with subqueries. 7. Set-returning functions in the FROM clause of a subquery. 8. Backwards scrolling cursors, including the use of FETCH PRIOR, FETCH FIRST, FETCH ABOLUTE, and FETCH RELATIVE. 9. In CREATE TABLE statements (on hash-distributed tables): a UNIQUE or PRIMARY KEY clause must include all of (or a superset of) the distribution key columns. Because of this restriction, only one UNIQUE clause or PRIMARY KEY clause is allowed in a CREATE TABLE statement. UNIQUE or PRIMARY KEY clauses are not allowed on randomly-distributed tables. 10. CREATE UNIQUE INDEX statements that do not contain all of (or a superset of) the distribution key columns. CREATE UNIQUE INDEX is not allowed on randomly-distributed tables. 11. VOLATILE or STABLE functions cannot execute on the segments, and so are generally limited to being passed literal values as the arguments to their parameters. 12. Triggers are not supported since they typically rely on the use of VOLATILE functions. 13. Referential integrity constraints (foreign keys) are not enforced in Greenplum Database. Users can declare foreign keys and this information is kept in the system catalog, however. 14. Sequence manipulation functions CURRVAL and LASTVAL. 15. DELETE WHERE CURRENT OF and UPDATE WHERE CURRENT OF (positioned delete and positioned update operations). SQL 1992 Conformance The following features of SQL 1992 are not supported in Greenplum Database: 1. NATIONAL CHARACTER (NCHAR) and NATIONAL CHARACTER VARYING (NVARCHAR). Users can declare the NCHAR and NVARCHAR types, however they are just synonyms for CHAR and VARCHAR in Greenplum Database. 2. CREATE ASSERTION statement. 3. INTERVAL literals are supported in Greenplum Database, but do not conform to the standard. 4. GET DIAGNOSTICS statement. 5. GRANT INSERT or UPDATE privileges on columns. Privileges can only be granted on tables in Greenplum Database. 40 Understanding the SQL Features of Greenplum Database
45 EMC DCA and DIA Getting Started Guide Chapter 4: Next Steps 6. GLOBAL TEMPORARY TABLEs and LOCAL TEMPORARY TABLEs. Greenplum TEMPORARY TABLEs do not conform to the SQL standard, but many commercial database systems have implemented temporary tables in the same way. Greenplum temporary tables are the same as VOLATILE TABLEs in Teradata. 7. UNIQUE predicate. 8. MATCH PARTIAL for referential integrity checks (most likely will not be implemented in Greenplum Database). SQL 1999 Conformance The following features of SQL 1999 are not supported in Greenplum Database: 1. Large Object data types: BLOB, CLOB, NCLOB. However, the BYTEA and TEXT columns can store very large amounts of data in Greenplum Database (hundreds of megabytes). 2. Recursive WITH clause or the WITH RECURSIVE clause (recursive queries). Non-recursive WITH clauses can easily be rewritten by moving the common table expression into the FROM clause as a derived table. 3. MODULE (SQL client modules). 4. CREATE PROCEDURE (SQL/PSM). This can be worked around in Greenplum Database by creating a FUNCTION that returns void, and invoking the function as follows: SELECT myfunc(args); 5. The PostgreSQL/Greenplum function definition language (PL/PGSQL) is a subset of Oracle s PL/SQL, rather than being compatible with the SQL/PSM function definition language. Greenplum Database also supports function definitions written in Python, Perl, and R. 6. BIT and BIT VARYING data types (intentionally omitted). These were deprecated in SQL 2003, and replaced in SQL Greenplum supports identifiers up to 63 characters long. The SQL standard requires support for identifiers up to 128 characters long. 8. Prepared transactions (PREPARE TRANSACTION, COMMIT PREPARED, ROLLBACK PREPARED). This also means Greenplum does not support XA Transactions (2 phase commit coordination of database transactions with external transactions). 9. CHARACTER SET option on the definition of CHAR() or VARCHAR() columns. 10. Specification of CHARACTERS or OCTETS (BYTES) on the length of a CHAR() or VARCHAR() column. For example, VARCHAR(15 CHARACTERS) or VARCHAR(15 OCTETS) or VARCHAR(15 BYTES). 11. CURRENT_SCHEMA function. Understanding the SQL Features of Greenplum Database 41
46 EMC DCA and DIA Getting Started Guide Chapter 4: Next Steps 12. CREATE DISTINCT TYPE statement. CREATE DOMAIN can be used as a work-around in Greenplum. 13. The explicit table construct. SQL 2003 Conformance The following features of SQL 2003 are not supported in Greenplum Database: 1. XML data type (PostgreSQL does support this). 2. MERGE statements. 3. IDENTITY columns and the associated GENERATED ALWAYS/GENERATED BY DEFAULT clause. The SERIAL or BIGSERIAL data types are very similar to INT or BIGINT GENERATED BY DEFAULT AS IDENTITY. 4. MULTISET modifiers on data types. 5. ROW data type. 6. Greenplum Database syntax for using sequences is non-standard. For example, nextval('seq') is used in Greenplum instead of the standard NEXT VALUE FOR seq. 7. GENERATED ALWAYS AS columns. Views can be used as a work-around. 8. The sample clause (TABLESAMPLE) on SELECT statements. The random() function can be used as a work-around to get random samples from tables. 9. NULLS FIRST/NULLS LAST clause on SELECT statements and subqueries (nulls are always last in Greenplum Database). 10. The partitioned join tables construct (PARTITION BY in a join). 11. GRANT SELECT privileges on columns. Privileges can only be granted on tables in Greenplum Database. Views can be used as a work-around. 12. For CREATE TABLE x (LIKE(y)) statements, Greenplum does not support the [INCLUDING EXCLUDING] [DEFAULTS CONSTRAINTS INDEXES] clauses. 13. Greenplum array data types are almost SQL standard compliant with some exceptions. Generally customers should not encounter any problems using them. SQL 2008 Conformance The following features of SQL 2008 are not supported in Greenplum Database: 1. BINARY and VARBINARY data types. BYTEA can be used in place of VARBINARY in Greenplum Database. 2. FETCH FIRST or FETCH NEXT clause for SELECT, for example: SELECT id, name FROM tab1 ORDER BY id OFFSET 20 ROWS FETCH NEXT 10 ROWS ONLY; 42 Understanding the SQL Features of Greenplum Database
47 EMC DCA and DIA Getting Started Guide Chapter 4: Next Steps Greenplum has LIMIT and LIMIT OFFSET clauses instead. 3. The ORDER BY clause is ignored in views and subqueries unless a LIMIT clause is also used. This is intentional, as the Greenplum optimizer cannot determine when it is safe to avoid the sort, causing an unexpected performance impact for such ORDER BY clauses. To work around, you can specify a really large LIMIT. For example: SELECT * FROM mytable ORDER BY 1 LIMIT The row subquery construct is not supported. 5. TRUNCATE TABLE does not accept the CONTINUE IDENTITY and RESTART IDENTITY clauses. Greenplum and PostgreSQL Compatibility Greenplum Database is based on PostgreSQL 8.2 with a few features added in from the 8.3 release. To support the distributed nature and typical workload of a Greenplum Database system, some SQL commands have been added or modified, and there are a few PostgreSQL features that are not supported. Greenplum has also added features not found in PostgreSQL, such as physical data distribution, parallel query optimization, external tables, resource queues for workload management and enhanced table partitioning. For full SQL syntax and references, see the Greenplum Database Administrator Guide. Table 4.1 SQL Support in Greenplum Database SQL Command ALTER AGGREGATE ALTER CONVERSION ALTER DATABASE ALTER DOMAIN Supported in Greenplum Modifications, Limitations, Exceptions ALTER FILESPACE Greenplum Database parallel tablespace feature - not in PostgreSQL ALTER FUNCTION ALTER GROUP An alias for ALTER ROLE ALTER INDEX ALTER LANGUAGE ALTER OPERATOR ALTER OPERATOR CLASS NO ALTER RESOURCE QUEUE Greenplum Database workload management feature - not in PostgreSQL. ALTER ROLE Greenplum Database Clauses: RESOURCE QUEUE queue_name none ALTER SCHEMA Understanding the SQL Features of Greenplum Database 43
48 EMC DCA and DIA Getting Started Guide Chapter 4: Next Steps Table 4.1 SQL Support in Greenplum Database SQL Command ALTER SEQUENCE Supported in Greenplum Modifications, Limitations, Exceptions ALTER TABLE Unsupported Clauses / Options: CLUSTER ON ENABLE/DISABLE TRIGGER Greenplum Database Clauses: ADD DROP RENAME SPLIT EXCHANGE PARTITION SET SUBPARTITION TEMPLATE SET WITH (REORGANIZE=true false) SET DISTRIBUTED BY ALTER TABLESPACE ALTER TRIGGER ALTER TYPE NO ALTER USER An alias for ALTER ROLE ANALYZE BEGIN CHECKPOINT CLOSE CLUSTER COMMENT COMMIT COMMIT PREPARED NO COPY Modified Clauses: ESCAPE [ AS ] 'escape' 'OFF' Greenplum Database Clauses: [LOG ERRORS INTO error_table] SEGMENT REJECT LIMIT count [ROWS PERCENT] CREATE AGGREGATE Unsupported Clauses / Options: [, SORTOP = sort_operator ] Greenplum Database Clauses: [, PREFUNC = prefunc ] Limitations: The functions used to implement the aggregate must be IMMUTABLE functions. CREATE CAST CREATE CONSTRAINT TRIGGER CREATE CONVERSION NO 44 Understanding the SQL Features of Greenplum Database
49 EMC DCA and DIA Getting Started Guide Chapter 4: Next Steps Table 4.1 SQL Support in Greenplum Database SQL Command CREATE DATABASE CREATE DOMAIN Supported in Greenplum Modifications, Limitations, Exceptions CREATE EXTERNAL TABLE Greenplum Database parallel ETL feature - not in PostgreSQL CREATE FILESPACE Greenplum Database parallel tablespace feature - not in PostgreSQL CREATE FUNCTION Limitations: Functions defined as STABLE or VOLATILE can be executed in Greenplum Database provided that they are executed on the master only. STABLE and VOLATILE functions cannot be used in statements that execute at the segment level. CREATE GROUP An alias for CREATE ROLE CREATE INDEX Greenplum Database Clauses: USING bitmap (bitmap indexes) Limitations: UNIQUE indexes are allowed only if they contain all of (or a superset of) the Greenplum distribution key columns. CONCURRENTLY keyword not supported in Greenplum. CREATE LANGUAGE CREATE OPERATOR Limitations: The function used to implement the operator must be an IMMUTABLE function. CREATE OPERATOR CLASS CREATE OPERATOR FAMILY NO NO CREATE RESOURCE QUEUE Greenplum Database workload management feature - not in PostgreSQL CREATE ROLE Greenplum Database Clauses: RESOURCE QUEUE queue_name none CREATE RULE CREATE SCHEMA CREATE SEQUENCE Limitations: The lastval and currval functions are not supported. The setval function is only allowed in queries that do not operate on distributed data. Understanding the SQL Features of Greenplum Database 45
50 EMC DCA and DIA Getting Started Guide Chapter 4: Next Steps Table 4.1 SQL Support in Greenplum Database SQL Command Supported in Greenplum Modifications, Limitations, Exceptions CREATE TABLE Unsupported Clauses / Options: [GLOBAL LOCAL] REFERENCES FOREIGN KEY [DEFERRABLE NOT DEFERRABLE] Limited Clauses: UNIQUE or PRIMARY KEY constraints are only allowed on hash-distributed tables (DISTRIBUTED BY), and the constraint columns must be the same as or a superset of the table s distribution key columns. CREATE TABLE AS See CREATE TABLE Greenplum Database Clauses: DISTRIBUTED BY (column, [... ] ) DISTRIBUTED RANDOMLY PARTITION BY type (column [,...]) ( partition_specification, [...] ) WITH (appendonly=true [,compresslevel=value,blocksize=value] ) CREATE TABLESPACE NO Greenplum Database Clauses: FILESPACE filespace_name CREATE TRIGGER NO CREATE TYPE Limitations: CREATE USER An alias for CREATE ROLE The functions used to implement a new base type must be IMMUTABLE functions. CREATE VIEW DEALLOCATE DECLARE Unsupported Clauses / Options: SCROLL FOR UPDATE [ OF column [,...] ] Limitations: Cursors are non-updatable, and cannot be backward-scrolled. Forward scrolling is supported. DELETE Unsupported Clauses / Options: RETURNING Limitations: Joins must be on a common Greenplum distribution key (equijoins) Cannot use STABLE or VOLATILE functions in a DELETE statement if mirrors are enabled 46 Understanding the SQL Features of Greenplum Database
51 EMC DCA and DIA Getting Started Guide Chapter 4: Next Steps Table 4.1 SQL Support in Greenplum Database SQL Command DROP AGGREGATE DROP CAST DROP CONVERSION DROP DATABASE DROP DOMAIN Supported in Greenplum Modifications, Limitations, Exceptions DROP EXTERNAL TABLE Greenplum Database parallel ETL feature - not in PostgreSQL DROP FILESPACE Greenplum Database parallel tablespace feature - not in PostgreSQL DROP FUNCTION DROP GROUP An alias for DROP ROLE DROP INDEX DROP LANGUAGE DROP OPERATOR DROP OPERATOR CLASS DROP OWNED NO NO DROP RESOURCE QUEUE Greenplum Database workload management feature - not in PostgreSQL DROP ROLE DROP RULE DROP SCHEMA DROP SEQUENCE DROP TABLE DROP TABLESPACE DROP TRIGGER DROP TYPE NO NO DROP USER An alias for DROP ROLE DROP VIEW END EXECUTE EXPLAIN Understanding the SQL Features of Greenplum Database 47
52 EMC DCA and DIA Getting Started Guide Chapter 4: Next Steps Table 4.1 SQL Support in Greenplum Database SQL Command Supported in Greenplum Modifications, Limitations, Exceptions FETCH Unsupported Clauses / Options: LAST PRIOR BACKWARD BACKWARD ALL Limitations: Cannot fetch rows in a nonsequential fashion; backward scan is not supported. GRANT INSERT Unsupported Clauses / Options: RETURNING LISTEN LOAD LOCK NO MOVE See FETCH NOTIFY PREPARE PREPARE TRANSACTION REASSIGN OWNED REINDEX RELEASE SAVEPOINT RESET REVOKE ROLLBACK ROLLBACK PREPARED ROLLBACK TO SAVEPOINT SAVEPOINT NO NO NO 48 Understanding the SQL Features of Greenplum Database
53 EMC DCA and DIA Getting Started Guide Chapter 4: Next Steps Table 4.1 SQL Support in Greenplum Database SQL Command Supported in Greenplum Modifications, Limitations, Exceptions SELECT Limitations: SELECT INTO See SELECT Limited use of VOLATILE and STABLE functions in FROM or WHERE clauses Limited use of correlated subquery expressions Text search (Tsearch2) is not supported FETCH FIRST or FETCH NEXT clauses not supported Greenplum Database Clauses (OLAP): [GROUP BY grouping_element [,...]] [WINDOW window_name AS (window_specification)] [FILTER (WHERE condition)] applied to an aggregate function in the SELECT list SET SET CONSTRAINTS NO In PostgreSQL, this only applies to foreign key constraints, which are currently not enforced in Greenplum Database. SET ROLE SET SESSION AUTHORIZATION Deprecated as of PostgreSQL see SET ROLE. SET TRANSACTION SHOW START TRANSACTION TRUNCATE UNLISTEN NO UPDATE Unsupported Clauses: RETURNING Limitations: SET not allowed for Greenplum distribution key columns. Joins must be on a common Greenplum distribution key (equijoins). Cannot use STABLE or VOLATILE functions in an UPDATE statement if mirrors are enabled. VACUUM Limitations: VACUUM FULL is not recommended in Greenplum Database. VALUES Providing User Access to Greenplum Database Greenplum Database manages database access permissions using the concept of roles. The concept of roles subsumes the concepts of users and groups. A role can be a database user, a group, or both. Roles can own database objects (for example, tables) Providing User Access to Greenplum Database 49
54 EMC DCA and DIA Getting Started Guide Chapter 4: Next Steps and can assign privileges on those objects to other roles to control access to the objects. Roles can be members of other roles, thus a member role can inherit the object privileges of its parent role. Every Greenplum Database system contains a set of database roles (users and groups). Those roles are separate from the users and groups managed by the operating system on which the database process runs. However, for convenience you may want to maintain a relationship between operating system user names and Greenplum Database role names, since many of the client applications use the current operating system user name as the default. In Greenplum Database, users log in and connect through the master instance, which then verifies their role and access privileges. In order to bootstrap the Greenplum Database system, a freshly initialized system always contains one predefined superuser role. This role will have the same name as the operating system user that initialized the Greenplum Database system. Customarily, this role is named gpadmin. In order to create more roles you first have to connect as this initial role. See the Greenplum Database Administrator Guide for more information on creating additional roles in Greenplum Database. Creating Databases and Loading Data After establishing your database connections, the next step is to begin creating databases and loading data. See the Greenplum Database Administrator Guide for more information about creating databases, schemas, tables, and other database objects in Greenplum Database and loading your data. 50 Creating Databases and Loading Data
55 EMC DCA and DIA Getting Started Guide Glossary Glossary A append-only tables An append-only (AO) table is a storage representation that allows only appending new rows to a table, but does not allow updating or deleting existing rows. This allows for more compact storage on disk because each row does not need to store the MVCC transaction visibility info. This saves 20 bytes per row. AO tables can also be compressed. array B The set of physical devices (hosts, servers, network switches, etc.) used to house a Greenplum Database system. bandwidth BI C Bandwidth is the maximum amount of information that can be transmitted along a channel, such as a network or I/O channel. This data transfer rate is usually measured in megabytes or gigabytes per second (MB/s or GB/s). Business Intelligence (BI) is a broad category of applications and technologies for gathering, storing, analyzing, and providing access to data with the goal of helping users make better business decisions. catalog See system catalog. column-oriented table Greenplum provides a choice of storage orientation models for a table: row or column. A column-oriented table stores its content on disk by column rather than by row. This storage model has performance advantages for certain types of queries. Only append-only tables can be column-oriented; heap tables are always row-oriented. append-only tables 51
56 EMC DCA and DIA Getting Started Guide Glossary D data directory The data directory is the file system location on disk where database data is stored. The master data directory contains the global system catalog only no user data is stored on the master. The data directory on the segment instances has user data for that segment plus a local copy of the system catalog. The data directory contains several subdirectories, control files, and configuration files as well. DCA Data Computing Appliance. See Greenplum Data Computing Appliance. distributed Certain database objects in Greenplum Database, such as tables and indexes, are distributed. They are divided into equal parts and spread out among the segment instances based on a hashing algorithm. To the end-user and client software, however, a distributed object appears as a conventional database object. distribution key In a Greenplum table that uses hash distribution, one or more columns are used as the distribution key, meaning those columns are used to divide the data among all of the segments. The distribution key should be the primary key of the table or a unique column or set of columns. distribution policy The distribution policy determines how to divide the rows of a table among the Greenplum segments. Greenplum Database provides two types of distribution policy: hash distribution and random distribution. DDL Data Definition Language. A subset of SQL commands used for defining the structure of a database. DML E Database Manipulation Language. SQL commands that store, manipulate, and retrieve data from tables. INSERT, UPDATE, DELETE, and SELECT are DML commands. ELT Extract, load, and transform (ELT) is a process in data warehousing that involves extracting data from outside data sources, loading the raw data into a high-performance database management system (such as Greenplum Database), and then performing the data transformations within the database itself. 52 data directory
57 EMC DCA and DIA Getting Started Guide Glossary ETL G Extract, transform, and load (ETL) is a process in data warehousing that involves extracting data from outside data sources, transforming it to meet the operational requirements of the data warehouse, and loading it into the target database. gang For each slice of the query plan there is at least one query executor worker process assigned. During query execution, each segment will have a number of processes working on the query in parallel. Related processes that are working on the same portion of the query plan on different segments are referred to as gangs. Greenplum Database Greenplum Database is the industry s first massively parallel processing (MPP) database server based on open-source technology. It is explicitly designed to support business intelligence (BI) applications and large, multi-terabyte data warehouses. Greenplum Database is based on PostgreSQL. Greenplum Database system An associated set of segment instances and a master instance running on an array, which can be composed of one or more hosts. Greenplum Data Computing Appliance Greenplum Data Computing Appliance (Greenplum DCA) is a self-contained data warehouse solution that integrates all of the database software, servers and switches necessary to perform big data analytics. Greenplum DCA is delivered racked and ready for immediate data loading and query execution. Greenplum GP100 The model name of the Greenplum Data Computing Appliance half rack solution. Greenplum GP1000 The model name of the Greenplum Data Computing Appliance full rack solution. Greenplum instance The process that serves a database. An instance of Greenplum Database is comprised of a master instance and two or more segment instances, however users and administrators always connect to the database via the master instance. GP100 See Greenplum GP100. GP1000 See Greenplum GP1000. ETL 53
58 EMC DCA and DIA Getting Started Guide Glossary H hash distribution With hash distribution, one or more table columns is used as the distribution key for the table. The distribution key is used by a hashing algorithm to assign each row to a particular segment. Keys of the same value will always hash to the same segment. heap tables host I Whenever you create a table without specifying a storage structure, the default is a heap storage structure. In a heap structure, the table is an unordered collection of data that allows multiple copies or versions of a row. Heap tables have row-level versioning information and allow updates and deletes. See also append-only tables and multiversion concurrency control. A host represents a physical machine or compute node in a Greenplum Database system. In Greenplum Database, one host is designated as the master. The other hosts in the system have one or more segments on them. interconnect I/O J The interconnect is the networking layer of Greenplum Database. When a user connects to a database and issues a query, processes are created on each of the segments to handle the work of that query. The interconnect refers to the inter-process communication between the segments and master, as well as the network infrastructure on which this communication relies. Input/Output (I/O) refers to the transfer of data to and from a system or device using a communucation channel. JDBC Java Database Connectivity is an application program interface (API) specification for connecting programs written in Java to data in a database management system (DBMS). The application program interface lets you encode access request statements in SQL that are then passed to the program that manages the database. 54 hash distribution
59 EMC DCA and DIA Getting Started Guide Glossary M master The master is the entry point to a Greenplum Database system. It is the database listener process (postmaster) that accepts client connections and dispatches the SQL commands issued by the users of the system. The master is where the global system catalog resides. However, the master does not contain any user data. User data resides only on the segments. The master does the work of authenticating user connections, parsing and planning the incoming SQL commands, distributing the query plan to the segments for execution, coordinating the results returned by each of the segments, and presenting the final results to the user. master instance The database process that serves the Greenplum master. See master. mirror A mirror is a backup copy of a segment (or master) that is stored on a different host than the primary copy. Mirrors are useful for maintaining operations if a host in your Greenplum Database system fails. Mirroring is an optional feature of Greenplum Database. Mirror segments are evenly distributed among other hosts in the array. If a host that holds a primary segment fails, Greenplum Database will switch to the mirror or secondary host. motion node A motion node is a portion of a query execution plan that indicates data movement between the various database instances of Greenplum Database (segments and the master). Some operations, such as joins, require segments to send and receive tuples to one another in order to satisfy the operation. A motion node can also indicate data movement from the segments back up to the master. MPP Massive Parallel Processing. master 55
60 EMC DCA and DIA Getting Started Guide Glossary multiversion concurrency control Unlike traditional database systems which use locks for concurrency control, Greenplum Database (as does PostgreSQL) maintains data consistency by using a multiversion model (multiversion concurrency control or MVCC). This means that while querying a database, each transaction sees a snapshot of data which protects the transaction from viewing inconsistent data that could be caused by (other) concurrent updates on the same data rows. This provides transaction isolation for each database session. MVCC, by eschewing explicit locking methodologies of traditional database systems, minimizes lock contention in order to allow for reasonable performance in multiuser environments. The main advantage to using the MVCC model of concurrency control rather than locking is that in MVCC locks acquired for querying (reading) data do not conflict with locks acquired for writing data, and so reading never blocks writing and writing never blocks reading. MVCC O See multiversion concurrency control. ODBC Open Database Connectivity, a standard database access method that makes it possible to access any data from any client application, regardless of which database management system (DBMS) is handling the data. ODBC manages this by inserting a middle layer, called a database driver, between a client application and the DBMS. The purpose of this layer is to translate the application s data queries into commands that the DBMS understands. OLAP Online Analytical Processing (OLAP) is a category of technologies for collecting, managing, processing and presenting multidimensional data for analysis and management. OLAP leverages existing data from a relational schema or data warehouse (data source) by placing key performance indicators (measures) into context (dimensions). As of release 3.1, OLAP functions are supported in Greenplum Database. In practice, OLAP functions allow application developers to compose analytic business queries more easily and more efficiently. For example, moving averages and moving sums can be calculated over various intervals; aggregations and ranks can be reset as selected column values change; and complex ratios can be expressed in simple terms. OLTP Online Transactional Processing (OLTP) is a mode of database processing involving single, small updates from end-point applications and real-time transactional systems. 56 multiversion concurrency control
61 EMC DCA and DIA Getting Started Guide Glossary P partitioned tables Partitioning is a way to logically divide the data in a table for better performance and easier maintenance. In Greenplum Database, partitioning is a procedure that creates multiple sub-tables (or child tables) from a single large table (or parent table). The primary purpose is to improve performance by scanning only the relevant data needed to satisfy a query. Note that partitioned tables are also distributed. Perl DBI Perl Database Interface (DBI) is an API for connecting programs written in Perl to database management systems (DBMS). Perl DBI (DataBase Interface) is the most common database interface for the Perl programming language. PostgreSQL PostgreSQL is a SQL compliant, open source relational database management system (RDBMS). Greenplum Database uses a modified version of PostgreSQL as its underlying database server. For more information on PostgreSQL go to postgresql.conf The server configuration file that configures various aspects of the database server. This configuration file is located in the data directory of the database instance. In Greenplum Database, the master and each segment instance has its own postgresql.conf file. postgres process The postgres executable is the actual PostgreSQL server process that processes queries. The database listener postgres process (also known as the postmaster) creates other postgres subprocesses as needed to handle client connections. postmaster psql In releases prior to Greenplum Database 3.2 and PostgreSQL 8.2, the database listener process was called postmaster. The postmaster process was renamed to postgres process in Greenplum Database 3.2 and PostgreSQL 8.2, however many users who are familiar with PostgreSQL still refer to the database listener process as the postmaster. In Greenplum Database, there is a postgres database listener process for the Greenplum master instance and each segment instance. This is the interactive terminal to PostgreSQL and Greenplum Database. You can use psql to access a database and issue SQL commands. partitioned tables 57
62 EMC DCA and DIA Getting Started Guide Glossary Q QD QE See query dispatcher. See query executor. query dispatcher The query dispatcher (QD) is a process that is initiated when users connect to the master and issue SQL commands. This process represents a user session and is responsible for sending the query plan to the segments and coordinating the results it gets back. The query dispatcher process spawns one or more query executor processes to assist in the execution of SQL commands. query executor A query executor process (QE) is associated with a query dispatcher (QD) process and operates on its behalf. Query executor processes run on the segment instances and execute their slice of the query plan on a segment. query plan R A query plan is the set of operations that Greenplum Database will perform to produce the answer to a given query. Each node or step in the plan represents a database operation such as a table scan, join, aggregation or sort. Plans are read and executed from bottom to top. Greenplum Database supports an additional plan node type called a motion node. See also slice. rack A type of shelving to which computer components can be attached vertically, one on top of the other. Components are normally screwed into front-mounted, tapped metal strips with holes which are spaced so as to accommodate the height of devices of various U-sizes. Racks usually have their height denominated in U-units. RAID Redundant Array of Independent (or Inexpensive) Disks. RAID is a system of using multiple hard drives for sharing or replicating data among the drives. The benefit of RAID is increased data integrity, fault-tolerance and/or performance. Multiple hard drives are grouped and seen by the OS as one logical hard drive. RAM Random Access Memory. The main memory of a computer system used for storing programs and data. RAM provides temporary read/write storage while hard disks offer semi-permanent storage. 58 QD
63 EMC DCA and DIA Getting Started Guide Glossary random distribution S With random distribution, table rows are sent to the segments as they come in, cycling across the segments in a round-robin fashion. Rows with columns having the same values will not necessarily be located on the same segment. Although a random distribution ensures even data distribution, there are performance advantages to choosing a hash distribution policy whenever possible. segment A segment represents a portion of data in a Greenplum database. User-defined tables and their indexes are distributed across the available number of segment instances in the Greenplum Database system. Each segment instance contains a distinct portion of the user data. A primary segment instance and its mirror both store the same segment of data. segment instance The segment instance is the database server process (postmaster) that serves segments. Users do not connect to segment instances directly, but through the master. server slice See host. In order to achieve maximum parallelism during query execution, Greenplum divides the work of the query plan into slices. A slice is a portion of the plan that can be worked on independently at the segment level. A query plan is sliced wherever a motion node occurs in the plan, one slice on each side of the motion. Plans that do not require data movement (such as catalog lookups on the master) are known as single-slice plans. star schema A relational database design often used in data warehousing. The star schema is organized around a central table (fact table) joined to a few smaller tables (dimension tables) using foreign key references. The fact table contains raw numeric items that represent relevant business facts (price, number of units sold, etc.). system catalog The system catalogs are the place where a relational database management system stores schema metadata, such as information about tables and columns, and internal bookkeeping information. The system catalog in Greenplum Database is the same as the PostgreSQL catalog with some additional tables to support the distributed nature of the Greenplum system and databases. In Greenplum Database, the master contains the global system catalog tables. The segments also maintain their own local copy of the system catalog. random distribution 59
64 EMC DCA and DIA Getting Started Guide Glossary T tuple W A tuple is another name for a row or record in a relational database table. WAL Write-Ahead Logging (WAL) is a standard approach to transaction logging. WAL s central concept is that changes to data files (where tables and indexes reside) are logged before they are written to permanent storage. Data pages do not need to be flushed to disk on every transaction commit. In the event of a crash, data changes not yet applied to the database can be recovered from the log. A major benefit of using WAL is a significantly reduced number of disk writes. 60 tuple
PRODUCT DOCUMENTATION. Greenplum Database. Version 4.3. Administrator Guide. Rev: A01. 2014 Pivotal Software, Inc.
PRODUCT DOCUMENTATION Greenplum Database Version 4.3 Administrator Guide Rev: A01 2014 Pivotal Software, Inc. Copyright 2014 Pivotal Software, Inc. All rights reserved. Pivotal Software, Inc. believes
Pivotal Greenplum Database
PRODUCT DOCUMENTATION Pivotal Greenplum Database Version 4.3 Rev: A06 2015 Pivotal Software, Inc. Copyright Notice Copyright Copyright 2015 Pivotal Software, Inc. All rights reserved. Pivotal Software,
Greenplum Database 4.2 System Administrator Guide. Rev: A15
Greenplum Database 4.2 System Administrator Guide Rev: A15 Copyright 2014 Pivotal Software, Inc. All rights reserved. Pivotal Software, Inc. believes the information in this publication is accurate as
Greenplum Performance Monitor 4.0 Administrator Guide
The Data Computing Division of EMC Greenplum Performance Monitor 4.0 Administrator Guide P/N: 300-011-542 Rev: A02 Copyright 2010 EMC Corporation. All rights reserved. EMC believes the information in this
Greenplum Database 4.2
Greenplum Database 4.2 Database Administrator Guide Rev: A01 Copyright 2012 EMC Corporation. All rights reserved. EMC believes the information in this publication is accurate as of its publication date.
PRODUCT DOCUMENTATION. Greenplum Database. Version 4.2. Getting Started. Rev: A01. 2014 GoPivotal, Inc.
PRODUCT DOCUMENTATION Greenplum Database Version 4.2 Getting Started Rev: A01 2014 GoPivotal, Inc. Copyright 2014 GoPivotal, Inc. All rights reserved. GoPivotal, Inc. believes the information in this publication
Greenplum Database (software-only environments): Greenplum Database (4.0 and higher supported, 4.2.1 or higher recommended)
P/N: 300-014-087 Rev: A01 Updated: April 3, 2012 Welcome to Command Center Command Center is a management tool for the Big Data Platform. Command Center monitors system performance metrics, system health,
EMC Data Domain Management Center
EMC Data Domain Management Center Version 1.1 Initial Configuration Guide 302-000-071 REV 04 Copyright 2012-2015 EMC Corporation. All rights reserved. Published in USA. Published June, 2015 EMC believes
EMC Backup and Recovery for Microsoft SQL Server 2008 Enabled by EMC Celerra Unified Storage
EMC Backup and Recovery for Microsoft SQL Server 2008 Enabled by EMC Celerra Unified Storage Applied Technology Abstract This white paper describes various backup and recovery solutions available for SQL
EMC NetWorker Module for Microsoft for Windows Bare Metal Recovery Solution
EMC NetWorker Module for Microsoft for Windows Bare Metal Recovery Solution Release 3.0 User Guide P/N 300-999-671 REV 02 Copyright 2007-2013 EMC Corporation. All rights reserved. Published in the USA.
Trend Micro Incorporated reserves the right to make changes to this document and to the products described herein without notice.
Trend Micro Incorporated reserves the right to make changes to this document and to the products described herein without notice. Before installing and using the software, please review the readme files,
Oracle BI EE Implementation on Netezza. Prepared by SureShot Strategies, Inc.
Oracle BI EE Implementation on Netezza Prepared by SureShot Strategies, Inc. The goal of this paper is to give an insight to Netezza architecture and implementation experience to strategize Oracle BI EE
MONITORING EMC GREENPLUM DCA WITH NAGIOS
White Paper MONITORING EMC GREENPLUM DCA WITH NAGIOS EMC Greenplum Data Computing Appliance, EMC DCA Nagios Plug-In, Monitor DCA hardware components Monitor DCA database and Hadoop services View full DCA
BIGDATA GREENPLUM DBA INTRODUCTION COURSE OBJECTIVES COURSE SUMMARY HIGHLIGHTS OF GREENPLUM DBA AT IQ TECH
BIGDATA GREENPLUM DBA Meta-data: Outrun your competition with advanced knowledge in the area of BigData with IQ Technology s online training course on Greenplum DBA. A state-of-the-art course that is delivered
Using RAID Admin and Disk Utility
Using RAID Admin and Disk Utility Xserve RAID Includes instructions for creating RAID arrays and monitoring Xserve RAID systems K Apple Computer, Inc. 2003 Apple Computer, Inc. All rights reserved. Under
http://www.trendmicro.com/download
Trend Micro Incorporated reserves the right to make changes to this document and to the products described herein without notice. Before installing and using the software, please review the readme files,
Monitoring PostgreSQL database with Verax NMS
Monitoring PostgreSQL database with Verax NMS Table of contents Abstract... 3 1. Adding PostgreSQL database to device inventory... 4 2. Adding sensors for PostgreSQL database... 7 3. Adding performance
EMC RepliStor for Microsoft Windows ERROR MESSAGE AND CODE GUIDE P/N 300-002-826 REV A02
EMC RepliStor for Microsoft Windows ERROR MESSAGE AND CODE GUIDE P/N 300-002-826 REV A02 EMC Corporation Corporate Headquarters: Hopkinton, MA 01748-9103 1-508-435-1000 www.emc.com Copyright 2003-2005
EMC Data Protection Search
EMC Data Protection Search Version 1.0 Security Configuration Guide 302-001-611 REV 01 Copyright 2014-2015 EMC Corporation. All rights reserved. Published in USA. Published April 20, 2015 EMC believes
EMC NetWorker Module for Microsoft Applications Release 2.3. Application Guide P/N 300-011-105 REV A02
EMC NetWorker Module for Microsoft Applications Release 2.3 Application Guide P/N 300-011-105 REV A02 EMC Corporation Corporate Headquarters: Hopkinton, MA 01748-9103 1-508-435-1000 www.emc.com Copyright
How To Use A Microsoft Networker Module For Windows 8.2.2 (Windows) And Windows 8 (Windows 8) (Windows 7) (For Windows) (Powerbook) (Msa) (Program) (Network
EMC NetWorker Module for Microsoft Applications Release 2.3 Application Guide P/N 300-011-105 REV A03 EMC Corporation Corporate Headquarters: Hopkinton, MA 01748-9103 1-508-435-1000 www.emc.com Copyright
Using Attunity Replicate with Greenplum Database Using Attunity Replicate for data migration and Change Data Capture to the Greenplum Database
White Paper Using Attunity Replicate with Greenplum Database Using Attunity Replicate for data migration and Change Data Capture to the Greenplum Database Abstract This white paper explores the technology
Integrated Grid Solutions. and Greenplum
EMC Perspective Integrated Grid Solutions from SAS, EMC Isilon and Greenplum Introduction Intensifying competitive pressure and vast growth in the capabilities of analytic computing platforms are driving
IBM TSM DISASTER RECOVERY BEST PRACTICES WITH EMC DATA DOMAIN DEDUPLICATION STORAGE
White Paper IBM TSM DISASTER RECOVERY BEST PRACTICES WITH EMC DATA DOMAIN DEDUPLICATION STORAGE Abstract This white paper focuses on recovery of an IBM Tivoli Storage Manager (TSM) server and explores
IBM System Storage DS5020 Express
IBM DS5020 Express Manage growth, complexity, and risk with scalable, high-performance storage Highlights Mixed host interfaces support (Fibre Channel/iSCSI) enables SAN tiering Balanced performance well-suited
EMC Backup and Recovery for Microsoft SQL Server
EMC Backup and Recovery for Microsoft SQL Server Enabled by Quest LiteSpeed Copyright 2010 EMC Corporation. All rights reserved. Published February, 2010 EMC believes the information in this publication
Copyright 2012 Trend Micro Incorporated. All rights reserved.
Trend Micro Incorporated reserves the right to make changes to this document and to the products described herein without notice. Before installing and using the software, please review the readme files,
EMC NetWorker Module for Microsoft for Windows Bare Metal Recovery Solution
EMC NetWorker Module for Microsoft for Windows Bare Metal Recovery Solution Version 9.0 User Guide 302-001-755 REV 01 Copyright 2007-2015 EMC Corporation. All rights reserved. Published in USA. Published
Symantec NetBackup 5220
A single-vendor enterprise backup appliance that installs in minutes Data Sheet: Data Protection Overview is a single-vendor enterprise backup appliance that installs in minutes, with expandable storage
EMC AVAMAR INTEGRATION GUIDE AND DATA DOMAIN 6.0 P/N 300-011-623 REV A02
EMC AVAMAR 6.0 AND DATA DOMAIN INTEGRATION GUIDE P/N 300-011-623 REV A02 EMC CORPORATION CORPORATE HEADQUARTERS: HOPKINTON, MA 01748-9103 1-508-435-1000 WWW.EMC.COM Copyright and Trademark Notices Copyright
Configuring Celerra for Security Information Management with Network Intelligence s envision
Configuring Celerra for Security Information Management with Best Practices Planning Abstract appliance is used to monitor log information from any device on the network to determine how that device is
FileMaker Server 11. FileMaker Server Help
FileMaker Server 11 FileMaker Server Help 2010 FileMaker, Inc. All Rights Reserved. FileMaker, Inc. 5201 Patrick Henry Drive Santa Clara, California 95054 FileMaker is a trademark of FileMaker, Inc. registered
SAN TECHNICAL - DETAILS/ SPECIFICATIONS
SAN TECHNICAL - DETAILS/ SPECIFICATIONS Technical Details / Specifications for 25 -TB Usable capacity SAN Solution Item 1) SAN STORAGE HARDWARE : One No. S.N. Features Description Technical Compliance
Rebasoft Auditor Quick Start Guide
Copyright Rebasoft Limited: 2009-2011 1 Release 2.1, Rev. 1 Copyright Notice Copyright 2009-2011 Rebasoft Ltd. All rights reserved. REBASOFT Software, the Rebasoft logo, Rebasoft Auditor are registered
Using EonStor FC-host Storage Systems in VMware Infrastructure 3 and vsphere 4
Using EonStor FC-host Storage Systems in VMware Infrastructure 3 and vsphere 4 Application Note Abstract This application note explains the configure details of using Infortrend FC-host storage systems
EMC NetWorker. Licensing Guide. Release 8.0 P/N 300-013-596 REV A01
EMC NetWorker Release 8.0 Licensing Guide P/N 300-013-596 REV A01 Copyright (2011-2012) EMC Corporation. All rights reserved. Published in the USA. Published June, 2012 EMC believes the information in
Symantec Database Security and Audit 3100 Series Appliance. Getting Started Guide
Symantec Database Security and Audit 3100 Series Appliance Getting Started Guide Symantec Database Security and Audit 3100 Series Getting Started Guide The software described in this book is furnished
Veeam Cloud Connect. Version 8.0. Administrator Guide
Veeam Cloud Connect Version 8.0 Administrator Guide April, 2015 2015 Veeam Software. All rights reserved. All trademarks are the property of their respective owners. No part of this publication may be
Postgres Plus xdb Replication Server with Multi-Master User s Guide
Postgres Plus xdb Replication Server with Multi-Master User s Guide Postgres Plus xdb Replication Server with Multi-Master build 57 August 22, 2012 , Version 5.0 by EnterpriseDB Corporation Copyright 2012
Using VMware VMotion with Oracle Database and EMC CLARiiON Storage Systems
Using VMware VMotion with Oracle Database and EMC CLARiiON Storage Systems Applied Technology Abstract By migrating VMware virtual machines from one physical environment to another, VMware VMotion can
Netezza PureData System Administration Course
Course Length: 2 days CEUs 1.2 AUDIENCE After completion of this course, you should be able to: Administer the IBM PDA/Netezza Install Netezza Client Software Use the Netezza System Interfaces Understand
ORACLE DATABASE 10G ENTERPRISE EDITION
ORACLE DATABASE 10G ENTERPRISE EDITION OVERVIEW Oracle Database 10g Enterprise Edition is ideal for enterprises that ENTERPRISE EDITION For enterprises of any size For databases up to 8 Exabytes in size.
Optimizing Business Continuity Management with NetIQ PlateSpin Protect and AppManager. Best Practices and Reference Architecture
Optimizing Business Continuity Management with NetIQ PlateSpin Protect and AppManager Best Practices and Reference Architecture WHITE PAPER Table of Contents Introduction.... 1 Why monitor PlateSpin Protect
SOFTWARE LICENSE LIMITED WARRANTY
CYBEROAM INSTALLATION GUIDE VERSION: 6..0..0..0 IMPORTANT NOTICE Elitecore has supplied this Information believing it to be accurate and reliable at the time of printing, but is presented without warranty
EMC NetWorker Module for Microsoft for Windows Bare Metal Recovery Solution
EMC NetWorker Module for Microsoft for Windows Bare Metal Recovery Solution Release 8.2 User Guide P/N 302-000-658 REV 01 Copyright 2007-2014 EMC Corporation. All rights reserved. Published in the USA.
Virtualizing SQL Server 2008 Using EMC VNX Series and Microsoft Windows Server 2008 R2 Hyper-V. Reference Architecture
Virtualizing SQL Server 2008 Using EMC VNX Series and Microsoft Windows Server 2008 R2 Hyper-V Copyright 2011 EMC Corporation. All rights reserved. Published February, 2011 EMC believes the information
EMC NetWorker Module for Microsoft for Windows Bare Metal Recovery Solution
EMC NetWorker Module for Microsoft for Windows Bare Metal Recovery Solution Version 8.2 Service Pack 1 User Guide 302-001-235 REV 01 Copyright 2007-2015 EMC Corporation. All rights reserved. Published
EMC Documentum Content Services for SAP iviews for Related Content
EMC Documentum Content Services for SAP iviews for Related Content Version 6.0 Administration Guide P/N 300 005 446 Rev A01 EMC Corporation Corporate Headquarters: Hopkinton, MA 01748 9103 1 508 435 1000
Installing Management Applications on VNX for File
EMC VNX Series Release 8.1 Installing Management Applications on VNX for File P/N 300-015-111 Rev 01 EMC Corporation Corporate Headquarters: Hopkinton, MA 01748-9103 1-508-435-1000 www.emc.com Copyright
EMC DOCUMENTUM xplore 1.1 DISASTER RECOVERY USING EMC NETWORKER
White Paper EMC DOCUMENTUM xplore 1.1 DISASTER RECOVERY USING EMC NETWORKER Abstract The objective of this white paper is to describe the architecture of and procedure for configuring EMC Documentum xplore
EMC NetWorker Module for Microsoft Exchange Server Release 5.1
EMC NetWorker Module for Microsoft Exchange Server Release 5.1 Installation Guide P/N 300-004-750 REV A02 EMC Corporation Corporate Headquarters: Hopkinton, MA 01748-9103 1-508-435-1000 www.emc.com Copyright
Active-Active and High Availability
Active-Active and High Availability Advanced Design and Setup Guide Perceptive Content Version: 7.0.x Written by: Product Knowledge, R&D Date: July 2015 2015 Perceptive Software. All rights reserved. Lexmark
Server Installation Guide ZENworks Patch Management 6.4 SP2
Server Installation Guide ZENworks Patch Management 6.4 SP2 02_016N 6.4SP2 Server Installation Guide - 2 - Notices Version Information ZENworks Patch Management Server Installation Guide - ZENworks Patch
EMC SYNCPLICITY FILE SYNC AND SHARE SOLUTION
EMC SYNCPLICITY FILE SYNC AND SHARE SOLUTION Automated file synchronization Flexible, cloud-based administration Secure, on-premises storage EMC Solutions January 2015 Copyright 2014 EMC Corporation. All
HP Intelligent Management Center v7.1 Virtualization Monitor Administrator Guide
HP Intelligent Management Center v7.1 Virtualization Monitor Administrator Guide Abstract This guide describes the Virtualization Monitor (vmon), an add-on service module of the HP Intelligent Management
FileMaker Server 7. Administrator s Guide. For Windows and Mac OS
FileMaker Server 7 Administrator s Guide For Windows and Mac OS 1994-2004, FileMaker, Inc. All Rights Reserved. FileMaker, Inc. 5201 Patrick Henry Drive Santa Clara, California 95054 FileMaker is a trademark
EMC Virtual Infrastructure for Microsoft Applications Data Center Solution
EMC Virtual Infrastructure for Microsoft Applications Data Center Solution Enabled by EMC Symmetrix V-Max and Reference Architecture EMC Global Solutions Copyright and Trademark Information Copyright 2009
EMC Virtual Infrastructure for SAP Enabled by EMC Symmetrix with Auto-provisioning Groups, Symmetrix Management Console, and VMware vcenter Converter
EMC Virtual Infrastructure for SAP Enabled by EMC Symmetrix with Auto-provisioning Groups, VMware vcenter Converter A Detailed Review EMC Information Infrastructure Solutions Abstract This white paper
VCE Vision Intelligent Operations Version 2.5 Technical Overview
Revision history www.vce.com VCE Vision Intelligent Operations Version 2.5 Technical Document revision 2.0 March 2014 2014 VCE Company, 1 LLC. Revision history VCE Vision Intelligent Operations Version
EMC NetWorker. Server Disaster Recovery and Availability Best Practices Guide. Release 8.0 Service Pack 1 P/N 300-999-723 REV 01
EMC NetWorker Release 8.0 Service Pack 1 Server Disaster Recovery and Availability Best Practices Guide P/N 300-999-723 REV 01 Copyright 1990-2012 EMC Corporation. All rights reserved. Published in the
Database Administration
Unified CCE, page 1 Historical Data, page 2 Tool, page 3 Database Sizing Estimator Tool, page 11 Administration & Data Server with Historical Data Server Setup, page 14 Database Size Monitoring, page 15
EMC NetWorker VSS Client for Microsoft Windows Server 2003 First Edition
EMC NetWorker VSS Client for Microsoft Windows Server 2003 First Edition Installation Guide P/N 300-003-994 REV A01 EMC Corporation Corporate Headquarters: Hopkinton, MA 01748-9103 1-508-435-1000 www.emc.com
EMC DATA DOMAIN OPERATING SYSTEM
ESSENTIALS HIGH-SPEED, SCALABLE DEDUPLICATION Up to 58.7 TB/hr performance Reduces protection storage requirements by 10 to 30x CPU-centric scalability DATA INVULNERABILITY ARCHITECTURE Inline write/read
FileMaker Server 14. FileMaker Server Help
FileMaker Server 14 FileMaker Server Help 2007 2015 FileMaker, Inc. All Rights Reserved. FileMaker, Inc. 5201 Patrick Henry Drive Santa Clara, California 95054 FileMaker and FileMaker Go are trademarks
OBIEE 11g Analytics Using EMC Greenplum Database
White Paper OBIEE 11g Analytics Using EMC Greenplum Database - An Integration guide for OBIEE 11g Windows Users Abstract This white paper explains how OBIEE Analytics Business Intelligence Tool can be
MEDIAROOM. Products Hosting Infrastructure Documentation. Introduction. Hosting Facility Overview
MEDIAROOM Products Hosting Infrastructure Documentation Introduction The purpose of this document is to provide an overview of the hosting infrastructure used for our line of hosted Web products and provide
Installing and Using the vnios Trial
Installing and Using the vnios Trial The vnios Trial is a software package designed for efficient evaluation of the Infoblox vnios appliance platform. Providing the complete suite of DNS, DHCP and IPAM
RSA Security Analytics. S4 Broker Setup Guide
RSA Security Analytics S4 Broker Setup Guide Copyright 2010-2013 RSA, the Security Division of EMC. All rights reserved. Trademarks RSA, the RSA Logo and EMC are either registered trademarks or trademarks
Working with the Cognos BI Server Using the Greenplum Database
White Paper Working with the Cognos BI Server Using the Greenplum Database Interoperability and Connectivity Configuration for AIX Users Abstract This white paper explains how the Cognos BI Server running
Integration Guide. EMC Data Domain and Silver Peak VXOA 4.4.10 Integration Guide
Integration Guide EMC Data Domain and Silver Peak VXOA 4.4.10 Integration Guide August 2013 Copyright 2013 EMC Corporation. All Rights Reserved. EMC believes the information in this publication is accurate
StreamServe Persuasion SP5 Microsoft SQL Server
StreamServe Persuasion SP5 Microsoft SQL Server Database Guidelines Rev A StreamServe Persuasion SP5 Microsoft SQL Server Database Guidelines Rev A 2001-2011 STREAMSERVE, INC. ALL RIGHTS RESERVED United
SAN Conceptual and Design Basics
TECHNICAL NOTE VMware Infrastructure 3 SAN Conceptual and Design Basics VMware ESX Server can be used in conjunction with a SAN (storage area network), a specialized high speed network that connects computer
Hillstone StoneOS User Manual Hillstone Unified Intelligence Firewall Installation Manual
Hillstone StoneOS User Manual Hillstone Unified Intelligence Firewall Installation Manual www.hillstonenet.com Preface Conventions Content This document follows the conventions below: CLI Tip: provides
Understanding EMC Avamar with EMC Data Protection Advisor
Understanding EMC Avamar with EMC Data Protection Advisor Applied Technology Abstract EMC Data Protection Advisor provides a comprehensive set of features that reduce the complexity of managing data protection
DD160 and DD620 Hardware Overview
DD160 and DD620 Hardware Overview Data Domain, Inc. 2421 Mission College Boulevard, Santa Clara, CA 95054 866-WE-DDUPE; 408-980-4800 775-0206-0001 Revision A March 21, 2011 Copyright 2011 EMC Corporation.
Postgres Enterprise Manager Installation Guide
Postgres Enterprise Manager Installation Guide January 22, 2016 Postgres Enterprise Manager Installation Guide, Version 6.0.0 by EnterpriseDB Corporation Copyright 2013-2016 EnterpriseDB Corporation. All
Copyright 2013 Trend Micro Incorporated. All rights reserved.
Trend Micro Incorporated reserves the right to make changes to this document and to the products described herein without notice. Before installing and using the software, please review the readme files,
DiskPulse DISK CHANGE MONITOR
DiskPulse DISK CHANGE MONITOR User Manual Version 7.9 Oct 2015 www.diskpulse.com [email protected] 1 1 DiskPulse Overview...3 2 DiskPulse Product Versions...5 3 Using Desktop Product Version...6 3.1 Product
5-Bay Raid Sub-System Smart Removable 3.5" SATA Multiple Bay Data Storage Device User's Manual
5-Bay Raid Sub-System Smart Removable 3.5" SATA Multiple Bay Data Storage Device User's Manual www.vipower.com Table of Contents 1. How the SteelVine (VPMP-75511R/VPMA-75511R) Operates... 1 1-1 SteelVine
EMC DATA DOMAIN OPERATING SYSTEM
EMC DATA DOMAIN OPERATING SYSTEM Powering EMC Protection Storage ESSENTIALS High-Speed, Scalable Deduplication Up to 58.7 TB/hr performance Reduces requirements for backup storage by 10 to 30x and archive
INTEROPERABILITY OF SAP BUSINESS OBJECTS 4.0 WITH GREENPLUM DATABASE - AN INTEGRATION GUIDE FOR WINDOWS USERS (64 BIT)
White Paper INTEROPERABILITY OF SAP BUSINESS OBJECTS 4.0 WITH - AN INTEGRATION GUIDE FOR WINDOWS USERS (64 BIT) Abstract This paper presents interoperability of SAP Business Objects 4.0 with Greenplum.
FileMaker Server 10 Help
FileMaker Server 10 Help 2007-2009 FileMaker, Inc. All Rights Reserved. FileMaker, Inc. 5201 Patrick Henry Drive Santa Clara, California 95054 FileMaker, the file folder logo, Bento and the Bento logo
Dell Desktop Virtualization Solutions Simplified. All-in-one VDI appliance creates a new level of simplicity for desktop virtualization
Dell Desktop Virtualization Solutions Simplified All-in-one VDI appliance creates a new level of simplicity for desktop virtualization Executive summary Desktop virtualization is a proven method for delivering
Oracle Database Deployments with EMC CLARiiON AX4 Storage Systems
Oracle Database Deployments with EMC CLARiiON AX4 Storage Systems Applied Technology Abstract This white paper investigates configuration and replication choices for Oracle Database deployment with EMC
VMware vsphere Data Protection
VMware vsphere Data Protection Replication Target TECHNICAL WHITEPAPER 1 Table of Contents Executive Summary... 3 VDP Identities... 3 vsphere Data Protection Replication Target Identity (VDP-RT)... 3 Replication
Using Red Hat Network Satellite Server to Manage Dell PowerEdge Servers
Using Red Hat Network Satellite Server to Manage Dell PowerEdge Servers Enterprise Product Group (EPG) Dell White Paper By Todd Muirhead and Peter Lillian July 2004 Contents Executive Summary... 3 Introduction...
BlackBerry Enterprise Service 10. Version: 10.2. Configuration Guide
BlackBerry Enterprise Service 10 Version: 10.2 Configuration Guide Published: 2015-02-27 SWD-20150227164548686 Contents 1 Introduction...7 About this guide...8 What is BlackBerry Enterprise Service 10?...9
2-Bay Raid Sub-System Smart Removable 3.5" SATA Multiple Bay Data Storage Device User's Manual
2-Bay Raid Sub-System Smart Removable 3.5" SATA Multiple Bay Data Storage Device User's Manual www.vipower.com Table of Contents 1. How the SteelVine (VPMP-75211R/VPMA-75211R) Operates... 1 1-1 SteelVine
EMC Documentum Content Services for SAP Document Controllers
EMC Documentum Content Services for SAP Document Controllers Version 6.0 User Guide P/N 300 005 439 Rev A01 EMC Corporation Corporate Headquarters: Hopkinton, MA 01748 9103 1 508 435 1000 www.emc.com Copyright
EMC Business Continuity for Microsoft SQL Server 2008
EMC Business Continuity for Microsoft SQL Server 2008 Enabled by EMC Celerra Fibre Channel, EMC MirrorView, VMware Site Recovery Manager, and VMware vsphere 4 Reference Architecture Copyright 2009, 2010
Symantec Endpoint Protection 11.0 Architecture, Sizing, and Performance Recommendations
Symantec Endpoint Protection 11.0 Architecture, Sizing, and Performance Recommendations Technical Product Management Team Endpoint Security Copyright 2007 All Rights Reserved Revision 6 Introduction This
EMC Data Domain Boost for Oracle Recovery Manager (RMAN)
White Paper EMC Data Domain Boost for Oracle Recovery Manager (RMAN) Abstract EMC delivers Database Administrators (DBAs) complete control of Oracle backup, recovery, and offsite disaster recovery with
EMC MIGRATION OF AN ORACLE DATA WAREHOUSE
EMC MIGRATION OF AN ORACLE DATA WAREHOUSE EMC Symmetrix VMAX, Virtual Improve storage space utilization Simplify storage management with Virtual Provisioning Designed for enterprise customers EMC Solutions
vrealize Operations Manager Customization and Administration Guide
vrealize Operations Manager Customization and Administration Guide vrealize Operations Manager 6.0.1 This document supports the version of each product listed and supports all subsequent versions until
Active Fabric Manager (AFM) Plug-in for VMware vcenter Virtual Distributed Switch (VDS) CLI Guide
Active Fabric Manager (AFM) Plug-in for VMware vcenter Virtual Distributed Switch (VDS) CLI Guide Notes, Cautions, and Warnings NOTE: A NOTE indicates important information that helps you make better use
EMC Backup and Recovery for Microsoft SQL Server
EMC Backup and Recovery for Microsoft SQL Server Enabled by EMC NetWorker Module for Microsoft SQL Server Copyright 2010 EMC Corporation. All rights reserved. Published February, 2010 EMC believes the
HYPERION SYSTEM 9 N-TIER INSTALLATION GUIDE MASTER DATA MANAGEMENT RELEASE 9.2
HYPERION SYSTEM 9 MASTER DATA MANAGEMENT RELEASE 9.2 N-TIER INSTALLATION GUIDE P/N: DM90192000 Copyright 2005-2006 Hyperion Solutions Corporation. All rights reserved. Hyperion, the Hyperion logo, and
EMC Unified Storage for Microsoft SQL Server 2008
EMC Unified Storage for Microsoft SQL Server 2008 Enabled by EMC CLARiiON and EMC FAST Cache Reference Copyright 2010 EMC Corporation. All rights reserved. Published October, 2010 EMC believes the information
EMC DiskXtender File System Manager for UNIX/Linux Release 3.5
EMC DiskXtender File System Manager for UNIX/Linux Release 3.5 Administrator s Guide P/N 300-009-573 REV. A01 EMC Corporation Corporate Headquarters: Hopkinton, MA 01748-9103 1-508-435-1000 www.emc.com
BrightStor ARCserve Backup for Windows
BrightStor ARCserve Backup for Windows Agent for Microsoft SQL Server r11.5 D01173-2E This documentation and related computer software program (hereinafter referred to as the "Documentation") is for the
