Integrating with Hadoop HPE Vertica Analytic Database. Software Version: 7.2.x

Size: px
Start display at page:

Download "Integrating with Hadoop HPE Vertica Analytic Database. Software Version: 7.2.x"

Transcription

1 HPE Vertica Analytic Database Software Version: 7.2.x Document Release Date: 5/18/2016

2 Legal Notices Warranty The only warranties for Hewlett Packard Enterprise products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HPE shall not be liable for technical or editorial errors or omissions contained herein. The information contained herein is subject to change without notice. Restricted Rights Legend Confidential computer software. Valid license from HPE required for possession, use or copying. Consistent with FAR and , Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license. Copyright Notice Copyright Hewlett Packard Enterprise Development LP Trademark Notices Adobe is a trademark of Adobe Systems Incorporated. Microsoft and Windows are U.S. registered trademarks of Microsoft Corporation. UNIX is a registered trademark of The Open Group. This product includes an interface of the 'zlib' general purpose compression library, which is Copyright Jean-loup Gailly and Mark Adler. HPE Vertica Analytic Database (7.2.x) Page 2 of 145

3 Contents Introduction to Hadoop Integration 8 Hadoop Distributions 8 Integration Options 8 File Paths 9 Cluster Layout 10 Co-Located Clusters 10 Hardware Recommendations 11 Configuring Hadoop for Co-Located Clusters 12 webhdfs 12 YARN 12 Hadoop Balancer 13 Replication Factor 13 Disk Space for Non-HDFS Use 13 Separate Clusters 14 Choosing Which Hadoop Interface to Use 16 Creating an HDFS Storage Location 16 Reading ORC and Parquet Files 16 Using the HCatalog Connector 16 Using the HDFS Connector 17 Using the MapReduce Connector 17 Using Kerberos with Hadoop 18 How Vertica uses Kerberos With Hadoop 18 User Authentication 18 Vertica Authentication 19 See Also 20 Configuring Kerberos 21 Prerequisite: Setting Up Users and the Keytab File 21 HCatalog Connector 21 HDFS Connector 21 HDFS Storage Location 22 Token Expiration 22 See Also 23 Reading Native Hadoop File Formats 24 HPE Vertica Analytic Database (7.2.x) Page 3 of 145

4 Requirements 24 Creating External Tables 24 Loading Data 25 Supported Data Types 26 Kerberos Authentication 26 Examples 26 See Alsos 26 Query Performance 27 Considerations When Writing Files 27 Predicate Pushdown 27 Data Locality 28 Configuring hdfs:/// Access 28 Troubleshooting Reads from Native File Formats 29 webhdfs Error When Using hdfs URIs 29 Reads from Parquet Files Report Unexpected Data-Type Mismatches 29 Time Zones in Timestamp Values Are Not Correct 30 Some Date and Timestamp Values Are Wrong by Several Days 30 Error 7087: Wrong Number of Columns 30 Using the HCatalog Connector 31 Hive, HCatalog, and WebHCat Overview 31 HCatalog Connection Features 32 HCatalog Connection Considerations 32 How the HCatalog Connector Works 33 HCatalog Connector Requirements 34 Vertica Requirements 34 Hadoop Requirements 34 Testing Connectivity 34 Installing the Java Runtime on Your Vertica Cluster 35 Installing a Java Runtime 36 Setting the JavaBinaryForUDx Configuration Parameter 37 Configuring Vertica for HCatalog 38 Copy Hadoop Libraries and Configuration Files 38 Install the HCatalog Connector 40 Upgrading to a New Version of Vertica 41 Additional Options for Native File Formats 41 Using the HCatalog Connector with HA NameNode 42 Defining a Schema Using the HCatalog Connector 42 Querying Hive Tables Using HCatalog Connector 44 Viewing Hive Schema and Table Metadata 44 HPE Vertica Analytic Database (7.2.x) Page 4 of 145

5 Synchronizing an HCatalog Schema or Table With a Local Schema or Table 49 Examples 50 Data Type Conversions from Hive to Vertica 51 Data-Width Handling Differences Between Hive and Vertica 53 Using Non-Standard SerDes 53 Determining Which SerDe You Need 54 Installing the SerDe on the Vertica Cluster 55 Troubleshooting HCatalog Connector Problems 55 Connection Errors 55 UDx Failure When Querying Data: Error SerDe Errors 58 Differing Results Between Hive and Vertica Queries 58 Preventing Excessive Query Delays 59 Using the HDFS Connector 60 HDFS Connector Requirements 60 Uninstall Prior Versions of the HDFS Connector 60 webhdfs Requirements 61 Kerberos Authentication Requirements 61 Testing Your Hadoop webhdfs Configuration 61 Loading Data Using the HDFS Connector 64 The HDFS File URL 65 Copying Files in Parallel 66 Viewing Rejected Rows and Exceptions 67 Creating an External Table with an HDFS Source 67 Load Errors in External Tables 69 HDFS ConnectorTroubleshooting Tips 69 User Unable to Connect to Kerberos-Authenticated Hadoop Cluster 70 Resolving Error Transfer Rate Errors 71 Error Loading Many Files 72 Using HDFS Storage Locations 73 Storage Location for HDFS Requirements 73 HDFS Space Requirements 74 Additional Requirements for Backing Up Data Stored on HDFS 74 How the HDFS Storage Location Stores Data 75 What You Can Store on HDFS 75 What HDFS Storage Locations Cannot Do 76 Creating an HDFS Storage Location 76 Creating a Storage Location Using Vertica for SQL on Hadoop 78 Adding HDFS Storage Locations to New Nodes 78 HPE Vertica Analytic Database (7.2.x) Page 5 of 145

6 Creating a Storage Policy for HDFS Storage Locations 79 Storing an Entire Table in an HDFS Storage Location 79 Storing Table Partitions in HDFS 80 Moving Partitions to a Table Stored on HDFS 82 Backing Up Vertica Storage Locations for HDFS 83 Configuring Vertica to Restore HDFS Storage Locations 84 Configuration Overview 85 Installing a Java Runtime 85 Finding Your Hadoop Distribution's Package Repository 86 Configuring Vertica Nodes to Access the Hadoop Distribution s Package Repository 86 Installing the Required Hadoop Packages 88 Setting Configuration Parameters 89 Setting Kerberos Parameters 91 Confirming that distcp Runs 91 Troubleshooting 92 Configuring Hadoop and Vertica to Enable Backup of HDFS Storage 93 Granting Superuser Status on Hortonworks Granting Superuser Status on Cloudera Manually Enabling Snapshotting for a Directory 94 Additional Requirements for Kerberos 95 Testing the Database Account's Ability to Make HDFS Directories Snapshottable 95 Performing Backups Containing HDFS Storage Locations 96 Removing HDFS Storage Locations 96 Removing Existing Data from an HDFS Storage Location 97 Moving Data to Another Storage Location 97 Clearing Storage Policies 98 Changing the Usage of HDFS Storage Locations 100 Dropping an HDFS Storage Location 101 Removing Storage Location Files from HDFS 102 Removing Backup Snapshots 102 Removing the Storage Location Directories 103 Troubleshooting HDFS Storage Locations 103 HDFS Storage Disk Consumption 103 Kerberos Authentication When Creating a Storage Location 106 Backup or Restore Fails When Using Kerberos 106 Using the MapReduce Connector 107 MapReduce Connector Features 107 Prerequisites 108 Hadoop and Vertica Cluster Scaling 108 Installing the Connector 108 Accessing Vertica Data From Hadoop 111 Selecting VerticaInputFormat 111 Setting the Query to Retrieve Data From Vertica 112 HPE Vertica Analytic Database (7.2.x) Page 6 of 145

7 Using a Simple Query to Extract Data From Vertica 112 Using a Parameterized Query and Parameter Lists 113 Using a Discrete List of Values 113 Using a Collection Object 113 Scaling Parameter Lists for the Hadoop Cluster 114 Using a Query to Retrieve Parameter Values for a Parameterized Query 115 Writing a Map Class That Processes Vertica Data 115 Working with the VerticaRecord Class 116 Writing Data to Vertica From Hadoop 117 Configuring Hadoop to Output to Vertica 117 Defining the Output Table 117 Writing the Reduce Class 119 Storing Data in the VerticaRecord 119 Passing Parameters to the Vertica Connector for Hadoop Map Reduce At Run Time 122 Specifying the Location of the Connector.jar File 122 Specifying the Database Connection Parameters 123 Parameters for a Separate Output Database 124 Example Vertica Connector for Hadoop Map Reduce Application 125 Compiling and Running the Example Application 128 Compiling the Example (optional) 130 Running the Example Application 131 Verifying the Results 132 Using Hadoop Streaming with the Vertica Connector for Hadoop Map Reduce 133 Reading Data From Vertica in a Streaming Hadoop Job 133 Writing Data to Vertica in a Streaming Hadoop Job 136 Loading a Text File From HDFS into Vertica 137 Accessing Vertica From Pig 139 Registering the Vertica.jar Files 140 Reading Data From Vertica 140 Writing Data to Vertica 141 Integrating Vertica with the MapR Distribution of Hadoop 143 Send Documentation Feedback 145 HPE Vertica Analytic Database (7.2.x) Page 7 of 145

8 Introduction to Hadoop Integration Introduction to Hadoop Integration Apache Hadoop, like Vertica, uses a cluster of nodes for distributed processing. The primary component of interest is HDFS, the Hadoop Distributed File System. You can use HDFS from Vertica in several ways: You can import HDFS data into locally-stored ROS files. You can access HDFS data in place, using external tables. You can use HDFS as a storage location for ROS files. Hadoop includes two other components of interest: Hive, a data warehouse that provides the ability to query data stored in Hadoop. HCatalog, a component that makes Hive metadata available to applications, such as Vertica, outside of Hadoop. A Hadoop cluster can use Kerberos authentication to protect data stored in HDFS. Vertica integrates with Kerberos to access HDFS data if needed. See Using Kerberos with Hadoop. Hadoop Distributions Vertica can be used with Hadoop distributions from Hortonworks, Cloudera, and MapR. See Vertica Integrations for Hadoop for the specific versions that are supported. Integration Options Vertica supports two cluster architectures. Which you use affects the decisions you make about integration. You can co-locate Vertica on some or all of your Hadoop nodes. Vertica can then take advantage of local data. This option is supported only for Vertica for SQL on Hadoop. HPE Vertica Analytic Database (7.2.x) Page 8 of 145

9 Introduction to Hadoop Integration You can build a Vertica cluster that is separate from your Hadoop cluster. In this configuration, Vertica can fully use each of its nodes; it does not share resources with Hadoop. This option is not supported for Vertica for SQL on Hadoop.. These layout options are described in Cluster Layout. Both layouts support several interfaces for using Hadoop: An HDFS Storage Location uses HDFS to hold Vertica data (ROS files). The HCatalog Connector lets Vertica query data that is stored in a Hive database the same way you query data stored natively in a Vertica schema. Vertica can directly query data in Reading Native Hadoop File Formats (ORC and Parquet). This option is faster than using the HCatalog Connector for this type of data. The HDFS Connector lets Vertica import HDFS data. It also lets Vertica read HDFS data as an external table without using Hive. The MapReduce Connector lets you create Hadoop MapReduce jobs that retrieve data from Vertica. These jobs can also insert data into Vertica. File Paths Hadoop file paths are generally expressed using the webhdfs scheme, such as 'webhdfs://somehost:port/opt/data/filename'. These paths are URIs, so if you need to escape a special character in a path, use URI escaping. For example: webhdfs://somehost:port/opt/data/my%20file HPE Vertica Analytic Database (7.2.x) Page 9 of 145

10 Cluster Layout Cluster Layout Vertica and Hadoop each use a cluster of nodes for distributed processing. These clusters can be co-located, meaning you run both products on the same machines, or separate. Co-Located Clusters are for use with Vertica for SQL on Hadoop licenses. Separate Clusters are for use with Premium Edition and Community Edition licenses. With either architecture, if you are using the hdfs scheme to read ORC or Parquet files, you must do some additional configuration. See Configuring hdfs:/// Access. Co-Located Clusters With co-located clusters, Vertica is installed on some or all of your Hadoop nodes. The Vertica nodes use a private network in addition to the public network used by all Hadoop nodes, as the following figure shows: You might choose to place Vertica on all of your Hadoop nodes or only on some of them. If you are using HDFS Storage Locations you should use at least three Vertica nodes, the minimum number for K-Safety. Using more Vertica nodes can improve performance because the HDFS data needed by a query is more likely to be local. Normally, both Hadoop and Vertica use the entire node. Because this configuration uses shared nodes, you must address potential resource contention in your configuration on those nodes. See Configuring Hadoop for Co-Located Clusters for more information. No changes are needed on Hadoop-only nodes. You can place Hadoop and Vertica clusters within a single rack, or you can span across many racks and nodes. Spreading node types across racks can improve efficiency. HPE Vertica Analytic Database (7.2.x) Page 10 of 145

11 Cluster Layout Hardware Recommendations Hadoop clusters frequently do not have identical provisioning requirements or hardware configurations. However, Vertica nodes should be equivalent in size and capability, per the best-practice standards recommended in General Hardware and OS Requirements and Recommendations in Installing Vertica. Because Hadoop cluster specifications do not always meet these standards, Hewlett Packard Enterprise recommends the following specifications for Vertica nodes in your Hadoop cluster. Specifications For... Processor Recommendation For best performance, run: Two-socket servers with 8 14 core CPUs, clocked at or above 2.6 GHz for clusters over 10 TB Single-socket servers with 8 12 cores clocked at or above 2.6 GHz for clusters under 10 TB Memory Distribute the memory appropriately across all memory channels in the server: Minimum 8 GB of memory per physical CPU core in the server High-performance applications GB of memory per physical core Type at least DDR3-1600, preferably DDR Storage Read/write: Minimum 40 MB/s per physical core of the CPU For best performance MB/s per physical core Storage post RAID: Each node should have 1 9 TB. For a production setting, RAID 10 is recommended. In some cases, RAID 50 is acceptable. HPE Vertica Analytic Database (7.2.x) Page 11 of 145

12 Cluster Layout Because of the heavy compression and encoding that Vertica does, SSDs are not required. In most cases, a RAID of more, lessexpensive HDDs performs just as well as a RAID of fewer SSDs. If you intend to use RAID 50 for your data partition, you should keep a spare node in every rack, allowing for manual failover of a Vertica node in the case of a drive failure. A Vertica node recovery is faster than a RAID 50 rebuild. Also, be sure to never put more than 10 TB compressed on any node, to keep node recovery times at an acceptable rate. Network 10 GB networking in almost every case. With the introduction of 10 GB over cat6a (Ethernet), the cost difference is minimal. Configuring Hadoop for Co-Located Clusters If you are co-locating Vertica on any HDFS nodes, there are some additional configuration requirements. webhdfs Hadoop has two services that can provide web access to HDFS: webhdfs httpfs For Vertica, you must use the webhdfs service. YARN The YARN service is available in newer releases of Hadoop. It performs resource management for Hadoop clusters. When co-locating Vertica on YARNmanaged Hadoop nodes you must make some changes in YARN. HPE recommends reserving at least 16GB of memory for Vertica on shared nodes. Reserving more will improve performance. How you do this depends on your Hadoop distribution: HPE Vertica Analytic Database (7.2.x) Page 12 of 145

13 Cluster Layout If you are using Hortonworks, create a "Vertica" node label and assign this to the nodes that are running Vertica. If you are using Cloudera, enable and configure static service pools. Consult the documentation for your Hadoop distribution for details. Alternatively, you can disable YARN on the shared nodes. Hadoop Balancer The Hadoop Balancer can redistribute data blocks across HDFS. For many Hadoop services, this feature is useful. However, for Vertica this can reduce performance under some conditions. If you are using HDFS storage locations, the Hadoop load balancer can move data away from the Vertica nodes that are operating on it, degrading performance. This behavior can also occur when reading ORC or Parquet files if Vertica is not running on all Hadoop nodes. (If you are using separate Vertica and Hadoop clusters, all Hadoop access is over the network, and the performance cost is less noticeable.) To prevent the undesired movement of data blocks across the HDFS cluster, consider excluding Vertica nodes from rebalancing. See the Hadoop documentation to learn how to do this. Replication Factor By default, HDFS stores three copies of each data block. Vertica is generally set up to store two copies of each data item through K-Safety. Thus, lowering the replication factor to 2 can save space and still provide data protection. To lower the number of copies HDFS stores, set HadoopFSReplication, as explained in Troubleshooting HDFS Storage Locations. Disk Space for Non-HDFS Use You also need to reserve some disk space for non-hdfs use. To reserve disk space using Ambari, set dfs.datanode.du.reserved to a value in the hdfs-site.xml configuration file. Setting this parameter preserves space for non-hdfs files that Vertica requires. HPE Vertica Analytic Database (7.2.x) Page 13 of 145

14 Cluster Layout Separate Clusters In the Premium Edition product, your Vertica and Hadoop clusters must be set up on separate nodes, ideally connected by a high-bandwidth network connection. This is different from the configuration for Vertica for SQL on Hadoop, in which Vertica nodes are co-located on Hadoop nodes. The following figure illustrates the configuration for separate clusters:: The network is a key performance component of any well-configured cluster. When Vertica stores data to HDFS it writes and reads data across the network. The layout shown in the figure calls for two networks, and there are benefits to adding a third: Database Private Network: Vertica uses a private network for command and control and moving data between nodes in support of its database functions. In some HPE Vertica Analytic Database (7.2.x) Page 14 of 145

15 Cluster Layout networks, the command and control and passing of data are split across two networks. Database/Hadoop Shared Network: Each Vertica node must be able to connect to each Hadoop data node and the Name Node. Hadoop best practices generally require a dedicated network for the Hadoop cluster. This is not a technical requirement, but a dedicated network improves Hadoop performance. Vertica and Hadoop should share the dedicated Hadoop network. Optional Client Network: Outside clients may access the clustered networks through a client network. This is not an absolute requirement, but the use of a third network that supports client connections to either Vertica or Hadoop can improve performance. If the configuration does not support a client network, than client connections should use the shared network. HPE Vertica Analytic Database (7.2.x) Page 15 of 145

16 Choosing Which Hadoop Interface to Use Choosing Which Hadoop Interface to Use Vertica provides several ways to interact with data stored in Hadoop. This section explains how to choose among them. Decisions about Cluster Layout can affect the decisions you make about Hadoop interfaces. Creating an HDFS Storage Location Using a storage location to store data in the Vertica native file format (ROS) delivers the best query performance among the available Hadoop options. (Storing ROS files on the local disk rather than in Hadoop is faster still.) If you already have data in Hadoop, however, doing this means you are importing that data into Vertica. For co-located clusters, which does not use local file storage, you might still choose to use an HDFS storage location for better performance. You can use the HDFS Connector to load data that is already in HDFS into Vertica. For separate clusters, which use local file storage, consider using an HDFS storage location for lower-priority data. See Using HDFS Storage Locations and Using the HDFS Connector. Reading ORC and Parquet Files If your data is stored in the Optimized Row Columnar (ORC) or Parquet format, Vertica can query that data directly from HDFS. This option is faster than using the HCatalog Connector, but you cannot pull schema definitions from Hive directly into the database. Vertica reads the data in place; no extra copies are made. See Reading Native Hadoop File Formats. Using the HCatalog Connector The HCatalog Connector uses Hadoop services (Hive and HCatalog) to query data stored in HDFS. Like the ORC Reader, it reads data in place rather than making copies. Using this interface you can read all file formats supported by Hadoop, including Parquet and ORC, and Vertica can use Hive's schema definitions. However, performance can be poor in some cases. The HCatalog Connector is also sensitive to HPE Vertica Analytic Database (7.2.x) Page 16 of 145

17 Choosing Which Hadoop Interface to Use changes in the Hadoop libraries on which it depends; upgrading your Hadoop cluster might affect your HCatalog connections. See Using the HCatalog Connector. Using the HDFS Connector The HDFS Connector can be used to create and query external tables, reading the data in place rather than making copies. The HDFS Connector can be used with any data format for which a parser is available. It does not use Hive data; you have to define the table yourself. Its performance can be poor because, like the HCatalog Connector, it cannot take advantage of the benefits of columnar file formats. See Using the HDFS Connector. Using the MapReduce Connector The other interfaces described in this section allow you to read Hadoop data from Vertica or create Vertica data in Hadoop. The MapReduce Connector, in contrast, allows you to integrate with Hadoop's MapReduce jobs. Use this connector to send Vertica data to MapReduce or to have MapReduce jobs create data in Vertica. See Using the MapReduce Connector. HPE Vertica Analytic Database (7.2.x) Page 17 of 145

18 Using Kerberos with Hadoop Using Kerberos with Hadoop If your Hadoop cluster uses Kerberos authentication to restrict access to HDFS, you must configure Vertica to make authenticated connections. The details of this configuration vary, based on which methods you are using to access HDFS data: How Vertica uses Kerberos With Hadoop Configuring Kerberos How Vertica uses Kerberos With Hadoop Vertica authenticates with Hadoop in two ways that require different configurations: User Authentication On behalf of the user, by passing along the user's existing Kerberos credentials, as occurs with the HDFS Connector and the HCatalog Connector. Vertica Authentication On behalf of system processes (such as the Tuple Mover), by using a special Kerberos credential stored in a keytab file. User Authentication To use Vertica with Kerberos and Hadoop, the client user first authenticates with the Kerberos server (Key Distribution Center, or KDC) being used by the Hadoop cluster. A user might run kinit or sign in to Active Directory, for example. A user who authenticates to a Kerberos server receives a Kerberos ticket. At the beginning of a client session, Vertica automatically retrieves this ticket.the database then uses this ticket to get a Hadoop token, which Hadoop uses to grant access. Vertica uses this token to access HDFS, such as when executing a query on behalf of the user. When the token expires, the database automatically renews it, also renewing the Kerberos ticket if necessary. The user must have been granted permission to access the relevant files in HDFS. This permission is checked the first time Vertica reads HDFS data. The following figure shows how the user, Vertica, Hadoop, and Kerberos interact in user authentication: HPE Vertica Analytic Database (7.2.x) Page 18 of 145

19 Using Kerberos with Hadoop When using the HDFS Connector or the HCatalog Connector, or when reading an ORC or Parquet file stored in HDFS, Vertica uses the client identity as the preceding figure shows. Vertica Authentication Automatic processes, such as the Tuple Mover, do not log in the way users do. Instead, Vertica uses a special identity (principal) stored in a keytab file on every database node. (This approach is also used for Vertica clusters that use Kerberos but do not use Hadoop.) After you configure the keytab file, Vertica uses the principal residing there to automatically obtain and maintain a Kerberos ticket, much as in the client scenario. In this case, the client does not interact with Kerberos. The following figure shows the interactions required for Vertica authentication: HPE Vertica Analytic Database (7.2.x) Page 19 of 145

20 Using Kerberos with Hadoop Each Vertica node uses its own principal; it is common to incorporate the name of the node into the principal name. You can either create one keytab per node, containing only that node's principal, or you can create a single keytab containing all the principals and distribute the file to all nodes. Either way, the node uses its principal to get a Kerberos ticket and then uses that ticket to get a Hadoop token. For simplicity, the preceding figure shows the full set of interactions for only one database node. When creating HDFS storage locations Vertica uses the principal in the keytab file, not the principal of the user issuing the CREATE LOCATION statement. See Also For specific configuration instructions, see Configuring Kerberos. HPE Vertica Analytic Database (7.2.x) Page 20 of 145

21 Using Kerberos with Hadoop Configuring Kerberos Vertica can connect with Hadoop in several ways, and how you manage Kerberos authentication varies by connection type. This documentation assumes that you are using Kerberos for both your HDFS and Vertica clusters. Prerequisite: Setting Up Users and the Keytab File If you have not already configured Kerberos authentication for Vertica, follow the instructions in Configure for Kerberos Authentication. In particular: Create one Kerberos principal per node. Place the keytab file(s) in the same location on each database node and set its location in KerberosKeytabFile (see Specify the Location of the Keytab File). Set KerberosServiceName to the name of the principal (see Inform About the Kerberos Principal). HCatalog Connector You use the HCatalog Connector to query data in Hive. Queries are executed on behalf of Vertica users. If the current user has a Kerberos key, then Vertica passes it to the HCatalog connector automatically. Verify that all users who need access to Hive have been granted access to HDFS. In addition, in your Hadoop configuration files (core-site.xml in most distributions), make sure that you enable all Hadoop components to impersonate the Vertica user. The easiest way to do this is to set the proxyuser property using wildcards for all users on all hosts and in all groups. Consult your Hadoop documentation for instructions. Make sure you do this before running hcatutil (see Configuring Vertica for HCatalog). HDFS Connector The HDFS Connector loads data from HDFS into Vertica on behalf of the user, using a User Defined Source. If the user performing the data load has a Kerberos key, then the UDS uses it to access HDFS. Verify that all users who use this connector have been granted access to HDFS. HPE Vertica Analytic Database (7.2.x) Page 21 of 145

22 Using Kerberos with Hadoop HDFS Storage Location You can create a database storage location in HDFS. An HDFS storage location provides improved performance compared to other HDFS interfaces (such as the HCatalog Connector). After you create Kerberos principals for each node, give all of them read and write permissions to the HDFS directory you will use as a storage location. If you plan to back up HDFS storage locations, take the following additional steps: Grant Hadoop superuser privileges to the new principals. Configure backups, including setting the HadoopConfigDir configuration parameter, following the instructions in Configuring Hadoop and Vertica to Enable Backup of HDFS Storage Configure user impersonation to be able to restore from backups following the instructions in "Setting Kerberos Parameters" in Configuring Vertica to Restore HDFS Storage Locations. Because the keytab file supplies the principal used to create the location, you must have it in place before creating the storage location. After you deploy keytab files to all database nodes, use the CREATE LOCATION statement to create the storage location as usual. Token Expiration Vertica attempts to automatically refresh Hadoop tokens before they expire, but you can also set a minimum refresh frequency if you prefer. The HadoopFSTokenRefreshFrequency configuration parameter specifies the frequency in seconds: => ALTER DATABASE exampledb SET HadoopFSTokenRefreshFrequency = '86400'; If the current age of the token is greater than the value specified in this parameter, Vertica refreshes the token before accessing data stored in HDFS. HPE Vertica Analytic Database (7.2.x) Page 22 of 145

23 Using Kerberos with Hadoop See Also How Vertica uses Kerberos With Hadoop Troubleshooting Kerberos Authentication HPE Vertica Analytic Database (7.2.x) Page 23 of 145

24 Reading Native Hadoop File Formats Reading Native Hadoop File Formats When you create external tables or copy data into tables, you can access data in certain native Hadoop formats directly. Currently, Vertica supports the ORC (Optimized Row Columnar) and Parquet formats. Because this approach allows you to define your tables yourself instead of fetching the metadata through webhcat, these readers can provide slightly better performance than the HCatalog Connector.. If you are already using the HCatalog Connector for other reasons, however, you might find it more convenient to use it to read data in these formats also. See Using the HCatalog Connector. You can use the hdfs scheme to access ORC and Parquet files stored in HDFS, as explained later in this section. To use this scheme you must perform some additional configuration; see Configuring hdfs:/// Access. Requirements The ORC or Parquet files must not use complex data types. All simple data types supported in Hive version 0.11 or later are supported. Files compressed by Hive or Impala require Zlib (GZIP) or Snappy compression. Vertica does not support LZO compression for these formats. Creating External Tables In the CREATE EXTERNAL TABLE AS COPY statement, specify a format of ORC or PARQUET as follows: => CREATE EXTERNAL TABLE tablename (columns) AS COPY FROM path ORC; => CREATE EXTERNAL TABLE tablename (columns) AS COPY FROM path PARQUET; If the file resides on the local file system of the node where you issue the command Use a local file path for path. Escape special characters in file paths with backslashes. HPE Vertica Analytic Database (7.2.x) Page 24 of 145

25 Reading Native Hadoop File Formats If the file resides elsewhere in HDFS Use the hdfs:/// prefix (three slashes), and then specify the file path. Escape special characters in HDFS paths using URI encoding, for example %20 for space. Vertica automatically converts from the hdfs scheme to the webhdfs scheme if necessary. You can also directly use a webhdfs:// prefix and specify the host name, port, and file path. Using the hdfs scheme potentially provides better performance when reading files not protected by Kerberos. When defining an external table, you must define all of the columns in the file. Unlike with some other data sources, you cannot select only the columns of interest. If you omit columns, the ORC or Parquet reader aborts with an error. Files stored in HDFS are governed by HDFS privileges. For files stored on the local disk, however, Vertica requires that users be granted access. All users who have administrative privileges have access. For other users, you must create a storage location and grant access to it. See CREATE EXTERNAL TABLE AS COPY. HDFS privileges are still enforced, so it is safe to create a location for webhdfs://host:port. Only users who have access to both the Vertica user storage location and the HDFS directory can read from the table. Loading Data In the COPY statement, specify a format of ORC or PARQUET: => COPY tablename FROM path ORC; => COPY tablenamefrom path PARQUET; For files that are not local, specify ON ANY NODE to improve performance. => COPY t FROM 'hdfs:///opt/data/orcfile' ON ANY NODE ORC; As with external tables, path may be a local or hdfs:/// path. Be aware that if you load from multiple files in the same COPY statement, and any of them is aborted, the entire load aborts. This behavior differs from that for delimited files, where the COPY statement loads what it can and ignores the rest. HPE Vertica Analytic Database (7.2.x) Page 25 of 145

26 Reading Native Hadoop File Formats Supported Data Types The Vertica ORC and Parquet file readers can natively read columns of all data types supported in Hive version 0.11 and later except for complex types. If complex types such as maps are encountered, the COPY or CREATE EXTERNAL TABLE AS COPY statement aborts with an error message. The readers do not attempt to read only some columns; either the entire file is read or the operation fails. For a complete list of supported types, see HIVE Data Types. Kerberos Authentication If the file to be read is located on an HDFS cluster that uses Kerberos authentication, Vertica uses the current user's principal to authenticate. It does not use the database's principal. Examples The following example shows how you can read from all ORC files in a local directory. This example uses all supported data types. => CREATE EXTERNAL TABLE t (a1 TINYINT, a2 SMALLINT, a3 INT, a4 BIGINT, a5 FLOAT, a6 DOUBLE PRECISION, a7 BOOLEAN, a8 DATE, a9 TIMESTAMP, a10 VARCHAR(20), a11 VARCHAR(20), a12 CHAR(20), a13 BINARY(20), a14 DECIMAL(10,5)) AS COPY FROM '/data/orc_test_*.orc' ORC; The following example shows the error that is produced if the file you specify is not recognized as an ORC file: => CREATE EXTERNAL TABLE t (a1 TINYINT, a2 SMALLINT, a3 INT, a4 BIGINT, a5 FLOAT) AS COPY FROM '/data/not_an_orc_file.orc' ORC; ERROR 0: Failed to read orc source [/data/not_an_orc_file.orc]: Not an ORC file See Alsos Query Performance Troubleshooting Reads from Native File Formats HPE Vertica Analytic Database (7.2.x) Page 26 of 145

27 Reading Native Hadoop File Formats Query Performance When working with external tables in native formats, Vertica tries to improve performance in two ways: Pushing query execution closer to the data so less has to be read and transmitted Using data locality in planning the query Considerations When Writing Files The decisions you make when writing ORC and Parquet files can affect performance when using them. To get the best performance from Vertica, follow these guidelines when writing your files: Use the latest available Hive version. (You can still read your files with earlier versions.) Use a large stripe size. 256 MB or greater is preferred. Partition the data at the table level. Sort the columns based on frequency of access, with most-frequently accessed columns appearing first. Use Snappy or Zlib/GZIP compression. Predicate Pushdown Predicate pushdown moves parts of the query execution closer to the data, reducing the amount of data that must be read from disk or across the network. ORC files have three levels of indexing: file statistics, stripe statistics, and row group indexes. Predicates are applied only to the first two levels. Parquet files can have statistics in the ColumnMetaData and DataPageHeader. Predicates are applied only to the ColumnMetaData. Predicate pushdown is automatically applied for files written with Hive version 0.14 and later. Files written with earlier versions of Hive might not contain the required statistics. When executing a query against a file that lacks these statistics, Vertica logs an EXTERNAL_PREDICATE_PUSHDOWN_NOT_SUPPORTED event in the QUERY_ HPE Vertica Analytic Database (7.2.x) Page 27 of 145

28 Reading Native Hadoop File Formats EVENTS system table. If you are seeing performance problems with your queries, check this table for these events. Data Locality In a cluster where Vertica nodes are co-located on HDFS nodes, the query can use data locality to improve performance. For Vertica to do so, both the following conditions must exist:: The data is on an HDFS node where a database node is also present. The query is not restricted to specific nodes using ON NODE. When both these conditions exist, the query planner uses the co-located database node to read that data locally, instead of making a network call. You can see how much data is being read locally by inspecting the query plan. The label for LoadStep(s) in the plan contains a statement of the form: "X% of ORC data matched with co-located Vertica nodes". To increase the volume of local reads, consider adding more database nodes. HDFS data, by its nature, can't be moved to specific nodes, but if you run more database nodes you increase the likelihood that a database node is local to one of the copies of the data. Configuring hdfs:/// Access When reading ORC or Parquet files from HDFS, you can use the hdfs scheme instead of the webhdfs scheme. Using the hdfs scheme can improve performance by bypassing the webhdfs service. To support the hdfs scheme, your Vertica nodes need access to certain Hadoop configuration files. If Vertica is co-located on HDFS nodes, then those files are already present. Verify that the HadoopConfDir environment variable is correctly set. Its path should include a directory containing the core-site.xml and hdfs-site.xml files. If Vertica is running on a separate cluster, you must copy the required files to those nodes and set the HadoopConfDir environment variable. A simple way to do so is to configure your Vertica nodes as Hadoop edge nodes. Edge nodes are used to run client applications; from Hadoop's perspective, Vertica is a client application. You can use Ambari or Cloudera Manager to configure edge nodes. For more information, see the documentation for your Hadoop vendor. HPE Vertica Analytic Database (7.2.x) Page 28 of 145

29 Reading Native Hadoop File Formats Using the hdfs scheme does not remove the need for access to the webhdfs service. The hdfs scheme is not available for all files. If hdfs is not available, then Vertica automatically uses webhdfs instead. If you update the configuration files after starting Vertica, use the following statement to refresh them: => SELECT CLEAR_CACHES(); Troubleshooting Reads from Native File Formats You might encounter the following issues when reading ORC or Parquet files. webhdfs Error When Using hdfs URIs When creating an external table or loading data and using the hdfs scheme, you might see errors from webhdfs failures. Such errors indicate that Vertica was not able to use the hdfs scheme and fell back to webhdfs, but that the webhdfs configuration is incorrect. Verify that the HDFS configuration files in HadoopConfDir have the correct webhdfs configuration for your Hadoop cluster. See Configuring hdfs:/// Access for information about use of these files. See your Hadoop documentation for information about webhdfs configuration. Reads from Parquet Files Report Unexpected Data-Type Mismatches If a Parquet file contains a column of type STRING but the column in Vertica is of a different type, such as INT, you might see an unclear error message. In this case Vertica reports the column in the Parquet file as BYTE_ARRAY, as shown in the following example: ERROR 0: Datatype mismatch: column 2 in the parquet_cpp source [/tmp/nation.0.parquet] has type BYTE_ARRAY, expected int This behavior is specific to Parquet files; with an ORC file the type is correctly reported as STRING. The problem occurs because Parquet does not natively support the STRING type and uses BYTE_ARRAY for strings instead. Because the Parquet file reports its type as BYTE_ARRAY, Vertica has no way to determine if the type is actually a BYTE_ARRAY or a STRING. HPE Vertica Analytic Database (7.2.x) Page 29 of 145

30 Reading Native Hadoop File Formats Time Zones in Timestamp Values Are Not Correct Reading time stamps from an ORC or Parquet file in Vertica might result in different values, based on the local time zone. This issue occurs because the ORC and Parquet formats do not support the SQL TIMESTAMP data type. If you define the column in your table with the TIMESTAMP data type, Vertica interprets time stamps read from ORC or Parquet files as values in the local time zone. This same behavior occurs in Hive. When this situation occurs, Vertica produces a warning at query time, such as the following: WARNING 0: SQL TIMESTAMPTZ is more appropriate for ORC TIMESTAMP because values are stored in UTC When creating the table in Vertica, you can avoid this issue by using the TIMESTAMPTZ data type instead of TIMESTAMP. Some Date and Timestamp Values Are Wrong by Several Days When Hive writes ORC or Parquet files, it converts dates before 1583 from the Gregorian calendar to the Julian calendar. Vertica does not perform this conversion. If your file contains dates before this time, values in Hive and the corresponding values in Vertica can differ by up to ten days. This difference applies to both DATE and TIMESTAMP values. Error 7087: Wrong Number of Columns When loading data, you might see an error stating that you have the wrong number of columns: => CREATE TABLE nation (nationkey bigint, name varchar(500), regionkey bigint, comment varchar(500)); CREATE TABLE => COPY nation from :orc_dir ORC; ERROR 7087: Attempt to load 4 columns from an orc source [/tmp/orc_glob/test.orc] that has 9 columns When you load data from Hadoop native file formats, your table must consume all of the data in the file, or this error results. To avoid this problem, add the missing columns to your table definition. HPE Vertica Analytic Database (7.2.x) Page 30 of 145

31 Using the HCatalog Connector Using the HCatalog Connector The Vertica HCatalog Connector lets you access data stored in Apache's Hive data warehouse software the same way you access it within a native Vertica table. If your files are in the Optimized Columnar Row (ORC) or Parquet format and do not use complex types, the HCatalog Connector creates an external table and uses the ORC or Parquet reader instead of using the Java SerDe. See Reading Native Hadoop File Formats for more information about these readers. The HCatalog Connector performs predicate pushdown to improve query performance. Instead of reading all data across the network to evaluate a query, the HCatalog Connector moves the evaluation of predicates closer to the data. Predicate pushdown applies to Hive partition pruning, ORC stripe pruning, and Parquet row-group pruning. The HCatalog Connector supports predicate pushdown for the following predicates: >, >=, =, <>, <=, <. Hive, HCatalog, and WebHCat Overview There are several Hadoop components that you need to understand in order to use the HCatalog connector: Apache's Hive lets you query data stored in a Hadoop Distributed File System (HDFS) the same way you query data stored in a relational database. Behind the scenes, Hive uses a set of serializer and deserializer (SerDe) classes to extract data from files stored on the HDFS and break it into columns and rows. Each SerDe handles data files in a specific format. For example, one SerDe extracts data from comma-separated data files while another interprets data stored in JSON format. Apache HCatalog is a component of the Hadoop ecosystem that makes Hive's metadata available to other Hadoop components (such as Pig). WebHCat (formerly known as Templeton) makes HCatalog and Hive data available via a REST web API. Through it, you can make an HTTP request to retrieve data stored in Hive, as well as information about the Hive schema. Vertica's HCatalog Connector lets you transparently access data that is available through WebHCat. You use the connector to define a schema in Vertica that corresponds to a Hive database or schema. When you query data within this schema, HPE Vertica Analytic Database (7.2.x) Page 31 of 145

32 Using the HCatalog Connector the HCatalog Connector transparently extracts and formats the data from Hadoop into tabular data. The data within this HCatalog schema appears as if it is native to Vertica. You can even perform operations such as joins between Vertica-native tables and HCatalog tables. For more details, see How the HCatalog Connector Works. HCatalog Connection Features The HCatalog Connector lets you query data stored in Hive using the Vertica native SQL syntax. Some of its main features are: The HCatalog Connector always reflects the current state of data stored in Hive. The HCatalog Connector uses the parallel nature of both Vertica and Hadoop to process Hive data. The result is that querying data through the HCatalog Connector is often faster than querying the data directly through Hive. Since Vertica performs the extraction and parsing of data, the HCatalog Connector does not signficantly increase the load on your Hadoop cluster. The data you query through the HCatalog Connector can be used as if it were native Vertica data. For example, you can execute a query that joins data from a table in an HCatalog schema with a native table. HCatalog Connection Considerations There are a few things to keep in mind when using the HCatalog Connector: Hive's data is stored in flat files in a distributed filesystem, requiring it to be read and deserialized each time it is queried. This deserialization causes Hive's performance to be much slower than Vertica. The HCatalog Connector has to perform the same process as Hive to read the data. Therefore, querying data stored in Hive using the HCatalog Connector is much slower than querying a native Vertica table. If you need to perform extensive analysis on data stored in Hive, you should consider loading it into Vertica through the HCatalog Connector or the WebHDFS connector. Vertica optimization often makes querying data through the HCatalog Connector faster than directly querying it through Hive. HPE Vertica Analytic Database (7.2.x) Page 32 of 145

33 Using the HCatalog Connector Hive supports complex data types such as lists, maps, and structs that Vertica does not support. Columns containing these data types are converted to a JSON representation of the data type and stored as a VARCHAR. See Data Type Conversions from Hive to Vertica. Note: The HCatalog Connector is read only. It cannot insert data into Hive. How the HCatalog Connector Works When planning a query that accesses data from a Hive table, the Vertica HCatalog Connector on the initiator node contacts the WebHCat server in your Hadoop cluster to determine if the table exists. If it does, the connector retrieves the table's metadata from the metastore database so the query planning can continue. When the query executes, all nodes in the Vertica cluster directly retrieve the data necessary for completing the query from HDFS. They then use the Hive SerDe classes to extract the data so the query can execute. This approach takes advantage of the parallel nature of both Vertica and Hadoop. In addition, by performing the retrieval and extraction of data directly, the HCatalog Connector reduces the impact of the query on the Hadoop cluster. HPE Vertica Analytic Database (7.2.x) Page 33 of 145

34 Using the HCatalog Connector HCatalog Connector Requirements Before you can use the HCatalog Connector, both your Vertica and Hadoop installations must meet the following requirements. Vertica Requirements All of the nodes in your cluster must have a Java Virtual Machine (JVM) installed. See Installing the Java Runtime on Your Vertica Cluster. You must also add certain libraries distributed with Hadoop and Hive to your Vertica installation directory. See Configuring Vertica for HCatalog. Hadoop Requirements Your Hadoop cluster must meet several requirements to operate correctly with the Vertica Connector for HCatalog: It must have Hive and HCatalog installed and running. See Apache's HCatalog page for more information. It must have WebHCat (formerly known as Templeton) installed and running. See Apache' s WebHCat page for details. The WebHCat server and all of the HDFS nodes that store HCatalog data must be directly accessible from all of the hosts in your Vertica database. Verify that any firewall separating the Hadoop cluster and the Vertica cluster will pass WebHCat, metastore database, and HDFS traffic. The data that you want to query must be in an internal or external Hive table. If a table you want to query uses a non-standard SerDe, you must install the SerDe's classes on your Vertica cluster before you can query the data. See Using Non- Standard SerDes. Testing Connectivity To test the connection between your database cluster and WebHcat, log into a node in your Vertica cluster. Then, run the following command to execute an HCatalog query: $ curl HPE Vertica Analytic Database (7.2.x) Page 34 of 145

35 Using the HCatalog Connector Where: webhcatserver is the IP address or hostname of the WebHCat server port is the port number assigned to the WebHCat service (usually 50111) hcatusername is a valid username authorized to use HCatalog Usually, you want to append ;echo to the command to add a linefeed after the curl command's output. Otherwise, the command prompt is automatically appended to the command's output, making it harder to read. For example: $ curl echo If there are no errors, this command returns a status message in JSON format, similar to the following: {"status":"ok","version":"v1"} This result indicates that WebHCat is running and that the Vertica host can connect to it and retrieve a result. If you do not receive this result, troubleshoot your Hadoop installation and the connectivity between your Hadoop and Vertica clusters. For details, see Troubleshooting HCatalog Connector Problems. You can also run some queries to verify that WebHCat is correctly configured to work with Hive. The following example demonstrates listing the databases defined in Hive and the tables defined within a database: $ curl echo {"databases":["default","production"]} $ curl echo {"tables":["messages","weblogs","tweets","transactions"],"database":"default"} See Apache's WebHCat reference for details about querying Hive using WebHCat. Installing the Java Runtime on Your Vertica Cluster The HCatalog Connector requires a 64-bit Java Virtual Machine (JVM). The JVM must support Java 6 or later, and must be the same version as the one installed on your Hadoop nodes. HPE Vertica Analytic Database (7.2.x) Page 35 of 145

Integrating with Apache Hadoop HPE Vertica Analytic Database. Software Version: 7.2.x

Integrating with Apache Hadoop HPE Vertica Analytic Database. Software Version: 7.2.x HPE Vertica Analytic Database Software Version: 7.2.x Document Release Date: 12/7/2015 Legal Notices Warranty The only warranties for Hewlett Packard Enterprise products and services are set forth in the

More information

Supported Platforms. HP Vertica Analytic Database. Software Version: 7.0.x

Supported Platforms. HP Vertica Analytic Database. Software Version: 7.0.x HP Vertica Analytic Database Software Version: 7.0.x Document Release Date: 5/7/2014 Legal Notices Warranty The only warranties for HP products and services are set forth in the express warranty statements

More information

Hadoop Integration Guide

Hadoop Integration Guide HP Vertica Analytic Database Software Version: 7.1.x Document Release Date: 12/9/2015 Legal Notices Warranty The only warranties for HP products and services are set forth in the express warranty statements

More information

Plug-In for Informatica Guide

Plug-In for Informatica Guide HP Vertica Analytic Database Software Version: 7.0.x Document Release Date: 2/20/2015 Legal Notices Warranty The only warranties for HP products and services are set forth in the express warranty statements

More information

Supported Platforms HPE Vertica Analytic Database. Software Version: 7.2.x

Supported Platforms HPE Vertica Analytic Database. Software Version: 7.2.x HPE Vertica Analytic Database Software Version: 7.2.x Document Release Date: 2/4/2016 Legal Notices Warranty The only warranties for Hewlett Packard Enterprise products and services are set forth in the

More information

Constructing a Data Lake: Hadoop and Oracle Database United!

Constructing a Data Lake: Hadoop and Oracle Database United! Constructing a Data Lake: Hadoop and Oracle Database United! Sharon Sophia Stephen Big Data PreSales Consultant February 21, 2015 Safe Harbor The following is intended to outline our general product direction.

More information

HADOOP MOCK TEST HADOOP MOCK TEST II

HADOOP MOCK TEST HADOOP MOCK TEST II http://www.tutorialspoint.com HADOOP MOCK TEST Copyright tutorialspoint.com This section presents you various set of Mock Tests related to Hadoop Framework. You can download these sample mock tests at

More information

Vertica OnDemand Getting Started Guide HPE Vertica Analytic Database. Software Version: 7.2.x

Vertica OnDemand Getting Started Guide HPE Vertica Analytic Database. Software Version: 7.2.x Vertica OnDemand Getting Started Guide HPE Vertica Analytic Database Software Version: 7.2.x Document Release Date: 12/15/2015 Legal Notices Warranty The only warranties for Hewlett Packard Enterprise

More information

Supported Platforms. HP Vertica Analytic Database. Software Version: 7.1.x

Supported Platforms. HP Vertica Analytic Database. Software Version: 7.1.x HP Vertica Analytic Database Software Version: 7.1.x Document Release Date: 10/14/2015 Legal Notices Warranty The only warranties for HP products and services are set forth in the express warranty statements

More information

Cloudera Manager Training: Hands-On Exercises

Cloudera Manager Training: Hands-On Exercises 201408 Cloudera Manager Training: Hands-On Exercises General Notes... 2 In- Class Preparation: Accessing Your Cluster... 3 Self- Study Preparation: Creating Your Cluster... 4 Hands- On Exercise: Working

More information

Big Data Operations Guide for Cloudera Manager v5.x Hadoop

Big Data Operations Guide for Cloudera Manager v5.x Hadoop Big Data Operations Guide for Cloudera Manager v5.x Hadoop Logging into the Enterprise Cloudera Manager 1. On the server where you have installed 'Cloudera Manager', make sure that the server is running,

More information

Architecting the Future of Big Data

Architecting the Future of Big Data Hive ODBC Driver User Guide Revised: July 22, 2013 2012-2013 Hortonworks Inc. All Rights Reserved. Parts of this Program and Documentation include proprietary software and content that is copyrighted and

More information

Integrating VoltDB with Hadoop

Integrating VoltDB with Hadoop The NewSQL database you ll never outgrow Integrating with Hadoop Hadoop is an open source framework for managing and manipulating massive volumes of data. is an database for handling high velocity data.

More information

How to Install and Configure EBF15328 for MapR 4.0.1 or 4.0.2 with MapReduce v1

How to Install and Configure EBF15328 for MapR 4.0.1 or 4.0.2 with MapReduce v1 How to Install and Configure EBF15328 for MapR 4.0.1 or 4.0.2 with MapReduce v1 1993-2015 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic,

More information

An Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

An Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database An Oracle White Paper June 2012 High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database Executive Overview... 1 Introduction... 1 Oracle Loader for Hadoop... 2 Oracle Direct

More information

Cloudera Backup and Disaster Recovery

Cloudera Backup and Disaster Recovery Cloudera Backup and Disaster Recovery Important Note: Cloudera Manager 4 and CDH 4 have reached End of Maintenance (EOM) on August 9, 2015. Cloudera will not support or provide patches for any of the Cloudera

More information

How To Use Cloudera Manager Backup And Disaster Recovery (Brd) On A Microsoft Hadoop 5.5.5 (Clouderma) On An Ubuntu 5.2.5 Or 5.3.5

How To Use Cloudera Manager Backup And Disaster Recovery (Brd) On A Microsoft Hadoop 5.5.5 (Clouderma) On An Ubuntu 5.2.5 Or 5.3.5 Cloudera Manager Backup and Disaster Recovery Important Notice (c) 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names or

More information

Architecting the Future of Big Data

Architecting the Future of Big Data Hive ODBC Driver User Guide Revised: July 22, 2014 2012-2014 Hortonworks Inc. All Rights Reserved. Parts of this Program and Documentation include proprietary software and content that is copyrighted and

More information

Spring,2015. Apache Hive BY NATIA MAMAIASHVILI, LASHA AMASHUKELI & ALEKO CHAKHVASHVILI SUPERVAIZOR: PROF. NODAR MOMTSELIDZE

Spring,2015. Apache Hive BY NATIA MAMAIASHVILI, LASHA AMASHUKELI & ALEKO CHAKHVASHVILI SUPERVAIZOR: PROF. NODAR MOMTSELIDZE Spring,2015 Apache Hive BY NATIA MAMAIASHVILI, LASHA AMASHUKELI & ALEKO CHAKHVASHVILI SUPERVAIZOR: PROF. NODAR MOMTSELIDZE Contents: Briefly About Big Data Management What is hive? Hive Architecture Working

More information

Parquet. Columnar storage for the people

Parquet. Columnar storage for the people Parquet Columnar storage for the people Julien Le Dem @J_ Processing tools lead, analytics infrastructure at Twitter Nong Li nong@cloudera.com Software engineer, Cloudera Impala Outline Context from various

More information

HP SiteScope. HP Vertica Solution Template Best Practices. For the Windows, Solaris, and Linux operating systems. Software Version: 11.

HP SiteScope. HP Vertica Solution Template Best Practices. For the Windows, Solaris, and Linux operating systems. Software Version: 11. HP SiteScope For the Windows, Solaris, and Linux operating systems Software Version: 11.23 HP Vertica Solution Template Best Practices Document Release Date: December 2013 Software Release Date: December

More information

docs.hortonworks.com

docs.hortonworks.com docs.hortonworks.com Hortonworks Data Platform: Configuring Kafka for Kerberos Over Ambari Copyright 2012-2015 Hortonworks, Inc. Some rights reserved. The Hortonworks Data Platform, powered by Apache Hadoop,

More information

Move Data from Oracle to Hadoop and Gain New Business Insights

Move Data from Oracle to Hadoop and Gain New Business Insights Move Data from Oracle to Hadoop and Gain New Business Insights Written by Lenka Vanek, senior director of engineering, Dell Software Abstract Today, the majority of data for transaction processing resides

More information

Cloudera Backup and Disaster Recovery

Cloudera Backup and Disaster Recovery Cloudera Backup and Disaster Recovery Important Notice (c) 2010-2013 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names or slogans

More information

Architecting the Future of Big Data

Architecting the Future of Big Data Hive ODBC Driver User Guide Revised: October 1, 2012 2012 Hortonworks Inc. All Rights Reserved. Parts of this Program and Documentation include proprietary software and content that is copyrighted and

More information

Hadoop Ecosystem B Y R A H I M A.

Hadoop Ecosystem B Y R A H I M A. Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open

More information

and Hadoop Technology

and Hadoop Technology SAS and Hadoop Technology Overview SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2015. SAS and Hadoop Technology: Overview. Cary, NC: SAS Institute

More information

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and

More information

Connectivity Pack for Microsoft Guide

Connectivity Pack for Microsoft Guide HP Vertica Analytic Database Software Version: 7.0.x Document Release Date: 2/20/2015 Legal Notices Warranty The only warranties for HP products and services are set forth in the express warranty statements

More information

HDFS Users Guide. Table of contents

HDFS Users Guide. Table of contents Table of contents 1 Purpose...2 2 Overview...2 3 Prerequisites...3 4 Web Interface...3 5 Shell Commands... 3 5.1 DFSAdmin Command...4 6 Secondary NameNode...4 7 Checkpoint Node...5 8 Backup Node...6 9

More information

ORACLE GOLDENGATE BIG DATA ADAPTER FOR HIVE

ORACLE GOLDENGATE BIG DATA ADAPTER FOR HIVE ORACLE GOLDENGATE BIG DATA ADAPTER FOR HIVE Version 1.0 Oracle Corporation i Table of Contents TABLE OF CONTENTS... 2 1. INTRODUCTION... 3 1.1. FUNCTIONALITY... 3 1.2. SUPPORTED OPERATIONS... 4 1.3. UNSUPPORTED

More information

HP Quality Center. Upgrade Preparation Guide

HP Quality Center. Upgrade Preparation Guide HP Quality Center Upgrade Preparation Guide Document Release Date: November 2008 Software Release Date: November 2008 Legal Notices Warranty The only warranties for HP products and services are set forth

More information

Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam sastry.vedantam@oracle.com

Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam sastry.vedantam@oracle.com Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam sastry.vedantam@oracle.com Agenda The rise of Big Data & Hadoop MySQL in the Big Data Lifecycle MySQL Solutions for Big Data Q&A

More information

The release notes provide details of enhancements and features in Cloudera ODBC Driver for Impala 2.5.30, as well as the version history.

The release notes provide details of enhancements and features in Cloudera ODBC Driver for Impala 2.5.30, as well as the version history. Cloudera ODBC Driver for Impala 2.5.30 The release notes provide details of enhancements and features in Cloudera ODBC Driver for Impala 2.5.30, as well as the version history. The following are highlights

More information

Data processing goes big

Data processing goes big Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,

More information

MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering

MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering MySQL and Hadoop: Big Data Integration Shubhangi Garg & Neha Kumari MySQL Engineering 1Copyright 2013, Oracle and/or its affiliates. All rights reserved. Agenda Design rationale Implementation Installation

More information

Interworks. Interworks Cloud Platform Installation Guide

Interworks. Interworks Cloud Platform Installation Guide Interworks Interworks Cloud Platform Installation Guide Published: March, 2014 This document contains information proprietary to Interworks and its receipt or possession does not convey any rights to reproduce,

More information

Innovative technology for big data analytics

Innovative technology for big data analytics Technical white paper Innovative technology for big data analytics The HP Vertica Analytics Platform database provides price/performance, scalability, availability, and ease of administration Table of

More information

Integrating with Apache Kafka HPE Vertica Analytic Database. Software Version: 7.2.x

Integrating with Apache Kafka HPE Vertica Analytic Database. Software Version: 7.2.x HPE Vertica Analytic Database Software Version: 7.2.x Document Release Date: 2/4/2016 Legal Notices Warranty The only warranties for Hewlett Packard Enterprise products and services are set forth in the

More information

Enhanced Connector Applications SupportPac VP01 for IBM WebSphere Business Events 3.0.0

Enhanced Connector Applications SupportPac VP01 for IBM WebSphere Business Events 3.0.0 Enhanced Connector Applications SupportPac VP01 for IBM WebSphere Business Events 3.0.0 Third edition (May 2012). Copyright International Business Machines Corporation 2012. US Government Users Restricted

More information

Contents. Pentaho Corporation. Version 5.1. Copyright Page. New Features in Pentaho Data Integration 5.1. PDI Version 5.1 Minor Functionality Changes

Contents. Pentaho Corporation. Version 5.1. Copyright Page. New Features in Pentaho Data Integration 5.1. PDI Version 5.1 Minor Functionality Changes Contents Pentaho Corporation Version 5.1 Copyright Page New Features in Pentaho Data Integration 5.1 PDI Version 5.1 Minor Functionality Changes Legal Notices https://help.pentaho.com/template:pentaho/controls/pdftocfooter

More information

Microsoft SQL Server Connector for Apache Hadoop Version 1.0. User Guide

Microsoft SQL Server Connector for Apache Hadoop Version 1.0. User Guide Microsoft SQL Server Connector for Apache Hadoop Version 1.0 User Guide October 3, 2011 Contents Legal Notice... 3 Introduction... 4 What is SQL Server-Hadoop Connector?... 4 What is Sqoop?... 4 Supported

More information

Hadoop Integration Guide

Hadoop Integration Guide HP Vertica Analytic Database Software Version: 7.0.x Document Release Date: 2/20/2015 Legal Notices Warranty The only warranties for HP products and services are set forth in the express warranty statements

More information

Deploying Hadoop with Manager

Deploying Hadoop with Manager Deploying Hadoop with Manager SUSE Big Data Made Easier Peter Linnell / Sales Engineer plinnell@suse.com Alejandro Bonilla / Sales Engineer abonilla@suse.com 2 Hadoop Core Components 3 Typical Hadoop Distribution

More information

Quick Install Guide. Lumension Endpoint Management and Security Suite 7.1

Quick Install Guide. Lumension Endpoint Management and Security Suite 7.1 Quick Install Guide Lumension Endpoint Management and Security Suite 7.1 Lumension Endpoint Management and Security Suite - 2 - Notices Version Information Lumension Endpoint Management and Security Suite

More information

Hadoop Job Oriented Training Agenda

Hadoop Job Oriented Training Agenda 1 Hadoop Job Oriented Training Agenda Kapil CK hdpguru@gmail.com Module 1 M o d u l e 1 Understanding Hadoop This module covers an overview of big data, Hadoop, and the Hortonworks Data Platform. 1.1 Module

More information

HP Project and Portfolio Management Center

HP Project and Portfolio Management Center HP Project and Portfolio Management Center Software Version: 9.20 RESTful Web Services Guide Document Release Date: February 2013 Software Release Date: February 2013 Legal Notices Warranty The only warranties

More information

Pro Apache Hadoop. Second Edition. Sameer Wadkar. Madhu Siddalingaiah

Pro Apache Hadoop. Second Edition. Sameer Wadkar. Madhu Siddalingaiah Pro Apache Hadoop Second Edition Sameer Wadkar Madhu Siddalingaiah Contents J About the Authors About the Technical Reviewer Acknowledgments Introduction xix xxi xxiii xxv Chapter 1: Motivation for Big

More information

Metalogix SharePoint Backup. Advanced Installation Guide. Publication Date: August 24, 2015

Metalogix SharePoint Backup. Advanced Installation Guide. Publication Date: August 24, 2015 Metalogix SharePoint Backup Publication Date: August 24, 2015 All Rights Reserved. This software is protected by copyright law and international treaties. Unauthorized reproduction or distribution of this

More information

HP Business Service Management

HP Business Service Management HP Business Service Management for the Windows and Linux operating systems Software Version: 9.10 Business Process Insight Server Administration Guide Document Release Date: August 2011 Software Release

More information

Integration of Apache Hive and HBase

Integration of Apache Hive and HBase Integration of Apache Hive and HBase Enis Soztutar enis [at] apache [dot] org @enissoz Page 1 About Me User and committer of Hadoop since 2007 Contributor to Apache Hadoop, HBase, Hive and Gora Joined

More information

HADOOP MOCK TEST HADOOP MOCK TEST I

HADOOP MOCK TEST HADOOP MOCK TEST I http://www.tutorialspoint.com HADOOP MOCK TEST Copyright tutorialspoint.com This section presents you various set of Mock Tests related to Hadoop Framework. You can download these sample mock tests at

More information

Sage Intelligence Financial Reporting for Sage ERP X3 Version 6.5 Installation Guide

Sage Intelligence Financial Reporting for Sage ERP X3 Version 6.5 Installation Guide Sage Intelligence Financial Reporting for Sage ERP X3 Version 6.5 Installation Guide Table of Contents TABLE OF CONTENTS... 3 1.0 INTRODUCTION... 1 1.1 HOW TO USE THIS GUIDE... 1 1.2 TOPIC SUMMARY...

More information

Apache Sentry. Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com

Apache Sentry. Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com Apache Sentry Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com Agenda Various aspects of data security Apache Sentry for authorization Key concepts of Apache Sentry Sentry features Sentry architecture

More information

HPE Vertica & Hadoop. Tapping Innovation to Turbocharge Your Big Data. #SeizeTheData

HPE Vertica & Hadoop. Tapping Innovation to Turbocharge Your Big Data. #SeizeTheData HPE Vertica & Hadoop Tapping Innovation to Turbocharge Your Big Data #SeizeTheData The HPE Vertica portfolio One Vertica Engine running on Cloud, Bare Metal, or Hadoop Data Nodes HPE Vertica OnDemand &

More information

HP Software as a Service

HP Software as a Service HP Software as a Service Software Version: 6.1 Federated SSO Document Release Date: August 2013 Legal Notices Warranty The only warranties for HP products and services are set forth in the express warranty

More information

New Features... 1 Installation... 3 Upgrade Changes... 3 Fixed Limitations... 4 Known Limitations... 5 Informatica Global Customer Support...

New Features... 1 Installation... 3 Upgrade Changes... 3 Fixed Limitations... 4 Known Limitations... 5 Informatica Global Customer Support... Informatica Corporation B2B Data Exchange Version 9.5.0 Release Notes June 2012 Copyright (c) 2006-2012 Informatica Corporation. All rights reserved. Contents New Features... 1 Installation... 3 Upgrade

More information

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give

More information

HP OpenView AssetCenter

HP OpenView AssetCenter HP OpenView AssetCenter Software version: 5.0 Integration with software distribution tools Build number: 50 Legal Notices Warranty The only warranties for HP products and services are set forth in the

More information

EMC Documentum Connector for Microsoft SharePoint

EMC Documentum Connector for Microsoft SharePoint EMC Documentum Connector for Microsoft SharePoint Version 7.1 Installation Guide EMC Corporation Corporate Headquarters Hopkinton, MA 01748-9103 1-508-435-1000 www.emc.com Legal Notice Copyright 2013-2014

More information

ODBC Client Driver Help. 2015 Kepware, Inc.

ODBC Client Driver Help. 2015 Kepware, Inc. 2015 Kepware, Inc. 2 Table of Contents Table of Contents 2 4 Overview 4 External Dependencies 4 Driver Setup 5 Data Source Settings 5 Data Source Setup 6 Data Source Access Methods 13 Fixed Table 14 Table

More information

HP AppPulse Active. Software Version: 2.2. Real Device Monitoring For AppPulse Active

HP AppPulse Active. Software Version: 2.2. Real Device Monitoring For AppPulse Active HP AppPulse Active Software Version: 2.2 For AppPulse Active Document Release Date: February 2015 Software Release Date: November 2014 Legal Notices Warranty The only warranties for HP products and services

More information

IBM Campaign Version-independent Integration with IBM Engage Version 1 Release 3 April 8, 2016. Integration Guide IBM

IBM Campaign Version-independent Integration with IBM Engage Version 1 Release 3 April 8, 2016. Integration Guide IBM IBM Campaign Version-independent Integration with IBM Engage Version 1 Release 3 April 8, 2016 Integration Guide IBM Note Before using this information and the product it supports, read the information

More information

Integrate Master Data with Big Data using Oracle Table Access for Hadoop

Integrate Master Data with Big Data using Oracle Table Access for Hadoop Integrate Master Data with Big Data using Oracle Table Access for Hadoop Kuassi Mensah Oracle Corporation Redwood Shores, CA, USA Keywords: Hadoop, BigData, Hive SQL, Spark SQL, HCatalog, StorageHandler

More information

HDFS. Hadoop Distributed File System

HDFS. Hadoop Distributed File System HDFS Kevin Swingler Hadoop Distributed File System File system designed to store VERY large files Streaming data access Running across clusters of commodity hardware Resilient to node failure 1 Large files

More information

docs.hortonworks.com

docs.hortonworks.com docs.hortonworks.com Hortonworks Data Platform: Administering Ambari Copyright 2012-2015 Hortonworks, Inc. Some rights reserved. The Hortonworks Data Platform, powered by Apache Hadoop, is a massively

More information

WhatsUp Gold v16.2 MSP Edition Deployment Guide This guide provides information about installing and configuring WhatsUp Gold MSP Edition to central

WhatsUp Gold v16.2 MSP Edition Deployment Guide This guide provides information about installing and configuring WhatsUp Gold MSP Edition to central WhatsUp Gold v16.2 MSP Edition Deployment Guide This guide provides information about installing and configuring WhatsUp Gold MSP Edition to central and remote sites. Contents Table of Contents Using WhatsUp

More information

docs.hortonworks.com

docs.hortonworks.com docs.hortonworks.com Hortonworks Data Platform: Hive Performance Tuning Copyright 2012-2015 Hortonworks, Inc. Some rights reserved. The Hortonworks Data Platform, powered by Apache Hadoop, is a massively

More information

How To Use Hp Vertica Ondemand

How To Use Hp Vertica Ondemand Data sheet HP Vertica OnDemand Enterprise-class Big Data analytics in the cloud Enterprise-class Big Data analytics for any size organization Vertica OnDemand Organizations today are experiencing a greater

More information

Data Domain Profiling and Data Masking for Hadoop

Data Domain Profiling and Data Masking for Hadoop Data Domain Profiling and Data Masking for Hadoop 1993-2015 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or

More information

Programming Hadoop 5-day, instructor-led BD-106. MapReduce Overview. Hadoop Overview

Programming Hadoop 5-day, instructor-led BD-106. MapReduce Overview. Hadoop Overview Programming Hadoop 5-day, instructor-led BD-106 MapReduce Overview The Client Server Processing Pattern Distributed Computing Challenges MapReduce Defined Google's MapReduce The Map Phase of MapReduce

More information

How To Install An Aneka Cloud On A Windows 7 Computer (For Free)

How To Install An Aneka Cloud On A Windows 7 Computer (For Free) MANJRASOFT PTY LTD Aneka 3.0 Manjrasoft 5/13/2013 This document describes in detail the steps involved in installing and configuring an Aneka Cloud. It covers the prerequisites for the installation, the

More information

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2016 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and

More information

HP ProLiant Essentials Vulnerability and Patch Management Pack Planning Guide

HP ProLiant Essentials Vulnerability and Patch Management Pack Planning Guide HP ProLiant Essentials Vulnerability and Patch Management Pack Planning Guide Product overview... 3 Vulnerability scanning components... 3 Vulnerability fix and patch components... 3 Checklist... 4 Pre-installation

More information

IBM Campaign and IBM Silverpop Engage Version 1 Release 2 August 31, 2015. Integration Guide IBM

IBM Campaign and IBM Silverpop Engage Version 1 Release 2 August 31, 2015. Integration Guide IBM IBM Campaign and IBM Silverpop Engage Version 1 Release 2 August 31, 2015 Integration Guide IBM Note Before using this information and the product it supports, read the information in Notices on page 93.

More information

Hadoop & Spark Using Amazon EMR

Hadoop & Spark Using Amazon EMR Hadoop & Spark Using Amazon EMR Michael Hanisch, AWS Solutions Architecture 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Why did we build Amazon EMR? What is Amazon EMR?

More information

HP Vertica Integration with SAP Business Objects: Tips and Techniques. HP Vertica Analytic Database

HP Vertica Integration with SAP Business Objects: Tips and Techniques. HP Vertica Analytic Database HP Vertica Integration with SAP Business Objects: Tips and Techniques HP Vertica Analytic Database HP Big Data Document Release Date: June 23, 2015 Legal Notices Warranty The only warranties for HP products

More information

HADOOP ADMINISTATION AND DEVELOPMENT TRAINING CURRICULUM

HADOOP ADMINISTATION AND DEVELOPMENT TRAINING CURRICULUM HADOOP ADMINISTATION AND DEVELOPMENT TRAINING CURRICULUM 1. Introduction 1.1 Big Data Introduction What is Big Data Data Analytics Bigdata Challenges Technologies supported by big data 1.2 Hadoop Introduction

More information

HP 3PAR Recovery Manager 4.5.0 Software for Microsoft Exchange Server 2007, 2010, and 2013

HP 3PAR Recovery Manager 4.5.0 Software for Microsoft Exchange Server 2007, 2010, and 2013 HP 3PAR Recovery Manager 4.5.0 Software for Microsoft Exchange Server 2007, 2010, and 2013 Release Notes Abstract This release notes document is for HP 3PAR Recovery Manager 4.5.0 Software for Microsoft

More information

SQL Server 2008 Designing, Optimizing, and Maintaining a Database Session 1

SQL Server 2008 Designing, Optimizing, and Maintaining a Database Session 1 SQL Server 2008 Designing, Optimizing, and Maintaining a Database Course The SQL Server 2008 Designing, Optimizing, and Maintaining a Database course will help you prepare for 70-450 exam from Microsoft.

More information

Teradata Connector for Hadoop Tutorial

Teradata Connector for Hadoop Tutorial Teradata Connector for Hadoop Tutorial Version: 1.0 April 2013 Page 1 Teradata Connector for Hadoop Tutorial v1.0 Copyright 2013 Teradata All rights reserved Table of Contents 1 Introduction... 5 1.1 Overview...

More information

WhatsUp Gold v16.2 Installation and Configuration Guide

WhatsUp Gold v16.2 Installation and Configuration Guide WhatsUp Gold v16.2 Installation and Configuration Guide Contents Installing and Configuring Ipswitch WhatsUp Gold v16.2 using WhatsUp Setup Installing WhatsUp Gold using WhatsUp Setup... 1 Security guidelines

More information

HP reference configuration for entry-level SAS Grid Manager solutions

HP reference configuration for entry-level SAS Grid Manager solutions HP reference configuration for entry-level SAS Grid Manager solutions Up to 864 simultaneous SAS jobs and more than 3 GB/s I/O throughput Technical white paper Table of contents Executive summary... 2

More information

vcenter Chargeback User s Guide vcenter Chargeback 1.0 EN-000186-00

vcenter Chargeback User s Guide vcenter Chargeback 1.0 EN-000186-00 vcenter Chargeback 1.0 EN-000186-00 You can find the most up-to-date technical documentation on the VMware Web site at: http://www.vmware.com/support/ The VMware Web site also provides the latest product

More information

StreamServe Persuasion SP4

StreamServe Persuasion SP4 StreamServe Persuasion SP4 Installation Guide Rev B StreamServe Persuasion SP4 Installation Guide Rev B 2001-2009 STREAMSERVE, INC. ALL RIGHTS RESERVED United States patent #7,127,520 No part of this document

More information

About Recovery Manager for Active

About Recovery Manager for Active Dell Recovery Manager for Active Directory 8.6.1 May 30, 2014 These release notes provide information about the Dell Recovery Manager for Active Directory release. About Resolved issues Known issues System

More information

Ontrack PowerControls User Guide Version 8.0

Ontrack PowerControls User Guide Version 8.0 ONTRACK POWERCONTROLS Ontrack PowerControls User Guide Version 8.0 Instructions for operating Ontrack PowerControls in Microsoft SQL Server Environments NOVEMBER 2014 NOTICE TO USERS Ontrack PowerControls

More information

ADAM 5.5. System Requirements

ADAM 5.5. System Requirements ADAM 5.5 System Requirements 1 1. Overview The schema below shows an overview of the ADAM components that will be installed and set up. ADAM Server: hosts the ADAM core components. You must install the

More information

Using RDBMS, NoSQL or Hadoop?

Using RDBMS, NoSQL or Hadoop? Using RDBMS, NoSQL or Hadoop? DOAG Conference 2015 Jean- Pierre Dijcks Big Data Product Management Server Technologies Copyright 2014 Oracle and/or its affiliates. All rights reserved. Data Ingest 2 Ingest

More information

HP LeftHand SAN Solutions

HP LeftHand SAN Solutions HP LeftHand SAN Solutions Support Document Application Notes Backup Exec 11D VSS Snapshots and Transportable Offhost Backup Legal Notices Warranty The only warranties for HP products and services are set

More information

Installing and Configuring vcenter Multi-Hypervisor Manager

Installing and Configuring vcenter Multi-Hypervisor Manager Installing and Configuring vcenter Multi-Hypervisor Manager vcenter Server 5.1 vcenter Multi-Hypervisor Manager 1.1 This document supports the version of each product listed and supports all subsequent

More information

Guidelines for using Microsoft System Center Virtual Machine Manager with HP StorageWorks Storage Mirroring

Guidelines for using Microsoft System Center Virtual Machine Manager with HP StorageWorks Storage Mirroring HP StorageWorks Guidelines for using Microsoft System Center Virtual Machine Manager with HP StorageWorks Storage Mirroring Application Note doc-number Part number: T2558-96337 First edition: June 2009

More information

OLH: Oracle Loader for Hadoop OSCH: Oracle SQL Connector for Hadoop Distributed File System (HDFS)

OLH: Oracle Loader for Hadoop OSCH: Oracle SQL Connector for Hadoop Distributed File System (HDFS) Use Data from a Hadoop Cluster with Oracle Database Hands-On Lab Lab Structure Acronyms: OLH: Oracle Loader for Hadoop OSCH: Oracle SQL Connector for Hadoop Distributed File System (HDFS) All files are

More information

HP LeftHand SAN Solutions

HP LeftHand SAN Solutions HP LeftHand SAN Solutions Support Document Installation Manuals Installation and Setup Guide Health Check Legal Notices Warranty The only warranties for HP products and services are set forth in the express

More information