Revolution R Enterprise 7 Hadoop Configuration Guide

Size: px
Start display at page:

Download "Revolution R Enterprise 7 Hadoop Configuration Guide"

Transcription

1 Revolution R Enterprise 7 Hadoop Configuration Guide

2 The correct bibliographic citation for this manual is as follows: Revolution Analytics, Inc Revolution R Enterprise 7 Hadoop Configuration Guide. Revolution Analytics, Inc., Mountain View, CA. Revolution R Enterprise 7 Hadoop Configuration Guide Copyright 2015 Revolution Analytics, Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of Revolution Analytics. U.S. Government Restricted Rights Notice: Use, duplication, or disclosure of this software and related documentation by the Government is subject to restrictions as set forth in subdivision (c) (1) (ii) of The Rights in Technical Data and Computer Software clause at Revolution R, Revolution R Enterprise, RPE, RevoScaleR, DeployR, RevoTreeView, and Revolution Analytics are trademarks of Revolution Analytics. Revolution R includes the Intel Math Kernel Library ( RevoScaleR includes Stat/Transfer software under license from Circle Systems, Inc. Stat/Transfer is a trademark of Circle Systems, Inc. Other product names mentioned herein are used for identification purposes only and may be trademarks of their respective owners. Revolution Analytics 2570 West El Camino Real Suite 222 Mountain View, CA U.S.A. Revised on August 20, 2015 We want our documentation to be useful, and we want it to address your needs. If you have comments on this or any Revolution document, write to doc@revolutionanalytics.com.

3 Table of Contents 1 Introduction System Requirements Basic Hadoop Terminology Verifying the Hadoop Installation Adjusting Hadoop Memory Limits (Hadoop 2.x Systems Only) Hadoop Security with Kerberos Authentication Installing Revolution R Enterprise on a Cluster Standard Command Line Install Distributed Installation with RevoMPM Installing the Revolution R Enterprise JAR File Environment Variables for Hadoop Creating Directories for Revolution R Enterprise Installing on a Cloudera Manager System Using a Cloudera Manager Parcel Verifying Installation Troubleshooting the Installation No Valid Credentials Unable to Load Class RevoScaleR Classpath Errors Unable to Load Shared Library Getting Started with Hadoop Using HDFS Caching Creating an R Package Parcel for Cloudera Manager... 15

4

5 1 Introduction Revolution R Enterprise is the scalable data analytics solution, and it is designed to work seamlessly whether your computing environment is a single-user workstation, a local network of connected servers, or a cluster in the cloud. This manual is intended for those who need to configure a Hadoop cluster for use with Revolution R Enterprise. This manual assumes that you have download instructions for the Revolution R Enterprise and related files; if you do not have those instructions, please contact Revolution Analytics Technical Support for assistance. 1.1 System Requirements Revolution R Enterprise works with the following Hadoop distributions: Cloudera CDH 5.0, 5.1, 5.2, 5.3 HortonWorks HDP 1.3.0, HDP 2.0.0, HDP 2.1.0, HDP MapR 3.0.2, MapR 3.0.3, MapR 3.1.0, MapR 3.1.1, MapR 4.0.1, MapR (provided this version of MapR has been updated to mapr-patch ga ; contact MapR to obtain the patch) Your cluster installation must include the C APIs contained in the libhdfs package; these are required for Revolution R Enterprise. See your Hadoop documentation for information on installing this package. The Hadoop distribution must be installed on Red Hat Enterprise Linux 5 or 6, or fully compatible operating system. Revolution R Enterprise should be installed on all nodes of the cluster. Revolution R Enterprise requires Hadoop MapReduce and the Hadoop Distributed File System (HDFS) (for HDP and MapR 3.x installations), or HDFS, Hadoop YARN, and Hadoop MapReduce2 for CDH5, HDP 2.x, and MapR 4.0.x installations. The HDFS, YARN, and MapReduce clients must be installed on all nodes on which you plan to run Revolution R Enterprise, as must Revolution R Enterprise itself. Minimum system configuration requirements for Revolution R Enterprise are as follows: Processor: 64-bit CPU with x86-compatible architecture (variously known as AMD64, Intel64, x86-64, IA-32e, EM64T, or x64 CPUs). Itanium-architecture CPUs (also known as IA-64) are not supported. Multiple-core CPUs are recommended. Operating System: Red Hat Enterprise Linux 6.0, 6.1, 6.2, 6.3, 6.4, 6.5, or 6.6. Only 64- bit operating systems are supported. (For HDP systems only, RHEL 5.x operating systems are also supported.) Memory: A minimum of 4GB of RAM is required for Revolution R Enterprise; 8GB or more are recommended. Hadoop itself has substantial memory requirements; see your Hadoop distribution s documentation for specific recommendations

6 2 Introduction Disk Space: A minimum of 500MB of disk space is required on each node for RRE installation. Hadoop itself has substantial disk space requirements; see your Hadoop distribution s documentation for specific recommendations. Package Dependencies: Revolution R Enterprise, like most Linux applications, depends upon a number of Linux packages. A few of these, listed in Table 1, are explicitly required by Revolution R Enterprise. The remainder are in turn required by these dependencies. These are automatically installed while the automated script is running. These are listed in Table 2. Table 1. Packages Explicitly Required by Revolution R Enterprise ed tk-devel gcc-objc readline-devel ncurses-devel perl libgfortran libicu ghostscript-fonts libsm-devel libxmu-devel cairo-devel make gcc-c++ libtiff-devel pango-devel texinfo pango libjpeg*-devel gcc-gfortran libicu-devel bzip2-devel Table 2. Secondary Dependencies Installed for Revolution R Enterprise cloog-ppl font-config-devel freetype-devel glib2-devel libobjc libstdc++-devel libxau-devel libxext-devel libxmu libxt-devel pixman-devel tcl gmp tk zlib-devel cpp freetype gcc libice-devel libpng-devel libx11-devel libxcb-devel libxft-devel libxrender-devel mpfr ppl glibc-headers tcl-devel kernel-headers xorg-x11-proto-devel

7 Introduction Basic Hadoop Terminology The following terms apply to computers and services within the Hadoop cluster, and define the roles of hosts within the cluster: Hadoop 1.x Installations (HDP 1.3.0, MapR 3.x) JobTracker: The Hadoop service that distributes MapReduce tasks to specific nodes in the cluster. The JobTracker queries the NameNode to find the location of the data needed for the tasks, then distributes the tasks to TaskTracker nodes near (or coextensive with) the data. For small clusters, the JobTracker may be running on the NameNode, but this is not recommended for production use. NameNode: A host in the cluster that is the master node of the HDFS file system, managing the directory tree of all files in the file system. In small clusters, the NameNode may host the JobTracker, but this is not recommended for production use. TaskTracker: Any host that can accept tasks (Map, Reduce, and Shuffle operations) from a JobTracker. TaskTrackers are usually, but not always, also DataNodes, so that tasks assigned to the TaskTracker can work on data on the same node. DataNode: A host that stores data in the Hadoop Distributed File System. DataNodes connect to the NameNode, and responds to requests from the NameNode for file system operations. Hadoop 2.x Installations (CDH5, HDP 2.x, MapR 4.0.x) Resource Manager: The Hadoop service that distributes MapReduce and other Hadoop tasks to specific nodes in the cluster. The Resource Manager takes over the scheduling functions of the old JobTracker, determining which nodes are appropriate for the current job. NameNode: A host in the cluster that is the master node of the HDFS file system, managing the directory tree of all files in the file system. Application Master: New in MapReduce2/YARN, the application master takes over the task progress coordination from the old JobTracker, working with node managers on the individual task nodes. The application master negotiates with the Resource Manager for cluster resources, which are allocated as a set of containers, with each container running an application-specific task on a particular node. NodeManager: Node managers manage the containers allocated for a given task on a given node, coordinating with the Resource Manager and the Application Masters. NodeManagers are usually, but not always, also DataNodes, and most frequently the containers on a given node are working with data on the same node. DataNode: A host that stores data in the Hadoop Distributed File System. DataNodes connect to the NameNode, and responds to requests from the NameNode for file system operations.

8 4 Introduction 1.3 Verifying the Hadoop Installation We assume you have already installed Hadoop on your cluster. If not, use the documentation provided with your Hadoop distribution to help you perform the installation; Hadoop installation is complicated and involves many steps--following the documentation carefully does not guarantee success, but it does make troubleshooting easier. In our testing, we have found the following documents helpful: Cloudera CDH5, package install Cloudera CDH5, Cloudera Manager parcel install Hortonworks HDP 1.3 Hortonworks HDP 2.1 Hortonworks HDP 1.x or 2.x, Ambari install MapR 3.x install MapR (M5 Edition) If you are using Cloudera Manager, it is important to know if your installation was via packages or parcels; the Revolution R Enterprise Cloudera Manager parcel can be used only with parcel installs. If you have installed Cloudera Manager via packages, do not attempt to use the RRE Cloudera Manager parcel; use the standard Revolution R Enterprise for Linux installer instead. It is useful to confirm that Hadoop itself is running correctly before attempting to install Revolution R Enterprise on the cluster. Hadoop comes with example programs that you can run to verify that your Hadoop installation is running properly, in the jar file hadoopmapreduce-examples.jar. The following command should display a list of the available examples: hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar (On MapR, the quick installation installs the Hadoop files to /opt/mapr by default; the path to the examples jar file is /opt/mapr/hadoop/hadoop /hadoop dev-examples.jar. Similarly, on Cloudera Manager parcel installs, the default path to the examples is /opt/cloudera/parcels/cdh/lib/hadoop-mapreduceexamples.jar.) The following runs the pi example, which uses Monte Carlo sampling to estimate pi; the 5 tells Hadoop to use 5 mappers, the 300 says to use 300 samples per map: hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar pi If you can successfully run one or more of the Hadoop examples, your Hadoop installation was successful and you are ready to install Revolution R Enterprise. 1.4 Adjusting Hadoop Memory Limits (Hadoop 2.x Systems Only) On YARN-based Hadoop systems (CDH5, HDP 2.x, MapR 4.0.x), we have found that the default settings for Map and Reduce memory limits are inadequate for large RevoScaleR jobs. The memory available for R is the difference between the container s

9 Hadoop Security with Kerberos Authentication 5 memory limit and the memory given to the Java Virtual Machine. To allow large RevoScaleR jobs to run, we need to modify four properties in mapred-site.xml and one in yarn-site.xml, as follows (these files are typically found in /etc/hadoop/conf): (in mapred-site.xml) <name>mapreduce.map.memory.mb</name> <value>2048</value> <name>mapreduce.reduce.memory.mb</name> <value>2048</value> <name>mapreduce.map.java.opts</name> <value>-xmx1229m</value> <name>mapreduce.reduce.java.opts</name> <value>-xmx1229m</value> (in yarn-site.xml) <name>yarn.nodemanager.resource.memory-mb</name> <value>3198</value> If you are using a cluster manager such as Cloudera Manager or Ambari, these settings must usually be modified using the Web interface. 2 Hadoop Security with Kerberos Authentication By default, most Hadoop configurations are relatively insecure. Security features such as SELinux and IPtables firewalls are often turned off to help get the Hadoop cluster up and running quickly. However, Cloudera and Hortonworks distributions of Hadoop support Kerberos authentication, which allows Hadoop to operate in a much more secure manner. To use Kerberos authentication with your particular version of Hadoop, see one of the following documents: o Cloudera CDH5 o Cloudera CDH5 with Cloudera Manager 5 o Hortonworks HDP 1.3 o Hortonworks HDP 2.x o Hortonworks HDP (1.3 or 2.x) with Ambari If you have trouble restarting your Hadoop cluster after enabling Kerberos authentication, the problem is most likely with your keytab files. Be sure you have created all the required Kerberos principals and generated appropriate keytab entries for all of your nodes, and that the keytab files have been located correctly with the appropriate permissions. (We have found that in Hortonworks clusters managed with Ambari, it is important that the spnego.service.keytab file be present on all the nodes of the cluster, not just the name node and secondary namenode.) The MapR distribution also supports Kerberos authentication, but most MapR installations use that distribution s wire-level security feature. See the MapR Security Guide for details.

10 6 Installing Revolution R Enterprise on a Cluster 3 Installing Revolution R Enterprise on a Cluster It is highly recommended that you install Revolution R Enterprise as root on each node of your Hadoop cluster. This ensures that all users will have access to it by default. Nonroot installs are supported, but require that the path to the R executable files be added to each user s path. If you are installing on a Cloudera Manager system using a parcel install, skip to Section 3.6, Installing on a Cloudera Manager System Using a Cloudera Manager Parcel. 3.1 Standard Command Line Install For most users, installing on the cluster means simply running the standard Revolution R Enterprise installers on each node of the cluster: 1. Log in as root or a user with sudo privileges. If the latter, precede commands requiring root privileges with sudo. (If you do not have root access or sudo privileges, you can install as a non-root user. See Section 3.2 for details.) 2. Make sure the system repositories are up to date prior to installing Revolution R Open: sudo yum clean all 3. Download the Revolution R Open tarball. 4. Change to the directory to which you downloaded the tarball (for example, /tmp): cd /tmp 5. Unzip the contents of the RRO installer tarball (asterisk will be a 5 or 6 depending on your RHEL operating system): tar xvzf RRO el*.x86_64.tar.gz 6. Change to the RRO directory: cd RRO Run the RRO install script:./install.sh 8. Download and unpack the Revolution R Connector tarball, then run the installer script, as follows (the tarball name may include an operating system ID denoted below by <OS>; the complete name of the tarball will be in your download letter): tar xvzf Revolution-R-Connector <OS>.tar.gz pushd rrconn./install_rr_conn.sh y -p /usr/lib64/rro-8.0.3/r popd

11 Installing Revolution R Enterprise on a Cluster 7 9. Download and unpack the Revolution R Enterprise tarball, then run the installer script, as follows (the tarball name may include an operating system ID denoted below by <OS>; the complete name of the tarball will be in your download letter): tar xvzf Revolution-R-Enterprise <OS>.tar.gz pushd rrent./install_rr_ent.sh a y -p /usr/lib64/rro-8.0.3/r popd This installs Revolution R Enterprise with the standard options (including loading the rpart and lattice packages by default when RevoScaleR is loaded). 3.2 Distributed Installation with RevoMPM If your Hadoop cluster is configured to allow passwordless-ssh access among the various nodes, you can use the Revolution Multi-Node Package Manager (RevoMPM) to deploy Revolution R Enterprise across your cluster. RevoMPM is a lightweight wrapper to the python fabric package. On any one node of the cluster, create a directory for the installer, such as /var/tmp/revoinstall, and download the following files to that directory (you can find the links in your welcome ): RevoMPM *.x86_64.rpm RRO *.x86_64.tar.gz Revolution-R-Connector *.tar.gz Revolution-R-Enterprise *.tar.gz (The * in the above file names represents an operating system indicator, such as RHEL5 or RHEL6.) For best results, install RevoMPM as root. You install RevoMPM directly from the rpm as follows: rpm -i RevoMPM-*.x86_64.rpm When run as root, this installs RevoMPM to the /opt/revompm directory. Add that directory s bin subdirectory (/opt/revompm/bin) to your system PATH variable. To ensure ready access to your nodes via RevoMPM, edit the file /opt/revompm/hosts.cfg to list the nodes in your cluster. The host configuration file follows the standard Python config file format. Only one section, groups, is required in this config file, e.g. [groups] nodes = ip ip ip ip ip

12 8 Installing Revolution R Enterprise on a Cluster Note the four spaces of indentation for continuation lines; if this is missing, the underlying Python interpreter will report a parsing error. (Any consistent number of spaces or tabs can be used; four spaces is the Python standard.) Issue the following commands to distribute and install Revolution R Enterprise (ensure that each revompm command is run on a single logical line, even if it spans two lines below due to space constraints): revompm cmd:"mkdir -p /var/tmp/revo-install" revompm dcp:/var/tmp/revo-install/rro *.x86_64.tar.gz revompm cmd:"cd /var/tmp/revo-install;tar -xzf RRO *.x86_64.tar.gz" revompm cmd:"yum clean all" revompm cmd:"cd /var/tmp/revo-install/rro-8.0.3;./install.sh" revompm dcp:/var/tmp/revo-install/revolution-r-connector <os>.tar.gz revompm cmd:"cd /var/tmp/revo-install;tar zxf Revolution-R-Connector-*" revompm cmd:"cd /var/tmp/revo-install/rrconn;./install_rr_conn.sh -y -p /usr/lib64/rro /R-3.1.3" revompm dcp:/var/tmp/revo-install/revolution-r-enterprise <os>.tar.gz revompm cmd:"cd /var/tmp/revo-install;tar zxf Revolution-R-Enterprise-*" revompm cmd:"cd /var/tmp/revo-install/rrent;./install_rr_ent.sh -y -a -p /usr/lib64/rro /R-3.1.3" For complete instructions on installing and running RevoMPM (including instructions for installing as a non-root user), see the RevoMPM User s Guide. 3.3 Installing the Revolution R Enterprise JAR File Using Revolution R Enterprise in Hadoop requires the presence of the Revolution R Enterprise Java Archive (JAR) file scaler-hadoop-0.1-snapshot.jar. This file is installed in the scripts directory of your Revolution R Enterprise installation (typically at /usr/lib64/revo-7.4/scripts), and is typically linked to the standard Hadoop jar file location (typically $HADOOP_HOME/lib or $HADOOP_PREFIX/lib). If you are installing RRE as a non-root user, you may need to obtain root access to link this file appropriately. 3.4 Environment Variables for Hadoop The file RevoHadoopEnvVars.site in the scripts directory of your Revolution R Enterprise installation (typically at /usr/lib64/revo-7.4/scripts) should be sourced by all users, by adding the following line to the.bash_profile file:. /usr/lib64/revo-7.4/scripts/revohadoopenvvars.site (The period (. ) at the beginning is part of the command, and must be included.) This file sets the following environment variables for use by Revolution R Enterprise: HADOOP_HOME This should be set to the directory containing the Hadoop files. HADOOP_CMD: This should be set to the command used to invoke Hadoop

13 Installing Revolution R Enterprise on a Cluster 9 HADOOP_CLASSPATH: This should be set to include the full path to the RRE jar files (typically /usr/lib64/revo-7.4/scripts). CLASSPATH: This should be a fully expanded CLASSPATH with access to all required Hadoop JAR files. JAVA_LIBRARY_PATH: If necessary, this should be set to include paths to the directories containing Hadoop jar files. HADOOP_STREAMING: This should be set to the path of the Hadoop streaming jar file. These environment variables are written to the file automatically on installation, but can be edited by hand if necessary. 3.5 Creating Directories for Revolution R Enterprise Each user should ensure that the appropriate user directories exist, and if necessary, create them with the following commands: hadoop fs -mkdir /user/revoshare/$user hadoop fs -chmod uog+rwx /user/revoshare/$user mkdir -p /var/revoshare/$user chmod uog+rwx /var/revoshare/$user The HDFS directory can also be created in a user s R session (provided the top-level /user/revoshare has the appropriate permissions) using the following RevoScaleR commands (substitute your actual user name for username ): rxhadoopmakedir("/user/revoshare/username") rxhadoopcommand("fs -chmod uog+rwx /user/revoshare/username") 3.6 Installing on a Cloudera Manager System Using a Cloudera Manager Parcel If you are running a Cloudera Hadoop cluster managed by Cloudera Manager, and if Cloudera itself was installed via a Cloudera Manager parcel, you can use the Revolution R Enterprise Cloudera Manager parcels to install Revolution R Enterprise on all the nodes of your cluster. Three parcels are required: Revolution R Open parcel installs open-source R on the nodes of your Cloudera cluster Revolution R Connector parcel installs open-source Revolution components on the nodes of your Cloudera cluster Revolution R Enterprise parcel installs proprietary Revolution components on the nodes of your Cloudera cluster Revolution R Enterprise requires several packages that may not be in a default Red Hat Enterprise Linux installation, run the following yum command as root to install them: yum install gcc-gfortran cairo-devel python-devel \ tk-devel libicu-devel

14 10 Installing Revolution R Enterprise on a Cluster Run this command on all the nodes of your cluster that will be running Revolution R Enterprise. If you have installed RevoMPM, you can distribute the command using RevoMPM s cmd command: revompm cmd:"yum install gcc-gfortran cairo-devel python-devel tk-devel libicu-devel" Once you have installed the Revolution R Enterprise prerequisites, install the Cloudera Manager parcels as follows: 1. Download the Revolution R Enterprise Cloudera Manager Parcels using the links provided in your welcome . (Note that each parcel consists of two files, the parcel itself and its associated.sha file. They may be packaged as a single.tar.gz file for convenience in downloading, but that must be unpacked and the two files copied to the parcel-repo for Cloudera Manager to recognize them as a parcel.) 2. Copy the parcel files to your local parcel-repo, typically /opt/cloudera/parcel-repo. You should have the following files in your parcel repo: RRO el6.parcel RRO el6.parcel.sha RevolutionR el6.parcel RevolutionR el6.parcel.sha RRE el6.parcel RRE el6.parcel.sha Be sure all the files are owned by root and have 755 permissions (that is, read, write, execute permission for root, and read and execute permissions for group and others). 3. In your browser, open Cloudera Manager. 4. Click Hosts in the upper navigation bar to bring up the All Hosts page. 5. Click Parcels to bring up the Parcels page. 6. Click Check for New Parcels. RRO 8.0.3, RevolutionR , and RRE should each appear with a Distribute button. After clicking Check for New Parcels you may need to click on All Clusters under the Location section on the left to see the new parcels. 7. Click the RRO Distribute button. Revolution R Open will be distributed to all the nodes of your cluster. When the distribution is complete, the Distribute button is replaced with an Activate button. 8. Click Activate. Activation prepares Revolution R Open to be used by the cluster. 9. Click the Revolution R Distribute button. The Revolution R Connector will be distributed to all the nodes of your cluster. When the distribution is complete, the Distribute button is replaced with an Activate button. 10. Click Activate. Activation prepares the Revolution R Connector to be used by the cluster. 11. Click the RRE Distribute button. Revolution R Enterprise will be distributed to all the nodes of your cluster. When the distribution is complete, the Distribute button is replaced with an Activate button. 12. Click Activate. Activation prepares Revolution R Enterprise to be used by the cluster.

15 Verifying Installation 11 When you have installed the three parcels, download, install, and run the Revolution Custom Service Descriptor as follows: 1. Download the Custom Service Descriptor from the links in your welcome to the Cloudera CSD directory, typically /opt/cloudera/csd. 2. Stop and restart the cloudera-scm-server service using the following shell commands: service cloudera-scm-server stop service cloudera-scm-server start 3. Confirm the CSD is installed by checking the Custom Service Descriptor list in Cloudera Manager at <hostname>/cmf/csd/list, where <hostname> is the host name of your Cloudera Manager server. 4. On the Cloudera Manager home page, click the dropdown beside the cluster name and click Add a Service. 5. From the Add Service Wizard, select Revolution R and click Continue. 6. Select all hosts, and click Continue. 7. Accept defaults through the remainder of the wizard. Each user should ensure that the appropriate user directories exist, and if necessary, create them with the following shell commands: hadoop fs -mkdir /user/revoshare/$user hadoop fs -chmod uog+rwx /user/revoshare/$user mkdir -p /var/revoshare/$user chmod uog+rwx /var/revoshare/$user The HDFS directory can also be created in a user s R session (provided the top-level /user/revoshare has the appropriate permissions) using the following RevoScaleR commands (substitute your actual user name for username ): rxhadoopmakedir("/user/revoshare/username") rxhadoopcommand("fs -chmod uog+rwx /user/revoshare/username") As part of this process make sure to check that the base directories /user and /user/revoshare have uog+rwx permissions as well. 4 Verifying Installation After completing installation, do the following to verify that Revolution R Enterprise will actually run commands in Hadoop: 1. If the cluster is security-enabled, obtain a ticket using kinit (for Kerberos authentication) or mapr password (for MapR wire security). 2. Start Revolution R Enterprise on a cluster node by typing Revo64 at a shell prompt.

16 12 Verifying Installation 3. At the R prompt >, enter the following commands (these commands are drawn from the RevoScaleR Hadoop Getting Started Guide, which explains what all of them are doing. For now, we are just trying to see if everything works): bigdatadirroot <- "/share" myhadoopcluster <- RxHadoopMR(consoleOutput=TRUE) rxsetcomputecontext(myhadoopcluster) source <- system.file("sampledata/airlinedemosmall.csv", package="revoscaler") inputdir <- file.path(bigdatadirroot,"airlinedemosmall") rxhadoopmakedir(inputdir) rxhadoopcopyfromlocal(source, inputdir) hdfsfs <- RxHdfsFileSystem() colinfo <- list(dayofweek = list(type = "factor", levels = c("monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"))) airds <- RxTextData(file = inputdir, missingvaluestring = "M", colinfo = colinfo, filesystem = hdfsfs) adssummary <- rxsummary(~arrdelay+crsdeptime+dayofweek, data = airds) adssummary If you installed Revolution R Enterprise in a non-default location, you must specify the location using both the hadooprpath and revopath arguments to RxHadoopMR: myhadoopcluster <- RxHadoopMR(hadoopRPath=/path/to/Revo64, revopath=/path/to/revo64) If you see the following, congratulations: Call: rxsummary(formula = ~ArrDelay + CRSDepTime + DayOfWeek, data = airds) Summary Statistics Results for: ~ArrDelay + CRSDepTime + DayOfWeek Data: airds ( RxTextData Data Source) File name: /share/airlinedemosmall Number of valid observations: 6e+05 Name Mean StdDev Min Max ValidObs MissingObs ArrDelay CRSDepTime Category Counts for DayOfWeek Number of categories: 7 Number of valid observations: 6e+05 Number of missing observations: 0 DayOfWeek Counts Monday Tuesday Wednesday Thursday Friday Saturday 86159

17 Troubleshooting the Installation 13 Sunday Next try to run a simple rxexec job: rxexec(list.files) That should return a list of files in the native file system. If either the call to rxsummary or the call to rxexec results in an error, see section 5, Troubleshooting the Installation, for a few of the more common errors and how to fix them. 5 Troubleshooting the Installation No two Hadoop installations are exactly alike, but most are quite similar. This section brings together a number of common errors seen in attempting to run Revolution R Enterprise commands on Hadoop clusters, and the most likely causes of such errors from our experience. 5.1 No Valid Credentials If you see a message such as No valid credentials provided, this means you do not have a valid Kerberos ticket. Quit Revolution R Enterprise, obtain a Kerberos ticket using kinit, and then restart Revolution R Enterprise. 5.2 Unable to Load Class RevoScaleR If you see a message about being unable to find or load main class RevoScaleR, this means that the jar file scaler-hadoop-0.1-snapshot.jar could not be found. This jar file must be in a location where it can be found by the gethadoopenvvars.py script, or its location must be explicitly added to the CLASSPATH. 5.3 Classpath Errors If you see other errors related to Java classes, these are likely related to the settings of the following environment variables: PATH CLASSPATH JAVA_LIBRARY_PATH Of these, the most commonly misconfigured is the CLASSPATH. 5.4 Unable to Load Shared Library If you see a message about being unable to load libhdfs.so, you may need to create a symbolic link from your installed version of libhdfs.so to the system library, such as the following:

18 14 Getting Started with Hadoop ln -s /path/to/libhdfs.so /usr/lib64/libhdfs.so Or, update your LD_LIBRARY_PATH environment variable to include the path to the libjvm shared object: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/libhdfs.so (This step is normally performed automatically during the RRE install. If you continue to see errors about libhdfs.so, you may need to both create the symbolic link as above and set LD_LIBRARY_PATH.) Similarly, if you see a message about being unable to load libjvm.so, you may need to create a symbolic link from your installed version of libjvm.so to the system library, such as the following: ln -s /path/to/libjvm.so /usr/lib64/libjvm.so Or, update your LD_LIBRARY_PATH environment variable to include the path to the libjvm shared object: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/libjvm.so 6 Getting Started with Hadoop To get started with Revolution R Enterprise on Hadoop, we recommend the RevoScaleR 7 Hadoop Getting Started Guide PDF. This provides a tutorial introduction to using RevoScaleR with Hadoop. 7 Using HDFS Caching HDFS caching, more formally centralized cache management in HDFS, can greatly improve the performance of your Hadoop jobs by keeping frequently used data in memory. You enable HDFS caching on a path by path basis, first by creating a pool of cached paths, and then adding paths to the pool. The HDFS command cacheadmin is used to perform these tasks. This command should be run by the hdfs user (the mapr user on MapR installations). The cacheadmin command has many subcommands; the Apache Software Foundation has complete documentation. To get started, the addpool and adddirective commands will suffice. For example, to specify HDFS caching for our /share/airlinedemosmall directory, we can first create a pool as follows: hdfs cacheadmin addpool rrepool

19 Creating an R Package Parcel for Cloudera Manager 15 You can then add the path to /share/airlinedemosmall to the pool with an adddirective command as follows: hdfs cacheadmin adddirective path /share/airlinedemosmall pool rrepool 8 Creating an R Package Parcel for Cloudera Manager If you are using Cloudera Manager to manage your Cloudera Hadoop cluster, you can use the Revolution R Enterprise Parcel Generator to create a Cloudera Manager parcel containing additional R packages, and use the resulting parcel to distribute those packages across all the nodes of your cluster. The Revolution R Enterprise Parcel Generator is a Python script that takes a library of R packages and creates a Cloudera Manager parcel that excludes any base or recommended packages, or packages included with the standard Revolution R Enterprise distribution. Make sure to consider any dependencies your packages might have and be sure to include those in your library. If you installed Revolution R Enterprise with Cloudera Manager parcels, you will find the Parcel Generator in the Revo.home()/scripts directory. (You may need to ensure that the script has execute permission using the chmod command, or you can call it as python generate_r_parcel.py.) When you call the script, you must provide a name and a version number for the resulting parcel, together with the path to the library you would like to package. When choosing a name for your parcel, be sure to pick a name that is unique in your parcel repository (typically /opt/cloudera/parcel-repo). For example, to package the library /home/revouser/r/library, you might call the script as follows: generate_r_parcel.py p "RevoUserPkgs" v "0.1" l /home/revouser/r/library By default, the path to the library you package should be the same as the path to the library on the Hadoop cluster. You can specify a different destination using the d flag: generate_r_parcel.py p "RevoUserPkgs" v "0.1" \ -l /home/revouser/r/library d /var/revoshare/revouser/library To distribute and activate your parcel perform the following steps: 1. Copy or move your.parcel and.sha files to the parcel repository on your Cloudera cluster (typically, /opt/cloudera/parcel-repo) 2. Ensure that the.parcel and.sha files are owned by root and have 755 permissions (that is, read, write, and execute permission for root, and read and execute permissions for group and others). 3. In your browser, open Cloudera Manager. 4. Click Hosts in the upper navigation bar to bring up the All Hosts page. 5. Click Parcels to bring up the Parcels page. 6. Click Check for New Parcels. Your new parcel should appear with a Distribute button. After clicking Check for New Parcels you may need to click on All Clusters under the Location section on the left to see the new parcel.

20 16 Creating an R Package Parcel for Cloudera Manager 7. Click the Distribute button for your parcel. The parcel will be distributed to all the nodes of your cluster. When the distribution is complete, the Distribute button is replaced with an Activate button. 8. Click Activate. Activation prepares your parcel to be used by the cluster. After your parcel is distributed and activated your R packages should be present in the libraries on each node and can be loaded into your next R session.

Revolution R Enterprise 7 Hadoop Configuration Guide

Revolution R Enterprise 7 Hadoop Configuration Guide Revolution R Enterprise 7 Hadoop Configuration Guide The correct bibliographic citation for this manual is as follows: Revolution Analytics, Inc. 2014. Revolution R Enterprise 7 Hadoop Configuration Guide.

More information

RHadoop Installation Guide for Red Hat Enterprise Linux

RHadoop Installation Guide for Red Hat Enterprise Linux RHadoop Installation Guide for Red Hat Enterprise Linux Version 2.0.2 Update 2 Revolution R, Revolution R Enterprise, and Revolution Analytics are trademarks of Revolution Analytics. All other trademarks

More information

CDH 5 Quick Start Guide

CDH 5 Quick Start Guide CDH 5 Quick Start Guide Important Notice (c) 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names or slogans contained in this

More information

Kognitio Technote Kognitio v8.x Hadoop Connector Setup

Kognitio Technote Kognitio v8.x Hadoop Connector Setup Kognitio Technote Kognitio v8.x Hadoop Connector Setup For External Release Kognitio Document No Authors Reviewed By Authorised By Document Version Stuart Watt Date Table Of Contents Document Control...

More information

VMware vsphere Big Data Extensions Administrator's and User's Guide

VMware vsphere Big Data Extensions Administrator's and User's Guide VMware vsphere Big Data Extensions Administrator's and User's Guide vsphere Big Data Extensions 1.0 This document supports the version of each product listed and supports all subsequent versions until

More information

How to Install and Configure EBF15328 for MapR 4.0.1 or 4.0.2 with MapReduce v1

How to Install and Configure EBF15328 for MapR 4.0.1 or 4.0.2 with MapReduce v1 How to Install and Configure EBF15328 for MapR 4.0.1 or 4.0.2 with MapReduce v1 1993-2015 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic,

More information

RevoScaleR 7.3 HPC Server Getting Started Guide

RevoScaleR 7.3 HPC Server Getting Started Guide RevoScaleR 7.3 HPC Server Getting Started Guide The correct bibliographic citation for this manual is as follows: Revolution Analytics, Inc. 2014. RevoScaleR 7.3 HPC Server Getting Started Guide. Revolution

More information

RHadoop and MapR. Accessing Enterprise- Grade Hadoop from R. Version 2.0 (14.March.2014)

RHadoop and MapR. Accessing Enterprise- Grade Hadoop from R. Version 2.0 (14.March.2014) RHadoop and MapR Accessing Enterprise- Grade Hadoop from R Version 2.0 (14.March.2014) Table of Contents Introduction... 3 Environment... 3 R... 3 Special Installation Notes... 4 Install R... 5 Install

More information

Revolution R Enterprise DeployR 7.1 Installation Guide for Windows

Revolution R Enterprise DeployR 7.1 Installation Guide for Windows Revolution R Enterprise DeployR 7.1 Installation Guide for Windows The correct bibliographic citation for this manual is as follows: Revolution Analytics, Inc. 2014. Revolution R Enterprise DeployR Installation

More information

Apache Hadoop 2.0 Installation and Single Node Cluster Configuration on Ubuntu A guide to install and setup Single-Node Apache Hadoop 2.

Apache Hadoop 2.0 Installation and Single Node Cluster Configuration on Ubuntu A guide to install and setup Single-Node Apache Hadoop 2. EDUREKA Apache Hadoop 2.0 Installation and Single Node Cluster Configuration on Ubuntu A guide to install and setup Single-Node Apache Hadoop 2.0 Cluster edureka! 11/12/2013 A guide to Install and Configure

More information

Quick Deployment Step-by-step instructions to deploy Oracle Big Data Lite Virtual Machine

Quick Deployment Step-by-step instructions to deploy Oracle Big Data Lite Virtual Machine Quick Deployment Step-by-step instructions to deploy Oracle Big Data Lite Virtual Machine Version 3.0 Please note: This appliance is for testing and educational purposes only; it is unsupported and not

More information

Centrify Identity and Access Management for Cloudera

Centrify Identity and Access Management for Cloudera Centrify Identity and Access Management for Cloudera Integration Guide Abstract Centrify Server Suite is an enterprise-class solution that secures Cloudera Enterprise Data Hub leveraging an organization

More information

JAMF Software Server Installation Guide for Linux. Version 8.6

JAMF Software Server Installation Guide for Linux. Version 8.6 JAMF Software Server Installation Guide for Linux Version 8.6 JAMF Software, LLC 2012 JAMF Software, LLC. All rights reserved. JAMF Software has made all efforts to ensure that this guide is accurate.

More information

H2O on Hadoop. September 30, 2014. www.0xdata.com

H2O on Hadoop. September 30, 2014. www.0xdata.com H2O on Hadoop September 30, 2014 www.0xdata.com H2O on Hadoop Introduction H2O is the open source math & machine learning engine for big data that brings distribution and parallelism to powerful algorithms

More information

Deploy Apache Hadoop with Emulex OneConnect OCe14000 Ethernet Network Adapters

Deploy Apache Hadoop with Emulex OneConnect OCe14000 Ethernet Network Adapters CONNECT - Lab Guide Deploy Apache Hadoop with Emulex OneConnect OCe14000 Ethernet Network Adapters Hardware, software and configuration steps needed to deploy Apache Hadoop 2.4.1 with the Emulex family

More information

Configuring Informatica Data Vault to Work with Cloudera Hadoop Cluster

Configuring Informatica Data Vault to Work with Cloudera Hadoop Cluster Configuring Informatica Data Vault to Work with Cloudera Hadoop Cluster 2013 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,

More information

Cloudera Manager Training: Hands-On Exercises

Cloudera Manager Training: Hands-On Exercises 201408 Cloudera Manager Training: Hands-On Exercises General Notes... 2 In- Class Preparation: Accessing Your Cluster... 3 Self- Study Preparation: Creating Your Cluster... 4 Hands- On Exercise: Working

More information

Integrating SAP BusinessObjects with Hadoop. Using a multi-node Hadoop Cluster

Integrating SAP BusinessObjects with Hadoop. Using a multi-node Hadoop Cluster Integrating SAP BusinessObjects with Hadoop Using a multi-node Hadoop Cluster May 17, 2013 SAP BO HADOOP INTEGRATION Contents 1. Installing a Single Node Hadoop Server... 2 2. Configuring a Multi-Node

More information

Red Hat Enterprise Linux OpenStack Platform 7 OpenStack Data Processing

Red Hat Enterprise Linux OpenStack Platform 7 OpenStack Data Processing Red Hat Enterprise Linux OpenStack Platform 7 OpenStack Data Processing Manually provisioning and scaling Hadoop clusters in Red Hat OpenStack OpenStack Documentation Team Red Hat Enterprise Linux OpenStack

More information

Deploying Cloudera CDH (Cloudera Distribution Including Apache Hadoop) with Emulex OneConnect OCe14000 Network Adapters

Deploying Cloudera CDH (Cloudera Distribution Including Apache Hadoop) with Emulex OneConnect OCe14000 Network Adapters Deploying Cloudera CDH (Cloudera Distribution Including Apache Hadoop) with Emulex OneConnect OCe14000 Network Adapters Table of Contents Introduction... Hardware requirements... Recommended Hadoop cluster

More information

Important Notice. (c) 2010-2016 Cloudera, Inc. All rights reserved.

Important Notice. (c) 2010-2016 Cloudera, Inc. All rights reserved. Cloudera QuickStart Important Notice (c) 2010-2016 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names or slogans contained in this

More information

Using The Hortonworks Virtual Sandbox

Using The Hortonworks Virtual Sandbox Using The Hortonworks Virtual Sandbox Powered By Apache Hadoop This work by Hortonworks, Inc. is licensed under a Creative Commons Attribution- ShareAlike3.0 Unported License. Legal Notice Copyright 2012

More information

Release Notes for McAfee(R) VirusScan(R) Enterprise for Linux Version 1.9.0 Copyright (C) 2014 McAfee, Inc. All Rights Reserved.

Release Notes for McAfee(R) VirusScan(R) Enterprise for Linux Version 1.9.0 Copyright (C) 2014 McAfee, Inc. All Rights Reserved. Release Notes for McAfee(R) VirusScan(R) Enterprise for Linux Version 1.9.0 Copyright (C) 2014 McAfee, Inc. All Rights Reserved. Release date: August 28, 2014 This build was developed and tested on: -

More information

Big Data Operations Guide for Cloudera Manager v5.x Hadoop

Big Data Operations Guide for Cloudera Manager v5.x Hadoop Big Data Operations Guide for Cloudera Manager v5.x Hadoop Logging into the Enterprise Cloudera Manager 1. On the server where you have installed 'Cloudera Manager', make sure that the server is running,

More information

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and

More information

Configuring MailArchiva with Insight Server

Configuring MailArchiva with Insight Server Copyright 2009 Bynari Inc., All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any

More information

Cloudera Backup and Disaster Recovery

Cloudera Backup and Disaster Recovery Cloudera Backup and Disaster Recovery Important Note: Cloudera Manager 4 and CDH 4 have reached End of Maintenance (EOM) on August 9, 2015. Cloudera will not support or provide patches for any of the Cloudera

More information

Architecting the Future of Big Data

Architecting the Future of Big Data Hive ODBC Driver User Guide Revised: July 22, 2014 2012-2014 Hortonworks Inc. All Rights Reserved. Parts of this Program and Documentation include proprietary software and content that is copyrighted and

More information

HYPERION SYSTEM 9 N-TIER INSTALLATION GUIDE MASTER DATA MANAGEMENT RELEASE 9.2

HYPERION SYSTEM 9 N-TIER INSTALLATION GUIDE MASTER DATA MANAGEMENT RELEASE 9.2 HYPERION SYSTEM 9 MASTER DATA MANAGEMENT RELEASE 9.2 N-TIER INSTALLATION GUIDE P/N: DM90192000 Copyright 2005-2006 Hyperion Solutions Corporation. All rights reserved. Hyperion, the Hyperion logo, and

More information

Hadoop Basics with InfoSphere BigInsights

Hadoop Basics with InfoSphere BigInsights An IBM Proof of Technology Hadoop Basics with InfoSphere BigInsights Part: 1 Exploring Hadoop Distributed File System An IBM Proof of Technology Catalog Number Copyright IBM Corporation, 2013 US Government

More information

docs.hortonworks.com

docs.hortonworks.com docs.hortonworks.com : Security Administration Tools Guide Copyright 2012-2014 Hortonworks, Inc. Some rights reserved. The, powered by Apache Hadoop, is a massively scalable and 100% open source platform

More information

Symantec Enterprise Solution for Hadoop Installation and Administrator's Guide 1.0

Symantec Enterprise Solution for Hadoop Installation and Administrator's Guide 1.0 Symantec Enterprise Solution for Hadoop Installation and Administrator's Guide 1.0 The software described in this book is furnished under a license agreement and may be used only in accordance with the

More information

A Study of Data Management Technology for Handling Big Data

A Study of Data Management Technology for Handling Big Data Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 9, September 2014,

More information

Installation Guide. McAfee VirusScan Enterprise for Linux 1.9.0 Software

Installation Guide. McAfee VirusScan Enterprise for Linux 1.9.0 Software Installation Guide McAfee VirusScan Enterprise for Linux 1.9.0 Software COPYRIGHT Copyright 2013 McAfee, Inc. Do not copy without permission. TRADEMARK ATTRIBUTIONS McAfee, the McAfee logo, McAfee Active

More information

MarkLogic Server. MarkLogic Connector for Hadoop Developer s Guide. MarkLogic 8 February, 2015

MarkLogic Server. MarkLogic Connector for Hadoop Developer s Guide. MarkLogic 8 February, 2015 MarkLogic Connector for Hadoop Developer s Guide 1 MarkLogic 8 February, 2015 Last Revised: 8.0-3, June, 2015 Copyright 2015 MarkLogic Corporation. All rights reserved. Table of Contents Table of Contents

More information

docs.hortonworks.com

docs.hortonworks.com docs.hortonworks.com Hortonworks Data Platform : Automated Install with Ambari Copyright 2012-2015 Hortonworks, Inc. Some rights reserved. The Hortonworks Data Platform, powered by Apache Hadoop, is a

More information

How To Install Hadoop 1.2.1.1 From Apa Hadoop 1.3.2 To 1.4.2 (Hadoop)

How To Install Hadoop 1.2.1.1 From Apa Hadoop 1.3.2 To 1.4.2 (Hadoop) Contents Download and install Java JDK... 1 Download the Hadoop tar ball... 1 Update $HOME/.bashrc... 3 Configuration of Hadoop in Pseudo Distributed Mode... 4 Format the newly created cluster to create

More information

CS380 Final Project Evaluating the Scalability of Hadoop in a Real and Virtual Environment

CS380 Final Project Evaluating the Scalability of Hadoop in a Real and Virtual Environment CS380 Final Project Evaluating the Scalability of Hadoop in a Real and Virtual Environment James Devine December 15, 2008 Abstract Mapreduce has been a very successful computational technique that has

More information

Supported Platforms HPE Vertica Analytic Database. Software Version: 7.2.x

Supported Platforms HPE Vertica Analytic Database. Software Version: 7.2.x HPE Vertica Analytic Database Software Version: 7.2.x Document Release Date: 2/4/2016 Legal Notices Warranty The only warranties for Hewlett Packard Enterprise products and services are set forth in the

More information

Important Notice. (c) 2010-2013 Cloudera, Inc. All rights reserved.

Important Notice. (c) 2010-2013 Cloudera, Inc. All rights reserved. Hue 2 User Guide Important Notice (c) 2010-2013 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names or slogans contained in this document

More information

Supported Platforms. HP Vertica Analytic Database. Software Version: 7.1.x

Supported Platforms. HP Vertica Analytic Database. Software Version: 7.1.x HP Vertica Analytic Database Software Version: 7.1.x Document Release Date: 10/14/2015 Legal Notices Warranty The only warranties for HP products and services are set forth in the express warranty statements

More information

Supported Platforms. HP Vertica Analytic Database. Software Version: 7.0.x

Supported Platforms. HP Vertica Analytic Database. Software Version: 7.0.x HP Vertica Analytic Database Software Version: 7.0.x Document Release Date: 5/7/2014 Legal Notices Warranty The only warranties for HP products and services are set forth in the express warranty statements

More information

Architecting the Future of Big Data

Architecting the Future of Big Data Hive ODBC Driver User Guide Revised: July 22, 2013 2012-2013 Hortonworks Inc. All Rights Reserved. Parts of this Program and Documentation include proprietary software and content that is copyrighted and

More information

Cloudera Manager Introduction

Cloudera Manager Introduction Cloudera Manager Introduction Important Notice (c) 2010-2013 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names or slogans contained

More information

www.novell.com/documentation Server Installation ZENworks Mobile Management 2.7.x August 2013

www.novell.com/documentation Server Installation ZENworks Mobile Management 2.7.x August 2013 www.novell.com/documentation Server Installation ZENworks Mobile Management 2.7.x August 2013 Legal Notices Novell, Inc., makes no representations or warranties with respect to the contents or use of this

More information

OnCommand Performance Manager 1.1

OnCommand Performance Manager 1.1 OnCommand Performance Manager 1.1 Installation and Setup Guide For Red Hat Enterprise Linux NetApp, Inc. 495 East Java Drive Sunnyvale, CA 94089 U.S. Telephone: +1 (408) 822-6000 Fax: +1 (408) 822-4501

More information

Single Node Hadoop Cluster Setup

Single Node Hadoop Cluster Setup Single Node Hadoop Cluster Setup This document describes how to create Hadoop Single Node cluster in just 30 Minutes on Amazon EC2 cloud. You will learn following topics. Click Here to watch these steps

More information

Pivotal HD Enterprise

Pivotal HD Enterprise PRODUCT DOCUMENTATION Pivotal HD Enterprise Version 1.1 Stack and Tool Reference Guide Rev: A01 2013 GoPivotal, Inc. Table of Contents 1 Pivotal HD 1.1 Stack - RPM Package 11 1.1 Overview 11 1.2 Accessing

More information

RecoveryVault Express Client User Manual

RecoveryVault Express Client User Manual For Linux distributions Software version 4.1.7 Version 2.0 Disclaimer This document is compiled with the greatest possible care. However, errors might have been introduced caused by human mistakes or by

More information

1. Product Information

1. Product Information ORIXCLOUD BACKUP CLIENT USER MANUAL LINUX 1. Product Information Product: Orixcloud Backup Client for Linux Version: 4.1.7 1.1 System Requirements Linux (RedHat, SuSE, Debian and Debian based systems such

More information

NovaBACKUP xsp Version 15.0 Upgrade Guide

NovaBACKUP xsp Version 15.0 Upgrade Guide NovaBACKUP xsp Version 15.0 Upgrade Guide NovaStor / November 2013 2013 NovaStor, all rights reserved. All trademarks are the property of their respective owners. Features and specifications are subject

More information

Rebasoft Auditor Quick Start Guide

Rebasoft Auditor Quick Start Guide Copyright Rebasoft Limited: 2009-2011 1 Release 2.1, Rev. 1 Copyright Notice Copyright 2009-2011 Rebasoft Ltd. All rights reserved. REBASOFT Software, the Rebasoft logo, Rebasoft Auditor are registered

More information

SAS Data Loader 2.1 for Hadoop

SAS Data Loader 2.1 for Hadoop SAS Data Loader 2.1 for Hadoop Installation and Configuration Guide SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2014. SAS Data Loader 2.1: Installation

More information

CA Performance Center

CA Performance Center CA Performance Center Release Notes Release 2.3.3 This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred to as the Documentation ) is for

More information

MarkLogic Server. Installation Guide for All Platforms. MarkLogic 8 February, 2015. Copyright 2015 MarkLogic Corporation. All rights reserved.

MarkLogic Server. Installation Guide for All Platforms. MarkLogic 8 February, 2015. Copyright 2015 MarkLogic Corporation. All rights reserved. Installation Guide for All Platforms 1 MarkLogic 8 February, 2015 Last Revised: 8.0-4, November, 2015 Copyright 2015 MarkLogic Corporation. All rights reserved. Table of Contents Table of Contents Installation

More information

Online Backup Linux Client User Manual

Online Backup Linux Client User Manual Online Backup Linux Client User Manual Software version 4.0.x For Linux distributions August 2011 Version 1.0 Disclaimer This document is compiled with the greatest possible care. However, errors might

More information

Archive Attender Version 3.5

Archive Attender Version 3.5 Archive Attender Version 3.5 Getting Started Guide Sherpa Software (800) 255-5155 www.sherpasoftware.com Page 1 Under the copyright laws, neither the documentation nor the software can be copied, photocopied,

More information

Quick Start Guide For Ipswitch Failover v9.0

Quick Start Guide For Ipswitch Failover v9.0 For Ipswitch Failover v9.0 Copyright 1991-2015 All rights reserved. This document, as well as the software described in it, is furnished under license and may be used or copied only in accordance with

More information

Cloud.com CloudStack Community Edition 2.1 Beta Installation Guide

Cloud.com CloudStack Community Edition 2.1 Beta Installation Guide Cloud.com CloudStack Community Edition 2.1 Beta Installation Guide July 2010 1 Specifications are subject to change without notice. The Cloud.com logo, Cloud.com, Hypervisor Attached Storage, HAS, Hypervisor

More information

Online Backup Client User Manual

Online Backup Client User Manual For Linux distributions Software version 4.1.7 Version 2.0 Disclaimer This document is compiled with the greatest possible care. However, errors might have been introduced caused by human mistakes or by

More information

DocuShare Installation Guide

DocuShare Installation Guide DocuShare Installation Guide Publication date: February 2011 This document supports DocuShare Release 6.6.1 Prepared by: Xerox Corporation DocuShare Business Unit 3400 Hillview Avenue Palo Alto, California

More information

Ankush Cluster Manager - Hadoop2 Technology User Guide

Ankush Cluster Manager - Hadoop2 Technology User Guide Ankush Cluster Manager - Hadoop2 Technology User Guide Ankush User Manual 1.5 Ankush User s Guide for Hadoop2, Version 1.5 This manual, and the accompanying software and other documentation, is protected

More information

IBM WebSphere Application Server Version 7.0

IBM WebSphere Application Server Version 7.0 IBM WebSphere Application Server Version 7.0 Centralized Installation Manager for IBM WebSphere Application Server Network Deployment Version 7.0 Note: Before using this information, be sure to read the

More information

Virtual Managment Appliance Setup Guide

Virtual Managment Appliance Setup Guide Virtual Managment Appliance Setup Guide 2 Sophos Installing a Virtual Appliance Installing a Virtual Appliance As an alternative to the hardware-based version of the Sophos Web Appliance, you can deploy

More information

CycleServer Grid Engine Support Install Guide. version 1.25

CycleServer Grid Engine Support Install Guide. version 1.25 CycleServer Grid Engine Support Install Guide version 1.25 Contents CycleServer Grid Engine Guide 1 Administration 1 Requirements 1 Installation 1 Monitoring Additional OGS/SGE/etc Clusters 3 Monitoring

More information

Installing Hadoop. You need a *nix system (Linux, Mac OS X, ) with a working installation of Java 1.7, either OpenJDK or the Oracle JDK. See, e.g.

Installing Hadoop. You need a *nix system (Linux, Mac OS X, ) with a working installation of Java 1.7, either OpenJDK or the Oracle JDK. See, e.g. Big Data Computing Instructor: Prof. Irene Finocchi Master's Degree in Computer Science Academic Year 2013-2014, spring semester Installing Hadoop Emanuele Fusco (fusco@di.uniroma1.it) Prerequisites You

More information

IBM Cloud Manager with OpenStack

IBM Cloud Manager with OpenStack IBM Cloud Manager with OpenStack Download Trial Guide Cloud Solutions Team: Cloud Solutions Beta cloudbta@us.ibm.com Page 1 Table of Contents Chapter 1: Introduction...3 Development cycle release scope...3

More information

Hadoop Lab - Setting a 3 node Cluster. http://hadoop.apache.org/releases.html. Java - http://wiki.apache.org/hadoop/hadoopjavaversions

Hadoop Lab - Setting a 3 node Cluster. http://hadoop.apache.org/releases.html. Java - http://wiki.apache.org/hadoop/hadoopjavaversions Hadoop Lab - Setting a 3 node Cluster Packages Hadoop Packages can be downloaded from: http://hadoop.apache.org/releases.html Java - http://wiki.apache.org/hadoop/hadoopjavaversions Note: I have tested

More information

Cloudera Manager Installation Guide

Cloudera Manager Installation Guide Cloudera Manager Installation Guide Important Notice (c) 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names or slogans contained

More information

Hadoop Training Hands On Exercise

Hadoop Training Hands On Exercise Hadoop Training Hands On Exercise 1. Getting started: Step 1: Download and Install the Vmware player - Download the VMware- player- 5.0.1-894247.zip and unzip it on your windows machine - Click the exe

More information

Online Backup Client User Manual Linux

Online Backup Client User Manual Linux Online Backup Client User Manual Linux 1. Product Information Product: Online Backup Client for Linux Version: 4.1.7 1.1 System Requirements Operating System Linux (RedHat, SuSE, Debian and Debian based

More information

Configuring Hadoop Security with Cloudera Manager

Configuring Hadoop Security with Cloudera Manager Configuring Hadoop Security with Cloudera Manager Important Notice (c) 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names

More information

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2016 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and

More information

Online Backup Client User Manual

Online Backup Client User Manual For Mac OS X Software version 4.1.7 Version 2.2 Disclaimer This document is compiled with the greatest possible care. However, errors might have been introduced caused by human mistakes or by other means.

More information

OpenGeo Suite for Linux Release 3.0

OpenGeo Suite for Linux Release 3.0 OpenGeo Suite for Linux Release 3.0 OpenGeo October 02, 2012 Contents 1 Installing OpenGeo Suite on Ubuntu i 1.1 Installing OpenGeo Suite Enterprise Edition............................... ii 1.2 Upgrading.................................................

More information

Hadoop Installation. Sandeep Prasad

Hadoop Installation. Sandeep Prasad Hadoop Installation Sandeep Prasad 1 Introduction Hadoop is a system to manage large quantity of data. For this report hadoop- 1.0.3 (Released, May 2012) is used and tested on Ubuntu-12.04. The system

More information

Kaspersky Endpoint Security 8 for Linux INSTALLATION GUIDE

Kaspersky Endpoint Security 8 for Linux INSTALLATION GUIDE Kaspersky Endpoint Security 8 for Linux INSTALLATION GUIDE A P P L I C A T I O N V E R S I O N : 8. 0 Dear User! Thank you for choosing our product. We hope that this documentation will help you in your

More information

Metalogix SharePoint Backup. Advanced Installation Guide. Publication Date: August 24, 2015

Metalogix SharePoint Backup. Advanced Installation Guide. Publication Date: August 24, 2015 Metalogix SharePoint Backup Publication Date: August 24, 2015 All Rights Reserved. This software is protected by copyright law and international treaties. Unauthorized reproduction or distribution of this

More information

Control-M for Hadoop. Technical Bulletin. www.bmc.com

Control-M for Hadoop. Technical Bulletin. www.bmc.com Technical Bulletin Control-M for Hadoop Version 8.0.00 September 30, 2014 Tracking number: PACBD.8.0.00.004 BMC Software is announcing that Control-M for Hadoop now supports the following: Secured Hadoop

More information

October 2011. Gluster Virtual Storage Appliance - 3.2 User Guide

October 2011. Gluster Virtual Storage Appliance - 3.2 User Guide October 2011 Gluster Virtual Storage Appliance - 3.2 User Guide Table of Contents 1. About the Guide... 4 1.1. Disclaimer... 4 1.2. Audience for this Guide... 4 1.3. User Prerequisites... 4 1.4. Documentation

More information

Deploying Hadoop with Manager

Deploying Hadoop with Manager Deploying Hadoop with Manager SUSE Big Data Made Easier Peter Linnell / Sales Engineer plinnell@suse.com Alejandro Bonilla / Sales Engineer abonilla@suse.com 2 Hadoop Core Components 3 Typical Hadoop Distribution

More information

Revolution R Enterprise DeployR 7.1 Enterprise Security Guide. Authentication, Authorization, and Access Controls

Revolution R Enterprise DeployR 7.1 Enterprise Security Guide. Authentication, Authorization, and Access Controls Revolution R Enterprise DeployR 7.1 Enterprise Security Guide Authentication, Authorization, and Access Controls The correct bibliographic citation for this manual is as follows: Revolution Analytics,

More information

Virtual Web Appliance Setup Guide

Virtual Web Appliance Setup Guide Virtual Web Appliance Setup Guide 2 Sophos Installing a Virtual Appliance Installing a Virtual Appliance This guide describes the procedures for installing a Virtual Web Appliance. If you are installing

More information

Partek Flow Installation Guide

Partek Flow Installation Guide Partek Flow Installation Guide Partek Flow is a web based application for genomic data analysis and visualization, which can be installed on a desktop computer, compute cluster or cloud. Users can access

More information

Source Code Management for Continuous Integration and Deployment. Version 1.0 DO NOT DISTRIBUTE

Source Code Management for Continuous Integration and Deployment. Version 1.0 DO NOT DISTRIBUTE Source Code Management for Continuous Integration and Deployment Version 1.0 Copyright 2013, 2014 Amazon Web Services, Inc. and its affiliates. All rights reserved. This work may not be reproduced or redistributed,

More information

docs.hortonworks.com

docs.hortonworks.com docs.hortonworks.com Hortonworks Data Platform: Administering Ambari Copyright 2012-2015 Hortonworks, Inc. Some rights reserved. The Hortonworks Data Platform, powered by Apache Hadoop, is a massively

More information

Architecting for the next generation of Big Data Hortonworks HDP 2.0 on Red Hat Enterprise Linux 6 with OpenJDK 7

Architecting for the next generation of Big Data Hortonworks HDP 2.0 on Red Hat Enterprise Linux 6 with OpenJDK 7 Architecting for the next generation of Big Data Hortonworks HDP 2.0 on Red Hat Enterprise Linux 6 with OpenJDK 7 Yan Fisher Senior Principal Product Marketing Manager, Red Hat Rohit Bakhshi Product Manager,

More information

EMC Documentum Connector for Microsoft SharePoint

EMC Documentum Connector for Microsoft SharePoint EMC Documentum Connector for Microsoft SharePoint Version 7.1 Installation Guide EMC Corporation Corporate Headquarters Hopkinton, MA 01748-9103 1-508-435-1000 www.emc.com Legal Notice Copyright 2013-2014

More information

Intel IoT Gateway Software Development Kit SK100

Intel IoT Gateway Software Development Kit SK100 Intel IoT Gateway Software Development Kit SK100 Order No.: 331568-001 By using this document, in addition to any agreements you have with Intel, you accept the terms set forth below. You may not use or

More information

HSearch Installation

HSearch Installation To configure HSearch you need to install Hadoop, Hbase, Zookeeper, HSearch and Tomcat. 1. Add the machines ip address in the /etc/hosts to access all the servers using name as shown below. 2. Allow all

More information

HDFS Users Guide. Table of contents

HDFS Users Guide. Table of contents Table of contents 1 Purpose...2 2 Overview...2 3 Prerequisites...3 4 Web Interface...3 5 Shell Commands... 3 5.1 DFSAdmin Command...4 6 Secondary NameNode...4 7 Checkpoint Node...5 8 Backup Node...6 9

More information

INSTALL AND CONFIGURATION GUIDE. Atlas 5.1 for Microsoft Dynamics AX

INSTALL AND CONFIGURATION GUIDE. Atlas 5.1 for Microsoft Dynamics AX INSTALL AND CONFIGURATION GUIDE Atlas 5.1 for Microsoft Dynamics AX COPYRIGHT NOTICE Copyright 2012, Globe Software Pty Ltd, All rights reserved. Trademarks Dynamics AX, IntelliMorph, and X++ have been

More information

BrightStor ARCserve Backup for Linux

BrightStor ARCserve Backup for Linux BrightStor ARCserve Backup for Linux Agent for MySQL Guide r11.5 D01213-2E This documentation and related computer software program (hereinafter referred to as the "Documentation") is for the end user's

More information

Actian Vortex Express 3.0

Actian Vortex Express 3.0 Actian Vortex Express 3.0 Quick Start Guide AH-3-QS-09 This Documentation is for the end user's informational purposes only and may be subject to change or withdrawal by Actian Corporation ("Actian") at

More information

Cloudera Backup and Disaster Recovery

Cloudera Backup and Disaster Recovery Cloudera Backup and Disaster Recovery Important Notice (c) 2010-2013 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names or slogans

More information

insync Installation Guide

insync Installation Guide insync Installation Guide 5.2 Private Cloud Druva Software June 21, 13 Copyright 2007-2013 Druva Inc. All Rights Reserved. Table of Contents Deploying insync Private Cloud... 4 Installing insync Private

More information

Dell SupportAssist Version 2.0 for Dell OpenManage Essentials Quick Start Guide

Dell SupportAssist Version 2.0 for Dell OpenManage Essentials Quick Start Guide Dell SupportAssist Version 2.0 for Dell OpenManage Essentials Quick Start Guide Notes, Cautions, and Warnings NOTE: A NOTE indicates important information that helps you make better use of your computer.

More information

Cloudera Distributed Hadoop (CDH) Installation and Configuration on Virtual Box

Cloudera Distributed Hadoop (CDH) Installation and Configuration on Virtual Box Cloudera Distributed Hadoop (CDH) Installation and Configuration on Virtual Box By Kavya Mugadur W1014808 1 Table of contents 1.What is CDH? 2. Hadoop Basics 3. Ways to install CDH 4. Installation and

More information