Revolution R Enterprise 7 Hadoop Configuration Guide

Size: px
Start display at page:

Download "Revolution R Enterprise 7 Hadoop Configuration Guide"

Transcription

1 Revolution R Enterprise 7 Hadoop Configuration Guide

2 The correct bibliographic citation for this manual is as follows: Revolution Analytics, Inc Revolution R Enterprise 7 Hadoop Configuration Guide. Revolution Analytics, Inc., Mountain View, CA. Revolution R Enterprise 7 Hadoop Configuration Guide Copyright 2014 Revolution Analytics, Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of Revolution Analytics. U.S. Government Restricted Rights Notice: Use, duplication, or disclosure of this software and related documentation by the Government is subject to restrictions as set forth in subdivision (c) (1) (ii) of The Rights in Technical Data and Computer Software clause at Revolution R, Revolution R Enterprise, RPE, RevoScaleR, DeployR, RevoTreeView, and Revolution Analytics are trademarks of Revolution Analytics. Other product names mentioned herein are used for identification purposes only and may be trademarks of their respective owners. Revolution Analytics 2570 West El Camino Real Suite 222 Mountain View, CA U.S.A. We want our documentation to be useful, and we want it to address your needs. If you have comments on this or any Revolution document, write to doc@revolutionanalytics.com.

3 Table of Contents 1 Introduction System Requirements Basic Hadoop Terminology Verifying the Hadoop Installation Adjusting Hadoop Memory Limits (Hadoop 2.x Systems Only) Hadoop Security with Kerberos Authentication Installing Revolution R Enterprise on a Cluster Standard Command Line Install Distributed Installation with RevoMPM Installing the Revolution R Enterprise JAR File Setting Environment Variables for Hadoop Creating Directories for Revolution R Enterprise Installing on a Cloudera Manager System Using a Cloudera Manager Parcel Verifying Installation Troubleshooting the Installation No Valid Credentials Unable to Load Class RevoScaleR Classpath Errors Unable to Load Shared Library Getting Started with Hadoop... 13

4

5 1 Introduction Revolution R Enterprise is the scalable data analytics solution, and it is designed to work seamlessly whether your computing environment is a single-user workstation, a local network of connected servers, or a cluster in the cloud. This manual is intended for those who need to configure a Hadoop cluster for use with Revolution R Enterprise. This manual assumes that you have download instructions for the Revolution R Enterprise and related files; if you do not have those instructions, please contact Revolution Analytics Technical Support for assistance. 1.1 System Requirements Revolution R Enterprise works with the following Hadoop distributions: Cloudera CDH4 and CDH5 HortonWorks HDP 1.3.0, HDP 2.0.0, HDP MapR 3.0.2, MapR 3.1.0, MapR Your cluster installation must include the C APIs contained in the libhdfs package; these are required for Revolution R Enterprise. See your Hadoop documentation for information on installing this package. The Hadoop distribution must be installed on Red Hat Enterprise Linux 5 or 6, or fully compatible operating system. Revolution R Enterprise should be installed on all nodes of the cluster. Revolution R Enterprise requires Hadoop MapReduce and the Hadoop Distributed File System (HDFS) (for CDH4, HDP 1.3.0, and MapR 3.x installations), or HDFS, Hadoop YARN, and Hadoop MapReduce2 for CDH5 and HDP 2.x installations. The HDFS, YARN, and MapReduce clients must be installed on all nodes on which you plan to run Revolution R Enterprise, as must Revolution R Enterprise itself. Minimum system configuration requirements for Revolution R Enterprise are as follows: Processor: 64-bit CPU with x86-compatible architecture (variously known as AMD64, Intel64, x86-64, IA-32e, EM64T, or x64 CPUs). Itanium-architecture CPUs (also known as IA-64) are not supported. Multiple-core CPUs are recommended. Operating System: Red Hat Enterprise Linux 5.4, 5.5, 5.6, 5.7, 5.8, 5.9, 6.0, 6.1, 6.2, or 6.3. Only 64-bit operating systems are supported. Memory: A minimum of 4GB of RAM is required; 8GB or more are recommended. Disk Space: A minimum of 500MB of disk space is required on each node. 1.2 Basic Hadoop Terminology The following terms apply to computers and services within the Hadoop cluster, and define the roles of hosts within the cluster:

6 2 Introduction Hadoop 1.x Installations (CDH4, HDP 1.3.0, MapR 3.x) JobTracker: The Hadoop service that distributes MapReduce tasks to specific nodes in the cluster. The JobTracker queries the NameNode to find the location of the data needed for the tasks, then distributes the tasks to TaskTracker nodes near (or coextensive with) the data. For small clusters, the JobTracker may be running on the NameNode, but this is not recommended for production use. NameNode: A host in the cluster that is the master node of the HDFS file system, managing the directory tree of all files in the file system. In small clusters, the NameNode may host the JobTracker, but this is not recommended for production use. TaskTracker: Any host that can accept tasks (Map, Reduce, and Shuffle operations) from a JobTracker. TaskTrackers are usually, but not always, also DataNodes, so that tasks assigned to the TaskTracker can work on data on the same node. DataNode: A host that stores data in the Hadoop Distributed File System. DataNodes connect to the NameNode, and responds to requests from the NameNode for file system operations. Hadoop 2.x Installations (CDH5, HDP 2.x) Resource Manager: The Hadoop service that distributes MapReduce and other Hadoop tasks to specific nodes in the cluster. The Resource Manager takes over the scheduling functions of the old JobTracker, determining which nodes are appropriate for the current job. NameNode: A host in the cluster that is the master node of the HDFS file system, managing the directory tree of all files in the file system. Application Master: New in MapReduce2/YARN, the application master takes over the task progress coordination from the old JobTracker, working with node managers on the individual task nodes. The application master negotiates with the Resource Manager for cluster resources, which are allocated as a set of containers, with each container running an application-specific task on a particular node. NodeManager: Node managers manage the containers allocated for a given task on a given node, coordinating with the Resource Manager and the Application Masters. NodeManagers are usually, but not always, also DataNodes, and most frequently the containers on a given node are working with data on the same node. DataNode: A host that stores data in the Hadoop Distributed File System. DataNodes connect to the NameNode, and responds to requests from the NameNode for file system operations. 1.3 Verifying the Hadoop Installation We assume you have already installed Hadoop on your cluster. If not, use the documentation provided with your Hadoop distribution to help you perform the

7 Introduction 3 installation; Hadoop installation is complicated and involves many steps--following the documentation carefully does not guarantee success, but it does make troubleshooting easier. In our testing, we have found the following documents helpful: Cloudera CDH4, package install Cloudera CDH4, Cloudera Manager parcel install Cloudera CDH5, package install Cloudera CDH5, Cloudera Manager parcel install Hortonworks HDP 1.3 Hortonworks HDP 2.1 Hortonworks HDP 1.x or 2.x, Ambari install MapR 3.1 (M5 Edition) If you are using Cloudera Manager, it is important to know if your installation was via packages or parcels; the Revolution R Enterprise Cloudera Manager parcel can be used only with parcel installs. If you have installed Cloudera Manager via packages, do not attempt to use the RRE Cloudera Manager parcel; use the standard Revolution R Enterprise for Linux installer instead. It is useful to confirm that Hadoop itself is running correctly before attempting to install Revolution R Enterprise on the cluster. Hadoop comes with example programs that you can run to verify that your Hadoop installation is running properly, in the jar file hadoopmapreduce-examples.jar. The following command should display a list of the available examples: hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar (On MapR, the quick installation installs the Hadoop files to /opt/mapr by default; the path to the examples jar file is /opt/mapr/hadoop/hadoop /hadoop dev-examples.jar. Similarly, on Cloudera Manager parcel installs, the default path to the examples is /opt/cloudera/parcels/cdh/lib/hadoop-mapreduceexamples.jar.) The following runs the pi example, which uses Monte Carlo sampling to estimate pi; the 5 tells Hadoop to use 5 mappers, the 300 says to use 300 samples per map: hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar pi If you can successfully run one or more of the Hadoop examples, your Hadoop installation was successful and you are ready to install Revolution R Enterprise. 1.4 Adjusting Hadoop Memory Limits (Hadoop 2.x Systems Only) On Hadoop 2.x systems only (CDH5 and HDP 2.x), we have found that the default settings for Map and Reduce memory limits are inadequate for large RevoScaleR jobs. We need to modify four properties in mapred-site.xml and one in yarn-site.xml, as follows:

8 4 Hadoop Security with Kerberos Authentication (in mapred-site.xml) <name>mapreduce.map.memory.mb</name> <value>2048</value> <name>mapreduce.reduce.memory.mb</name> <value>2048</value> <name>mapreduce.map.java.opts</name> <value>-xmx1229m</value> <name>mapreduce.reduce.java.opts</name> <value>-xmx1229m</value> (in yarn-site.xml) <name>yarn.nodemanager.resource.memory-mb</name> <value>3198</value> If you are using a cluster manager such as Cloudera Manager or Ambari, these settings must usually be modified using the Web interface. 2 Hadoop Security with Kerberos Authentication By default, most Hadoop configurations are relatively insecure. Security features such as SELinux and IPtables firewalls are often turned off to help get the Hadoop cluster up and running quickly. However, Cloudera and Hortonworks distributions of Hadoop support Kerberos authentication, which allows Hadoop to operate in a much more secure manner. To use Kerberos authentication with your particular version of Hadoop, see one of the following documents: o Cloudera CDH4 o Cloudera CDH4 with Cloudera Manager 4 o Cloudera CDH5 o Cloudera CDH5 with Cloudera Manager 5 o Hortonworks HDP 1.3 o Hortonworks HDP 2.x o Hortonworks HDP (1.3 or 2.x) with Ambari If you have trouble restarting your Hadoop cluster after enabling Kerberos authentication, the problem is most likely with your keytab files. Be sure you have created all the required Kerberos principals and generated appropriate keytab entries for all of your nodes, and that the keytab files have been located correctly with the appropriate permissions. (We have found that in Hortonworks clusters managed with Ambari, it is important that the spnego.service.keytab file be present on all the nodes of the cluster, not just the name node and secondary namenode.) The MapR distribution also supports Kerberos authentication, but most MapR installations use that distribution s wire-level security feature. See the MapR Security Guide for details.

9 Installing Revolution R Enterprise on a Cluster 5 3 Installing Revolution R Enterprise on a Cluster It is highly recommended that you install Revolution R Enterprise as root on each node of your Hadoop cluster. This ensures that all users will have access to it by default. Nonroot installs are supported, but require that the path to the R executable files be added to each user s path. 3.1 Standard Command Line Install For most users, installing on the cluster means simply running the standard Revolution R Enterprise installer on each node of the cluster. This can be done most quickly by doing the following on each node: 1. Copy the installer Revo-Ent RHELn.tar.gz to the node (where n is either 5 or 6, depending on your cluster s operating system). 2. Unpack the installer by issuing the command: tar zxf Revo-Ent RHELn.tar.gz 3. Change directory to the RevolutionR_7.2.0 directory created, and issue the following command:./install.py -n -d -a This installs Revolution R Enterprise with the standard options. 3.2 Distributed Installation with RevoMPM If your Hadoop cluster is configured to allow passwordless-ssh access among the various nodes, you can use the Revolution Multi-Node Package Manager (RevoMPM) to deploy Revolution R Enterprise across your cluster. On any one node of the cluster, create a directory for the installer, such as /var/tmp/revoinstall, and download the following files to that directory (you can find the links in your welcome ): install_mpm.py RevoMPM x86_64.rpm In that same directory, create an empty file named hosts.cfg and a subdirectory named packages, and download the Revolution R Enterprise installer Revo-Ent RHEL*.tar.gz to that packages subdirectory. To ensure ready access to your nodes via RevoMPM, edit the file hosts.cfg to list the nodes in your cluster. For example: [groups] nodes =

10 6 Installing Revolution R Enterprise on a Cluster ip ip ip ip ip Change directory to the /var/tmp/revo-install directory, and issue the following command: python install_mpm.py This launches a script that will prompt you for the location of your Revolution R Enterprise installer (accept the default), then will prompt you for either a hosts.cfg file (accept the default if you have edited it as described above) or to manually specify groups. In the latter case, you will be prompted for a group name (this is just a convenient way of referring to your cluster) and then the names of the hosts in the group (the nodes you want to install to). You can define multiple groups (you can do this in the hosts.cfg file as well). You will also be asked which version of Revolution R Enterprise you want to install. Answer the prompts and RevoMPM will install Revolution R Enterprise on all the requested nodes. If you are not running as root, you must specify a Revolution installation directory when running install_mpm.py. The directory must be writable by the user running install_mpm.py: python install_mpm.py --nonroot /home/ec2-user/revolution For complete instructions on installing and running RevoMPM, see the RevoMPM User s Guide. 3.3 Installing the Revolution R Enterprise JAR File Using Revolution R Enterprise in Hadoop requires the presence of the Revolution R Enterprise Java Archive (JAR) file scaler-hadoop-0.1-snapshot.jar. This file can be found in the RevolutionR_7.2.0 directory created when the installer tarball is unpacked; it should be installed on each node of your Hadoop cluster in the standard location for Hadoop JAR files, typically /usr/lib/hadoop/lib. If you are using RevoMPM, you can install the JAR files on all the nodes of your group with the following command (this command must be entered on a single line, and there should be no space between revo- and install ): revompm cmd:'sudo cp /var/tmp/revoinstall/revolutionr_7.2.0/scaler-hadoop-0.1-snapshot.jar /usr/lib/hadoop/lib' Ensure that the file has execute permissions by executing the following command: revompm cmd:'sudo chmod a+x /usr/lib/hadoop/lib/scaler-hadoop-0.1- SNAPSHOT.jar'

11 Installing Revolution R Enterprise on a Cluster Setting Environment Variables for Hadoop The following steps must be performed on each host that will be involved in HDFSbased computations. 1. Install the gethadoopclasspath.py file, available in the RevolutionR_7.2.0 directory, on each host upon which Revolution R Enterprise will be run by logging in as root and downloading the file to root s home directory. Then, execute the following commands: cd mkdir -p /usr/local/sbin/ chmod uog+rx /usr/local/sbin/ cp gethadoopclasspath.py /usr/local/sbin chmod uog+rx /usr/local/sbin/gethadoopclasspath.py 2. As the root user, ensure that a Revolution directory under /etc/profile.d exists and has the correct permissions: mkdir -p /etc/profile.d/revolution chmod uog+rx /etc/profile.d/revolution 3. Place the file bash_profile_additions (again, available in the RevolutionR_7.2.0 directory) into the newly created directory: cp bash_profile_additions /etc/profile.d/revolution chmod uog+rx /etc/profile.d/revolution/bash_profile_additions 4. Edit the following configuration file: /etc/profile.d/revolution/bash_profile_additions Uncomment those lines which are necessary as described in the inline comments relating to each configuration. In particular, make sure that the line that runs gethadoopclasspath.py is uncommented. Modify those paths per the provided instructions that are host specific as directed by the inline comments. 5. Place the file rhadoop.sh (again, available in the RevolutionR_7.2.0 directory) into the directory /etc/profile.d. Ensure the file is world-readable. 6. Edit the file /etc/profile.d/rhadoop.sh as appropriate for your Hadoop distribution. 7. Each user who will be using Revolution R on the host must add the following line to his or her $HOME/.bash_profile file:. /etc/profile.d/revolution/bash_profile_additions (Note the dot at the beginning of the line; that is part of the command to source the file.)

12 8 Installing Revolution R Enterprise on a Cluster 8. Each user should log out and back in to pick up environment changes. The following environment variables should be set when you are done (the easiest way to ensure that this is done consistently is to edit and uncomment the corresponding lines in the bash_profile_additions file): HADOOP_HOME This should be set to the directory containing the Hadoop files. HADOOP_VERSION This should be set to the current Hadoop version, such as ch3u3. HADOOP_CMD: This should be set to the command used to invoke Hadoop PATH: This should be updated to include your Hadoop command and your Java executables. CLASSPATH: This should be a fully expanded CLASSPATH with access to all required Hadoop JAR files. JAVA_LIBRARY_PATH: If necessary, this should be set as described in the bashprofiile-additions. The one environment variable that should NOT be set in bash_profile_additions is the following: HADOOP_STREAMING: This should be set (in /etc/profile.d/rhadoop.sh) to the path of the Hadoop streaming jar file. 3.5 Creating Directories for Revolution R Enterprise The /var/revoshare directory should be created for use by Revolution R Enterprise and its users on the Hadoop cluster s native file system. The /user/revoshare directory should be created on the Hadoop Distributed File System. Both should have read, write, and execute permissions for all authorized users, and each user should have a personal user directory beneath each top-level directory. The /tmp and /share directories should also exist with read, write, and execute permissions on HDFS. The following commands run from the shell prompt as the hdfs user (as the mapr user on MapR systems) should create the necessary HDFS directories: hadoop fs -mkdir /tmp hadoop fs -chmod uog+rwx /tmp hadoop fs -mkdir /share hadoop fs -chmod uog+rwx /share hadoop fs -mkdir /user hadoop fs -chmod uog+rwx /user hadoop fs -mkdir /user/revoshare/ hadoop fs -chmod uog+rwx /user/revoshare/

13 Installing Revolution R Enterprise on a Cluster 9 The root user should then create the following directory in the native file system: sudo mkdir -p /var/revoshare chmod uog+rwx /var/revoshare Each user should then ensure that the appropriate user directories exist, and if necessary, create them with the following commands: hadoop fs -mkdir /user/revoshare/$user hadoop fs -chmod uog+rwx /user/revoshare/$user mkdir -p /var/revoshare/$user chmod uog+rwx /var/revoshare/$user The HDFS directory can also be created in a user s R session (provided the top-level /user/revoshare has the appropriate permissions) using the following RevoScaleR commands (substitute your actual user name for username ): rxhadoopmakedir("/user/revoshare/username") rxhadoopcommand("fs -chmod uog+rwx /user/revoshare/username") 3.6 Installing on a Cloudera Manager System Using a Cloudera Manager Parcel If you are running a Cloudera Hadoop cluster managed by Cloudera Manager, and if Cloudera itself was installed via a Cloudera Manager parcel, you can use the Revolution R Enterprise Cloudera Manager parcel to install Revolution R Enterprise on all the nodes of your cluster. Revolution R Enterprise requires several packages that may not be in a default Red Hat Enterprise Linux installation, run the following yum command as root to install them: yum install gcc-gfortran cairo-devel python-devel \ tk-devel libicu Once you have installed the Revolution R Enterprise prerequisites, install the Cloudera Manager parcel as follows: 1. Download the Revolution R Enterprise Cloudera Manager Parcel using the links provided in your welcome . (Note that the parcel consists of two files, the parcel itself and its associated.sha file.) 2. Copy the parcel files to your local parcel-repo, typically /opt/cloudera/parcel-repo. 3. In your browser, open Cloudera Manager. 4. Click Hosts in the upper navigation bar to bring up the All Hosts page. 5. Click Parcels to bring up the Parcels page. 6. Click Check for New Parcels. RRE should appear with a Distribute button. 7. Click the RRE Distribute button. Revolution R Enterprise will be distributed to all the nodes of your cluster. When the distribution is complete, the Distribute button is replaced with an Activate button. 8. Click Activate. Activation prepares Revolution R Enterprise to be used by the cluster after a restart. 9. Run the following script as root on the node running the Cloudera Manager:

14 10 Verifying Installation /opt/cloudera/parcels/rre/scripts/rre_hdfs_run_once.sh If you get a permission denied error on running the script, make sure that the file is executable; if not, use the chmod command to make it so: cd /opt/cloudera/parcels/rre/scripts chmod +x rre_hdfs_run_once.sh 10. Have any users who will be using any of your managed nodes to run Revolution R Enterprise add the following line to their.bash-profile files (the period at the beginning represents the bash source command):. /etc/profile.d/revolution/bash_profile_additions 4 Verifying Installation After completing installation, do the following to verify that Revolution R Enterprise will actually run commands in Hadoop: 1. If the cluster is security-enabled, obtain a ticket using kinit (for Kerberos authentication) or mapr password (for MapR wire security). 2. Start Revolution R Enterprise on a cluster node by typing Revo64 at a shell prompt. 3. At the R prompt >, enter the following commands (these commands are drawn from the RevoScaleR Hadoop Getting Started Guide, which explains what all of them are doing. For now, we are just trying to see if everything works): bigdatadirroot <- "/share" myhadoopcluster <- RxHadoopMR() rxsetcomputecontext(myhadoopcluster) source <- system.file("sampledata/airlinedemosmall.csv", package="revoscaler") inputdir <- file.path(bigdatadirroot,"airlinedemosmall") rxhadoopmakedir(inputdir) rxhadoopcopyfromlocal(source, inputdir) hdfsfs <- RxHdfsFileSystem() colinfo <- list(dayofweek = list(type = "factor", levels = c("monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"))) airds <- RxTextData(file = inputdir, missingvaluestring = "M", colinfo = colinfo, filesystem = hdfsfs) adssummary <- rxsummary(~arrdelay+crsdeptime+dayofweek, data = airds) adssummary If you installed Revolution R Enterprise in a non-default location, you must specify the location using both the hadooprpath and revopath arguments to RxHadoopMR: myhadoopcluster <- RxHadoopMR(hadoopRPath=/path/to/Revo64, revopath=/path/to/revo64)

15 Troubleshooting the Installation 11 If you see the following, congratulations: Call: rxsummary(formula = ~ArrDelay + CRSDepTime + DayOfWeek, data = airds) Summary Statistics Results for: ~ArrDelay + CRSDepTime + DayOfWeek Data: airds ( RxTextData Data Source) File name: /share/airlinedemosmall Number of valid observations: 6e+05 Name Mean StdDev Min Max ValidObs MissingObs ArrDelay CRSDepTime Category Counts for DayOfWeek Number of categories: 7 Number of valid observations: 6e+05 Number of missing observations: 0 DayOfWeek Counts Monday Tuesday Wednesday Thursday Friday Saturday Sunday Next try to run a simple rxexec job: rxexec(list.files) That should return a list of files in the native file system. If either the call to rxsummary or the call to rxexec results in an error, see section 5, Troubleshooting the Installation, for a few of the more common errors and how to fix them. 5 Troubleshooting the Installation No two Hadoop installations are exactly alike, but most are quite similar. This section brings together a number of common errors seen in attempting to run Revolution R Enterprise commands on Hadoop clusters, and the most likely causes of such errors from our experience. 5.1 No Valid Credentials If you see a message such as No valid credentials provided, this means you do not have a valid Kerberos ticket. Quit Revolution R Enterprise, obtain a Kerberos ticket using kinit, and then restart Revolution R Enterprise.

16 12 Troubleshooting the Installation 5.2 Unable to Load Class RevoScaleR If you see a message about being unable to find or load main class RevoScaleR, this means that the jar file scaler-hadoop-0.1-snapshot.jar could not be found. This jar file must be in a location where it can be found by the gethadoopclasspath.py script, or its location must be explicitly added to the CLASSPATH. 5.3 Classpath Errors If you see other errors related to Java classes, these are likely related to the settings of the following environment variables: PATH CLASSPATH JAVA_LIBRARY_PATH Of these, the most commonly misconfigured is the CLASSPATH. Ensure that the script gethadoopclasspath.py has execute permission and is being executed when the bash_profile_additions script is sourced. 5.4 Unable to Load Shared Library If you see a message about being unable to load libhdfs.so, you may need to create a symbolic link from your installed version of libhdfs.so to the system library, such as the following: ln -s /path/to/libhdfs.so /usr/lib64/libhdfs.so Or, update your LD_LIBRARY_PATH environment variable to include the path to the libjvm shared object: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/libhdfs.so Similarly, if you see a message about being unable to load libjvm.so, you may need to create a symbolic link from your installed version of libjvm.so to the system library, such as the following: ln -s /path/to/libjvm.so /usr/lib/libjvm.so Or, update your LD_LIBRARY_PATH environment variable to include the path to the libjvm shared object: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/libjvm.so

17 Getting Started with Hadoop 13 6 Getting Started with Hadoop To get started with Revolution R Enterprise on Hadoop, we recommend the RevoScaleR 7 Hadoop Getting Started Guide PDF. This provides a tutorial introduction to using RevoScaleR with Hadoop.

Revolution R Enterprise 7 Hadoop Configuration Guide

Revolution R Enterprise 7 Hadoop Configuration Guide Revolution R Enterprise 7 Hadoop Configuration Guide The correct bibliographic citation for this manual is as follows: Revolution Analytics, Inc. 2015. Revolution R Enterprise 7 Hadoop Configuration Guide.

More information

RHadoop Installation Guide for Red Hat Enterprise Linux

RHadoop Installation Guide for Red Hat Enterprise Linux RHadoop Installation Guide for Red Hat Enterprise Linux Version 2.0.2 Update 2 Revolution R, Revolution R Enterprise, and Revolution Analytics are trademarks of Revolution Analytics. All other trademarks

More information

CDH 5 Quick Start Guide

CDH 5 Quick Start Guide CDH 5 Quick Start Guide Important Notice (c) 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names or slogans contained in this

More information

Kognitio Technote Kognitio v8.x Hadoop Connector Setup

Kognitio Technote Kognitio v8.x Hadoop Connector Setup Kognitio Technote Kognitio v8.x Hadoop Connector Setup For External Release Kognitio Document No Authors Reviewed By Authorised By Document Version Stuart Watt Date Table Of Contents Document Control...

More information

RHadoop and MapR. Accessing Enterprise- Grade Hadoop from R. Version 2.0 (14.March.2014)

RHadoop and MapR. Accessing Enterprise- Grade Hadoop from R. Version 2.0 (14.March.2014) RHadoop and MapR Accessing Enterprise- Grade Hadoop from R Version 2.0 (14.March.2014) Table of Contents Introduction... 3 Environment... 3 R... 3 Special Installation Notes... 4 Install R... 5 Install

More information

RevoScaleR 7.3 HPC Server Getting Started Guide

RevoScaleR 7.3 HPC Server Getting Started Guide RevoScaleR 7.3 HPC Server Getting Started Guide The correct bibliographic citation for this manual is as follows: Revolution Analytics, Inc. 2014. RevoScaleR 7.3 HPC Server Getting Started Guide. Revolution

More information

Revolution R Enterprise DeployR 7.1 Installation Guide for Windows

Revolution R Enterprise DeployR 7.1 Installation Guide for Windows Revolution R Enterprise DeployR 7.1 Installation Guide for Windows The correct bibliographic citation for this manual is as follows: Revolution Analytics, Inc. 2014. Revolution R Enterprise DeployR Installation

More information

How To Install Hadoop 1.2.1.1 From Apa Hadoop 1.3.2 To 1.4.2 (Hadoop)

How To Install Hadoop 1.2.1.1 From Apa Hadoop 1.3.2 To 1.4.2 (Hadoop) Contents Download and install Java JDK... 1 Download the Hadoop tar ball... 1 Update $HOME/.bashrc... 3 Configuration of Hadoop in Pseudo Distributed Mode... 4 Format the newly created cluster to create

More information

Cloudera Manager Training: Hands-On Exercises

Cloudera Manager Training: Hands-On Exercises 201408 Cloudera Manager Training: Hands-On Exercises General Notes... 2 In- Class Preparation: Accessing Your Cluster... 3 Self- Study Preparation: Creating Your Cluster... 4 Hands- On Exercise: Working

More information

VMware vsphere Big Data Extensions Administrator's and User's Guide

VMware vsphere Big Data Extensions Administrator's and User's Guide VMware vsphere Big Data Extensions Administrator's and User's Guide vsphere Big Data Extensions 1.0 This document supports the version of each product listed and supports all subsequent versions until

More information

Centrify Identity and Access Management for Cloudera

Centrify Identity and Access Management for Cloudera Centrify Identity and Access Management for Cloudera Integration Guide Abstract Centrify Server Suite is an enterprise-class solution that secures Cloudera Enterprise Data Hub leveraging an organization

More information

Apache Hadoop 2.0 Installation and Single Node Cluster Configuration on Ubuntu A guide to install and setup Single-Node Apache Hadoop 2.

Apache Hadoop 2.0 Installation and Single Node Cluster Configuration on Ubuntu A guide to install and setup Single-Node Apache Hadoop 2. EDUREKA Apache Hadoop 2.0 Installation and Single Node Cluster Configuration on Ubuntu A guide to install and setup Single-Node Apache Hadoop 2.0 Cluster edureka! 11/12/2013 A guide to Install and Configure

More information

Installing Hadoop. You need a *nix system (Linux, Mac OS X, ) with a working installation of Java 1.7, either OpenJDK or the Oracle JDK. See, e.g.

Installing Hadoop. You need a *nix system (Linux, Mac OS X, ) with a working installation of Java 1.7, either OpenJDK or the Oracle JDK. See, e.g. Big Data Computing Instructor: Prof. Irene Finocchi Master's Degree in Computer Science Academic Year 2013-2014, spring semester Installing Hadoop Emanuele Fusco (fusco@di.uniroma1.it) Prerequisites You

More information

Deploy Apache Hadoop with Emulex OneConnect OCe14000 Ethernet Network Adapters

Deploy Apache Hadoop with Emulex OneConnect OCe14000 Ethernet Network Adapters CONNECT - Lab Guide Deploy Apache Hadoop with Emulex OneConnect OCe14000 Ethernet Network Adapters Hardware, software and configuration steps needed to deploy Apache Hadoop 2.4.1 with the Emulex family

More information

How to Install and Configure EBF15328 for MapR 4.0.1 or 4.0.2 with MapReduce v1

How to Install and Configure EBF15328 for MapR 4.0.1 or 4.0.2 with MapReduce v1 How to Install and Configure EBF15328 for MapR 4.0.1 or 4.0.2 with MapReduce v1 1993-2015 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic,

More information

Symantec Enterprise Solution for Hadoop Installation and Administrator's Guide 1.0

Symantec Enterprise Solution for Hadoop Installation and Administrator's Guide 1.0 Symantec Enterprise Solution for Hadoop Installation and Administrator's Guide 1.0 The software described in this book is furnished under a license agreement and may be used only in accordance with the

More information

HSearch Installation

HSearch Installation To configure HSearch you need to install Hadoop, Hbase, Zookeeper, HSearch and Tomcat. 1. Add the machines ip address in the /etc/hosts to access all the servers using name as shown below. 2. Allow all

More information

CS380 Final Project Evaluating the Scalability of Hadoop in a Real and Virtual Environment

CS380 Final Project Evaluating the Scalability of Hadoop in a Real and Virtual Environment CS380 Final Project Evaluating the Scalability of Hadoop in a Real and Virtual Environment James Devine December 15, 2008 Abstract Mapreduce has been a very successful computational technique that has

More information

Quick Deployment Step-by-step instructions to deploy Oracle Big Data Lite Virtual Machine

Quick Deployment Step-by-step instructions to deploy Oracle Big Data Lite Virtual Machine Quick Deployment Step-by-step instructions to deploy Oracle Big Data Lite Virtual Machine Version 3.0 Please note: This appliance is for testing and educational purposes only; it is unsupported and not

More information

Deploying Cloudera CDH (Cloudera Distribution Including Apache Hadoop) with Emulex OneConnect OCe14000 Network Adapters

Deploying Cloudera CDH (Cloudera Distribution Including Apache Hadoop) with Emulex OneConnect OCe14000 Network Adapters Deploying Cloudera CDH (Cloudera Distribution Including Apache Hadoop) with Emulex OneConnect OCe14000 Network Adapters Table of Contents Introduction... Hardware requirements... Recommended Hadoop cluster

More information

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and

More information

H2O on Hadoop. September 30, 2014. www.0xdata.com

H2O on Hadoop. September 30, 2014. www.0xdata.com H2O on Hadoop September 30, 2014 www.0xdata.com H2O on Hadoop Introduction H2O is the open source math & machine learning engine for big data that brings distribution and parallelism to powerful algorithms

More information

Configuring Informatica Data Vault to Work with Cloudera Hadoop Cluster

Configuring Informatica Data Vault to Work with Cloudera Hadoop Cluster Configuring Informatica Data Vault to Work with Cloudera Hadoop Cluster 2013 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,

More information

Hadoop Installation. Sandeep Prasad

Hadoop Installation. Sandeep Prasad Hadoop Installation Sandeep Prasad 1 Introduction Hadoop is a system to manage large quantity of data. For this report hadoop- 1.0.3 (Released, May 2012) is used and tested on Ubuntu-12.04. The system

More information

Big Data Operations Guide for Cloudera Manager v5.x Hadoop

Big Data Operations Guide for Cloudera Manager v5.x Hadoop Big Data Operations Guide for Cloudera Manager v5.x Hadoop Logging into the Enterprise Cloudera Manager 1. On the server where you have installed 'Cloudera Manager', make sure that the server is running,

More information

Hadoop Basics with InfoSphere BigInsights

Hadoop Basics with InfoSphere BigInsights An IBM Proof of Technology Hadoop Basics with InfoSphere BigInsights Part: 1 Exploring Hadoop Distributed File System An IBM Proof of Technology Catalog Number Copyright IBM Corporation, 2013 US Government

More information

Important Notice. (c) 2010-2016 Cloudera, Inc. All rights reserved.

Important Notice. (c) 2010-2016 Cloudera, Inc. All rights reserved. Cloudera QuickStart Important Notice (c) 2010-2016 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names or slogans contained in this

More information

docs.hortonworks.com

docs.hortonworks.com docs.hortonworks.com : Security Administration Tools Guide Copyright 2012-2014 Hortonworks, Inc. Some rights reserved. The, powered by Apache Hadoop, is a massively scalable and 100% open source platform

More information

Red Hat Enterprise Linux OpenStack Platform 7 OpenStack Data Processing

Red Hat Enterprise Linux OpenStack Platform 7 OpenStack Data Processing Red Hat Enterprise Linux OpenStack Platform 7 OpenStack Data Processing Manually provisioning and scaling Hadoop clusters in Red Hat OpenStack OpenStack Documentation Team Red Hat Enterprise Linux OpenStack

More information

JAMF Software Server Installation Guide for Linux. Version 8.6

JAMF Software Server Installation Guide for Linux. Version 8.6 JAMF Software Server Installation Guide for Linux Version 8.6 JAMF Software, LLC 2012 JAMF Software, LLC. All rights reserved. JAMF Software has made all efforts to ensure that this guide is accurate.

More information

CA Performance Center

CA Performance Center CA Performance Center Release Notes Release 2.3.3 This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred to as the Documentation ) is for

More information

Quick Start Guide For Ipswitch Failover v9.0

Quick Start Guide For Ipswitch Failover v9.0 For Ipswitch Failover v9.0 Copyright 1991-2015 All rights reserved. This document, as well as the software described in it, is furnished under license and may be used or copied only in accordance with

More information

Using The Hortonworks Virtual Sandbox

Using The Hortonworks Virtual Sandbox Using The Hortonworks Virtual Sandbox Powered By Apache Hadoop This work by Hortonworks, Inc. is licensed under a Creative Commons Attribution- ShareAlike3.0 Unported License. Legal Notice Copyright 2012

More information

HYPERION SYSTEM 9 N-TIER INSTALLATION GUIDE MASTER DATA MANAGEMENT RELEASE 9.2

HYPERION SYSTEM 9 N-TIER INSTALLATION GUIDE MASTER DATA MANAGEMENT RELEASE 9.2 HYPERION SYSTEM 9 MASTER DATA MANAGEMENT RELEASE 9.2 N-TIER INSTALLATION GUIDE P/N: DM90192000 Copyright 2005-2006 Hyperion Solutions Corporation. All rights reserved. Hyperion, the Hyperion logo, and

More information

Architecting the Future of Big Data

Architecting the Future of Big Data Hive ODBC Driver User Guide Revised: July 22, 2013 2012-2013 Hortonworks Inc. All Rights Reserved. Parts of this Program and Documentation include proprietary software and content that is copyrighted and

More information

Introduction to HDFS. Prasanth Kothuri, CERN

Introduction to HDFS. Prasanth Kothuri, CERN Prasanth Kothuri, CERN 2 What s HDFS HDFS is a distributed file system that is fault tolerant, scalable and extremely easy to expand. HDFS is the primary distributed storage for Hadoop applications. HDFS

More information

MapReduce. Tushar B. Kute, http://tusharkute.com

MapReduce. Tushar B. Kute, http://tusharkute.com MapReduce Tushar B. Kute, http://tusharkute.com What is MapReduce? MapReduce is a framework using which we can write applications to process huge amounts of data, in parallel, on large clusters of commodity

More information

Hadoop Training Hands On Exercise

Hadoop Training Hands On Exercise Hadoop Training Hands On Exercise 1. Getting started: Step 1: Download and Install the Vmware player - Download the VMware- player- 5.0.1-894247.zip and unzip it on your windows machine - Click the exe

More information

Cloudera Backup and Disaster Recovery

Cloudera Backup and Disaster Recovery Cloudera Backup and Disaster Recovery Important Note: Cloudera Manager 4 and CDH 4 have reached End of Maintenance (EOM) on August 9, 2015. Cloudera will not support or provide patches for any of the Cloudera

More information

HDFS Users Guide. Table of contents

HDFS Users Guide. Table of contents Table of contents 1 Purpose...2 2 Overview...2 3 Prerequisites...3 4 Web Interface...3 5 Shell Commands... 3 5.1 DFSAdmin Command...4 6 Secondary NameNode...4 7 Checkpoint Node...5 8 Backup Node...6 9

More information

A Study of Data Management Technology for Handling Big Data

A Study of Data Management Technology for Handling Big Data Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 9, September 2014,

More information

Configuring Hadoop Security with Cloudera Manager

Configuring Hadoop Security with Cloudera Manager Configuring Hadoop Security with Cloudera Manager Important Notice (c) 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names

More information

CactoScale Guide User Guide. Athanasios Tsitsipas (UULM), Papazachos Zafeirios (QUB), Sakil Barbhuiya (QUB)

CactoScale Guide User Guide. Athanasios Tsitsipas (UULM), Papazachos Zafeirios (QUB), Sakil Barbhuiya (QUB) CactoScale Guide User Guide Athanasios Tsitsipas (UULM), Papazachos Zafeirios (QUB), Sakil Barbhuiya (QUB) Version History Version Date Change Author 0.1 12/10/2014 Initial version Athanasios Tsitsipas(UULM)

More information

Обработка больших данных: Map Reduce (Python) + Hadoop (Streaming) Максим Щербаков ВолгГТУ 8/10/2014

Обработка больших данных: Map Reduce (Python) + Hadoop (Streaming) Максим Щербаков ВолгГТУ 8/10/2014 Обработка больших данных: Map Reduce (Python) + Hadoop (Streaming) Максим Щербаков ВолгГТУ 8/10/2014 1 Содержание Бигдайта: распределенные вычисления и тренды MapReduce: концепция и примеры реализации

More information

Pivotal HD Enterprise

Pivotal HD Enterprise PRODUCT DOCUMENTATION Pivotal HD Enterprise Version 1.1 Stack and Tool Reference Guide Rev: A01 2013 GoPivotal, Inc. Table of Contents 1 Pivotal HD 1.1 Stack - RPM Package 11 1.1 Overview 11 1.2 Accessing

More information

LOCKSS on LINUX. CentOS6 Installation Manual 08/22/2013

LOCKSS on LINUX. CentOS6 Installation Manual 08/22/2013 LOCKSS on LINUX CentOS6 Installation Manual 08/22/2013 1 Table of Contents Overview... 3 LOCKSS Hardware... 5 Installation Checklist... 6 BIOS Settings... 9 Installation... 10 Firewall Configuration...

More information

Partek Flow Installation Guide

Partek Flow Installation Guide Partek Flow Installation Guide Partek Flow is a web based application for genomic data analysis and visualization, which can be installed on a desktop computer, compute cluster or cloud. Users can access

More information

Important Notice. (c) 2010-2013 Cloudera, Inc. All rights reserved.

Important Notice. (c) 2010-2013 Cloudera, Inc. All rights reserved. Hue 2 User Guide Important Notice (c) 2010-2013 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names or slogans contained in this document

More information

Single Node Hadoop Cluster Setup

Single Node Hadoop Cluster Setup Single Node Hadoop Cluster Setup This document describes how to create Hadoop Single Node cluster in just 30 Minutes on Amazon EC2 cloud. You will learn following topics. Click Here to watch these steps

More information

Actian Vortex Express 3.0

Actian Vortex Express 3.0 Actian Vortex Express 3.0 Quick Start Guide AH-3-QS-09 This Documentation is for the end user's informational purposes only and may be subject to change or withdrawal by Actian Corporation ("Actian") at

More information

研 發 專 案 原 始 程 式 碼 安 裝 及 操 作 手 冊. Version 0.1

研 發 專 案 原 始 程 式 碼 安 裝 及 操 作 手 冊. Version 0.1 102 年 度 國 科 會 雲 端 計 算 與 資 訊 安 全 技 術 研 發 專 案 原 始 程 式 碼 安 裝 及 操 作 手 冊 Version 0.1 總 計 畫 名 稱 : 行 動 雲 端 環 境 動 態 群 組 服 務 研 究 與 創 新 應 用 子 計 畫 一 : 行 動 雲 端 群 組 服 務 架 構 與 動 態 群 組 管 理 (NSC 102-2218-E-259-003) 計

More information

docs.hortonworks.com

docs.hortonworks.com docs.hortonworks.com Hortonworks Data Platform : Automated Install with Ambari Copyright 2012-2015 Hortonworks, Inc. Some rights reserved. The Hortonworks Data Platform, powered by Apache Hadoop, is a

More information

Ankush Cluster Manager - Hadoop2 Technology User Guide

Ankush Cluster Manager - Hadoop2 Technology User Guide Ankush Cluster Manager - Hadoop2 Technology User Guide Ankush User Manual 1.5 Ankush User s Guide for Hadoop2, Version 1.5 This manual, and the accompanying software and other documentation, is protected

More information

Document Type: Best Practice

Document Type: Best Practice Global Architecture and Technology Enablement Practice Hadoop with Kerberos Deployment Considerations Document Type: Best Practice Note: The content of this paper refers exclusively to the second maintenance

More information

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2016 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and

More information

NIST/ITL CSD Biometric Conformance Test Software on Apache Hadoop. September 2014. National Institute of Standards and Technology (NIST)

NIST/ITL CSD Biometric Conformance Test Software on Apache Hadoop. September 2014. National Institute of Standards and Technology (NIST) NIST/ITL CSD Biometric Conformance Test Software on Apache Hadoop September 2014 Dylan Yaga NIST/ITL CSD Lead Software Designer Fernando Podio NIST/ITL CSD Project Manager National Institute of Standards

More information

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Copyright 2012, Oracle and/or its affiliates. All rights reserved. 1 Oracle Big Data Appliance Releases 2.5 and 3.0 Ralf Lange Global ISV & OEM Sales Agenda Quick Overview on BDA and its Positioning Product Details and Updates Security and Encryption New Hadoop Versions

More information

Hadoop Installation MapReduce Examples Jake Karnes

Hadoop Installation MapReduce Examples Jake Karnes Big Data Management Hadoop Installation MapReduce Examples Jake Karnes These slides are based on materials / slides from Cloudera.com Amazon.com Prof. P. Zadrozny's Slides Prerequistes You must have an

More information

Deploying Hadoop with Manager

Deploying Hadoop with Manager Deploying Hadoop with Manager SUSE Big Data Made Easier Peter Linnell / Sales Engineer plinnell@suse.com Alejandro Bonilla / Sales Engineer abonilla@suse.com 2 Hadoop Core Components 3 Typical Hadoop Distribution

More information

Supported Platforms. HP Vertica Analytic Database. Software Version: 7.0.x

Supported Platforms. HP Vertica Analytic Database. Software Version: 7.0.x HP Vertica Analytic Database Software Version: 7.0.x Document Release Date: 5/7/2014 Legal Notices Warranty The only warranties for HP products and services are set forth in the express warranty statements

More information

Hadoop Lab - Setting a 3 node Cluster. http://hadoop.apache.org/releases.html. Java - http://wiki.apache.org/hadoop/hadoopjavaversions

Hadoop Lab - Setting a 3 node Cluster. http://hadoop.apache.org/releases.html. Java - http://wiki.apache.org/hadoop/hadoopjavaversions Hadoop Lab - Setting a 3 node Cluster Packages Hadoop Packages can be downloaded from: http://hadoop.apache.org/releases.html Java - http://wiki.apache.org/hadoop/hadoopjavaversions Note: I have tested

More information

Integrating SAP BusinessObjects with Hadoop. Using a multi-node Hadoop Cluster

Integrating SAP BusinessObjects with Hadoop. Using a multi-node Hadoop Cluster Integrating SAP BusinessObjects with Hadoop Using a multi-node Hadoop Cluster May 17, 2013 SAP BO HADOOP INTEGRATION Contents 1. Installing a Single Node Hadoop Server... 2 2. Configuring a Multi-Node

More information

NovaBACKUP xsp Version 15.0 Upgrade Guide

NovaBACKUP xsp Version 15.0 Upgrade Guide NovaBACKUP xsp Version 15.0 Upgrade Guide NovaStor / November 2013 2013 NovaStor, all rights reserved. All trademarks are the property of their respective owners. Features and specifications are subject

More information

CA Performance Center

CA Performance Center CA Performance Center Single Sign-On User Guide 2.4 This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred to as the Documentation ) is

More information

SAS Data Loader 2.1 for Hadoop

SAS Data Loader 2.1 for Hadoop SAS Data Loader 2.1 for Hadoop Installation and Configuration Guide SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2014. SAS Data Loader 2.1: Installation

More information

PrimeRail Installation Notes Version A-2008.06 June 9, 2008 1

PrimeRail Installation Notes Version A-2008.06 June 9, 2008 1 PrimeRail Installation Notes Version A-2008.06 June 9, 2008 1 These installation notes present information about installing PrimeRail version A-2008.06 in the following sections: Media Availability and

More information

9.4 Hadoop Configuration Guide for Base SAS. and SAS/ACCESS

9.4 Hadoop Configuration Guide for Base SAS. and SAS/ACCESS SAS 9.4 Hadoop Configuration Guide for Base SAS and SAS/ACCESS Second Edition SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2015. SAS 9.4 Hadoop

More information

Virtual Managment Appliance Setup Guide

Virtual Managment Appliance Setup Guide Virtual Managment Appliance Setup Guide 2 Sophos Installing a Virtual Appliance Installing a Virtual Appliance As an alternative to the hardware-based version of the Sophos Web Appliance, you can deploy

More information

A. Aiken & K. Olukotun PA3

A. Aiken & K. Olukotun PA3 Programming Assignment #3 Hadoop N-Gram Due Tue, Feb 18, 11:59PM In this programming assignment you will use Hadoop s implementation of MapReduce to search Wikipedia. This is not a course in search, so

More information

Revolution R Enterprise DeployR 7.1 Enterprise Security Guide. Authentication, Authorization, and Access Controls

Revolution R Enterprise DeployR 7.1 Enterprise Security Guide. Authentication, Authorization, and Access Controls Revolution R Enterprise DeployR 7.1 Enterprise Security Guide Authentication, Authorization, and Access Controls The correct bibliographic citation for this manual is as follows: Revolution Analytics,

More information

What We Can Do in the Cloud (2) -Tutorial for Cloud Computing Course- Mikael Fernandus Simalango WISE Research Lab Ajou University, South Korea

What We Can Do in the Cloud (2) -Tutorial for Cloud Computing Course- Mikael Fernandus Simalango WISE Research Lab Ajou University, South Korea What We Can Do in the Cloud (2) -Tutorial for Cloud Computing Course- Mikael Fernandus Simalango WISE Research Lab Ajou University, South Korea Overview Riding Google App Engine Taming Hadoop Summary Riding

More information

Hadoop 2.2.0 MultiNode Cluster Setup

Hadoop 2.2.0 MultiNode Cluster Setup Hadoop 2.2.0 MultiNode Cluster Setup Sunil Raiyani Jayam Modi June 7, 2014 Sunil Raiyani Jayam Modi Hadoop 2.2.0 MultiNode Cluster Setup June 7, 2014 1 / 14 Outline 4 Starting Daemons 1 Pre-Requisites

More information

Novell Open Workgroup Suite

Novell Open Workgroup Suite Novell Open Workgroup Suite Small Business Edition QUICK START GUIDE September 2007 v1.5 Page 1 Introduction This Quick Start explains how to install the Novell Open Workgroup Suite software on a server.

More information

docs.hortonworks.com

docs.hortonworks.com docs.hortonworks.com : Ambari Views Guide Copyright 2012-2015 Hortonworks, Inc. All rights reserved. The, powered by Apache Hadoop, is a massively scalable and 100% open source platform for storing, processing

More information

Integrating VoltDB with Hadoop

Integrating VoltDB with Hadoop The NewSQL database you ll never outgrow Integrating with Hadoop Hadoop is an open source framework for managing and manipulating massive volumes of data. is an database for handling high velocity data.

More information

How To Use Hadoop

How To Use Hadoop Hadoop in Action Justin Quan March 15, 2011 Poll What s to come Overview of Hadoop for the uninitiated How does Hadoop work? How do I use Hadoop? How do I get started? Final Thoughts Key Take Aways Hadoop

More information

Lecture 2 (08/31, 09/02, 09/09): Hadoop. Decisions, Operations & Information Technologies Robert H. Smith School of Business Fall, 2015

Lecture 2 (08/31, 09/02, 09/09): Hadoop. Decisions, Operations & Information Technologies Robert H. Smith School of Business Fall, 2015 Lecture 2 (08/31, 09/02, 09/09): Hadoop Decisions, Operations & Information Technologies Robert H. Smith School of Business Fall, 2015 K. Zhang BUDT 758 What we ll cover Overview Architecture o Hadoop

More information

An Oracle White Paper September 2013. Oracle WebLogic Server 12c on Microsoft Windows Azure

An Oracle White Paper September 2013. Oracle WebLogic Server 12c on Microsoft Windows Azure An Oracle White Paper September 2013 Oracle WebLogic Server 12c on Microsoft Windows Azure Table of Contents Introduction... 1 Getting Started: Creating a Single Virtual Machine... 2 Before You Begin...

More information

Qsoft Inc www.qsoft-inc.com

Qsoft Inc www.qsoft-inc.com Big Data & Hadoop Qsoft Inc www.qsoft-inc.com Course Topics 1 2 3 4 5 6 Week 1: Introduction to Big Data, Hadoop Architecture and HDFS Week 2: Setting up Hadoop Cluster Week 3: MapReduce Part 1 Week 4:

More information

JAMF Software Server Installation Guide for Windows. Version 8.6

JAMF Software Server Installation Guide for Windows. Version 8.6 JAMF Software Server Installation Guide for Windows Version 8.6 JAMF Software, LLC 2012 JAMF Software, LLC. All rights reserved. JAMF Software has made all efforts to ensure that this guide is accurate.

More information

Installation Guide. McAfee VirusScan Enterprise for Linux 1.9.0 Software

Installation Guide. McAfee VirusScan Enterprise for Linux 1.9.0 Software Installation Guide McAfee VirusScan Enterprise for Linux 1.9.0 Software COPYRIGHT Copyright 2013 McAfee, Inc. Do not copy without permission. TRADEMARK ATTRIBUTIONS McAfee, the McAfee logo, McAfee Active

More information

Configuring MailArchiva with Insight Server

Configuring MailArchiva with Insight Server Copyright 2009 Bynari Inc., All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any

More information

How to install PowerChute Network Shutdown on VMware ESXi 3.5, 4.0 and 4.1

How to install PowerChute Network Shutdown on VMware ESXi 3.5, 4.0 and 4.1 How to install PowerChute Network Shutdown on VMware ESXi 3.5, 4.0 and 4.1 Basic knowledge of Linux commands and Linux administration is needed before user should attempt the installation of the software.

More information

Supported Platforms. HP Vertica Analytic Database. Software Version: 7.1.x

Supported Platforms. HP Vertica Analytic Database. Software Version: 7.1.x HP Vertica Analytic Database Software Version: 7.1.x Document Release Date: 10/14/2015 Legal Notices Warranty The only warranties for HP products and services are set forth in the express warranty statements

More information

Architecting the Future of Big Data

Architecting the Future of Big Data Hive ODBC Driver User Guide Revised: July 22, 2014 2012-2014 Hortonworks Inc. All Rights Reserved. Parts of this Program and Documentation include proprietary software and content that is copyrighted and

More information

Virtual Web Appliance Setup Guide

Virtual Web Appliance Setup Guide Virtual Web Appliance Setup Guide 2 Sophos Installing a Virtual Appliance Installing a Virtual Appliance This guide describes the procedures for installing a Virtual Web Appliance. If you are installing

More information

HADOOP MOCK TEST HADOOP MOCK TEST

HADOOP MOCK TEST HADOOP MOCK TEST http://www.tutorialspoint.com HADOOP MOCK TEST Copyright tutorialspoint.com This section presents you various set of Mock Tests related to Hadoop Framework. You can download these sample mock tests at

More information

Release Notes for McAfee(R) VirusScan(R) Enterprise for Linux Version 1.9.0 Copyright (C) 2014 McAfee, Inc. All Rights Reserved.

Release Notes for McAfee(R) VirusScan(R) Enterprise for Linux Version 1.9.0 Copyright (C) 2014 McAfee, Inc. All Rights Reserved. Release Notes for McAfee(R) VirusScan(R) Enterprise for Linux Version 1.9.0 Copyright (C) 2014 McAfee, Inc. All Rights Reserved. Release date: August 28, 2014 This build was developed and tested on: -

More information

Platfora Deployment Planning Guide

Platfora Deployment Planning Guide Platfora Deployment Planning Guide Version 5.3 Copyright Platfora 2016 Last Updated: 5:30 p.m. June 27, 2016 Contents Document Conventions... 4 Contact Platfora Support...5 Copyright Notices... 5 Chapter

More information

Introduction to HDFS. Prasanth Kothuri, CERN

Introduction to HDFS. Prasanth Kothuri, CERN Prasanth Kothuri, CERN 2 What s HDFS HDFS is a distributed file system that is fault tolerant, scalable and extremely easy to expand. HDFS is the primary distributed storage for Hadoop applications. Hadoop

More information

TP1: Getting Started with Hadoop

TP1: Getting Started with Hadoop TP1: Getting Started with Hadoop Alexandru Costan MapReduce has emerged as a leading programming model for data-intensive computing. It was originally proposed by Google to simplify development of web

More information

Supported Platforms HPE Vertica Analytic Database. Software Version: 7.2.x

Supported Platforms HPE Vertica Analytic Database. Software Version: 7.2.x HPE Vertica Analytic Database Software Version: 7.2.x Document Release Date: 2/4/2016 Legal Notices Warranty The only warranties for Hewlett Packard Enterprise products and services are set forth in the

More information

HP AppPulse Active. Software Version: 2.2. Real Device Monitoring For AppPulse Active

HP AppPulse Active. Software Version: 2.2. Real Device Monitoring For AppPulse Active HP AppPulse Active Software Version: 2.2 For AppPulse Active Document Release Date: February 2015 Software Release Date: November 2014 Legal Notices Warranty The only warranties for HP products and services

More information

Cloud.com CloudStack Community Edition 2.1 Beta Installation Guide

Cloud.com CloudStack Community Edition 2.1 Beta Installation Guide Cloud.com CloudStack Community Edition 2.1 Beta Installation Guide July 2010 1 Specifications are subject to change without notice. The Cloud.com logo, Cloud.com, Hypervisor Attached Storage, HAS, Hypervisor

More information

MarkLogic Server. MarkLogic Connector for Hadoop Developer s Guide. MarkLogic 8 February, 2015

MarkLogic Server. MarkLogic Connector for Hadoop Developer s Guide. MarkLogic 8 February, 2015 MarkLogic Connector for Hadoop Developer s Guide 1 MarkLogic 8 February, 2015 Last Revised: 8.0-3, June, 2015 Copyright 2015 MarkLogic Corporation. All rights reserved. Table of Contents Table of Contents

More information

DocuShare Installation Guide

DocuShare Installation Guide DocuShare Installation Guide Publication date: February 2011 This document supports DocuShare Release 6.6.1 Prepared by: Xerox Corporation DocuShare Business Unit 3400 Hillview Avenue Palo Alto, California

More information

Centrify Server Suite 2015.1 For MapR 4.1 Hadoop With Multiple Clusters in Active Directory

Centrify Server Suite 2015.1 For MapR 4.1 Hadoop With Multiple Clusters in Active Directory Centrify Server Suite 2015.1 For MapR 4.1 Hadoop With Multiple Clusters in Active Directory v1.1 2015 CENTRIFY CORPORATION. ALL RIGHTS RESERVED. 1 Contents General Information 3 Centrify Server Suite for

More information

HDFS Installation and Shell

HDFS Installation and Shell 2012 coreservlets.com and Dima May HDFS Installation and Shell Originals of slides and source code for examples: http://www.coreservlets.com/hadoop-tutorial/ Also see the customized Hadoop training courses

More information

Platfora Installation Guide

Platfora Installation Guide Platfora Installation Guide Version 4.5 For On-Premise Hadoop Deployments Copyright Platfora 2015 Last Updated: 10:14 p.m. June 28, 2015 Contents Document Conventions... 5 Contact Platfora Support...6

More information