Hadoop Data Warehouse Manual

Similar documents
Big Data Operations Guide for Cloudera Manager v5.x Hadoop

Training module 2 Installing VMware View

Extending Remote Desktop for Large Installations. Distributed Package Installs

CDH installation & Application Test Report

Hadoop Basics with InfoSphere BigInsights

Hadoop Basics with InfoSphere BigInsights

Network Setup Guide. 1 Glossary. 2 Operation. 1.1 Static IP. 1.2 Point-to-Point Protocol over Ethernet (PPPoE)

SETTING UP REMOTE ACCESS ON EYEMAX PC BASED DVR.

Configuring MailArchiva with Insight Server

1. Installation Overview

Department of Veterans Affairs VistA Integration Adapter Release Enhancement Manual

SOA Software API Gateway Appliance 7.1.x Administration Guide

Integrating SAP BusinessObjects with Hadoop. Using a multi-node Hadoop Cluster

How to Configure an Initial Installation of the VMware ESXi Hypervisor

Virtual Appliance Setup Guide

Apache Hadoop 2.0 Installation and Single Node Cluster Configuration on Ubuntu A guide to install and setup Single-Node Apache Hadoop 2.

BaseManager & BACnet Manager VM Server Configuration Guide

Cloudera Manager Training: Hands-On Exercises

Reporting works by connecting reporting tools directly to the database and retrieving stored information from the database.

Written by Wirabumi Software Sunday, 30 December :27 - Last Updated Thursday, 03 January :52

Secure Messaging Server Console... 2

Deploying BitDefender Client Security and BitDefender Windows Server Solutions

Laboration 3 - Administration

Livezilla How to Install on Shared Hosting By: Jon Manning

Configuring Business Monitor for Event Consumption from WebSphere MQ

CommandCenter Secure Gateway

WebSphere Business Monitor V7.0 Configuring a remote CEI server

How To - Implement Single Sign On Authentication with Active Directory

Defender Token Deployment System Quick Start Guide

Local Caching Servers (LCS): User Manual

User guide. Business

FlexSim LAN License Server

Running Knn Spark on EC2 Documentation

Installing and Configuring vcloud Connector

Installation Overview

NSi Mobile Installation Guide. Version 6.2

NOC PS manual. Copyright Maxnet All rights reserved. Page 1/45 NOC-PS Manuel EN version 1.3

DraganFly Guardian: API Instillation Instructions

Remote Viewer Recording Backup

How To Set Up Dataprotect

Local Caching Servers (LCS) February 2015

Configuring the Active Directory Plug-in

Using Remote Web Workplace Version 1.01

Perforce Helix Threat Detection OVA Deployment Guide

Sample copy. Introduction To WebLogic Server Property of Web 10.3 Age Solutions Inc.

Indian Standards on DVDs. Installation Manual Version 1.0. Prepared by Everonn Education Ltd

Hadoop Tutorial. General Instructions

Quick Start Guide. User Manual. 1 March 2012

CONNECTING TO DEPARTMENT OF COMPUTER SCIENCE SERVERS BOTH FROM ON AND OFF CAMPUS USING TUNNELING, PuTTY, AND VNC Client Utilities

ESX System Analyzer Version 1.0 Installation Guide

OpenTOSCA Release v1.1. Contact: Documentation Version: March 11, 2014 Current version:

Citrix Remote Access Work Instructions

MiraCosta College now offers two ways to access your student virtual desktop.

How do I use Citrix Staff Remote Desktop

Configuring High Availability for VMware vcenter in RMS Distributed Setup

How to remotely access your Virtual Desktop from outside the college using VMware View Client. How to guide

NETWORK SETUP GLOSSARY

SevOne NMS Download Installation and Implementation Guide

VPS Remote Computing. Connecting to a Windows Server for the first time. 1 Your Server has been installed. 2 Finding the login details for your Server

Configure Single Sign on Between Domino and WPS

Server Configuration and Deployment (part 1) Lotus Foundations Essentials

Step by step guide for installing highly available System Centre 2012 Virtual Machine Manager Management server:

Hadoop Installation MapReduce Examples Jake Karnes

QNAP SYSTEMS INC. QNAP Digital Signage Player Web Console Manual

How to Remotely View Security Cameras Using the Internet

CONFIGURING ECLIPSE FOR AWS EMR DEVELOPMENT

Creating an ESS instance on the Amazon Cloud

Ciphermail for BlackBerry Quick Start Guide

Workshop for WebLogic introduces new tools in support of Java EE 5.0 standards. The support for Java EE5 includes the following technologies:

EVALUATION ONLY. WA2088 WebSphere Application Server 8.5 Administration on Windows. Student Labs. Web Age Solutions Inc.

Kaltura On-Prem Evaluation Package - Getting Started

Install MS SQL Server 2012 Express Edition

IDS 561 Big data analytics Assignment 1

WA1826 Designing Cloud Computing Solutions. Classroom Setup Guide. Web Age Solutions Inc. Copyright Web Age Solutions Inc. 1

F-Secure Messaging Security Gateway. Deployment Guide

Installation and Deployment

SECTION 1: FIND OUT THE IP ADDRESS OF DVR To find out the IP of DVR for your DVR do the following:

IIS, FTP Server and Windows

IBM WebSphere Application Server Version 7.0

CycleServer Grid Engine Support Install Guide. version 1.25

Installing and Configuring vcloud Connector

Setting up FileMaker 10 Server

SATO Network Interface Card Configuration Instructions

How to Use? SKALICLOUD DEMO

Setting up Remote Desktop

Digipass Plug-In for IAS. IAS Plug-In IAS. Microsoft's Internet Authentication Service. Installation Guide

MSSQL quick start guide

SchoolBooking SSO Integration Guide

Administrative Guide VtigerCRM Microsoft Exchange Connector (Exchange Server 2010)

Installation documentation for Ulteo Open Virtual Desktop

Using VirtualBox ACHOTL1 Virtual Machines

Ulteo Open Virtual Desktop Installation

VMTurbo Operations Manager 4.5 Installing and Updating Operations Manager

Preinstallation Requirements Guide

Password Manager. Version Password Manager Quick Guide

Active Interest Media File Transfer Server Initial Client Install Documentation

F-SECURE MESSAGING SECURITY GATEWAY

Clearswift Information Governance

Source Code Management for Continuous Integration and Deployment. Version 1.0 DO NOT DISTRIBUTE

Transcription:

Ruben Vervaeke & Jonas Lesy 1 Hadoop Data Warehouse Manual To start off, we d like to advise you to read the thesis written about this project before applying any changes to the setup! The thesis can be found on the same Moodle page where this manual was found or on the download link found here: https://www.theseus.fi/handle/10024/96600 Before any changes are made, we d like to note that it s advised to first run a local virtual machine with Hadoop installed to develop on. This is a safer way to test since the setup on the Hadoop cluster is not optimised for debugging. A virtual machine can also be found on the mentioned Moodle page, this machine has Hadoop and HBase already installed and should be used for development. The machine can be downloaded and then run with VMware Player. Setting up the virtual machine and obtaining the program This chapter will describe how to work with the virtual machine and what needs to be done before development or testing can be continued. First, like mentioned above, one will have to download the virtual machine from the Moodle page. After unzipping, the machine can be opened in VMware Player. When running the virtual machine for the first time a screen with three options might pop up. Please click I Moved It, if this happens. This will adjust networking settings and such to the appropriate settings. Now a login screen should popup which looks like the image below. The password for the Ruben Vervaeke account is karelia. Before doing anything now, please make sure you have Internet access on this virtual machine, otherwise development will not be possible. This is because we used Maven in our project and Maven pulls all necessary dependencies from the Internet when the application is being built. The first thing that should be noticed is that there are two files on the desktop. Please don t alter them, they can be moved to a Documents folder for example but make sure to remember where you put it. This script has to be run to start all Hadoop components and functionalities. This is done by opening the command shell and typing the following lines. > su This is done to change to root user (since the script has to be run as root).

Ruben Vervaeke & Jonas Lesy 2 This line is followed by the root s password which is also karelia like the user account s password. > cd Desktop/ Change to the Desktop folder. > sh hadoopscript.sh Run the hadoopscript (which is currently still in the Desktop folder) Now the script is running and all that needs to happen is waiting. After it is done, one should check the running Hadoop services by typing the jps command. The following results should pop up: Here you can see the different services running and all of these should be running, if not please (re)start them manually. The commands for this can be found in the other file on the desktop. The complete application is stored on a server with Github version control. Currently Ruben Vervaeke is the owner of the repository, so anyone who want s access to our application will have to have a Github account and request access. In our case, you can send an email to ruben.vervaeke@hotmail.com with your Github username, so I can add you to the Github project. Once this is done you can clone the repository in Netbeans. You can find a tutorial about this here: https://netbeans.org/kb/docs/ide/git.html

Ruben Vervaeke & Jonas Lesy 3 Creating a new resource The implementation in the application is now slightly different than explained in the thesis. The way of adding a new resource is now much more flexible and easy. As mentioned in the thesis new resources can be added by defining them in the XML config files. But now a driverjar tag has been added to define the name of the MapReduce jar file that was created for the resource. So we need a MapReduce application. For how this is done you can refer to all documentation from Hadoop and HBase. The only important thing is that the mapper will have to output its result to the HBase database. Therefor before working with new domain objects, you need to create a schema via the shell for HBase. All this information can be found in the HBase documentation. After this is done the data can be stored onto the file system, but again if you defined new resources that didn t yet had domain objects defined in the program, you will have to create a corresponding web service class for them. You can use the existing SensorService class as a reference on how to do this.

Ruben Vervaeke & Jonas Lesy 4 Connect to the setup with Cloudera In case the project is ready to be deployed on the Cloudera setup, this chapter will describe how connection to this system is achieved. On the image below, the complete network setup can be seen. The bottom three machines are those where the system is running on. These machines are virtual and run on a blade server at the Karelia University of Applied Sciences. These machines are protected by a gateway which requires a username and password to connect to. This networking setup was configured by Tiainen Henri and Janne Puustinen, so full credit for this goes to them. Connection to the full setup requires the installation of X2Go Client (http://wiki.x2go.org/doku.php). After completely installing the software, the first things that needs to happen is connecting to the gateway. This is done by opening the X2Go Client software, clicking on the Session button followed by New Session. In this window the following details are filled in. And under Session type select the following option:

Ruben Vervaeke & Jonas Lesy 5 The first field represents the host to connect to, this is the IP address shown on the networking setup image. This field is followed by the default user to login with which is user. The last field is the SSH port, which is 22 in this case. After configuring this, the X2Go Client should have a session available like shown on the image below. When clicking on this session, connection will be initiated but a password is required first. The password to connect to the gateway is Password1!. Now you re connected to the desktop of the gateway. If the X2Go Client software is not opened on this gateway, you ll still have to open it first to make further connection. This is done by right clicking on the desktop and selecting the option found in the image below. Now the familiar program is opened and the three other machines can be seen. The software s screen should look like shown on the image below.

Ruben Vervaeke & Jonas Lesy 6 Now you can connect to whichever machine you need with the same credentials as before, the default user user and Password1! password. The project could now be deployed on the Master machine if it is ready. The terminal can be opened by clicking this button: The browser (Mozilla Firefox in this case) can be opened by clicking this button.

Ruben Vervaeke & Jonas Lesy 7 Manage and configure the Cloudera Manager To open the Cloudera Manager, the following steps must be taken. On one of the virtual clients, open the Firefox browser and go to the following address: 192.168.72.11:7180. This will bring you to the following page: Now you can login with the following credentials: - Username: admin - Password: fibekarelia4 From there on, you can manage the different installed components and check for errors. How to start Cloudera after restart It s possible that the servers were down for some reason. In that case Cloudera will not automatically restart after the servers have finished starting up. You can start Cloudera using following steps: 1. Start the Cloudera database using following command as root user on the master node: $ sudo service cloudera-scm-server-db start 2. Start the Cloudera manager server using following command as root user on the master node: $ sudo service cloudera-scm-server start 3. Start the Cloudera manager agents using following commands as root user on BOTH slave nodes: $ sudo service cloudera-scm-agent start After this you should be able to login the admin console and monitor the system s status.