Hadoop Kelvin An Overview
|
|
- Paula Evangeline Evans
- 8 years ago
- Views:
Transcription
1 Hadoop Kelvin An Overview Aviad Pines and Lev Faerman, HUJI LAWA Group Introduction: This document outlines the Hadoop Kelvin monitoring system, and the benefits it brings to a Hadoop cluster. Why Hadoop is (currently) sub-optimal: At present Hadoop has a simple concept of Network locality. At best, it can be configured to know that a specific group of machines is located in the same rack, and enjoy a better communication link than in general. This allows it to schedule computation tasks close to the data. However, this notion of locality is limited. In a larger data center environment a single rack houses only a few machines when compared to the whole data center inventory of tens, hundreds (or even thousands). Once a task cannot be scheduled on the same rack, it is scheduled on one of the remaining available machines in the cluster. Since traffic between machines not on the same rack is different between adjacent and non-adjacent racks, this leads to sub-optimal task assignment. This issue is compounded when a Hadoop cluster is deployed in a cloud environment where the administrator is not capable of knowing the rack topology of the machines he is currently working on. As such, a need arises for Hadoop to be able to detect strong and weak network links between machines in order to improve its scheduling mechanism. Even more detailed information will be required to manage Hadoop workload when it is distributed into two or more distinct clusters, as is expected to become a common future cloud configuration. What is Hadoop Kelvin, and What It Enables: Hadoop Kelvin is a network monitoring system designed for the Hadoop Map-Reduce framework. It monitors data (not control) traffic between Hadoop nodes and provides multiple ways to store, visualize and access the stored monitoring data. It is designed to be easily extensible, flexible and to operate with a minimal effect on the running time of Hadoop jobs. Because Hadoop Kelvin is tightly integrated into Hadoop by instrumenting some of the Hadoop source code, it is only available as part of the Hadoop-LAWA version of Hadoop. Once Hadoop Kelvin has amassed a certain amount of monitoring data, it presents any agent utilizing it with data on the network links in the cluster. This data can be later used to improve Hadoop's decision-making in its scheduling process by placing computation tasks as close to the data as possible, expanding on the rack concept used in the regular Hadoop scheduler.
2 Method: Hadoop Kelvin collects data about the following data transfers: HDFS reads (regardless of who is performing the read). HDFS writes (regardless of who is the origin of the data). Data transfers between Mappers and Reducers during a Map-Reduce job execution. The data collected about each transfer includes: Source machine. Destination machine. Starting timestamp. Duration of transfer in milliseconds. Size of the transferred data, in bytes. The data is collected by a statistics server, several of which may be present in a sufficiently large Hadoop cluster. The number is statistic servers is configurable and each machine in the cluster is configured to report to a single statistic server. This method of operation was chosen because it offers a complete view of the heavy network traffic in the system, which is the motion of data. This ignores the light-weight management traffic (such as requests for blocks, heartbeats, and other traffic that would be caught by external monitoring tools). The single-location (or, more correctly, few-location ) statistics storage is designed in order to allow the future scheduler quick access to the stored data. If this data was stored locally on each cluster machine it would slow down the scheduler's decision-making process while it was waiting for the measurement data relevant to it at this particular moment because the collection
3 Architectural Overview: High-Level Design: There are two main parts in Hadoop Kelvin: These are the Statistics Server and the Statistics Client. The Statistics Server is a program which runs on a single machine in the cluster (typically one of the master machines in the cluster if a single Statistics Server is present. Alternatively a set of slave machines can be used if several such servers are required) and serves as a sink for all the traffic reports arriving from the cluster nodes. The server operates a set of user-configurable (via XML) data storers (which are write-only), data retrievers (which are read-only) and data manipulators (which provide read-and-write access) to which measurement data is stored and from which queries about past measurement data are completed. Currently, Hadoop Kelvin provides a Log-based information store which stores all traffic reports in plaintext form, and also a data manipulator which is based on a SQL database which collects the traffic reports. The Statistics Client allows a 3 rd party program to access the data stored inside the data storers of the Statistics Server and also to submit reports of its own. The protocol all Hadoop Kelvin traffic uses is HTTP.
4 Data reports and data requests are abstracted into Packets and Queries respectively, and the system includes a Packet which contains the information described above in the Method section, and also a Query which allows a user to retrieve information from the Statistics Server s SQL database. The system additionally includes two hook points for web applications deployed on top of an embedded web server (Jetty). They are usually used by the following: 1. Kelvin In Action This is a web application designed for displaying the monitoring data collected by Kelvin about the Hadoop cluster in a convenient, easy to understand web-based fashion. Kelvin In Action is discussed in more details in Appendix A. 2. Typically, this hook point is occupied by additional visualization tools which are accessible via any Internet browser.
5 Data Storers, Data Retrievers and Data Manipulators Why Hadoop Kelvin is More than a Logger: As briefly described above, the system incorporates the notions of a Data Storer, a Data Retriever and a Data Manipulator, we refer to them all as Data Handlers. The first two define a Java Interface which can be implemented by anyone seeking to expand upon the functionality of Kelvin, while the latter is simply an entity implementing both these interfaces at once. The addition of extra such elements does not require the recompilation of Hadoop (they just need to be located in a JAR file which is located on the classpath and need to be enabled in the XML configuration files), but it does require a re-start of the statistic server(s). The default Kelvin implementation supplies one Data Storer (LogStatisticStore) and one Data Manipulator (H2DBManipulator) which is also a Storer. The LogStatisticStore logs all traffic reports to a log4j log file. This is the simplest form of a Data Storer, and should be mainly used for debugging or research purposes. The log files have a tendency to grow very large rather quickly, so it is not suited for long-term, constant deployment in a production environment. The H2DBManipulator stores the traffic reports into a H2 (SQL) database. This database provides the basic building block for the future Hadoop scheduler as it allows other code to access the traffic reports collected over a period of time. The existence of the SQL database, and the extensibility of Kelvin set it apart from a simple logger interface (it in fact contains just one such logger as a default).
6 Data Flow Through Kelvin: Packets and Queries Submitting and Requesting Information to/from Kelvin: The communication with Kelvin is done via two serializable Java types. Objects inheriting the StatisticQuery abstract class can be sent via the Statistic Client to the Statistic Server in order to obtain a response from a retriever (the specific response depends on the data retriever the query is addressed to). These queries allow access to the data stored within Kelvin. A specific type of query is created for each target retriever (and more than one is possible per retriever). After being processed by the target Retriever, the data is sent back to the client. Currently, the NetworkMatrixStatisticQuery exists to allow retrieving matrices of traffic reports from the H2DBManipulator. Objects inheriting the StatisticPacket abstract class can be also sent via the Statistic Client to the Statistic Server in order to store information in all the storers which support the specific packet, as opposed to the queries which are targeted specifically for a target retriever. Currently, the NetworkMatrixStatisticsPacket stores traffic reports in the H2DBManipulator, LogStatisticsStore and the DebugStore. Sending data to the Statistics Server Inhereting the StatisticsPacket class Sending data to the statisitcs server is done by sending objects which inherit the StatisticPacket abstract class. Inheriting this class compels the user to implement the method addto(datastorer collector). Since the implmentation is done by the Visitor design pattern, in order for the process to work the method has to be implemented as public void addto(datastorer collector) { collector.accept(this); }
7 Other than that, the user is free to choose how to construct and what methods to implement inside the Packet, to be later used inside the Storers and Manipulators of his choosing. Retrieving data from the Statistics Server Inhereting the StatisticsQuery class Retrieving data from the statistics server is done by sending objects which inherit the StatisticsQuery abstract class. This compels the user to implement the following methods: public Verifiable perform(dataretriever retriever) public Verifiable query() public Verifiable query(httpstatisticclient client) Another method, Verifiable perform(dataretriever retriever, has to be implemented, and again due to the restrictions of the Visitor design pattern the user must implement it as public Verifiable perform(dataretriever retriever) { return retriever.retrieve(this)); } A verifiable object is an object whose data can be assessed to be valid or not via the boolean isdatavalid() method. If a query result is always valid, the method should always return true. As with the Statistics Packet, other than the specified methods the user is free to choose how to construct and what methods to implement inside the Query, to be later used inside the Retrievers and Manipulators of his choosing. Data Storers, Retrievers and Manipulators The Statistic Packets and Queries are being sent from the Statistics Client to the Statistics Server, where they are processed by the Storers (Packets), Retrievers (Queries) and Manipulators (both Packets and Queries). Data Storers are objects which implement the DataStorer interface. A Data Storer is designed to store the data in a particular way, such as in a log file or a database. Data Storer can support multiple kinds of Statisitc Packets. For each Statistic Packet supported by the Storer, an accept method needs to be implemented: public void accept(<packettype> packet) Where <packettype> extends the StatisticPacket class. Looking at the previously mentioned H2DBManipulator class, it has two accept methods: public void accept(debugpacket packet); public void accept(networkstatisticpacket packet);
8 One method for each supported Packet. Note that due to limitations of the Visitor design pattern, each accept method needs to be written in the DataStorer abstract class as well, and implemented inside all the inherting classes. If a class does not support a specific packet, it should implement an empty method. Data Retrievers are objects which implement the DataRetriever interface, and their job is to retrieve data that was stored on the server and send it back to the user. The H2DBManipulator for example, retrieves data from a H2 Database according to the parameters given by the user. Similar to the DataStorer, for each supported Statistic Query, a method Verifiable retrieve(<querytype> query), where <QueryType> is a class which inherits the StatisticsQuery class needs to be implemented in the DataRetriever abstract class itself and all the inheriting subclasses. The H2DBManipulator for example implements the method Verifiable retrieve(networkmatrixstatisticquery query) in order to be able to respond to NetworkMatrixStatisticQueries. User API: The following section details how someone using Kelvin can access the information it currently provides. In order to access Kelvin via user code, the Hadoop HUJI Common jar needs to be on the classpath, since all Kelvin classes are located there. This file is a part of the standard Hadoop HUJI distribution. Retrieving H2DBManipulator Data: In order to retrieve information from the Kelvin database, the user code needs to create a NetworkMatrixStatisticQuery object. This object has three constructors. public NetworkMatrixStatisticQuery(String querytarget) public NetworkMatrixStatisticQuery(String querytarget, Date timestamp) public NetworkMatrixStatisticQuery(String querytarget, Date timestart, Date timeend) The first receives only the class name (it needs to be a full class name. For example: org.apache.hadoop.statistics.waldoes.h2dbmanipulator) of the retriever it tries to access. In this case, the data returned will be only the latest measurements between all nodes, one per each node pair (or none, if no traffic passed between these particular nodes). The second specifies an additional Date object. Only measurements that have taken place at this particular second will be returned. The final constructor requires two dates and returns the results of all measurements between all nodes falling into this time period. Aggregation is currently done by the user as he sees fit. After the query object has been created, its query() method needs to be called. This retrieves an instance of the Statistic Client singleton, performs the query and returns the
9 result as a NetworkStatisticsMatrix object which can then be used to access the query results. Appendix A Kelvin In Action Kelvin In Action is a web-front for the H2 Database data manipulator (although it is designed to be easily extensible for displaying other data). It presents the user with a web-based means of accessing the traffic reports collected by Kelvin by allowing it to generate NetworkMatrixStatisticQueries from his browser. The query interface allows the user to specify the time frame for which the measurement data will be returned (the default is the latest measurements) and also to specify the aggregation method (sum, mean, max, min and so on) of the results to enable their display on the heat-map (which obviously shows only a single square for each pair of nodes). In the end the aggregated results are displayed in a heat-map fashion, showing the current hot-spots and cold-spots of the cluster. It also shows a list of the cluster's machines and their IPs. A hover-over tooltip allows the user to expand the heat-map in order to obtain additional information about the color-represented data transfer. The legend of the heat-map is highly configurable as well, and the user can define the color scheme and the ranges of the legend. A screenshot illustrates further: The Kelvin In Action WAR file is included in the LAWA CDH3 Hadoop release. Additionally, it can be found at: / username: lawa, password: thisislawa!
10 Appendix B - Deployment Instructions: To configure Hadoop Kelvin, you would first need to configure your LAWA Hadoop cluster (as a regular Hadoop cluster). In a fashion similar to Hadoop itself, Hadoop Kelvin uses XML configuration files which are added to the Hadoop conf directory. This section describes the various configuration files and their possible parameters. Please note that parameters which are marked as mandatory must be specified, or the system will fail to load. statistics-site.xml: Property Name Description / Valid Values Mandatory server.hostname The host on which the Statistics Server is Yes running. aggregation.threshold The minimum value of traffic reports which Yes are submitted to the statistics server at once. sleep.cycles The number of sleep cycles done by the Yes statistics client before submitting all the waiting reports even if their number does not reach the aggregation threshold. sleep.cycle.duration The duration in milliseconds the thread Yes sleeps in each sleep cycle. statistics.port The port on which the Statistics Server is Yes listening for traffic reports and data queries statistics.webapp.port The port on which the Statistics Server web Yes application is accessible server.statistics.kia.port The port which on which the Kelvin In Action Yes application is accessible. request.timeout The timeout for HTTP transactions directed Yes at the Statistics Server (milliseconds) server.statistics.sub.url The sub-url of the Statistics Server on Yes server.hostname. server.statistics.webapp.sub.ur The sub-url of the Statistics Server web Yes l application on server.hostname server.statistisics.kia.sub.url The sub-url of the Kelvin In Action Yes application. visualizer.war The path for the webapp application war file. Yes kelvin.in.action.war The path for Kelvin In Action's war file. Yes statistics.stores The list of fully-qualified Java class names of Yes the data stores and data retrievers to be used by the Statistics Server. This is a comma-separated list. mapreduce.fetcher.enable.rep Whether reporting from the Fetch phase is Yes or enabled. Fetching is the transmission of data from the Mappers to the Reducers during a Map-Reduce Job. hdfs.blockreader.enable.report Whether reporting of network flows which are HDFS reads is enabled. If enabled, any read from the HDFS will be monitored. Yes
11 Property Name Description / Valid Values Mandatory hdfs.blockreceiver.enable.repo Whether reporting of HDFS writes is Yes rt enabled. If enabled, any writes to the HDFS hdfs.blockreader.aggregation.f actor will be monitored. The level of local aggregation between reports of HDFS reads. If set to X, then every X reports will be aggregated into a single report. This is used to reduce the report-load in cases where many small reads are performed (as opposed to a single large read). This often occurs while processing text in a Map-Reduce Job. Yes dbmanipulator-site.xml: Property Name Description / Valid Values Mandatory database.xml.definition The full path and file name of the database definitions file used for the DB manipulator. No. Required if H2DBManipulator is to be used with the system. database.name The name of the database defined in the configuration file specified in database.xml.definition to use for the DB manipulator. No. Required if H2DBManipulator is to be used with the system. Database configuration file (the one referenced by database.xml.defintion): Each database used in the statistics package needs to be defined in a database configuration file. Each database definition consists of four properties: Field Name Description / Valid Values Mandatory name The name that the database will be referred No to in the code via the get method. If no name supplied, the default name will be full path supplied. path The full path to the database file. Yes username The username to access the database, if No needed. Password The password to access the database, if needed. No Example Hadoop Kelvin configuration files are included with the Hadoop LAWA release, in the conf directory.
12 Starting the Statistics Server: After configuring your Hadoop cluster, you would need to start the Hadoop Kelvin Statistics Server. This is done by executing the regular hadoop script (located in the bin subdirectory of your Hadoop location) with the stats parameter ( bin/hadoop stats ). The Hadoop cluster itself will function even without the Statistics Server running, and the Statistics Server can be left running when you're taking the cluster itself down (this can be used to leave its data accessible even during maintenance periods). Once the Statistics Server is running, it will begin receiving traffic reports from all cluster nodes as data is shifted around according to the configuration points which are enabled. Accessing the Web Frontend: Similar to the HDFS and Map-Reduce web front-ends, Hadoop Kelvin's information storage can be accessed via a regular browser. Just point your browser to the following URL: Where <server.hostname> and <statistics.webapp.port> are the values you've specified in the documentation.
Integrating VoltDB with Hadoop
The NewSQL database you ll never outgrow Integrating with Hadoop Hadoop is an open source framework for managing and manipulating massive volumes of data. is an database for handling high velocity data.
More informationCloudera Manager Training: Hands-On Exercises
201408 Cloudera Manager Training: Hands-On Exercises General Notes... 2 In- Class Preparation: Accessing Your Cluster... 3 Self- Study Preparation: Creating Your Cluster... 4 Hands- On Exercise: Working
More informationBig Data Operations Guide for Cloudera Manager v5.x Hadoop
Big Data Operations Guide for Cloudera Manager v5.x Hadoop Logging into the Enterprise Cloudera Manager 1. On the server where you have installed 'Cloudera Manager', make sure that the server is running,
More informationUsing RADIUS Agent for Transparent User Identification
Using RADIUS Agent for Transparent User Identification Using RADIUS Agent Web Security Solutions Version 7.7, 7.8 Websense RADIUS Agent works together with the RADIUS server and RADIUS clients in your
More informationEmerald. Network Collector Version 4.0. Emerald Management Suite IEA Software, Inc.
Emerald Network Collector Version 4.0 Emerald Management Suite IEA Software, Inc. Table Of Contents Purpose... 3 Overview... 3 Modules... 3 Installation... 3 Configuration... 3 Filter Definitions... 4
More informationHadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh
1 Hadoop: A Framework for Data- Intensive Distributed Computing CS561-Spring 2012 WPI, Mohamed Y. Eltabakh 2 What is Hadoop? Hadoop is a software framework for distributed processing of large datasets
More informationDeploying the BIG-IP LTM system and Microsoft Windows Server 2003 Terminal Services
Deployment Guide Deploying the BIG-IP System with Microsoft Windows Server 2003 Terminal Services Deploying the BIG-IP LTM system and Microsoft Windows Server 2003 Terminal Services Welcome to the BIG-IP
More informationAnkush Cluster Manager - Hadoop2 Technology User Guide
Ankush Cluster Manager - Hadoop2 Technology User Guide Ankush User Manual 1.5 Ankush User s Guide for Hadoop2, Version 1.5 This manual, and the accompanying software and other documentation, is protected
More informationSample copy. Introduction To WebLogic Server Property of Web 10.3 Age Solutions Inc.
Introduction To WebLogic Server Property of Web 10.3 Age Solutions Inc. Objectives At the end of this chapter, participants should be able to: Understand basic WebLogic Server architecture Understand the
More informationMonitoring Siebel Enterprise
Monitoring Siebel Enterprise eg Enterprise v6 Restricted Rights Legend The information contained in this document is confidential and subject to change without notice. No part of this document may be reproduced
More informationMarkLogic Server. MarkLogic Connector for Hadoop Developer s Guide. MarkLogic 8 February, 2015
MarkLogic Connector for Hadoop Developer s Guide 1 MarkLogic 8 February, 2015 Last Revised: 8.0-3, June, 2015 Copyright 2015 MarkLogic Corporation. All rights reserved. Table of Contents Table of Contents
More informationFile S1: Supplementary Information of CloudDOE
File S1: Supplementary Information of CloudDOE Table of Contents 1. Prerequisites of CloudDOE... 2 2. An In-depth Discussion of Deploying a Hadoop Cloud... 2 Prerequisites of deployment... 2 Table S1.
More informationHDFS Users Guide. Table of contents
Table of contents 1 Purpose...2 2 Overview...2 3 Prerequisites...3 4 Web Interface...3 5 Shell Commands... 3 5.1 DFSAdmin Command...4 6 Secondary NameNode...4 7 Checkpoint Node...5 8 Backup Node...6 9
More informationApache Hadoop. Alexandru Costan
1 Apache Hadoop Alexandru Costan Big Data Landscape No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard, except Hadoop 2 Outline What is Hadoop? Who uses it? Architecture HDFS MapReduce Open
More informationThis document summarizes the steps of deploying ActiveVOS on the IBM WebSphere Platform.
Technical Note Overview This document summarizes the steps of deploying ActiveVOS on the IBM WebSphere Platform. Legal Notice The information in this document is preliminary and is subject to change without
More informationWeekly Report. Hadoop Introduction. submitted By Anurag Sharma. Department of Computer Science and Engineering. Indian Institute of Technology Bombay
Weekly Report Hadoop Introduction submitted By Anurag Sharma Department of Computer Science and Engineering Indian Institute of Technology Bombay Chapter 1 What is Hadoop? Apache Hadoop (High-availability
More informationFileNet System Manager Dashboard Help
FileNet System Manager Dashboard Help Release 3.5.0 June 2005 FileNet is a registered trademark of FileNet Corporation. All other products and brand names are trademarks or registered trademarks of their
More informationLog Analyzer Reference
IceWarp Unified Communications Log Analyzer Reference Version 10.4 Printed on 27 February, 2012 Contents Log Analyzer 1 Quick Start... 2 Required Steps... 2 Optional Steps... 3 Advanced Configuration...
More informationA Performance Analysis of Distributed Indexing using Terrier
A Performance Analysis of Distributed Indexing using Terrier Amaury Couste Jakub Kozłowski William Martin Indexing Indexing Used by search
More informations@lm@n Oracle Exam 1z0-102 Oracle Weblogic Server 11g: System Administration I Version: 9.0 [ Total Questions: 111 ]
s@lm@n Oracle Exam 1z0-102 Oracle Weblogic Server 11g: System Administration I Version: 9.0 [ Total Questions: 111 ] Oracle 1z0-102 : Practice Test Question No : 1 Which two statements are true about java
More informationE-mail Listeners. E-mail Formats. Free Form. Formatted
E-mail Listeners 6 E-mail Formats You use the E-mail Listeners application to receive and process Service Requests and other types of tickets through e-mail in the form of e-mail messages. Using E- mail
More informationCA Nimsoft Monitor. Probe Guide for URL Endpoint Response Monitoring. url_response v4.1 series
CA Nimsoft Monitor Probe Guide for URL Endpoint Response Monitoring url_response v4.1 series Legal Notices This online help system (the "System") is for your informational purposes only and is subject
More informationHadoop EKG: Using Heartbeats to Propagate Resource Utilization Data
Hadoop EKG: Using Heartbeats to Propagate Resource Utilization Data Trevor G. Reid Duke University tgr3@duke.edu Jian Wei Gan Duke University jg76@duke.edu Abstract Hadoop EKG is a modification to the
More informationImportant Notice. (c) 2010-2013 Cloudera, Inc. All rights reserved.
Hue 2 User Guide Important Notice (c) 2010-2013 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names or slogans contained in this document
More informationTP1: Getting Started with Hadoop
TP1: Getting Started with Hadoop Alexandru Costan MapReduce has emerged as a leading programming model for data-intensive computing. It was originally proposed by Google to simplify development of web
More informationIDS 561 Big data analytics Assignment 1
IDS 561 Big data analytics Assignment 1 Due Midnight, October 4th, 2015 General Instructions The purpose of this tutorial is (1) to get you started with Hadoop and (2) to get you acquainted with the code
More informationDepartment of Veterans Affairs VistA Integration Adapter Release 1.0.5.0 Enhancement Manual
Department of Veterans Affairs VistA Integration Adapter Release 1.0.5.0 Enhancement Manual Version 1.1 September 2014 Revision History Date Version Description Author 09/28/2014 1.0 Updates associated
More informationMadCap Software. Upgrading Guide. Pulse
MadCap Software Upgrading Guide Pulse Copyright 2014 MadCap Software. All rights reserved. Information in this document is subject to change without notice. The software described in this document is furnished
More informationCA Performance Center
CA Performance Center Single Sign-On User Guide 2.4 This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred to as the Documentation ) is
More informationHow To Test The Bandwidth Meter For Hyperv On Windows V2.4.2.2 (Windows) On A Hyperv Server (Windows V2) On An Uniden V2 (Amd64) Or V2A (Windows 2
BANDWIDTH METER FOR HYPER-V NEW FEATURES OF 2.0 The Bandwidth Meter is an active application now, not just a passive observer. It can send email notifications if some bandwidth threshold reached, run scripts
More informationApplication Discovery Manager User s Guide vcenter Application Discovery Manager 6.2.1
Application Discovery Manager User s Guide vcenter Application Discovery Manager 6.2.1 This document supports the version of each product listed and supports all subsequent versions until the document
More informationConfiguration Worksheets for Oracle WebCenter Ensemble 10.3
Configuration Worksheets for Oracle WebCenter Ensemble 10.3 This document contains worksheets for installing and configuring Oracle WebCenter Ensemble 10.3. Print this document and use it to gather the
More informationvcenter Operations Management Pack for SAP HANA Installation and Configuration Guide
vcenter Operations Management Pack for SAP HANA Installation and Configuration Guide This document supports the version of each product listed and supports all subsequent versions until a new edition replaces
More informationCS380 Final Project Evaluating the Scalability of Hadoop in a Real and Virtual Environment
CS380 Final Project Evaluating the Scalability of Hadoop in a Real and Virtual Environment James Devine December 15, 2008 Abstract Mapreduce has been a very successful computational technique that has
More informationDEPLOYMENT GUIDE Version 1.1. Deploying the BIG-IP LTM v10 with Citrix Presentation Server 4.5
DEPLOYMENT GUIDE Version 1.1 Deploying the BIG-IP LTM v10 with Citrix Presentation Server 4.5 Table of Contents Table of Contents Deploying the BIG-IP system v10 with Citrix Presentation Server Prerequisites
More informationH2O on Hadoop. September 30, 2014. www.0xdata.com
H2O on Hadoop September 30, 2014 www.0xdata.com H2O on Hadoop Introduction H2O is the open source math & machine learning engine for big data that brings distribution and parallelism to powerful algorithms
More informationLogLogic General Database Collector for Microsoft SQL Server Log Configuration Guide
LogLogic General Database Collector for Microsoft SQL Server Log Configuration Guide Document Release: Septembere 2011 Part Number: LL600066-00ELS100000 This manual supports LogLogic General Database Collector
More informationCA Workload Automation Agent for Databases
CA Workload Automation Agent for Databases Implementation Guide r11.3.4 This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred to as the
More informationPreinstallation Requirements Guide
Preinstallation Requirements Guide Synergy 3.4.9 June 2015 Synergy 2015 TOC 1: Introduction 4 Synergy platform modules 4 Synergy install procedure - your responsibilities 4 Further information about Synergy
More informationHadoop Data Warehouse Manual
Ruben Vervaeke & Jonas Lesy 1 Hadoop Data Warehouse Manual To start off, we d like to advise you to read the thesis written about this project before applying any changes to the setup! The thesis can be
More informationDeveloping a MapReduce Application
TIE 12206 - Apache Hadoop Tampere University of Technology, Finland November, 2014 Outline 1 MapReduce Paradigm 2 Hadoop Default Ports 3 Outline 1 MapReduce Paradigm 2 Hadoop Default Ports 3 MapReduce
More informationTIBCO ActiveMatrix BusinessWorks Plug-in for Big Data User s Guide
TIBCO ActiveMatrix BusinessWorks Plug-in for Big Data User s Guide Software Release 1.0 November 2013 Two-Second Advantage Important Information SOME TIBCO SOFTWARE EMBEDS OR BUNDLES OTHER TIBCO SOFTWARE.
More informationCA Unified Infrastructure Management
CA Unified Infrastructure Management Probe Guide for IIS Server Monitoring iis v1.7 series Copyright Notice This online help system (the "System") is for your informational purposes only and is subject
More informationThe Hadoop Distributed File System
The Hadoop Distributed File System The Hadoop Distributed File System, Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler, Yahoo, 2010 Agenda Topic 1: Introduction Topic 2: Architecture
More informationWSO2 Business Process Server Clustering Guide for 3.2.0
WSO2 Business Process Server Clustering Guide for 3.2.0 Throughout this document we would refer to WSO2 Business Process server as BPS. Cluster Architecture Server clustering is done mainly in order to
More informationdocs.hortonworks.com
docs.hortonworks.com Hortonworks Data Platform: Administering Ambari Copyright 2012-2015 Hortonworks, Inc. Some rights reserved. The Hortonworks Data Platform, powered by Apache Hadoop, is a massively
More informationConfiguring a Jetty Container for SESM Applications
CHAPTER 4 Configuring a Jetty Container for SESM Applications The SESM installation process performs all required configurations for running the SESM applications in Jetty containers. Use this chapter
More informationFreeSB Installation Guide 1. Introduction Purpose
FreeSB Installation Guide 1. Introduction Purpose This document provides step-by-step instructions on the installation and configuration of FreeSB Enterprise Service Bus. Quick Install Background FreeSB
More informationRMCS Installation Guide
RESTRICTED RIGHTS Use, duplication, or disclosure by the Government is subject to restrictions as set forth in subparagraph (C)(1)(ii) of the Rights in Technical Data and Computer Software clause at DFARS
More informationHow To Install An Aneka Cloud On A Windows 7 Computer (For Free)
MANJRASOFT PTY LTD Aneka 3.0 Manjrasoft 5/13/2013 This document describes in detail the steps involved in installing and configuring an Aneka Cloud. It covers the prerequisites for the installation, the
More informationWEBAPP PATTERN FOR APACHE TOMCAT - USER GUIDE
WEBAPP PATTERN FOR APACHE TOMCAT - USER GUIDE Contents 1. Pattern Overview... 3 Features 3 Getting started with the Web Application Pattern... 3 Accepting the Web Application Pattern license agreement...
More informationHADOOP MOCK TEST HADOOP MOCK TEST II
http://www.tutorialspoint.com HADOOP MOCK TEST Copyright tutorialspoint.com This section presents you various set of Mock Tests related to Hadoop Framework. You can download these sample mock tests at
More informationCRM Setup Factory Installer V 3.0 Developers Guide
CRM Setup Factory Installer V 3.0 Developers Guide Who Should Read This Guide This guide is for ACCPAC CRM solution providers and developers. We assume that you have experience using: Microsoft Visual
More informationUniversal Event Monitor for SOA 5.2.0 Reference Guide
Universal Event Monitor for SOA 5.2.0 Reference Guide 2015 by Stonebranch, Inc. All Rights Reserved. 1. Universal Event Monitor for SOA 5.2.0 Reference Guide.............................................................
More informationConfiguring and Monitoring the Client Desktop Component
Configuring and Monitoring the Client Desktop Component eg Enterprise v5.6 Restricted Rights Legend The information contained in this document is confidential and subject to change without notice. No part
More informationUsing the DataDirect Connect for JDBC Drivers with WebLogic 8.1
Using the DataDirect Connect for JDBC Drivers with WebLogic 8.1 Introduction This document explains the steps required to use the DataDirect Connect for JDBC drivers with the WebLogic Application Server
More informationHow to Install and Configure EBF15328 for MapR 4.0.1 or 4.0.2 with MapReduce v1
How to Install and Configure EBF15328 for MapR 4.0.1 or 4.0.2 with MapReduce v1 1993-2015 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic,
More information1 Introduction FrontBase is a high performance, scalable, SQL 92 compliant relational database server created in the for universal deployment.
FrontBase 7 for ios and Mac OS X 1 Introduction FrontBase is a high performance, scalable, SQL 92 compliant relational database server created in the for universal deployment. On Mac OS X FrontBase can
More informationSymantec Endpoint Protection Shared Insight Cache User Guide
Symantec Endpoint Protection Shared Insight Cache User Guide Symantec Endpoint Protection Shared Insight Cache User Guide The software described in this book is furnished under a license agreement and
More informationWEBTITAN CLOUD. User Identification Guide BLOCK WEB THREATS BOOST PRODUCTIVITY REDUCE LIABILITIES
BLOCK WEB THREATS BOOST PRODUCTIVITY REDUCE LIABILITIES WEBTITAN CLOUD User Identification Guide This guide explains how to install and configure the WebTitan Cloud Active Directory components required
More informationCA APM Cloud Monitor. Scripting Guide. Release 8.2
CA APM Cloud Monitor Scripting Guide Release 8.2 This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred to as the Documentation ) is for
More informationLoad Balancing Microsoft Sharepoint 2010 Load Balancing Microsoft Sharepoint 2013. Deployment Guide
Load Balancing Microsoft Sharepoint 2010 Load Balancing Microsoft Sharepoint 2013 Deployment Guide rev. 1.4.2 Copyright 2015 Loadbalancer.org, Inc. 1 Table of Contents About this Guide... 3 Appliances
More informationDeploying the BIG-IP System with Oracle E-Business Suite 11i
Deploying the BIG-IP System with Oracle E-Business Suite 11i Introducing the BIG-IP and Oracle 11i configuration Configuring the BIG-IP system for deployment with Oracle 11i Configuring the BIG-IP system
More informationHP OO 10.X - SiteScope Monitoring Templates
HP OO Community Guides HP OO 10.X - SiteScope Monitoring Templates As with any application continuous automated monitoring is key. Monitoring is important in order to quickly identify potential issues,
More informationCA SiteMinder. Policy Server Management Guide. r6.0 SP6. Second Edition
CA SiteMinder Policy Server Management Guide r6.0 SP6 Second Edition This documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred to as the Documentation
More informationGetting Started with SandStorm NoSQL Benchmark
Getting Started with SandStorm NoSQL Benchmark SandStorm is an enterprise performance testing tool for web, mobile, cloud and big data applications. It provides a framework for benchmarking NoSQL, Hadoop,
More informationDocumentum Content Distribution Services TM Administration Guide
Documentum Content Distribution Services TM Administration Guide Version 5.3 SP5 August 2007 Copyright 1994-2007 EMC Corporation. All rights reserved. Table of Contents Preface... 7 Chapter 1 Introducing
More informationRakam: Distributed Analytics API
Rakam: Distributed Analytics API Burak Emre Kabakcı May 30, 2014 Abstract Today, most of the big data applications needs to compute data in real-time since the Internet develops quite fast and the users
More informationCERTIFIED MULESOFT DEVELOPER EXAM. Preparation Guide
CERTIFIED MULESOFT DEVELOPER EXAM Preparation Guide v. November, 2014 2 TABLE OF CONTENTS Table of Contents... 3 Preparation Guide Overview... 5 Guide Purpose... 5 General Preparation Recommendations...
More informationMonitoring Hybrid Cloud Applications in VMware vcloud Air
Monitoring Hybrid Cloud Applications in ware vcloud Air ware vcenter Hyperic and ware vcenter Operations Manager Installation and Administration Guide for Hybrid Cloud Monitoring TECHNICAL WHITE PAPER
More informationKINETIC SR (Survey and Request)
KINETIC SR (Survey and Request) Installation and Configuration Guide Version 5.0 Revised October 14, 2010 Kinetic SR Installation and Configuration Guide 2007-2010, Kinetic Data, Inc. Kinetic Data, Inc,
More informationFileMaker Server 12. FileMaker Server Help
FileMaker Server 12 FileMaker Server Help 2010-2012 FileMaker, Inc. All Rights Reserved. FileMaker, Inc. 5201 Patrick Henry Drive Santa Clara, California 95054 FileMaker is a trademark of FileMaker, Inc.
More informationMarkLogic Server. Connector for SharePoint Administrator s Guide. MarkLogic 8 February, 2015
Connector for SharePoint Administrator s Guide 1 MarkLogic 8 February, 2015 Last Revised: 8.0-1, February, 2015 Copyright 2015 MarkLogic Corporation. All rights reserved. Table of Contents Table of Contents
More informationConfiguring Nex-Gen Web Load Balancer
Configuring Nex-Gen Web Load Balancer Table of Contents Load Balancing Scenarios & Concepts Creating Load Balancer Node using Administration Service Creating Load Balancer Node using NodeCreator Connecting
More informationDEPLOYMENT GUIDE DEPLOYING THE BIG-IP LTM SYSTEM WITH CITRIX PRESENTATION SERVER 3.0 AND 4.5
DEPLOYMENT GUIDE DEPLOYING THE BIG-IP LTM SYSTEM WITH CITRIX PRESENTATION SERVER 3.0 AND 4.5 Deploying F5 BIG-IP Local Traffic Manager with Citrix Presentation Server Welcome to the F5 BIG-IP Deployment
More informationHDFS Architecture Guide
by Dhruba Borthakur Table of contents 1 Introduction... 3 2 Assumptions and Goals... 3 2.1 Hardware Failure... 3 2.2 Streaming Data Access...3 2.3 Large Data Sets... 3 2.4 Simple Coherency Model...3 2.5
More informationHadoop Tutorial. General Instructions
CS246: Mining Massive Datasets Winter 2016 Hadoop Tutorial Due 11:59pm January 12, 2016 General Instructions The purpose of this tutorial is (1) to get you started with Hadoop and (2) to get you acquainted
More informationJava Web Services SDK
Java Web Services SDK Version 1.5.1 September 2005 This manual and accompanying electronic media are proprietary products of Optimal Payments Inc. They are to be used only by licensed users of the product.
More informationOracle EXAM - 1Z0-102. Oracle Weblogic Server 11g: System Administration I. Buy Full Product. http://www.examskey.com/1z0-102.html
Oracle EXAM - 1Z0-102 Oracle Weblogic Server 11g: System Administration I Buy Full Product http://www.examskey.com/1z0-102.html Examskey Oracle 1Z0-102 exam demo product is here for you to test the quality
More informationSOLR INSTALLATION & CONFIGURATION GUIDE FOR USE IN THE NTER SYSTEM
SOLR INSTALLATION & CONFIGURATION GUIDE FOR USE IN THE NTER SYSTEM Prepared By: Leigh Moulder, SRI International leigh.moulder@sri.com TABLE OF CONTENTS Table of Contents. 1 Document Change Log 2 Solr
More informationRobert Honeyman Honeyman IT Consulting. http://www.honeymanit.co.uk rob.honeyman@honeymanit.co.uk
Robert Honeyman Honeyman IT Consulting http://www.honeymanit.co.uk rob.honeyman@honeymanit.co.uk Requirement for HA with SSO Centralized access control SPOF for dependent apps SSO failure = no protected
More informationDetection of Distributed Denial of Service Attack with Hadoop on Live Network
Detection of Distributed Denial of Service Attack with Hadoop on Live Network Suchita Korad 1, Shubhada Kadam 2, Prajakta Deore 3, Madhuri Jadhav 4, Prof.Rahul Patil 5 Students, Dept. of Computer, PCCOE,
More informationChapter 1 - Web Server Management and Cluster Topology
Objectives At the end of this chapter, participants will be able to understand: Web server management options provided by Network Deployment Clustered Application Servers Cluster creation and management
More informationELIXIR LOAD BALANCER 2
ELIXIR LOAD BALANCER 2 Overview Elixir Load Balancer for Elixir Repertoire Server 7.2.2 or greater provides software solution for load balancing of Elixir Repertoire Servers. As a pure Java based software
More informationCloudera Certified Developer for Apache Hadoop
Cloudera CCD-333 Cloudera Certified Developer for Apache Hadoop Version: 5.6 QUESTION NO: 1 Cloudera CCD-333 Exam What is a SequenceFile? A. A SequenceFile contains a binary encoding of an arbitrary number
More informationActive-Active ImageNow Server
Active-Active ImageNow Server Getting Started Guide ImageNow Version: 6.7. x Written by: Product Documentation, R&D Date: March 2014 2014 Perceptive Software. All rights reserved CaptureNow, ImageNow,
More informationAlfresco Enterprise on AWS: Reference Architecture
Alfresco Enterprise on AWS: Reference Architecture October 2013 (Please consult http://aws.amazon.com/whitepapers/ for the latest version of this paper) Page 1 of 13 Abstract Amazon Web Services (AWS)
More informationThe full setup includes the server itself, the server control panel, Firebird Database Server, and three sample applications with source code.
Content Introduction... 2 Data Access Server Control Panel... 2 Running the Sample Client Applications... 4 Sample Applications Code... 7 Server Side Objects... 8 Sample Usage of Server Side Objects...
More informationData Collection and Analysis: Get End-to-End Security with Cisco Connected Analytics for Network Deployment
White Paper Data Collection and Analysis: Get End-to-End Security with Cisco Connected Analytics for Network Deployment Cisco Connected Analytics for Network Deployment (CAND) is Cisco hosted, subscription-based
More informationvcenter Operations Manager for Horizon Supplement
vcenter Operations Manager for Horizon Supplement vcenter Operations Manager for Horizon 1.6 This document supports the version of each product listed and supports all subsequent versions until the document
More informationConfiguring Apache HTTP Server With Pramati
Configuring Apache HTTP Server With Pramati 45 A general practice often seen in development environments is to have a web server to cater to the static pages and use the application server to deal with
More informationSAIP 2012 Performance Engineering
SAIP 2012 Performance Engineering Author: Jens Edlef Møller (jem@cs.au.dk) Instructions for installation, setup and use of tools. Introduction For the project assignment a number of tools will be used.
More informationDEPLOYMENT GUIDE. Deploying F5 for High Availability and Scalability of Microsoft Dynamics 4.0
DEPLOYMENT GUIDE Deploying F5 for High Availability and Scalability of Microsoft Dynamics 4.0 Introducing the F5 and Microsoft Dynamics CRM configuration Microsoft Dynamics CRM is a full customer relationship
More informationInstallation and Configuration Guide
Entrust Managed Services PKI Auto-enrollment Server 7.0 Installation and Configuration Guide Document issue: 1.0 Date of Issue: July 2009 Copyright 2009 Entrust. All rights reserved. Entrust is a trademark
More informationAssignment # 1 (Cloud Computing Security)
Assignment # 1 (Cloud Computing Security) Group Members: Abdullah Abid Zeeshan Qaiser M. Umar Hayat Table of Contents Windows Azure Introduction... 4 Windows Azure Services... 4 1. Compute... 4 a) Virtual
More informationIntroducing the BIG-IP and SharePoint Portal Server 2003 configuration
Deployment Guide Deploying Microsoft SharePoint Portal Server 2003 and the F5 BIG-IP System Introducing the BIG-IP and SharePoint Portal Server 2003 configuration F5 and Microsoft have collaborated on
More informationSDK Code Examples Version 2.4.2
Version 2.4.2 This edition of SDK Code Examples refers to version 2.4.2 of. This document created or updated on February 27, 2014. Please send your comments and suggestions to: Black Duck Software, Incorporated
More informationEnhanced Connector Applications SupportPac VP01 for IBM WebSphere Business Events 3.0.0
Enhanced Connector Applications SupportPac VP01 for IBM WebSphere Business Events 3.0.0 Third edition (May 2012). Copyright International Business Machines Corporation 2012. US Government Users Restricted
More informationImplement Hadoop jobs to extract business value from large and varied data sets
Hadoop Development for Big Data Solutions: Hands-On You Will Learn How To: Implement Hadoop jobs to extract business value from large and varied data sets Write, customize and deploy MapReduce jobs to
More informationagileworkflow Manual 1. agileworkflow 2. The repository 1 of 29 Contents Definition
agileworkflow Manual Contents 1. Intro 2. Repository 3. Diagrams 4. Agents 4.1. Dispatcher Service 4.2. Event Service 4.3. Execution Service 5. Variables 6. Instances 7. Events 7.1. External 7.2. File
More information