Bright Cluster Manager

Similar documents
Bright Cluster Manager

Cloud Bursting with SLURM and Bright Cluster Manager. Martijn de Vries CTO

OpenStack: we drink our own Champagne. Teun Docter Software developer

Qsoft Inc

Hybrid Cluster Management: Reducing Stress, increasing productivity and preparing for the future

Hadoop Deployment Manual

Hadoop Architecture. Part 1

Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components

Hadoop Deployment Manual

Parallel Data Mining and Assurance Service Model Using Hadoop in Cloud

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

Hadoop on OpenStack Cloud. Dmitry Mescheryakov Software

Cloudera Administrator Training for Apache Hadoop

Apache Hadoop. Alexandru Costan

HDFS Users Guide. Table of contents

Programming Hadoop 5-day, instructor-led BD-106. MapReduce Overview. Hadoop Overview

Weekly Report. Hadoop Introduction. submitted By Anurag Sharma. Department of Computer Science and Engineering. Indian Institute of Technology Bombay

Deploying Hadoop with Manager

Cloudera Manager Training: Hands-On Exercises

Hadoop Distributed File System. Dhruba Borthakur June, 2007

Apache Hadoop new way for the company to store and analyze big data

Big Data Analytics. Lucas Rego Drumond

H2O on Hadoop. September 30,

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE

Building & Optimizing Enterprise-class Hadoop with Open Architectures Prem Jain NetApp

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop

International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February ISSN

Optimize the execution of local physics analysis workflows using Hadoop

TUT5605: Deploying an elastic Hadoop cluster Alejandro Bonilla

Chapter 7. Using Hadoop Cluster and MapReduce

COURSE CONTENT Big Data and Hadoop Training

BIG DATA & HADOOP DEVELOPER TRAINING & CERTIFICATION

Hadoop. Apache Hadoop is an open-source software framework for storage and large scale processing of data-sets on clusters of commodity hardware.

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

Reference Architecture and Best Practices for Virtualizing Hadoop Workloads Justin Murray VMware

How To Understand The Architecture Of An Ulteo Virtual Desktop Server Farm

HDFS Federation. Sanjay Radia Founder and Hortonworks. Page 1

Prepared By : Manoj Kumar Joshi & Vikas Sawhney

BIG DATA HADOOP TRAINING

The Greenplum Analytics Workbench

Open source Google-style large scale data analysis with Hadoop

VMware vsphere Big Data Extensions Administrator's and User's Guide

Complete Java Classes Hadoop Syllabus Contact No:

Integrating SAP BusinessObjects with Hadoop. Using a multi-node Hadoop Cluster

IMPLEMENTING PREDICTIVE ANALYTICS USING HADOOP FOR DOCUMENT CLASSIFICATION ON CRM SYSTEM

Bright OpenStack. Case Studies & What s New. John Corne Senior Pre-Sales Engineer at Bright June 19, 2016

DATA MINING WITH HADOOP AND HIVE Introduction to Architecture

Experiences with Lustre* and Hadoop*

A Multilevel Secure MapReduce Framework for Cross-Domain Information Sharing in the Cloud

Cray XC30 Hadoop Platform Jonathan (Bill) Sparks Howard Pritchard Martha Dumler

Hadoop Technology HADOOP CLUSTER

White Paper. Big Data and Hadoop. Abhishek S, Java COE. Cloud Computing Mobile DW-BI-Analytics Microsoft Oracle ERP Java SAP ERP

Hadoop Distributed File System. T Seminar On Multimedia Eero Kurkela

Hadoop Distributed File System. Dhruba Borthakur Apache Hadoop Project Management Committee June 3 rd, 2008

Agenda. Big Data. Dell Cloud Solutions A Dell Story Summary. Concepts Market Trends and Challenges Dell Solutions

HADOOP MOCK TEST HADOOP MOCK TEST I

WHAT S NEW IN SAS 9.4

Enabling High performance Big Data platform with RDMA

marlabs driving digital agility WHITEPAPER Big Data and Hadoop

Virtualizing Apache Hadoop. June, 2012

L1: Introduction to Hadoop

What We Can Do in the Cloud (2) -Tutorial for Cloud Computing Course- Mikael Fernandus Simalango WISE Research Lab Ajou University, South Korea

Cloudera in the Public Cloud

Introduction. Various user groups requiring Hadoop, each with its own diverse needs, include:

SwiftStack Filesystem Gateway Architecture

Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA

Apache Hadoop 2.0 Installation and Single Node Cluster Configuration on Ubuntu A guide to install and setup Single-Node Apache Hadoop 2.

Distributed Filesystems

Network-Aware Scheduling of MapReduce Framework on Distributed Clusters over High Speed Networks

Deploying Cloudera CDH (Cloudera Distribution Including Apache Hadoop) with Emulex OneConnect OCe14000 Network Adapters

IBM Software Hadoop Fundamentals

Adobe Deploys Hadoop as a Service on VMware vsphere

Peers Techno log ies Pv t. L td. HADOOP

CDH 5 Quick Start Guide

Deploying Virtualized Hadoop Systems with VMware vsphere Big Data Extensions A DEPLOYMENT GUIDE

Overview. Big Data in Apache Hadoop. - HDFS - MapReduce in Hadoop - YARN. Big Data Management and Analytics

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop)

Savanna Hadoop on. OpenStack. Savanna Technical Lead

CS380 Final Project Evaluating the Scalability of Hadoop in a Real and Virtual Environment

NoSQL and Hadoop Technologies On Oracle Cloud

docs.hortonworks.com

Chase Wu New Jersey Ins0tute of Technology

Building Storage as a Service with OpenStack. Greg Elkinbard Senior Technical Director

Big Data Introduction

Hortonworks Data Platform Reference Architecture

The Inside Scoop on Hadoop

Hadoop IST 734 SS CHUNG

A Performance Analysis of Distributed Indexing using Terrier

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

Dell Reference Configuration for Hortonworks Data Platform

How To Scale Out Of A Nosql Database

t] open source Hadoop Beginner's Guide ij$ data avalanche Garry Turkington Learn how to crunch big data to extract meaning from

CDH AND BUSINESS CONTINUITY:

HDFS Installation and Shell

Transcription:

Bright Cluster Manager A Unified Management Solution for HPC and Hadoop Martijn de Vries CTO

Introduction

Architecture Bright Cluster CMDaemon Cluster Management GUI Cluster Management Shell SOAP/ JSONAPI +SSL SOAP+SSL node001 Web-Based User Portal head node node002 Third-Party Applications node003

Management Interface Graphical User Interface (GUI) Offers administrator full cluster control Standalone desktop application Manages multiple clusters simultaneously Runs natively on Linux, Windows and OS X Cluster Management GUI Cluster Management Shell (CMSH) All GUI functionality also available through Cluster Management Shell Interactive and scriptable in batch mode Cluster Management Shell

Hadoop Integration

Managing Clusters Bright Cluster Manager can be used for several types of clusters HPC Compute Storage Private cloud (OpenStack) Server farms Big Data (Hadoop) All types of clusters need to be: Deployed Configured Provisioned Managed Monitored Health-checked

Managing Hadoop Clusters Managing Hadoop Clusters just as difficult as other types of clusters Without proper infrastructure, Hadoop will not run and cluster will not be usable for data processing Bright Cluster Manager provides single-pane-of-glass to manage and monitor all aspects of Hadoop cluster Includes: Hardware (set up, configuration, monitoring) Operating system (provisioning, updates) Hadoop distribution Hadoop configuration Users Bright Cluster Manager provides perfect environment for Hadoop to run on Hadoop distribution agnostic (switching is easy)

Bright for Hadoop Cluster Management Bright Cluster Manager 7.0 for Apache Hadoop Provides single-pane-of-glass for managing both physical cluster as well as Hadoop Easy installation of Hadoop Apache Hadoop 1.2, 2.2, 2.3 & 2.4 (on Bright DVD) Cloudera CDH 4 & 5 HortonWorks HDP 1.3 & 2.1 Configuration, monitoring and healthchecking of Hadoop instances Graphical UI, command-line interface and API access

10

11

12

13

Hadoop Configuration Hadoop configuration through roles Nodes can be configured to run certain Hadoop related services by assigning roles Example roles: DataNode, JobTracker, TaskTracker, Namenode, SecondaryNameNode, YARNServer, YARNClient, HBaseServer, HBaseClient, ZooKeeper Assigning/unassigning role will: Write out configuration files based on role parameters Start/stop/monitor relevant services Most important Hadoop configuration aspects can be changed from inside Bright Exotic Hadoop configuration parameters can be set directly in (partially generated) configuration file

15

Hadoop Management Features Integrated user management and HDFS access control Ability to re-purpose nodes between Hadoop and e.g. HPC Multiple HDFS instances on same cluster (different Hadoop distributions possible) Most Hadoop configuration aspects controlled through GUI and CLI Healthchecking and monitoring of Hadoop related services Ability to use alternative filesystems to HDFS (e.g. Lustre)

Re-purposing nodes Node tasks are determined by assignment of roles (e.g. Hadoop Data Node, Slurm) By default, node runs all tasks that it has been assigned roles for in parallel (e.g. Hadoop + Slurm) Two methods to stop running Hadoop on a node: Method 1: (temporary) Property at category and device level: Use exclusively for: Values: <empty>, HPC, OpenStack, Hadoop, Ceph, Nothing Setting Use exclusively for causes all other tasks to be stopped immediately Method 2: (permanent) Hadoop related operations: decommission/recommission Decommission: move data to other nodes to maintain replication factor and stop using for jobs (could take a while) Recommission: move data back to node and use for Hadoop jobs

Conclusion Bright provides tried & tested method of cluster management Hundreds of clusters world-wide are being managed using Bright Cluster Manager Inclusion of Hadoop management capabilities provides complete solution for setup, management & monitoring of Hadoop clusters Single pane of glass for cluster & Hadoop Especially well suited for clusters that must support both HPC compute and Hadoop jobs

Questions?