Chef Patterns at Bloomberg Scale HADOOP INFRASTRUCTURE TEAM. Freenode: #chef-bach

Size: px
Start display at page:

Download "Chef Patterns at Bloomberg Scale HADOOP INFRASTRUCTURE TEAM. https://github.com/bloomberg/chef-bach Freenode: #chef-bach"

Transcription

1 CHEF PATTERNS AT BLOOMBERG SCALE HADOOP INFRASTRUCTURE TEAM Freenode: #chef-bach

2 BLOOMBERG CLUSTERS 2 APPLICATION SPECIFIC Hadoop, Kafka ENVIRONMENT SPECIFIC Networking, Storage BUILT REGULARLY DEDICATED BOOTSTRAP SERVER Virtual Machine DEDICATED CHEF-SERVER

3 WHY A VM? 3 LIGHTWEIGHT PRE-REQUISITE Low memory/storage Requirements RAPID DEPLOYMENT Vagrant for Bring-Up Vagrant for Re-Configuration EASY RELEASE MANAGEMENT MULTIPLE VM PER HYPERVISOR Multiple Clusters EASY RELOCATION

4 SERVICES OFFERED 4 REPOSITORIES APT Ruby Gems Static Files (Chef!) CHEF SERVER KERBEROS KDC PXE SERVER DHCP/TFTP Server Cobbler ( Bridged Networking (for test VMs) STRONG ISOLATION

5 BUILDING BOOTSTRAP 5 CHEF AND VAGRANT Generic Image (Jenkins) NETWORK CONFIGURATION CORRECTING KNIFE.RB CHEF SERVER RECONFIGURATION CLEAN UP (CHEF REST API) CONVERT BOOTSTRAP TO BE AN ADMIN CLIENT Secrets/Keys

6 BUILDING BOOTSTRAP CHEF-SOLO PROVISIONER # Chef provisioning bootstrap.vm.provision "chef_solo" do chef chef.environments_path = [[:vm,""]] chef.environment = env_name chef.cookbooks_path = [[:vm,""]] chef.roles_path = [[:vm,""]] chef.add_recipe("bcpc::bootstrap_network") chef.log_level="debug" chef.verbose_logging=true chef.provisioning_path="/home/vagrant/chef-bcpc/" CHEF SERVER RECONFIGURATION NGINX, SOLR, RABBITMQ # Reconfigure chef-server bootstrap.vm.provision :shell, :inline => "chef-server-ctl reconfigure" 6

7 BUILDING BOOTSTRAP CLEAN UP (REST API) ruby_block "cleanup-old-environment-databag" do block do rest = Chef::REST.new(node[:chef_client][:server_url], "admin", \ "/etc/chef-server/admin.pem") rest.delete("/environments/generic") rest.delete("/data/configs/generic") ruby_block "cleanup-old-clients" do block do system_clients = ["chef-validator", "chef-webui"] rest = Chef::REST.new(node[:chef_client][:server_url], "admin", \ "/etc/chef-server/admin.pem") rest.get_rest("/clients").each do client if!system_clients.include?(client.first) rest.delete("/clients/#{client.first}") 7

8 BUILDING BOOTSTRAP 8 CONVERT TO ADMIN (BOOTSTRAP_CONFIG.RB) ruby_block "convert-bootstrap-to-admin" do block do rest = Chef::REST.new(node[:chef_client][:server_url], "admin", "/etc/chef-server/admin.pem") rest.put_rest("/clients/#{node[:hostname]}",{:admin => true}) rest.put_rest("/nodes/#{node[:hostname]}", { :name => node[:hostname], :run_list => ['role[bcpc-bootstrap]'] } )

9 CLUSTER USABILITY 9 CODE DEPLOYMENT APPLICATION COOKBOOKS RUBY GEMS Zookeeper, WebHDFS CLUSTERS ARE NOT SINGLE MACHINE Which machine to deploy Idempotency; Races

10 DEPLOY TO HDFS 10 USE CHEF DIRECTORY RESOURCE USE CUSTOM PROVIDER directory /projects/myapp do mode 755 owner foo recursive true provider BCPC::HdfsDirectory

11 DEPLOY KAFKA TOPIC 11 USE LWRP Dynamic Topic; Right Zookeeper PROVIDER CODE AVAILABLE AT # Kafka Topic Resource actions :create, :update attribute :name, :kind_of => String, :name_attribute => true attribute :partitions, :kind_of => Integer, :default => 1 attribute :replication, :kind_of => Integer, :default => 1

12 KERBEROS 12 KEYTABS Per Service / Host Up to 10 Keytabs per Host WHAT ABOUT MULTI HOMED HOSTS? Hadoop imputes _HOST PROVIDERS WebHDFS uses SPNEGO SYSTEM ROLE ACCOUNTS TENANT ROLE ACCOUNTS AVAILABLE AT

13 LOGIC INJECTION 13 Statutory Warning Code snippets are edited to fit the slides which may have resulted in logic incoherence, bugs and un-readability. Readers discretion requested. COMPLETE CODE CAN BE FOUND AT Community cookbook Wrapper custom recipe

14 LOGIC INJECTION 14 WE USE COMMUNITY COOKBOOKS Takes care of standard install, enable and starting of services NEED TO ADD LOGIC TO COOKBOOK RECIPES Take action on a service only when conditions are satisfied Take action on a service based on depent service state

15 LOGIC INJECTION 15 VANILLA COMMUNITY COOKBOOK: template ::File.join(node.kafka.config_dir, 'server.properties') do source 'server.properties.erb'... helpers(kafka::configuration) if restart_on_configuration_change? notifies :restart, 'service[kafka]', :delayed service 'kafka' do provider kafka_init_opts[:provider] supports start: true, stop: true, restart: true, status: true action kafka_service_actions

16 LOGIC INJECTION VANILLA COMMUNITY COOKBOOK: template ::File.join(node.kafka.config_dir, 'server.properties') do source 'server.properties.erb'... helpers(kafka::configuration) if restart_on_configuration_change? notifies :restart, 'service[kafka]', :delayed #----- Remove ----# service 'kafka' do provider kafka_init_opts[:provider] supports start: true, stop: true, restart: true, status: true action kafka_service_actions #----- Remove----# 16

17 LOGIC INJECTION 17 VANILLA COMMUNITY COOKBOOK 2.0: template ::File.join(node.kafka.config_dir, 'server.properties') do source 'server.properties.erb... helpers(kafka::configuration) if restart_on_configuration_change? notifies :create, 'ruby_block[pre-shim]', :immediately #----- Replace----# include_recipe node["kafka"]["start_coordination"]["recipe"] #----- Replace----#

18 LOGIC INJECTION 18 COOKBOOK COORDINATOR RECIPE: ruby_block 'pre-shim' do # pre-restart no-op notifies :restart, 'service[kafka] ', :delayed service 'kafka' do provider kafka_init_opts[:provider] supports start: true, stop: true, restart: true, status: true action kafka_service_actions

19 LOGIC INJECTION 19 WRAPPER COORDINATOR RECIPE: ruby_block 'pre-shim' do # pre-restart done here notifies :restart, 'service[kafka] ', :delayed service 'kafka' do provider kafka_init_opts[:provider] supports start: true, stop: true, restart: true, status: true action kafka_service_actions notifies :create, 'ruby_block[post-shim] ', :immediately ruby_block 'post-shim' do # clean-up done here

20 SERVICE ON DEMAND 20 COMMON SERVICE WHICH CAN BE REQUESTED Copy log files from applications into a centralized location Single location for users to review logs and helps with security Service available on all the nodes Applications can request the service dynamically

21 SERVICE ON DEMAND 21 NODE ATTRIBUTE TO STORE SERVICE REQUESTS default['bcpc']['hadoop']['copylog'] = {} DATA STRUCTURE TO MAKE SERVICE REQUESTS { } 'app_id' => { 'logfile' => "/path/file_name_of_log_file", 'docopy' => true (or false) },...

22 SERVICE ON DEMAND 22 APPLICATION RECIPES MAKE SERVICE REQUESTS # # Updating node attributes to copy HBase master log file to HDFS # node.default['bcpc']['hadoop']['copylog']['hbase_master'] = { 'logfile' => "/var/log/hbase/hbase-master-#{node.hostname}.log", 'docopy' => true } node.default['bcpc']['hadoop']['copylog']['hbase_master_out'] = { 'logfile' => "/var/log/hbase/hbase-master-#{node.hostname}.out", 'docopy' => true }

23 SERVICE ON DEMAND 23 RECIPE FOR THE COMMON SERVICE node['bcpc']['hadoop']['copylog'].each do id,f if f['docopy'] template "/etc/flume/conf/flume-#{id}.conf" do source "flume_flume-conf.erb action :create... variables(:agent_name => "#{id}", :log_location => "#{f['logfile']}" ) notifies :restart,"service[flume-agent-multi-#{id}]",:delayed service "flume-agent-multi-#{id}" do supports :status => true, :restart => true, :reload => false service_name "flume-agent-multi" action :start start_command "service flume-agent-multi start #{id}" restart_command "service flume-agent-multi restart #{id}" status_command "service flume-agent-multi status #{id}"

24 PLUGGABLE ALERTS 24 SINGLE SOURCE FOR MONITORED STATS Allows users to visualize stats across different parameters Didn t want to duplicate the stats collection by alerting system Need to feed data to the alerting system to generate alerts

25 PLUGGABLE ALERTS ATTRIBUTE WHERE USERS CAN DEFINE ALERTS default["bcpc"]["hadoop"]["graphite"]["queries"] = { 'hbase_master' => [ { 'type' => "jmx", 'query' => "memory.nonheapmemoryusage_committed", 'key' => "hbasenonheapmem", 'trigger_val' => "max(61,0)", 'trigger_cond' => "=0", 'trigger_name' => "HBaseMasterAvailability", 'trigger_dep' => ["NameNodeAvailability"], 'trigger_desc' => "HBase master seems to be down", 'severity' => 1 },{ 'type' => "jmx", 'query' => "memory.heapmemoryusage_committed", 'key' => "hbaseheapmem",... },...], namenode' => [...]...} Query to pull stats from data source Define alert criteria 25

26 TEMPLATE PITFALLS 26 LIBRARY FUNCTION CALLS IN WRAPPER COOKBOOKS Community cookbook provider accepts template as an attribute Template passed from wrapper makes a library function call Wrapper recipe includes the module of library function

27 TEMPLATE PITFALLS WRAPPER RECIPE... Chef::Resource.s(:include, Bcpc::OSHelper)... cobbler_profile "bcpc_host" do kickstart "cobbler.bcpc_ubuntu_host.preseed" distro "ubuntu mini-x86_ FUNCTION CALL IN TEMPLATE... d-i passwd/user-password-crypted password 'cobbler-root-password-salted')}"%> d-i passwd/user-uid string...

28 TEMPLATE PITFALLS 28 MODIFIED FUNCTION CALL IN TEMPLATE... d-i passwd/user-password-crypted password 'cobbler-root-passwordsalted')}"%> d-i passwd/user-uid string...

29 DYNAMIC RESOURCES 29 ANIT-PATTERN? ruby_block "create namenode directories" do block do node[:bcpc][:storage][:mounts].each do d dir = Chef::Resource::Directory.new("#{mount_root}/#{d}/dfs/nn", run_context) dir.owner "hdfs" dir.group "hdfs" dir.mode 0755 dir.recursive true dir.run_action :create exe = Chef::Resource::Execute.new("fixup nn owner", run_context) exe.command "chown -Rf hdfs:hdfs #{mount_root}/#{d}/dfs" exe.only_if { Etc.getpwuid(File.stat("#{mount_root}/#{d}/dfs/").uid).name!= "hdfs " }

30 DYNAMIC RESOURCES 30 SYSTEM CONFIGURATION Lengthy Configuration of a Storage Controller Setting Attributes at Converge Time Compile Time Actions? MUST WRAP IN RUBY_BLOCK S Does not Update the Resource Collection Lazy s everywhere: Guards: not_if{lazy{node[ ]}.call.map{ }}

31 SERVICE RESTART 31 WE USE JMXTRANS TO MONITOR JMX STATS Service to be monitored varies with node There can be more than one service to be monitored Monitored service restart requires JMXtrans to be restarted**

32 SERVICE RESTART 32 DATA STRUCTURE IN ROLES TO DEFINE THE SERVICES "default_attributes" : { "jmxtrans :{ "servers :[ { "type": "datanode", "service": "hadoop-hdfs-datanode", "service_cmd": "org.apache.hadoop.hdfs.server.datanode.datanode" }, { "type": "hbase_rs", "service": "hbase-regionserver", "service_cmd": org.apache.hadoop.hbase.regionserver.hregionserver" } ] }... Depent Service Name String to uniquely identify the service process

33 SERVICE RESTART 33 JMXTRANS SERVICE RESTART LOGIC BUILT DYNAMICALLY jmx_services = Array.new jmx_srvc_cmds = Hash.new node['jmxtrans']['servers'].each do server jmx_services.push(server['service']) jmx_srvc_cmds[server['service']] = server['service_cmd'] service "restart jmxtrans on depent service" do service_name "jmxtrans" supports :restart => true, :status => true, :reload => true Store the depent service name and process ids in local variables action :restart jmx_services.each do jmx_dep_service subscribes :restart, "service[#{jmx_dep_service}]", :delayed only_if {process_require_restart?("jmxtrans","jmxtrans-all.jar, jmx_srvc_cmds)} Subscribes from all depent services What if a process is re/started externally?

34 SERVICE RESTART 34 def process_require_restart?(process_name, process_cmd, dep_cmds) tgt_proces_pid = `pgrep -f #{process_cmd}`... tgt_proces_stime = `ps --no-header -o start_time #{tgt_process_pid}`... ret = false restarted_processes = Array.new dep_cmds.each do dep_process, dep_cmd dep_pids = `pgrep -f #{dep_cmd}` if dep_pids!= "" dep_pids_arr = dep_pids.split("\n") dep_pids_arr.each do dep_pid Start time of the service process Start time of all the service processes on which it is depent on Compare the start time dep_process_stime = `ps --no-header -o start_time #{dep_pid}` if DateTime.parse(tgt_proces_stime) < DateTime.parse(dep_process_stime) restarted_processes.push(dep_process) ret = true...

35 ROLLING RESTART 35 AUTOMATIC CONVERGENCE AVAILABILITY HOW High Availability Toxic Configuration Check Masters for Slave Status Synchronous Communication Locking

36 ROLLING RESTART 36 FLAGGING Negative Flagging flag when a service is down Positive Flagging flag when a service is reconfiguring Deadlock Avoidance CONTENTION Poll & Wait Fail the Run Simply Skip Service Restart and Go On Store the Need for Restart Breaks Assumptions of Procedural Chef Runs

37 ROLLING RESTART 37 SERVICE DEFINITION HADOOP_SERVICE "ZOOKEEPER-SERVER" DO DEPENDENCIES ["TEMPLATE[/ETC/ZOOKEEPER/CONF/ZOO.CFG]", "TEMPLATE[/USR/LIB/ZOOKEEPER/BIN/ZKSERVER.SH]", "TEMPLATE[/ETC/DEFAULT/ZOOKEEPER-SERVER]"] PROCESS_IDENTIFIER "ORG.APACHE.ZOOKEEPER... QUORUMPEERMAIN" END

38 ROLLING RESTART 38 SYNCH STATE STORE Zookeeper SERVICE RESTART (KAFKA) VALIDATION CHECK Based on Jenkins pattern for wait_until_ready! Verifies that the service is up to an acceptable level Passes or stops the Chef run FUTURE DIRECTIONS Topology Aware Deployment Data Aware Deployment

39 WE ARE HIRING JOBS.BLOOMBERG.COM: Hadoop Infrastructure Engineer DevOps Engineer Search Infrastructure Freenode: #chef-bach

CHEF IN THE CLOUD AND ON THE GROUND

CHEF IN THE CLOUD AND ON THE GROUND CHEF IN THE CLOUD AND ON THE GROUND Michael T. Nygard Relevance michael.nygard@thinkrelevance.com @mtnygard Infrastructure As Code Infrastructure As Code Chef Infrastructure As Code Chef Development Models

More information

Communicating with the Elephant in the Data Center

Communicating with the Elephant in the Data Center Communicating with the Elephant in the Data Center Who am I? Instructor Consultant Opensource Advocate http://www.laubersoltions.com sml@laubersolutions.com Twitter: @laubersm Freenode: laubersm Outline

More information

Cloudera Manager Training: Hands-On Exercises

Cloudera Manager Training: Hands-On Exercises 201408 Cloudera Manager Training: Hands-On Exercises General Notes... 2 In- Class Preparation: Accessing Your Cluster... 3 Self- Study Preparation: Creating Your Cluster... 4 Hands- On Exercise: Working

More information

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture. Big Data Hadoop Administration and Developer Course This course is designed to understand and implement the concepts of Big data and Hadoop. This will cover right from setting up Hadoop environment in

More information

Ankush Cluster Manager - Hadoop2 Technology User Guide

Ankush Cluster Manager - Hadoop2 Technology User Guide Ankush Cluster Manager - Hadoop2 Technology User Guide Ankush User Manual 1.5 Ankush User s Guide for Hadoop2, Version 1.5 This manual, and the accompanying software and other documentation, is protected

More information

The Greenplum Analytics Workbench

The Greenplum Analytics Workbench The Greenplum Analytics Workbench External Overview 1 The Greenplum Analytics Workbench Definition Is a 1000-node Hadoop Cluster. Pre-configured with publicly available data sets. Contains the entire Hadoop

More information

Upgrading a Single Node Cisco UCS Director Express, page 2. Supported Upgrade Paths to Cisco UCS Director Express for Big Data, Release 2.

Upgrading a Single Node Cisco UCS Director Express, page 2. Supported Upgrade Paths to Cisco UCS Director Express for Big Data, Release 2. Upgrading Cisco UCS Director Express for Big Data, Release 2.0 This chapter contains the following sections: Supported Upgrade Paths to Cisco UCS Director Express for Big Data, Release 2.0, page 1 Upgrading

More information

How Bigtop Leveraged Docker for Build Automation and One-Click Hadoop Provisioning

How Bigtop Leveraged Docker for Build Automation and One-Click Hadoop Provisioning How Bigtop Leveraged Docker for Build Automation and One-Click Hadoop Provisioning Evans Ye Apache Big Data 2015 Budapest Who am I Apache Bigtop PMC member Software Engineer at Trend Micro Develop Big

More information

Big Data Operations Guide for Cloudera Manager v5.x Hadoop

Big Data Operations Guide for Cloudera Manager v5.x Hadoop Big Data Operations Guide for Cloudera Manager v5.x Hadoop Logging into the Enterprise Cloudera Manager 1. On the server where you have installed 'Cloudera Manager', make sure that the server is running,

More information

docs.hortonworks.com

docs.hortonworks.com docs.hortonworks.com : Ambari User's Guide Copyright 2012-2015 Hortonworks, Inc. Some rights reserved. The, powered by Apache Hadoop, is a massively scalable and 100% open source platform for storing,

More information

HDFS Federation. Sanjay Radia Founder and Architect @ Hortonworks. Page 1

HDFS Federation. Sanjay Radia Founder and Architect @ Hortonworks. Page 1 HDFS Federation Sanjay Radia Founder and Architect @ Hortonworks Page 1 About Me Apache Hadoop Committer and Member of Hadoop PMC Architect of core-hadoop @ Yahoo - Focusing on HDFS, MapReduce scheduler,

More information

docs.hortonworks.com

docs.hortonworks.com docs.hortonworks.com Hortonworks Data Platform: Administering Ambari Copyright 2012-2015 Hortonworks, Inc. Some rights reserved. The Hortonworks Data Platform, powered by Apache Hadoop, is a massively

More information

Spectrum Scale HDFS Transparency Guide

Spectrum Scale HDFS Transparency Guide Spectrum Scale Guide Spectrum Scale BDA 2016-1-5 Contents 1. Overview... 3 2. Supported Spectrum Scale storage mode... 4 2.1. Local Storage mode... 4 2.2. Shared Storage Mode... 4 3. Hadoop cluster planning...

More information

How to Install and Configure EBF15328 for MapR 4.0.1 or 4.0.2 with MapReduce v1

How to Install and Configure EBF15328 for MapR 4.0.1 or 4.0.2 with MapReduce v1 How to Install and Configure EBF15328 for MapR 4.0.1 or 4.0.2 with MapReduce v1 1993-2015 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic,

More information

Deploying Hadoop with Manager

Deploying Hadoop with Manager Deploying Hadoop with Manager SUSE Big Data Made Easier Peter Linnell / Sales Engineer plinnell@suse.com Alejandro Bonilla / Sales Engineer abonilla@suse.com 2 Hadoop Core Components 3 Typical Hadoop Distribution

More information

000-596. IBM Security Access Manager for Enterprise Single Sign-On V8.2 Implementation Exam. http://www.examskey.com/000-596.html

000-596. IBM Security Access Manager for Enterprise Single Sign-On V8.2 Implementation Exam. http://www.examskey.com/000-596.html IBM 000-596 IBM Security Access Manager for Enterprise Single Sign-On V8.2 Implementation Exam TYPE: DEMO http://www.examskey.com/000-596.html Examskey IBM 000-596 exam demo product is here for you to

More information

HDFS Users Guide. Table of contents

HDFS Users Guide. Table of contents Table of contents 1 Purpose...2 2 Overview...2 3 Prerequisites...3 4 Web Interface...3 5 Shell Commands... 3 5.1 DFSAdmin Command...4 6 Secondary NameNode...4 7 Checkpoint Node...5 8 Backup Node...6 9

More information

IBM Cloud Manager with OpenStack

IBM Cloud Manager with OpenStack IBM Cloud Manager with OpenStack Download Trial Guide Cloud Solutions Team: Cloud Solutions Beta cloudbta@us.ibm.com Page 1 Table of Contents Chapter 1: Introduction...3 Development cycle release scope...3

More information

MapReduce Job Processing

MapReduce Job Processing April 17, 2012 Background: Hadoop Distributed File System (HDFS) Hadoop requires a Distributed File System (DFS), we utilize the Hadoop Distributed File System (HDFS). Background: Hadoop Distributed File

More information

HADOOP MOCK TEST HADOOP MOCK TEST II

HADOOP MOCK TEST HADOOP MOCK TEST II http://www.tutorialspoint.com HADOOP MOCK TEST Copyright tutorialspoint.com This section presents you various set of Mock Tests related to Hadoop Framework. You can download these sample mock tests at

More information

docs.hortonworks.com

docs.hortonworks.com docs.hortonworks.com Hortonworks Data Platform: Configuring Kafka for Kerberos Over Ambari Copyright 2012-2015 Hortonworks, Inc. Some rights reserved. The Hortonworks Data Platform, powered by Apache Hadoop,

More information

Single Sign On. Configuration Checklist for Single Sign On CHAPTER

Single Sign On. Configuration Checklist for Single Sign On CHAPTER CHAPTER 39 The single sign on feature allows end users to log into a Windows client machine on a Windows domain, then use certain Cisco Unified Communications Manager applications without signing on again.

More information

Installation Guide Avi Networks Cloud Application Delivery Platform Integration with Cisco Application Policy Infrastructure

Installation Guide Avi Networks Cloud Application Delivery Platform Integration with Cisco Application Policy Infrastructure Installation Guide Avi Networks Cloud Application Delivery Platform Integration with Cisco Application Policy Infrastructure August 2015 Table of Contents 1 Introduction... 3 Purpose... 3 Products... 3

More information

Perforce Helix Threat Detection OVA Deployment Guide

Perforce Helix Threat Detection OVA Deployment Guide Perforce Helix Threat Detection OVA Deployment Guide OVA Deployment Guide 1 Introduction For a Perforce Helix Threat Analytics solution there are two servers to be installed: an analytics server (Analytics,

More information

Pivotal HD Enterprise

Pivotal HD Enterprise PRODUCT DOCUMENTATION Pivotal HD Enterprise Version 1.1 Stack and Tool Reference Guide Rev: A01 2013 GoPivotal, Inc. Table of Contents 1 Pivotal HD 1.1 Stack - RPM Package 11 1.1 Overview 11 1.2 Accessing

More information

Overview. Big Data in Apache Hadoop. - HDFS - MapReduce in Hadoop - YARN. https://hadoop.apache.org. Big Data Management and Analytics

Overview. Big Data in Apache Hadoop. - HDFS - MapReduce in Hadoop - YARN. https://hadoop.apache.org. Big Data Management and Analytics Overview Big Data in Apache Hadoop - HDFS - MapReduce in Hadoop - YARN https://hadoop.apache.org 138 Apache Hadoop - Historical Background - 2003: Google publishes its cluster architecture & DFS (GFS)

More information

How To Use Cloudera Manager Backup And Disaster Recovery (Brd) On A Microsoft Hadoop 5.5.5 (Clouderma) On An Ubuntu 5.2.5 Or 5.3.5

How To Use Cloudera Manager Backup And Disaster Recovery (Brd) On A Microsoft Hadoop 5.5.5 (Clouderma) On An Ubuntu 5.2.5 Or 5.3.5 Cloudera Manager Backup and Disaster Recovery Important Notice (c) 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names or

More information

Implementation of Hadoop Distributed File System Protocol on OneFS Tanuj Khurana EMC Isilon Storage Division

Implementation of Hadoop Distributed File System Protocol on OneFS Tanuj Khurana EMC Isilon Storage Division Implementation of Hadoop Distributed File System Protocol on OneFS Tanuj Khurana EMC Isilon Storage Division Outline HDFS Overview OneFS Overview HDFS protocol on OneFS HDFS protocol server implementation

More information

COURSE CONTENT Big Data and Hadoop Training

COURSE CONTENT Big Data and Hadoop Training COURSE CONTENT Big Data and Hadoop Training 1. Meet Hadoop Data! Data Storage and Analysis Comparison with Other Systems RDBMS Grid Computing Volunteer Computing A Brief History of Hadoop Apache Hadoop

More information

Infrastructure Clouds for Science and Education: Platform Tools

Infrastructure Clouds for Science and Education: Platform Tools Infrastructure Clouds for Science and Education: Platform Tools Kate Keahey, Renato J. Figueiredo, John Bresnahan, Mike Wilde, David LaBissoniere Argonne National Laboratory Computation Institute, University

More information

Prepared By : Manoj Kumar Joshi & Vikas Sawhney

Prepared By : Manoj Kumar Joshi & Vikas Sawhney Prepared By : Manoj Kumar Joshi & Vikas Sawhney General Agenda Introduction to Hadoop Architecture Acknowledgement Thanks to all the authors who left their selfexplanatory images on the internet. Thanks

More information

The Hadoop Distributed File System

The Hadoop Distributed File System The Hadoop Distributed File System Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler Yahoo! Sunnyvale, California USA {Shv, Hairong, SRadia, Chansler}@Yahoo-Inc.com Presenter: Alex Hu HDFS

More information

Hadoop as a Service. VMware vcloud Automation Center & Big Data Extension

Hadoop as a Service. VMware vcloud Automation Center & Big Data Extension Hadoop as a Service VMware vcloud Automation Center & Big Data Extension Table of Contents 1. Introduction... 2 1.1 How it works... 2 2. System Pre-requisites... 2 3. Set up... 2 3.1 Request the Service

More information

Cloudera Manager Introduction

Cloudera Manager Introduction Cloudera Manager Introduction Important Notice (c) 2010-2013 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names or slogans contained

More information

Integrating SAP BusinessObjects with Hadoop. Using a multi-node Hadoop Cluster

Integrating SAP BusinessObjects with Hadoop. Using a multi-node Hadoop Cluster Integrating SAP BusinessObjects with Hadoop Using a multi-node Hadoop Cluster May 17, 2013 SAP BO HADOOP INTEGRATION Contents 1. Installing a Single Node Hadoop Server... 2 2. Configuring a Multi-Node

More information

AFW: Automating host-based firewalls with Chef

AFW: Automating host-based firewalls with Chef : Automating host-based firewalls with Chef Julien Vehent Aweber Communications th 9 Netfilter Workshop Open Source Days 2013 Problem Monolithic/border firewalls will either fail under load, or contain

More information

Control-M for Hadoop. Technical Bulletin. www.bmc.com

Control-M for Hadoop. Technical Bulletin. www.bmc.com Technical Bulletin Control-M for Hadoop Version 8.0.00 September 30, 2014 Tracking number: PACBD.8.0.00.004 BMC Software is announcing that Control-M for Hadoop now supports the following: Secured Hadoop

More information

CDH 5 Quick Start Guide

CDH 5 Quick Start Guide CDH 5 Quick Start Guide Important Notice (c) 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names or slogans contained in this

More information

Hadoop Distributed File System Propagation Adapter for Nimbus

Hadoop Distributed File System Propagation Adapter for Nimbus University of Victoria Faculty of Engineering Coop Workterm Report Hadoop Distributed File System Propagation Adapter for Nimbus Department of Physics University of Victoria Victoria, BC Matthew Vliet

More information

Glassfish Architecture.

Glassfish Architecture. Glassfish Architecture. First part Introduction. Over time, GlassFish has evolved into a server platform that is much more than the reference implementation of the Java EE specifcations. It is now a highly

More information

Installing and Administering VMware vsphere Update Manager

Installing and Administering VMware vsphere Update Manager Installing and Administering VMware vsphere Update Manager Update 1 vsphere Update Manager 5.1 This document supports the version of each product listed and supports all subsequent versions until the document

More information

Our Puppet Story. Martin Schütte. May 5 2014

Our Puppet Story. Martin Schütte. May 5 2014 Our Puppet Story Martin Schütte May 5 2014 About DECK36 Small team of 7 engineers Longstanding expertise in designing, implementing and operating complex web systems Developing own data intelligence-focused

More information

Cloudera Backup and Disaster Recovery

Cloudera Backup and Disaster Recovery Cloudera Backup and Disaster Recovery Important Note: Cloudera Manager 4 and CDH 4 have reached End of Maintenance (EOM) on August 9, 2015. Cloudera will not support or provide patches for any of the Cloudera

More information

IBM Tivoli Composite Application Manager for Microsoft Applications: Microsoft Hyper-V Server Agent Version 6.3.1 Fix Pack 2.

IBM Tivoli Composite Application Manager for Microsoft Applications: Microsoft Hyper-V Server Agent Version 6.3.1 Fix Pack 2. IBM Tivoli Composite Application Manager for Microsoft Applications: Microsoft Hyper-V Server Agent Version 6.3.1 Fix Pack 2 Reference IBM Tivoli Composite Application Manager for Microsoft Applications:

More information

Chancery SMS 7.5.0 Database Split

Chancery SMS 7.5.0 Database Split TECHNICAL BULLETIN Microsoft SQL Server replication... 1 Transactional replication... 2 Preparing to set up replication... 3 Setting up replication... 4 Quick Reference...11, 2009 Pearson Education, Inc.

More information

Single Sign On. Configuration Checklist for Single Sign On CHAPTER

Single Sign On. Configuration Checklist for Single Sign On CHAPTER CHAPTER 39 The single sign on feature allows end users to log into a Windows client machine on a Windows domain, then use certain Cisco Unified Communications Manager applications without signing on again.

More information

Jenkins and Chef Infrastructure CI and Application Deployment

Jenkins and Chef Infrastructure CI and Application Deployment Jenkins and Chef Infrastructure CI and Application Deployment Dan Stine Copyright Clearance Center www.copyright.com June 18, 2014 #jenkinsconf About Me! Software Architect! Library & Framework Developer!

More information

PrivateWire Gateway Load Balancing and High Availability using Microsoft SQL Server Replication

PrivateWire Gateway Load Balancing and High Availability using Microsoft SQL Server Replication PrivateWire Gateway Load Balancing and High Availability using Microsoft SQL Server Replication Introduction The following document describes how to install PrivateWire in high availability mode using

More information

Hadoop Training Hands On Exercise

Hadoop Training Hands On Exercise Hadoop Training Hands On Exercise 1. Getting started: Step 1: Download and Install the Vmware player - Download the VMware- player- 5.0.1-894247.zip and unzip it on your windows machine - Click the exe

More information

Pivotal HD Enterprise 1.0 Stack and Tool Reference Guide. Rev: A03

Pivotal HD Enterprise 1.0 Stack and Tool Reference Guide. Rev: A03 Pivotal HD Enterprise 1.0 Stack and Tool Reference Guide Rev: A03 Use of Open Source This product may be distributed with open source code, licensed to you in accordance with the applicable open source

More information

Comparing Scalable NOSQL Databases

Comparing Scalable NOSQL Databases Comparing Scalable NOSQL Databases Functionalities and Measurements Dory Thibault UCL Contact : thibault.dory@student.uclouvain.be Sponsor : Euranova Website : nosqlbenchmarking.com February 15, 2011 Clarications

More information

CI Pipeline with Docker 2015-02-27

CI Pipeline with Docker 2015-02-27 CI Pipeline with Docker 2015-02-27 Juho Mäkinen, Technical Operations, Unity Technologies Finland http://www.juhonkoti.net http://github.com/garo Overview 1. Scale on how we use Docker 2. Overview on the

More information

CRITEO INTERNSHIP PROGRAM 2015/2016

CRITEO INTERNSHIP PROGRAM 2015/2016 CRITEO INTERNSHIP PROGRAM 2015/2016 A. List of topics PLATFORM Topic 1: Build an API and a web interface on top of it to manage the back-end of our third party demand component. Challenge(s): Working with

More information

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP Big-Data and Hadoop Developer Training with Oracle WDP What is this course about? Big Data is a collection of large and complex data sets that cannot be processed using regular database management tools

More information

DevOps Best Practices for Mobile Apps. Sanjeev Sharma IBM Software Group

DevOps Best Practices for Mobile Apps. Sanjeev Sharma IBM Software Group DevOps Best Practices for Mobile Apps Sanjeev Sharma IBM Software Group Me 18 year in the software industry 15+ years he has been a solution architect with IBM Areas of work: o DevOps o Enterprise Architecture

More information

Testing Spark: Best Practices

Testing Spark: Best Practices Testing Spark: Best Practices Anupama Shetty Neil Marshall Senior SDET, Analytics, Ooyala Inc SDET, Analytics, Ooyala Inc Spark Summit 2014 Agenda - Anu 1. Application 2. Test 3. Best Overview Batch mode

More information

Hadoop. History and Introduction. Explained By Vaibhav Agarwal

Hadoop. History and Introduction. Explained By Vaibhav Agarwal Hadoop History and Introduction Explained By Vaibhav Agarwal Agenda Architecture HDFS Data Flow Map Reduce Data Flow Hadoop Versions History Hadoop version 2 Hadoop Architecture HADOOP (HDFS) Data Flow

More information

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and

More information

FioranoMQ 9. High Availability Guide

FioranoMQ 9. High Availability Guide FioranoMQ 9 High Availability Guide Copyright (c) 1999-2008, Fiorano Software Technologies Pvt. Ltd., Copyright (c) 2008-2009, Fiorano Software Pty. Ltd. All rights reserved. This software is the confidential

More information

CDH 5 High Availability Guide

CDH 5 High Availability Guide CDH 5 High Availability Guide Important Notice (c) 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names or slogans contained

More information

Ensure that your environment meets the requirements. Provision the OpenAM server in Active Directory, then generate keytab files.

Ensure that your environment meets the requirements. Provision the OpenAM server in Active Directory, then generate keytab files. This chapter provides information about the feature which allows end users to log into a Windows client machine on a Windows domain, then use certain Cisco Unified Communications Manager applications without

More information

CORD Monitoring Service

CORD Monitoring Service CORD Design Notes CORD Monitoring Service Srikanth Vavilapalli, Ericsson Larry Peterson, Open Networking Lab November 17, 2015 Introduction The XOS Monitoring service provides a generic platform to support

More information

Release Notes for Fuel and Fuel Web Version 3.0.1

Release Notes for Fuel and Fuel Web Version 3.0.1 Release Notes for Fuel and Fuel Web Version 3.0.1 June 21, 2013 1 Mirantis, Inc. is releasing version 3.0.1 of the Fuel Library and Fuel Web products. This is a cumulative maintenance release to the previously

More information

Deploying and Managing SolrCloud in the Cloud ApacheCon, April 8, 2014 Timothy Potter. Search Discover Analyze

Deploying and Managing SolrCloud in the Cloud ApacheCon, April 8, 2014 Timothy Potter. Search Discover Analyze Deploying and Managing SolrCloud in the Cloud ApacheCon, April 8, 2014 Timothy Potter Search Discover Analyze My SolrCloud Experience Currently, working on scaling up to a 200+ node deployment at LucidWorks

More information

000-420. IBM InfoSphere MDM Server v9.0. Version: Demo. Page <<1/11>>

000-420. IBM InfoSphere MDM Server v9.0. Version: Demo. Page <<1/11>> 000-420 IBM InfoSphere MDM Server v9.0 Version: Demo Page 1. As part of a maintenance team for an InfoSphere MDM Server implementation, you are investigating the "EndDate must be after StartDate"

More information

Hadoop Setup. 1 Cluster

Hadoop Setup. 1 Cluster In order to use HadoopUnit (described in Sect. 3.3.3), a Hadoop cluster needs to be setup. This cluster can be setup manually with physical machines in a local environment, or in the cloud. Creating a

More information

Insights to Hadoop Security Threats

Insights to Hadoop Security Threats Insights to Hadoop Security Threats Presenter: Anwesha Das Peipei Wang Outline Attacks DOS attack - Rate Limiting Impersonation Implementation Sandbox HDP version 2.1 Cluster Set-up Kerberos Security Setup

More information

Introduction to HDFS. Prasanth Kothuri, CERN

Introduction to HDFS. Prasanth Kothuri, CERN Prasanth Kothuri, CERN 2 What s HDFS HDFS is a distributed file system that is fault tolerant, scalable and extremely easy to expand. HDFS is the primary distributed storage for Hadoop applications. HDFS

More information

SUSE Cloud Installation: Best Practices Using an Existing SMT and KVM Environment

SUSE Cloud Installation: Best Practices Using an Existing SMT and KVM Environment Best Practices Guide www.suse.com SUSE Cloud Installation: Best Practices Using an Existing SMT and KVM Environment Written by B1 Systems GmbH Table of Contents Introduction...3 Use Case Overview...3 Hardware

More information

Cisco UCS CPA Workflows

Cisco UCS CPA Workflows This chapter contains the following sections: Workflows for Big Data, page 1 About Service Requests for Big Data, page 2 Workflows for Big Data Cisco UCS Director Express for Big Data defines a set of

More information

This How To guide will take you through configuring Network Load Balancing and deploying MOSS 2007 in SharePoint Farm.

This How To guide will take you through configuring Network Load Balancing and deploying MOSS 2007 in SharePoint Farm. Quick Brief This How To guide will take you through configuring Network Load Balancing and deploying MOSS 2007 in SharePoint Farm. This document will serve as prerequisite for Enterprise Portal deployment

More information

Exam Name: IBM InfoSphere MDM Server v9.0

Exam Name: IBM InfoSphere MDM Server v9.0 Vendor: IBM Exam Code: 000-420 Exam Name: IBM InfoSphere MDM Server v9.0 Version: DEMO 1. As part of a maintenance team for an InfoSphere MDM Server implementation, you are investigating the "EndDate must

More information

Apache Sentry. Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com

Apache Sentry. Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com Apache Sentry Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com Agenda Various aspects of data security Apache Sentry for authorization Key concepts of Apache Sentry Sentry features Sentry architecture

More information

The Top 10 7 Hadoop Patterns and Anti-patterns. Alex Holmes @

The Top 10 7 Hadoop Patterns and Anti-patterns. Alex Holmes @ The Top 10 7 Hadoop Patterns and Anti-patterns Alex Holmes @ whoami Alex Holmes Software engineer Working on distributed systems for many years Hadoop since 2008 @grep_alex grepalex.com what s hadoop...

More information

User and Group-Based Reporting in TRITON - Web Security: Best Practices and Troubleshooting

User and Group-Based Reporting in TRITON - Web Security: Best Practices and Troubleshooting User and Group-Based Reporting in TRITON - Web Security: Best Practices and Troubleshooting Websense Support Webinar March 2012 web security data security email security Support Webinars 2012 Websense,

More information

HADOOP MOCK TEST HADOOP MOCK TEST I

HADOOP MOCK TEST HADOOP MOCK TEST I http://www.tutorialspoint.com HADOOP MOCK TEST Copyright tutorialspoint.com This section presents you various set of Mock Tests related to Hadoop Framework. You can download these sample mock tests at

More information

DEPLOYING EMC DOCUMENTUM BUSINESS ACTIVITY MONITOR SERVER ON IBM WEBSPHERE APPLICATION SERVER CLUSTER

DEPLOYING EMC DOCUMENTUM BUSINESS ACTIVITY MONITOR SERVER ON IBM WEBSPHERE APPLICATION SERVER CLUSTER White Paper DEPLOYING EMC DOCUMENTUM BUSINESS ACTIVITY MONITOR SERVER ON IBM WEBSPHERE APPLICATION SERVER CLUSTER Abstract This white paper describes the process of deploying EMC Documentum Business Activity

More information

Cloudera Backup and Disaster Recovery

Cloudera Backup and Disaster Recovery Cloudera Backup and Disaster Recovery Important Notice (c) 2010-2013 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names or slogans

More information

1. GridGain In-Memory Accelerator For Hadoop. 2. Hadoop Installation. 2.1 Hadoop 1.x Installation

1. GridGain In-Memory Accelerator For Hadoop. 2. Hadoop Installation. 2.1 Hadoop 1.x Installation 1. GridGain In-Memory Accelerator For Hadoop GridGain's In-Memory Accelerator For Hadoop edition is based on the industry's first high-performance dual-mode in-memory file system that is 100% compatible

More information

Advantages and Disadvantages of Application Network Marketing Systems

Advantages and Disadvantages of Application Network Marketing Systems Application Deployment Softwaretechnik II 2014/15 Thomas Kowark Outline Options for Application Hosting Automating Environment Setup Deployment Scripting Application Monitoring Continuous Deployment and

More information

SUSE Cloud 2.0. Pete Chadwick. Douglas Jarvis. Senior Product Manager pchadwick@suse.com. Product Marketing Manager djarvis@suse.

SUSE Cloud 2.0. Pete Chadwick. Douglas Jarvis. Senior Product Manager pchadwick@suse.com. Product Marketing Manager djarvis@suse. SUSE Cloud 2.0 Pete Chadwick Douglas Jarvis Senior Product Manager pchadwick@suse.com Product Marketing Manager djarvis@suse.com SUSE Cloud SUSE Cloud is an open source software solution based on OpenStack

More information

The future of middleware: enterprise application integration and Fuse

The future of middleware: enterprise application integration and Fuse The future of middleware: enterprise application integration and Fuse Giuseppe Brindisi EMEA Solution Architect/Red Hat AGENDA Agenda Build an enterprise application integration platform that is: Resilient

More information

TIBCO Spotfire Statistics Services Installation and Administration Guide. Software Release 5.0 November 2012

TIBCO Spotfire Statistics Services Installation and Administration Guide. Software Release 5.0 November 2012 TIBCO Spotfire Statistics Services Installation and Administration Guide Software Release 5.0 November 2012 Important Information SOME TIBCO SOFTWARE EMBEDS OR BUNDLES OTHER TIBCO SOFTWARE. USE OF SUCH

More information

Deploy Big Data Extensions on vsphere Standard Edition

Deploy Big Data Extensions on vsphere Standard Edition Deploy Big Data Extensions on vsphere Standard Edition You can deploy Big Data Extensions 2.1.1 Fling on VMware vsphere Standard Edition for the purpose of experimentation and proof-of-concept projects

More information

docs.hortonworks.com

docs.hortonworks.com docs.hortonworks.com : Security Administration Tools Guide Copyright 2012-2014 Hortonworks, Inc. Some rights reserved. The, powered by Apache Hadoop, is a massively scalable and 100% open source platform

More information

Understanding MySQL storage and clustering in QueueMetrics. Loway

Understanding MySQL storage and clustering in QueueMetrics. Loway Understanding MySQL storage and clustering in QueueMetrics Loway Understanding MySQL storage and clustering in QueueMetrics Loway Table of Contents 1. Understanding MySQL storage and clustering... 1 2.

More information

Hadoop Basics with InfoSphere BigInsights

Hadoop Basics with InfoSphere BigInsights An IBM Proof of Technology Hadoop Basics with InfoSphere BigInsights Unit 4: Hadoop Administration An IBM Proof of Technology Catalog Number Copyright IBM Corporation, 2013 US Government Users Restricted

More information

Real-time Streaming Analysis for Hadoop and Flume. Aaron Kimball odiago, inc. OSCON Data 2011

Real-time Streaming Analysis for Hadoop and Flume. Aaron Kimball odiago, inc. OSCON Data 2011 Real-time Streaming Analysis for Hadoop and Flume Aaron Kimball odiago, inc. OSCON Data 2011 The plan Background: Flume introduction The need for online analytics Introducing FlumeBase Demo! FlumeBase

More information

Unicenter NSM Integration for Remedy (v 1.0.5)

Unicenter NSM Integration for Remedy (v 1.0.5) Unicenter NSM Integration for Remedy (v 1.0.5) The Unicenter NSM Integration for Remedy package brings together two powerful technologies to enable better tracking, faster diagnosis and reduced mean-time-to-repair

More information

Cloudera Manager Health Checks

Cloudera Manager Health Checks Cloudera, Inc. 220 Portage Avenue Palo Alto, CA 94306 info@cloudera.com US: 1-888-789-1488 Intl: 1-650-362-0488 www.cloudera.com Cloudera Manager Health Checks Important Notice 2010-2013 Cloudera, Inc.

More information

IceWarp to IceWarp Server Migration

IceWarp to IceWarp Server Migration IceWarp to IceWarp Server Migration Registered Trademarks iphone, ipad, Mac, OS X are trademarks of Apple Inc., registered in the U.S. and other countries. Microsoft, Windows, Outlook and Windows Phone

More information

Hadoop IST 734 SS CHUNG

Hadoop IST 734 SS CHUNG Hadoop IST 734 SS CHUNG Introduction What is Big Data?? Bulk Amount Unstructured Lots of Applications which need to handle huge amount of data (in terms of 500+ TB per day) If a regular machine need to

More information

vcenter Operations Manager for Horizon Supplement

vcenter Operations Manager for Horizon Supplement vcenter Operations Manager for Horizon Supplement vcenter Operations Manager for Horizon 1.6 This document supports the version of each product listed and supports all subsequent versions until the document

More information

Deployment Planning Guide

Deployment Planning Guide Deployment Planning Guide Community 1.5.0 release The purpose of this document is to educate the user about the different strategies that can be adopted to optimize the usage of Jumbune on Hadoop and also

More information

STREAM ANALYTIX. Industry s only Multi-Engine Streaming Analytics Platform

STREAM ANALYTIX. Industry s only Multi-Engine Streaming Analytics Platform STREAM ANALYTIX Industry s only Multi-Engine Streaming Analytics Platform One Platform for All Create real-time streaming data analytics applications in minutes with a powerful visual editor Get a wide

More information

Table Of Contents. 1. GridGain In-Memory Database

Table Of Contents. 1. GridGain In-Memory Database Table Of Contents 1. GridGain In-Memory Database 2. GridGain Installation 2.1 Check GridGain Installation 2.2 Running GridGain Examples 2.3 Configure GridGain Node Discovery 3. Starting Grid Nodes 4. Management

More information

The Big Data Ecosystem at LinkedIn. Presented by Zhongfang Zhuang

The Big Data Ecosystem at LinkedIn. Presented by Zhongfang Zhuang The Big Data Ecosystem at LinkedIn Presented by Zhongfang Zhuang Based on the paper The Big Data Ecosystem at LinkedIn, written by Roshan Sumbaly, Jay Kreps, and Sam Shah. The Ecosystems Hadoop Ecosystem

More information

Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA

Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA http://kzhang6.people.uic.edu/tutorial/amcis2014.html August 7, 2014 Schedule I. Introduction to big data

More information

Chase Wu New Jersey Ins0tute of Technology

Chase Wu New Jersey Ins0tute of Technology CS 698: Special Topics in Big Data Chapter 4. Big Data Analytics Platforms Chase Wu New Jersey Ins0tute of Technology Some of the slides have been provided through the courtesy of Dr. Ching-Yung Lin at

More information