Complete Java Classes Hadoop Syllabus Contact No: 8888022204



Similar documents
HADOOP ADMINISTATION AND DEVELOPMENT TRAINING CURRICULUM

Qsoft Inc

Peers Techno log ies Pv t. L td. HADOOP

BIG DATA HADOOP TRAINING

COURSE CONTENT Big Data and Hadoop Training

Workshop on Hadoop with Big Data

Big Data Course Highlights

Programming Hadoop 5-day, instructor-led BD-106. MapReduce Overview. Hadoop Overview

ITG Software Engineering

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

Hadoop Job Oriented Training Agenda

International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February ISSN

BIG DATA - HADOOP PROFESSIONAL amron

Certified Big Data and Apache Hadoop Developer VS-1221

Hadoop: The Definitive Guide

Cloudera Certified Developer for Apache Hadoop

ITG Software Engineering

Big Data and Hadoop. Module 1: Introduction to Big Data and Hadoop. Module 2: Hadoop Distributed File System. Module 3: MapReduce

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

Implement Hadoop jobs to extract business value from large and varied data sets

The Hadoop Eco System Shanghai Data Science Meetup

BIG DATA & HADOOP DEVELOPER TRAINING & CERTIFICATION

Hadoop: The Definitive Guide

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

BIG DATA SERIES: HADOOP DEVELOPER TRAINING PROGRAM. An Overview

t] open source Hadoop Beginner's Guide ij$ data avalanche Garry Turkington Learn how to crunch big data to extract meaning from

HADOOP. Revised 10/19/2015

[Type text] Week. National summer training program on. Big Data & Hadoop. Why big data & Hadoop is important?

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi

Pro Apache Hadoop. Second Edition. Sameer Wadkar. Madhu Siddalingaiah

Hadoop Ecosystem B Y R A H I M A.

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

brief contents PART 1 BACKGROUND AND FUNDAMENTALS...1 PART 2 PART 3 BIG DATA PATTERNS PART 4 BEYOND MAPREDUCE...385

Deploying Hadoop with Manager

Introduction to Big Data Training

Prepared By : Manoj Kumar Joshi & Vikas Sawhney

HADOOP MOCK TEST HADOOP MOCK TEST II

Hadoop Development & BI- 0 to 100

TRAINING PROGRAM ON BIGDATA/HADOOP

Operations and Big Data: Hadoop, Hive and Scribe. Zheng 铮 9 12/7/2011 Velocity China 2011

Session: Big Data get familiar with Hadoop to use your unstructured data Udo Brede Dell Software. 22 nd October :00 Sesión B - DB2 LUW

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE

Hadoop implementation of MapReduce computational model. Ján Vaňo

Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop

HADOOP BIG DATA DEVELOPER TRAINING AGENDA

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

Overview. Big Data in Apache Hadoop. - HDFS - MapReduce in Hadoop - YARN. Big Data Management and Analytics

Hadoop 只 支 援 用 Java 開 發 嘛? Is Hadoop only support Java? 總 不 能 全 部 都 重 新 設 計 吧? 如 何 與 舊 系 統 相 容? Can Hadoop work with existing software?

Hadoop 101. Lars George. NoSQL- Ma4ers, Cologne April 26, 2013

MySQL and Hadoop. Percona Live 2014 Chris Schneider

Open source Google-style large scale data analysis with Hadoop

Spring,2015. Apache Hive BY NATIA MAMAIASHVILI, LASHA AMASHUKELI & ALEKO CHAKHVASHVILI SUPERVAIZOR: PROF. NODAR MOMTSELIDZE

Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA

Internals of Hadoop Application Framework and Distributed File System

Building Scalable Big Data Infrastructure Using Open Source Software. Sam William

Pivotal HD Enterprise

Data-Intensive Programming. Timo Aaltonen Department of Pervasive Computing

The Big Data Ecosystem at LinkedIn. Presented by Zhongfang Zhuang

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

CURSO: ADMINISTRADOR PARA APACHE HADOOP

Constructing a Data Lake: Hadoop and Oracle Database United!

Hadoop Introduction. Olivier Renault Solution Engineer - Hortonworks

Weekly Report. Hadoop Introduction. submitted By Anurag Sharma. Department of Computer Science and Engineering. Indian Institute of Technology Bombay

Chase Wu New Jersey Ins0tute of Technology

Large scale processing using Hadoop. Ján Vaňo

ISSN: (Online) Volume 3, Issue 4, April 2015 International Journal of Advance Research in Computer Science and Management Studies

How to Install and Configure EBF15328 for MapR or with MapReduce v1

Hive Interview Questions

Apache HBase. Crazy dances on the elephant back

Lecture 2 (08/31, 09/02, 09/09): Hadoop. Decisions, Operations & Information Technologies Robert H. Smith School of Business Fall, 2015

Processing of massive data: MapReduce. 2. Hadoop. New Trends In Distributed Systems MSc Software and Systems

Hadoop Certification (Developer, Administrator HBase & Data Science) CCD-410, CCA-410 and CCB-400 and DS-200

How to Hadoop Without the Worry: Protecting Big Data at Scale

Apache Sentry. Prasad Mujumdar

Fundamentals Curriculum HAWQ

Ankush Cluster Manager - Hadoop2 Technology User Guide

Training Catalog. Summer 2015 Training Catalog. Apache Hadoop Training from the Experts. Apache Hadoop Training From the Experts

Apache Hadoop: The Pla/orm for Big Data. Amr Awadallah CTO, Founder, Cloudera, Inc.

Big Data Development CASSANDRA NoSQL Training - Workshop. March 13 to am to 5 pm HOTEL DUBAI GRAND DUBAI

HareDB HBase Client Web Version USER MANUAL HAREDB TEAM

Open source large scale distributed data management with Google s MapReduce and Bigtable

Map Reduce & Hadoop Recommended Text:

Pivotal HD Enterprise 1.0 Stack and Tool Reference Guide. Rev: A03

COSC 6397 Big Data Analytics. 2 nd homework assignment Pig and Hive. Edgar Gabriel Spring 2015

Hadoop and Hive Development at Facebook. Dhruba Borthakur Zheng Shao {dhruba, Presented at Hadoop World, New York October 2, 2009

Facebook s Petabyte Scale Data Warehouse using Hive and Hadoop

Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related

Apache Hadoop: Past, Present, and Future

Big Data Management and Security

Cloudera Manager Training: Hands-On Exercises

HiBench Introduction. Carson Wang Software & Services Group

Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases. Lecture 15

Hadoop and Map-Reduce. Swati Gore

Transcription:

1) Introduction to BigData & Hadoop What is Big Data? Why all industries are talking about Big Data? What are the issues in Big Data? Storage What are the challenges for storing big data? Processing What are the challenges for processing big data? What are the technologies support big data? Hadoop Data Base Traditional NO SQL Hadoop What is Hadoop? History of Hadoop Why Hadoop? Hadoop Use cases Advantages and Disadvantages of Hadoop Importance of Different Ecosystems of Hadoop Importance of Integration with other BigData solutions Big Data Real time Use Cases 2) HDFS (Hadoop Distributed File System) HDFS Architecture Daemons in Hadoop a) Name Node Importance of Name Node What are the roles of Name Node What are the drawbacks in Name Node b) Secondary Name Node Address: Flat No.3, Kusai Market Above Tirupati Hospital, Karvenagar, Pune-52 Page 1

Importance of Secondary Name Node What are the roles of Secondary Name Node What are the drawbacks in Secondary Name Node c) Data Node Importance of Data Node What are the roles of Data Node What are the drawbacks in Data Node Data Storage in HDFS How blocks are stored in Data Nodes How replication works in Data Nodes How to write the files in HDFS How to read the files in HDFS HDFS Block size Importance of HDFS Block size Why Block size is so large? How it is related to MapReduce split size HDFS Replication factor Importance of HDFS Replication factor in production environment Can we change the replication for a particular file or folders. Can we change the replication for all files or folders. Accessing HDFS CLI(Command Line Interface) using HDFS commands Java Based Approach HDFS Commands Importance of each command How to execute the command HDFS admin related commands explanation Configurations Can we change the existing configurations of HDFS or not? Importance of configurations Address: Flat No.3, Kusai Market Above Tirupati Hospital, Karvenagar, Pune-52 Page 2

How to overcome the Drawbacks in HDFS Name Node failures Secondary Name Node failures Data Node failures How to configure the Hadoop Cluster How to add the new nodes (Commissioning ) How to remove the existing nodes (De-Commissioning ) How to verify the Dead Nodes How to start the Dead Nodes Hadoop 2.x.x version features Introduction to Namenode fedoration Introduction to Namenode High Availabilty Difference between Hadoop 1.x.x and Hadoop 2.x.x versions 3) MapReduce Architecure a) JobTracker Importance of JobTracker What are the roles of JobTracker What are the drawbacks in JobTracker b) TaskTracker Importance of TaskTracker What are the roles of TaskTracker What are the drawbacks in TaskTracker c) Data Types in Hadoop What are the Data types in Map Reduce Why these are importance in Map Reduce Can we write custom Data Types in MapReduce Input Formats in Map Reduce Text Input Format Key Value Text Input Format Sequence File Input Format Nline Input Format Address: Flat No.3, Kusai Market Above Tirupati Hospital, Karvenagar, Pune-52 Page 3

Importance of Input Format in Map Reduce How to use Input Format in Map Reduce How to write custom Input Formats and its Record Readers Output Formats in Map Reduce Text Output Format Sequence File Output Format Importance of Output Format in Map Reduce How to use Output Format in Map Reduce How to write custom Output Format's and its Record Writers Mapper What is mapper in Map Reduce Job Why we need mapper? What are the Advantages and Disadvantages of mapper Writing mapper programs Reducer What is reducer in Map Reduce Job Why we need reducer? What are the Advantages and Disadvantages of reducer Writing reducer programs Driver What is Driver in Map Reduce Job Why we need Driver? Writing Driver program Input Split InputSplit Need Of Input Split in Map Reduce lnputsplit Size InputSplit Size Vs Block Size InputSplit Vs Mappers Map Reduce Job execution flow Combiner What is combiner in Map Reduce Job Why we need combiner? What are the Advantages and Disadvantages of Combiner Writing Combiner programs Identity Mapper and Identity Reducer Address: Flat No.3, Kusai Market Above Tirupati Hospital, Karvenagar, Pune-52 Page 4

Partitioner What is Partitioner in Map Reduce Job Why we need Partitioner? What are the Advantages and Disadvantages of Partitioner Writing Partitioner programs Distributed Cache What is Distributed Cache in Map Reduce Job Importance of Distributed Cache in Map Reduce job What are the Advantages and Disadvantages of Distributed Cache Writing Distributed Cache programs Counters What is Counter in Map Reduce Job Why we need Counters in production environment? How to Write Counters in Map Reduce programs Importance of Writable and Writable Comparable Api's How to write custom Map Reduce Keys using Writable How to write custom Map Reduce Values using Writable Comparable 4) Joins Map Side Join What is the importance of Map Side Join Where we are using it Reduce Side Join What is the importance of Reduce Side Join Where we are using it What is the difference between Map Side join and Reduce Side Join? Compression techniques Importance of Compression techniques in production environment Compression Types NONE, RECORD and BLOCK Compression Codecs Default, Gzip, Bzip, Snappy and LZO Enabling and Disabling these techniques for all the Jobs Enabling and Disabling these techniques for a particular Job Address: Flat No.3, Kusai Market Above Tirupati Hospital, Karvenagar, Pune-52 Page 5

Map Reduce Programming Model How to write the Map Reduce jobs in Java Running the Map Reduce jobs in local mode Running the Map Reduce jobs in pseudo mode Running the Map Reduce jobs in cluster mode Debugging Map Reduce Jobs How to debug Map Reduce Jobs in Local Mode. How to debug Map Reduce Jobs in Remote Mode. 5) YARN (Next Generation Map Reduce) What is YARN? What is the importance of YARN? Where we can use the concept of YARN in Real Time What is difference between YARN and Map Reduce Data Locality What is Data Locality? Will Hadoop follows Data Locality? Speculative Execution What is Speculative Execution? Will Hadoop follows Speculative Execution? Map Reduce Commands Importance of each command How to execute the command Mapreduce admin related commands explanation Configurations Can we change the existing configurations of mapreduce or not? Importance of configurations Writing Unit Tests for Map Reduce Jobs Use of Secondary Sorting and how to solve using MapReduce How to Identify Performance Bottlenecks in MR jobs and tuning MR jobs. Map Reduce Streaming and Pipes with examples Exploring the Apache MapReduce Web UI 6) Apache Pig Introduction to Apache Pig Map Reduce Vs Apache Pig SQL Vs Apache Pig Different data types in Pig Address: Flat No.3, Kusai Market Above Tirupati Hospital, Karvenagar, Pune-52 Page 6

Modes Of Execution in Pig Local Mode Map Reduce Mode Execution Mechanism Grunt Shell Script Embedded UDF's How to write the UDF's in Pig How to use the UDF's in Pig Importance of UDF's in Pig Filter's How to write the Filter's in Pig How to use the Filter's in Pig Importance of Filter's in Pig Load Functions How to write the Load Functions in Pig How to use the Load Functions in Pig Importance of Load Functions in Pig Store Functions How to use the Store Functions in Pig Importance of Store Functions in Pig Transformations in Pig How to write the complex pig scripts How to integrate the Pig and Hbas 7) Apache HIVE Hive Introduction Hive architecture Driver Compiler Semantic Analyzer Hive Integration with Hadoop Hive Query Language(Hive QL) SQL VS Hive QL Hive Installation and Configuration Hive, Map-Reduce and Local-Mode Hive DLL and DML Operations Address: Flat No.3, Kusai Market Above Tirupati Hospital, Karvenagar, Pune-52 Page 7

Hive Services CLI Hiveserver Hwi Metastore embedded metastore configuration external metastore configuration UDF's How to write the UDF's in Hive How to use the UDF's in Hive Importance of UDF's in Hive UDAF's How to use the UDAF's in Hive Importance of UDAF's in Hive UDTF's How to use the UDTF's in Hive Importance of UDTF's in Hive How to write a complex Hive queries What is Hive Data Model? Partitions Importance of Hive Partitions in production environment Limitations of Hive Partitions How to write Partitions Buckets Importance of Hive Buckets in production environment How to write Buckets SerDe Importance of Hive SerDe's in production environment How to write SerDe programs How to integrate the Hive and Hbase 8) Apache Zookeeper Introduction to zookeeper Pseudo mode installations Zookeeper cluster installations Basic commands execution Address: Flat No.3, Kusai Market Above Tirupati Hospital, Karvenagar, Pune-52 Page 8

9) Apache HBase Hbase introduction Hbase usecases Hbase basics Column families Scans Hbase installation Local mode Psuedo mode Cluster mode Hbase Architecture Storage WriteAhead Log Log Structured MergeTrees Mapreduce integration Mapreduce over Hbase Hbase Usage Key design Bloom Filters Versioning Coprocessors Filters Hbase Clients REST Thrift Hive Web Based UI Hbase Admin Schema definition Basic CRUD operations 10) Apache Scoop Introduction to Sqoop MySQL client and Server Installation Sqoop Installation How to connect to Relational Database using Sqoop Address: Flat No.3, Kusai Market Above Tirupati Hospital, Karvenagar, Pune-52 Page 9

Sqoop Commands and Examples on Import and Export commands 11) Apache Flume Introduction to flume Flume installation Flume agent usage and Flume examples execution 12) Apache Oozie Introduction to oozie Oozie installation Executing oozie workflow jobs Monitering Oozie workflow jobs 13) MongoDB Introduction to MongoDB MongoDB installatio MongoDB examples Introduction to Apache Cassandra, Storm, Mahout and Apache Nutch What we are offering to you? Hands on MapReduce programming around 20+ programs these will make you to pefect in MapReduce both conceptwise and programmatically Hands on installation for all the Hadoop and ecosystems in your laptop. Real time projects explanation will be provided. Mock Interviews will be conducted on one-to-one basis. Discussing about hadoop interview questions daily base. Resume preparation with POC's or Project's based on your experiance. Address: Flat No.3, Kusai Market Above Tirupati Hospital, Karvenagar, Pune-52 Page 10