BIG DATA - HADOOP PROFESSIONAL amron



Similar documents
ITG Software Engineering

HADOOP ADMINISTATION AND DEVELOPMENT TRAINING CURRICULUM

Peers Techno log ies Pv t. L td. HADOOP

BIG DATA HADOOP TRAINING

Workshop on Hadoop with Big Data

Big Data Course Highlights

COURSE CONTENT Big Data and Hadoop Training

Complete Java Classes Hadoop Syllabus Contact No:

ITG Software Engineering

Qsoft Inc

Implement Hadoop jobs to extract business value from large and varied data sets

brief contents PART 1 BACKGROUND AND FUNDAMENTALS...1 PART 2 PART 3 BIG DATA PATTERNS PART 4 BEYOND MAPREDUCE...385

BIG DATA SERIES: HADOOP DEVELOPER TRAINING PROGRAM. An Overview

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

Hadoop: The Definitive Guide

Hadoop Development & BI- 0 to 100

Certified Big Data and Apache Hadoop Developer VS-1221

Pro Apache Hadoop. Second Edition. Sameer Wadkar. Madhu Siddalingaiah

HADOOP BIG DATA DEVELOPER TRAINING AGENDA

Big Data and Hadoop. Module 1: Introduction to Big Data and Hadoop. Module 2: Hadoop Distributed File System. Module 3: MapReduce

HADOOP. Revised 10/19/2015

TRAINING PROGRAM ON BIGDATA/HADOOP

t] open source Hadoop Beginner's Guide ij$ data avalanche Garry Turkington Learn how to crunch big data to extract meaning from

Hadoop Ecosystem B Y R A H I M A.

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Cloudera Certified Developer for Apache Hadoop

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

SAP BODS - BUSINESS OBJECTS DATA SERVICES 4.0 amron

Hadoop Job Oriented Training Agenda

International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February ISSN

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi

Programming Hadoop 5-day, instructor-led BD-106. MapReduce Overview. Hadoop Overview

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

Hadoop: The Definitive Guide

Introduction to Big Data Training

Has been into training Big Data Hadoop and MongoDB from more than a year now

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop

BIG DATA & HADOOP DEVELOPER TRAINING & CERTIFICATION

SAP BUSINESS OBJECTS BO BI 4.1 amron

WROX Certified Big Data Analyst Program by AnalytixLabs and Wiley

Hadoop 只 支 援 用 Java 開 發 嘛? Is Hadoop only support Java? 總 不 能 全 部 都 重 新 設 計 吧? 如 何 與 舊 系 統 相 容? Can Hadoop work with existing software?

Training Catalog. Summer 2015 Training Catalog. Apache Hadoop Training from the Experts. Apache Hadoop Training From the Experts

The Big Data Ecosystem at LinkedIn. Presented by Zhongfang Zhuang

Big Data Training - Hackveda

Data Analyst Program- 0 to 100

Oracle Big Data Essentials

Deploying Hadoop with Manager

Oracle Data Integrator 12c: Integration and Administration

HDP Hadoop From concept to deployment.

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

Data processing goes big

Fundamentals Curriculum HAWQ

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE

Hadoop Introduction. Olivier Renault Solution Engineer - Hortonworks

Strategies for scheduling Hadoop Jobs. Pere Urbon-Bayes

Important Notice. (c) Cloudera, Inc. All rights reserved.

Big Data Too Big To Ignore

Hadoop IST 734 SS CHUNG

Oracle Big Data Fundamentals Ed 1 NEW

Upcoming Announcements

Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015

Oracle Data Integrator 11g: Integration and Administration

Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware

Constructing a Data Lake: Hadoop and Oracle Database United!

Lecture 10: HBase! Claudia Hauff (Web Information Systems)!

Professional Hadoop Solutions

Moving From Hadoop to Spark

Apache Hadoop: The Big Data Refinery

Why Spark on Hadoop Matters

Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

Scalable Network Measurement Analysis with Hadoop. Taghrid Samak and Daniel Gunter Advanced Computing for Sciences, LBNL

Chase Wu New Jersey Ins0tute of Technology

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

The Future of Data Management with Hadoop and the Enterprise Data Hub

Ankush Cluster Manager - Hadoop2 Technology User Guide

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

Dominik Wagenknecht Accenture

Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster. Nov 7, 2012

HDFS. Hadoop Distributed File System

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

How to Hadoop Without the Worry: Protecting Big Data at Scale

A Brief Outline on Bigdata Hadoop

A bit about Hadoop. Luca Pireddu. March 9, CRS4Distributed Computing Group. (CRS4) Luca Pireddu March 9, / 18

The Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn

How To Write A Nosql Database In Spring Data Project

Extreme Computing. Hadoop MapReduce in more detail.

Bringing Big Data to People

[Type text] Week. National summer training program on. Big Data & Hadoop. Why big data & Hadoop is important?

Overview. Big Data in Apache Hadoop. - HDFS - MapReduce in Hadoop - YARN. Big Data Management and Analytics

MS 20487A Developing Windows Azure and Web Services

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

BIG DATA HANDS-ON WORKSHOP Data Manipulation with Hive and Pig

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

Information Builders Mission & Value Proposition

Distributed Calculus with Hadoop MapReduce inside Orange Search Engine. mardi 3 juillet 12

CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop)

Transcription:

0

Training Details Course Duration: 30-35 hours training + assignments + actual project based case studies Training Materials: All attendees will receive: Assignment after each module, video recording of every session Notes and study material for examples covered. Access to the training blog & repository of materials Training Format: This course is delivered as a highly interactive session, with extensive live examples. This course is live instructor led online training delivered using Cisco Webex Meeting Center web and audio conferencing tool. Timing: Weekdays and weekends after work hours. Audience: This course is designed for anyone who is: Wanting to architect a project using Hadoop and its eco system components. Wanting to develop map reduce programs A business analyst or data warehousing person looking at alternative approach to data analysis and storage. Pre-Requisites: The participants should have at least basic knowledge of Java. Any experience of Linux environment will be very helpful. Training Highlights Focus on hands on training 30 hours of assignments, live case studies Video recordings of sessions provided Demonstration of concepts using different tools like MS SQL, IIS, Business Object Crystal Reports One problem statement discussed across the ASP.NET, VB.NET, WPF, WCF, WWF and LINQ Hadoop certification guidance. Resume prep, interview questions provided. SOA fundamentals and products covered. Cloud computing for.net developers. 1

Introduction to HADOOP and BIG DATA Road Map 2

Modules Covered in this Training Basic Hadoop 1. Introduction and overview of Hadoop 2. Hadoop distributed file system (HDFS) 3. HBase the Hadoop database 4. Map/Reduce 2.0/YARN 5. MapReduce workflows 6. Pig 7. Hive 8. Putting it all together Advanced Hadoop 1. Integrating Hadoop into the workflow 2. Delving deeper into the Hadoop API 3. Common map reduce algorithms 4. Using hive and PIG 5. Practical development tips and techniques 6. More advanced map reduce programming 7. Joining data sets in map reduce 8. Graph manipulation in Hadoop 9. Creating workflows with Oozie 10. Hands on exercise Attendees Also Learn: 1. Resume preparation guidelines and tips 2. Mock interviews and interview preparation tips Topics Covered Basic Hadoop 1. Introduction And Overview Of Hadoop What is Hadoop? History of Hadoop. Building Blocks Hadoop Eco-System. Who is behind Hadoop? What Hadoop is good for and what it is not? 3

2. Hadoop Distributed File System (HDFS) HDFS overview and architecture HDFS installation HDFS use cases Hadoop file system shell File system JAVA API Hadoop configuration 3. HBase The Hadoop Database HBase overview and architecture HBase installation HBase shell Java client API Java administrative API Filters Scan caching and batching Key design Table design 4. Map/Reduce 2.0/YARN MapReduce 2.0 and YARN overview MapReduce 2.0 and YARN architecture Installation YARN and MapReduce command line tools Developing MapReduce jobs Input and output formats HDFS and HBase as source and sink Job configuration Job submission and monitoring Anatomy of Mappers, Reducers, Combiners and Partitioners Anatomy of Job Execution on YARN Distributed cache Hadoop streaming 5. MapReduce Workflows Decomposing problems into MapReduce workflow Using job control Oozie introduction and architecture Oozie installation Developing, deploying, and executing Oozie workflows 4

6. Pig Pig overview Installation Pig Latin Developing pig scripts Processing big data with pig Joining data-sets with pig 7. Hive Hive overview Installation Hive QL 8. Putting It All Together Distributed installations Best practices Advanced Hadoop Outline Our advanced Hadoop is an extension of essential Hadoop module designed with objective of indepth coverage with case study illustration. 1. Integrating Hadoop Into The Workflow Relational database management systems Storage systems Importing data from RDBMSs with Sqoop Importing real-time data with flume Accessing HDFS using FuseDFS and Hoop 2. Delving Deeper Into The Hadoop API More about ToolRunner Testing with MRUnit Reducing intermediate data with combiners The configure and close methods for map/reduce setup and teardown Writing partitioners for better load balancing 5

Directly accessing HDFS Using the distributed cache 3. Common MapReduce Algorithms Sorting and searching Indexing Machine learning with mahout Term frequency inverse document frequency Word co-occurrence 4. Using Hive and Pig Hive basics Pig basics 5. Practical Development Tips And Techniques Debugging MapReduce Code Using LocalJobRunner Mode For Easier Debugging Retrieving Job Information with Counters Logging Splittable file formats Determining the Optimal Number of Reducers Map-Only MapReduce Jobs Hands-On-Exercise More Advanced MapReduce Programming Custom writables and writable-comparables Saving binary data using sequence files and Avro files Creating input formats and output formats 6. Joining Data Sets In MapReduce Map-side joins The secondary sort Reduce-side joins 7. Graph Manipulation In Hadoop 6

Introduction to graph techniques Representing graphs in Hadoop Implementing a sample algorithm: Single Source Shortest Path 8. Creating Workflows With Oozie The motivation for Oozie Oozie s workflow definition format Hands on exercises 7

Copyright Amron IT Solutions & Resource Management. 2014 All Rights Reserved. No part of this document or website may be reproduced without Amron IT Solutions & Resource Management s express consent. www.amronitsolutions.com 8