[Type text] Week. National summer training program on. Big Data & Hadoop. Why big data & Hadoop is important?

Size: px
Start display at page:

Download "[Type text] Week. National summer training program on. Big Data & Hadoop. Why big data & Hadoop is important?"

Transcription

1 1 Week National summer training program on Big Data & Hadoop Why big data & Hadoop is important?

2 Highlights of Big Data & Hadoop Implement a Hadoop Project Learn to write Complex MapReduce programs Perform Data Analytics using Pig and Hive Understand Data Loading Techniques using SQOOP and Flume Master the concepts of Hadoop Distributed File System and MapReduce Framework Work on a Real Life Project on Big Data Analytics and gain Hands on Project Experience 1-week Big Data & Hadooop Projects details 1. Word Count for a large amount of data With Using (MapReduce, Pig, Hive ) 2. Temperature Sensor Conversation With Using Pig or HQL 3. DATA Sorting With Large data Set With Pig and Sqoop 4. Write Complex MapReduce programs 5. Master the concepts of Hadoop Distributed File System and MapReduce Framework 6. Work on a Real Life Project on Big Data Analytics and gain Hands on Project Experience Big Data & Hadoop Certification 1. Certificate of Participation from NSTP-2016 EDC IIT-Roorkee. 2. Certificate of merit from NSTP-2016 EDC IIT-Roorkee. 3. Certificate of Coordination from NSTP-2016 EDC IIT-Roorkee. 4. College will get Center of Excellence form NSTP-2016 EDC IIT-Roorkee.

3 1-week Big Data & Hadoop Course Content Day 1 A. Introduction to Big Data and Hadoop What is a Data? Type of Data Need of Big Data Characteristics of Big Data B. Different Components of Hadoop C. Big Data Technology Traditional IT approach Big Data Capabilities Milestones of Hadoop Day 2 A. Software Introduction VMware Player VMware installed with BIOS system Horton Works Sand Box Introduction B. Hadoop Architecture Hadoop cluster Hadoop Core Services Hadoop Core Components Map reduce Introduction HDFS

4 Day 3 A. Starting With Hadoop Map reduce Analogy Map reduce Example Map Execution Real time Using With Hadoop B. Pig Introduction to Apache Pig Components of pig How to Works pig, or Data model Pig vs. SQL Pig Execution Modes Day 4 A. Pig Map Reduce vs. Apache Pig Different Data Types in Pig Modes of Execution in Pig Local Mode Execution Mechanism Grunt Shell Scrip Pig Commands Examples Of pig 6.1 Word Count 6.2 Batting Examples

5 Day 5 A. Hive Hive Introduction Hive characteristics System Architecture & components of Hive Query Compiler Hive Server HIVE Architecture HIVE Meta Store HIVE Integration with Hadoop HIVE Query Language (Hive QL) SQL Vs Hive QL B. HBase Introduction of HBase Characteristics of HBase HBase Architecture HBase Vs RDBMS HBase Shell Commands Day 6 A. HBase Architecture Clients REST Thrift Java Based AVTO B. MapReduce Integration MapReduce over HBase HBase Admin Schema Definition MapReduce OR Distributed Mode

6 Day 7 A. SQOOP Introduction to SQOOP MySQL Client and Server Installation How to Connect to Relational Database Using SQOOP Different SQOOP Commands Different Flavors of Import Export HIVE Imports B. Introduction Of Zookeeper Features of Zookeeper Use of ZooKeeper Zookeeper Data Model 1-week Big Data & Hadoop Projects details 1. Word Count for a large amount of data With Using (MapReduce, Pig, Hive ) 2. Temperature Sensor Conversation With Using Pig or HQL 3. DATA Sorting With Large data Set With Pig and Sqoop 4. Write Complex MapReduce programs 5. Master the concepts of Hadoop Distributed File System and MapReduce Framework 6. Work on a Real Life Project on Big Data Analytics and gain Hands on Project Experience

7 Who should go for this Course? Predictions say 2017 will be the year Hadoop finally becomes a cornerstone of your business technology agenda. To stay ahead in the game, Hadoop has become a must-know technology for the Graduates aiming to build a career in Big Data. What are the pre-requisites for this Course? Knowledge of core java concepts is the pre-requisite for this course. We provide a complimentary course i.e. "Java Essentials for Hadoop" to all the participants who enroll for the Hadoop Training. What are the other requirement for this training program? Seminar hall/classroom having the enough capacity to conduct hands-on-session for all participants. Good Quality public address system ideally two cordless mikes will be required. Projector/ Screen along with black/white board for teaching and presentation purposes. One small stereo jack cord to in laptop for its sound system. This training center can only be arranged for a minimum of 30 students 1-week accommodation to our resource team Training Duration: 1-week (3-4 hours each day) Training Fee: Rs.4000/- Participant + Services 14.50%, (The fee includes training, certification, and Event registration and a free Big Data & Hadoop Kit to each Student) Please be free to contact us for any kind of discussions related to NSTP-2016 Mobile : l

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture. Big Data Hadoop Administration and Developer Course This course is designed to understand and implement the concepts of Big data and Hadoop. This will cover right from setting up Hadoop environment in

More information

ITG Software Engineering

ITG Software Engineering Introduction to Apache Hadoop Course ID: Page 1 Last Updated 12/15/2014 Introduction to Apache Hadoop Course Overview: This 5 day course introduces the student to the Hadoop architecture, file system,

More information

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP Big-Data and Hadoop Developer Training with Oracle WDP What is this course about? Big Data is a collection of large and complex data sets that cannot be processed using regular database management tools

More information

Workshop on Hadoop with Big Data

Workshop on Hadoop with Big Data Workshop on Hadoop with Big Data Hadoop? Apache Hadoop is an open source framework for distributed storage and processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly

More information

Complete Java Classes Hadoop Syllabus Contact No: 8888022204

Complete Java Classes Hadoop Syllabus Contact No: 8888022204 1) Introduction to BigData & Hadoop What is Big Data? Why all industries are talking about Big Data? What are the issues in Big Data? Storage What are the challenges for storing big data? Processing What

More information

Peers Techno log ies Pv t. L td. HADOOP

Peers Techno log ies Pv t. L td. HADOOP Page 1 Peers Techno log ies Pv t. L td. Course Brochure Overview Hadoop is a Open Source from Apache, which provides reliable storage and faster process by using the Hadoop distibution file system and

More information

Introduction to Big Data Training

Introduction to Big Data Training Introduction to Big Data Training The quickest way to be introduce with NOSQL/BIG DATA offerings Learn and experience Big Data Solutions including Hadoop HDFS, Map Reduce, NoSQL DBs: Document Based DB

More information

HADOOP ADMINISTATION AND DEVELOPMENT TRAINING CURRICULUM

HADOOP ADMINISTATION AND DEVELOPMENT TRAINING CURRICULUM HADOOP ADMINISTATION AND DEVELOPMENT TRAINING CURRICULUM 1. Introduction 1.1 Big Data Introduction What is Big Data Data Analytics Bigdata Challenges Technologies supported by big data 1.2 Hadoop Introduction

More information

Qsoft Inc www.qsoft-inc.com

Qsoft Inc www.qsoft-inc.com Big Data & Hadoop Qsoft Inc www.qsoft-inc.com Course Topics 1 2 3 4 5 6 Week 1: Introduction to Big Data, Hadoop Architecture and HDFS Week 2: Setting up Hadoop Cluster Week 3: MapReduce Part 1 Week 4:

More information

BIG DATA HADOOP TRAINING

BIG DATA HADOOP TRAINING BIG DATA HADOOP TRAINING DURATION 40hrs AVAILABLE BATCHES WEEKDAYS (7.00AM TO 8.30AM) & WEEKENDS (10AM TO 1PM) MODE OF TRAINING AVAILABLE ONLINE INSTRUCTOR LED CLASSROOM TRAINING (MARATHAHALLI, BANGALORE)

More information

BIG DATA SERIES: HADOOP DEVELOPER TRAINING PROGRAM. An Overview

BIG DATA SERIES: HADOOP DEVELOPER TRAINING PROGRAM. An Overview BIG DATA SERIES: HADOOP DEVELOPER TRAINING PROGRAM An Overview Contents Contents... 1 BIG DATA SERIES: HADOOP DEVELOPER TRAINING PROGRAM... 1 Program Overview... 4 Curriculum... 5 Module 1: Big Data: Hadoop

More information

Hadoop Job Oriented Training Agenda

Hadoop Job Oriented Training Agenda 1 Hadoop Job Oriented Training Agenda Kapil CK hdpguru@gmail.com Module 1 M o d u l e 1 Understanding Hadoop This module covers an overview of big data, Hadoop, and the Hortonworks Data Platform. 1.1 Module

More information

HareDB HBase Client Web Version USER MANUAL HAREDB TEAM

HareDB HBase Client Web Version USER MANUAL HAREDB TEAM 2013 HareDB HBase Client Web Version USER MANUAL HAREDB TEAM Connect to HBase... 2 Connection... 3 Connection Manager... 3 Add a new Connection... 4 Alter Connection... 6 Delete Connection... 6 Clone Connection...

More information

Big Data and Hadoop. Module 1: Introduction to Big Data and Hadoop. Module 2: Hadoop Distributed File System. Module 3: MapReduce

Big Data and Hadoop. Module 1: Introduction to Big Data and Hadoop. Module 2: Hadoop Distributed File System. Module 3: MapReduce Big Data and Hadoop Module 1: Introduction to Big Data and Hadoop Learn about Big Data and the shortcomings of the prevailing solutions for Big Data issues. You will also get to know, how Hadoop eradicates

More information

Big Data Course Highlights

Big Data Course Highlights Big Data Course Highlights The Big Data course will start with the basics of Linux which are required to get started with Big Data and then slowly progress from some of the basics of Hadoop/Big Data (like

More information

BIG DATA & HADOOP DEVELOPER TRAINING & CERTIFICATION

BIG DATA & HADOOP DEVELOPER TRAINING & CERTIFICATION FACT SHEET BIG DATA & HADOOP DEVELOPER TRAINING & CERTIFICATION BIGDATA & HADOOP CLASS ROOM SESSION GreyCampus provides Classroom sessions for Big Data & Hadoop Developer Certification. This course will

More information

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE AGENDA Introduction to Big Data Introduction to Hadoop HDFS file system Map/Reduce framework Hadoop utilities Summary BIG DATA FACTS In what timeframe

More information

Programming Hadoop 5-day, instructor-led BD-106. MapReduce Overview. Hadoop Overview

Programming Hadoop 5-day, instructor-led BD-106. MapReduce Overview. Hadoop Overview Programming Hadoop 5-day, instructor-led BD-106 MapReduce Overview The Client Server Processing Pattern Distributed Computing Challenges MapReduce Defined Google's MapReduce The Map Phase of MapReduce

More information

Certified Big Data and Apache Hadoop Developer VS-1221

Certified Big Data and Apache Hadoop Developer VS-1221 Certified Big Data and Apache Hadoop Developer VS-1221 Certified Big Data and Apache Hadoop Developer Certification Code VS-1221 Vskills certification for Big Data and Apache Hadoop Developer Certification

More information

Big Data Training - Hackveda

Big Data Training - Hackveda Big Data Training - Hackveda Become a Hackveda Certified Big Data Professional - (Beginner) Skill level: Beginner Training fee: INR 9000 only (Topics covered: 108) Chief Trainer: Mr. Devanshu Shukla Training

More information

An Introduction to MOHAMMAD REZA KARIMI DASTJERDI SPRING

An Introduction to MOHAMMAD REZA KARIMI DASTJERDI SPRING An Introduction to MOHAMMAD REZA KARIMI DASTJERDI SPRING 2015 1 Table Of Contents Introduction Problems with RDBMs What is Hadoop? Who use Hadoop? Job Positions History Hadoop Distributions Hadoop Ecosystem

More information

BIG DATA TECHNOLOGY. Hadoop Ecosystem

BIG DATA TECHNOLOGY. Hadoop Ecosystem BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big

More information

Firebird meets NoSQL (Apache HBase) Case Study

Firebird meets NoSQL (Apache HBase) Case Study Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 thomas.steinmaurer@scch.at www.scch.at Michael Zwick DI

More information

Oracle Big Data Fundamentals Ed 1

Oracle Big Data Fundamentals Ed 1 Oracle University Contact Us: 001-855-844-3881 Oracle Big Data Fundamentals Ed 1 Duration: 5 Days What you will learn In the Oracle Big Data Fundamentals course, learn to use Oracle's Integrated Big Data

More information

Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster. Nov 7, 2012

Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster. Nov 7, 2012 Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster Nov 7, 2012 Who I Am Robert Lancaster Solutions Architect, Hotel Supply Team rlancaster@orbitz.com @rob1lancaster Organizer of Chicago

More information

COURSE CONTENT Big Data and Hadoop Training

COURSE CONTENT Big Data and Hadoop Training COURSE CONTENT Big Data and Hadoop Training 1. Meet Hadoop Data! Data Storage and Analysis Comparison with Other Systems RDBMS Grid Computing Volunteer Computing A Brief History of Hadoop Apache Hadoop

More information

CHAPTER 1: NOSQL: WHAT IT IS AND WHY YOU NEED IT 3

CHAPTER 1: NOSQL: WHAT IT IS AND WHY YOU NEED IT 3 INTRODUCTION xvii PART I: GETTING STARTED CHAPTER 1: NOSQL: WHAT IT IS AND WHY YOU NEED IT 3 Definition and Introduction 4 Context and a Bit of History 4 Big Data 7 Scalability 9 Defi nition and Introduction

More information

BIG DATA - HADOOP PROFESSIONAL amron

BIG DATA - HADOOP PROFESSIONAL amron 0 Training Details Course Duration: 30-35 hours training + assignments + actual project based case studies Training Materials: All attendees will receive: Assignment after each module, video recording

More information

project collects data from national events, both natural and manmade, to be stored and evaluated by

project collects data from national events, both natural and manmade, to be stored and evaluated by Joseph Sebastian CS 2994 Spring 2014 Undergraduate Research Final Paper GOALS The goal of my research was to assist the Integrated Digital Event Archive (IDEAL) team in transferring their Twitter data

More information

Apache Sentry. Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com

Apache Sentry. Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com Apache Sentry Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com Agenda Various aspects of data security Apache Sentry for authorization Key concepts of Apache Sentry Sentry features Sentry architecture

More information

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP Eva Andreasson Cloudera Most FAQ: Super-Quick Overview! The Apache Hadoop Ecosystem a Zoo! Oozie ZooKeeper Hue Impala Solr Hive Pig Mahout HBase MapReduce

More information

Implement Hadoop jobs to extract business value from large and varied data sets

Implement Hadoop jobs to extract business value from large and varied data sets Hadoop Development for Big Data Solutions: Hands-On You Will Learn How To: Implement Hadoop jobs to extract business value from large and varied data sets Write, customize and deploy MapReduce jobs to

More information

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future

More information

Constructing a Data Lake: Hadoop and Oracle Database United!

Constructing a Data Lake: Hadoop and Oracle Database United! Constructing a Data Lake: Hadoop and Oracle Database United! Sharon Sophia Stephen Big Data PreSales Consultant February 21, 2015 Safe Harbor The following is intended to outline our general product direction.

More information

HackINFINITY Workshop (A two day workshop on Ethical Hacking & Information Security)

HackINFINITY Workshop (A two day workshop on Ethical Hacking & Information Security) HackINFINITY Workshop (A two day workshop on Ethical Hacking & Information Security) About Us Since the beginning we have always followed the philosophy of the 3 "joys": "The joy of learning" "The joy

More information

Spring,2015. Apache Hive BY NATIA MAMAIASHVILI, LASHA AMASHUKELI & ALEKO CHAKHVASHVILI SUPERVAIZOR: PROF. NODAR MOMTSELIDZE

Spring,2015. Apache Hive BY NATIA MAMAIASHVILI, LASHA AMASHUKELI & ALEKO CHAKHVASHVILI SUPERVAIZOR: PROF. NODAR MOMTSELIDZE Spring,2015 Apache Hive BY NATIA MAMAIASHVILI, LASHA AMASHUKELI & ALEKO CHAKHVASHVILI SUPERVAIZOR: PROF. NODAR MOMTSELIDZE Contents: Briefly About Big Data Management What is hive? Hive Architecture Working

More information

Internals of Hadoop Application Framework and Distributed File System

Internals of Hadoop Application Framework and Distributed File System International Journal of Scientific and Research Publications, Volume 5, Issue 7, July 2015 1 Internals of Hadoop Application Framework and Distributed File System Saminath.V, Sangeetha.M.S Abstract- Hadoop

More information

COSC 6397 Big Data Analytics. 2 nd homework assignment Pig and Hive. Edgar Gabriel Spring 2015

COSC 6397 Big Data Analytics. 2 nd homework assignment Pig and Hive. Edgar Gabriel Spring 2015 COSC 6397 Big Data Analytics 2 nd homework assignment Pig and Hive Edgar Gabriel Spring 2015 2 nd Homework Rules Each student should deliver Source code (.java files) Documentation (.pdf,.doc,.tex or.txt

More information

Data-Intensive Programming. Timo Aaltonen Department of Pervasive Computing

Data-Intensive Programming. Timo Aaltonen Department of Pervasive Computing Data-Intensive Programming Timo Aaltonen Department of Pervasive Computing Data-Intensive Programming Lecturer: Timo Aaltonen University Lecturer timo.aaltonen@tut.fi Assistants: Henri Terho and Antti

More information

You should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required.

You should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required. What is this course about? This course is an overview of Big Data tools and technologies. It establishes a strong working knowledge of the concepts, techniques, and products associated with Big Data. Attendees

More information

Map Reduce & Hadoop Recommended Text:

Map Reduce & Hadoop Recommended Text: Big Data Map Reduce & Hadoop Recommended Text:! Large datasets are becoming more common The New York Stock Exchange generates about one terabyte of new trade data per day. Facebook hosts approximately

More information

Bringing Big Data to People

Bringing Big Data to People Bringing Big Data to People Microsoft s modern data platform SQL Server 2014 Analytics Platform System Microsoft Azure HDInsight Data Platform Everyone should have access to the data they need. Process

More information

Introduction to Big Data and the Lambda Architecture

Introduction to Big Data and the Lambda Architecture Introduction to Big Data and the Lambda Architecture Marc Schöni Meinrad Weiss April 2014 BASEL BERN BRUGG LAUSANNE ZUERICH DUESSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MUNICH STUTTGART VIENNA 1 What

More information

Oracle Big Data Essentials

Oracle Big Data Essentials Oracle University Contact Us: Local: 1800 103 4775 Intl: +91 80 40291196 Oracle Big Data Essentials Duration: 3 Days What you will learn This Oracle Big Data Essentials training deep dives into using the

More information

WROX Certified Big Data Analyst Program by AnalytixLabs and Wiley

WROX Certified Big Data Analyst Program by AnalytixLabs and Wiley WROX Certified Big Data Analyst Program by AnalytixLabs and Wiley Disclaimer: This material is protected under copyright act AnalytixLabs, 2011. Unauthorized use and/ or duplication of this material or

More information

MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering

MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering MySQL and Hadoop: Big Data Integration Shubhangi Garg & Neha Kumari MySQL Engineering 1Copyright 2013, Oracle and/or its affiliates. All rights reserved. Agenda Design rationale Implementation Installation

More information

HADOOP. Revised 10/19/2015

HADOOP. Revised 10/19/2015 HADOOP Revised 10/19/2015 This Page Intentionally Left Blank Table of Contents Hortonworks HDP Developer: Java... 1 Hortonworks HDP Developer: Apache Pig and Hive... 2 Hortonworks HDP Developer: Windows...

More information

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani Technical Architect - Big Data Syntel Agenda Welcome to the Zoo! Evolution Timeline Traditional BI/DW Architecture Where Hadoop Fits In 2 Welcome to

More information

Event Details. Stage 1 (Workshop Round to be held at Zonal Centers) Stage 2 (Zonal Round to be held at zonal center)

Event Details. Stage 1 (Workshop Round to be held at Zonal Centers) Stage 2 (Zonal Round to be held at zonal center) Introducing India s First Network Security Championship To, The HOD Date: 09/01/2014 Caarmel Engineering College Sub: National Network Security Championship by Network Bulls in association with ACM-IIT

More information

Fundamentals Curriculum HAWQ

Fundamentals Curriculum HAWQ Fundamentals Curriculum Pivotal Hadoop 2.1 HAWQ Education Services zdata Inc. 660 4th St. Ste. 176 San Francisco, CA 94107 t. 415.890.5764 zdatainc.com Pivotal Hadoop & HAWQ Fundamentals Course Description

More information

Hadoop implementation of MapReduce computational model. Ján Vaňo

Hadoop implementation of MapReduce computational model. Ján Vaňo Hadoop implementation of MapReduce computational model Ján Vaňo What is MapReduce? A computational model published in a paper by Google in 2004 Based on distributed computation Complements Google s distributed

More information

Oracle Big Data Fundamentals Ed 1 NEW

Oracle Big Data Fundamentals Ed 1 NEW Oracle University Contact Us: +90 212 329 6779 Oracle Big Data Fundamentals Ed 1 NEW Duration: 5 Days What you will learn In the Oracle Big Data Fundamentals course, learn to use Oracle's Integrated Big

More information

ASAM ODS, Peak ODS Server and openmdm as a company-wide information hub for test and simulation data. Peak Solution GmbH, Nuremberg

ASAM ODS, Peak ODS Server and openmdm as a company-wide information hub for test and simulation data. Peak Solution GmbH, Nuremberg ASAM, and as a company-wide information hub for test and simulation data Peak Solution GmbH, Nuremberg Overview ASAM provides for over a decade a standard, that has proven itself as a reliable basis for

More information

SQL Server 2012 PDW. Ryan Simpson Technical Solution Professional PDW Microsoft. Microsoft SQL Server 2012 Parallel Data Warehouse

SQL Server 2012 PDW. Ryan Simpson Technical Solution Professional PDW Microsoft. Microsoft SQL Server 2012 Parallel Data Warehouse SQL Server 2012 PDW Ryan Simpson Technical Solution Professional PDW Microsoft Microsoft SQL Server 2012 Parallel Data Warehouse Massively Parallel Processing Platform Delivers Big Data HDFS Delivers Scale

More information

Scaling Up 2 CSE 6242 / CX 4242. Duen Horng (Polo) Chau Georgia Tech. HBase, Hive

Scaling Up 2 CSE 6242 / CX 4242. Duen Horng (Polo) Chau Georgia Tech. HBase, Hive CSE 6242 / CX 4242 Scaling Up 2 HBase, Hive Duen Horng (Polo) Chau Georgia Tech Some lectures are partly based on materials by Professors Guy Lebanon, Jeffrey Heer, John Stasko, Christos Faloutsos, Le

More information

Data Analyst Program- 0 to 100

Data Analyst Program- 0 to 100 Development Data Analyst Program- 0 to 100 Master the Data Analysis tools like Pig and hive Data Science Build a recommendation engine 1 Data Analyst Program- 0 to 100 HADOOP SCHOOL OF TRAINING Basics

More information

Hadoop 只 支 援 用 Java 開 發 嘛? Is Hadoop only support Java? 總 不 能 全 部 都 重 新 設 計 吧? 如 何 與 舊 系 統 相 容? Can Hadoop work with existing software?

Hadoop 只 支 援 用 Java 開 發 嘛? Is Hadoop only support Java? 總 不 能 全 部 都 重 新 設 計 吧? 如 何 與 舊 系 統 相 容? Can Hadoop work with existing software? Hadoop 只 支 援 用 Java 開 發 嘛? Is Hadoop only support Java? 總 不 能 全 部 都 重 新 設 計 吧? 如 何 與 舊 系 統 相 容? Can Hadoop work with existing software? 可 以 跟 資 料 庫 結 合 嘛? Can Hadoop work with Databases? 開 發 者 們 有 聽 到

More information

BIG DATA IS MESSY PARTNER WITH SCALABLE

BIG DATA IS MESSY PARTNER WITH SCALABLE BIG DATA IS MESSY PARTNER WITH SCALABLE SCALABLE SYSTEMS HADOOP SOLUTION WHAT IS BIG DATA? Each day human beings create 2.5 quintillion bytes of data. In the last two years alone over 90% of the data on

More information

Big Data Too Big To Ignore

Big Data Too Big To Ignore Big Data Too Big To Ignore Geert! Big Data Consultant and Manager! Currently finishing a 3 rd Big Data project! IBM & Cloudera Certified! IBM & Microsoft Big Data Partner 2 Agenda! Defining Big Data! Introduction

More information

Hadoop Development & BI- 0 to 100

Hadoop Development & BI- 0 to 100 Development Master the Data Analysis tools like Pig and hive Data Science Hadoop Development & BI- 0 to 100 Build a recommendation engine Hadoop Development - 0 to 100 HADOOP SCHOOL OF TRAINING Basics

More information

The Best Database for Hadoop

The Best Database for Hadoop The Best Database for Hadoop Justin Makeig, Director, Product Management, MarkLogic April 9, 2013 Disclaimer Forward-looking Statements All statements describing future releases and capabilities, estimated

More information

HADOOP BIG DATA DEVELOPER TRAINING AGENDA

HADOOP BIG DATA DEVELOPER TRAINING AGENDA HADOOP BIG DATA DEVELOPER TRAINING AGENDA About the Course This course is the most advanced course available to Software professionals This has been suitably designed to help Big Data Developers and experts

More information

Instance Types. Standard Instances:

Instance Types. Standard Instances: Instance Types Standard Instances: 1EC2CU: equivalent of 1.0-1.2GHz 2007 AMD Opteron or 2007 Intel Xeon processor Small: 1.7GBmem, 1EC2Compute Unit (EC2CU), 160GB local instance storage(lis), 32/64bits.

More information

Why all industries are talking about Big Data. What are the technologies support big data. Traditional NO SQL

Why all industries are talking about Big Data. What are the technologies support big data. Traditional NO SQL Intrductin t Big Data and Hadp Big Data What is Big Data Why all industries are talking abut Big Data What are the issues in Big Data Strage What are the challenges fr string big data Prcessing What are

More information

Comprehensive Analytics on the Hortonworks Data Platform

Comprehensive Analytics on the Hortonworks Data Platform Comprehensive Analytics on the Hortonworks Data Platform We do Hadoop. Page 1 Page 2 Back to 2005 Page 3 Vertical Scaling Page 4 Vertical Scaling Page 5 Vertical Scaling Page 6 Horizontal Scaling Page

More information

Hadoop vs Apache Spark

Hadoop vs Apache Spark Innovate, Integrate, Transform Hadoop vs Apache Spark www.altencalsoftlabs.com Introduction Any sufficiently advanced technology is indistinguishable from magic. said Arthur C. Clark. Big data technologies

More information

The Greenplum Analytics Workbench

The Greenplum Analytics Workbench The Greenplum Analytics Workbench External Overview 1 The Greenplum Analytics Workbench Definition Is a 1000-node Hadoop Cluster. Pre-configured with publicly available data sets. Contains the entire Hadoop

More information

ITG Software Engineering

ITG Software Engineering Introduction to Cloudera Course ID: Page 1 Last Updated 12/15/2014 Introduction to Cloudera Course : This 5 day course introduces the student to the Hadoop architecture, file system, and the Hadoop Ecosystem.

More information

Big Data and Analytics by Seema Acharya and Subhashini Chellappan Copyright 2015, WILEY INDIA PVT. LTD. Introduction to Pig

Big Data and Analytics by Seema Acharya and Subhashini Chellappan Copyright 2015, WILEY INDIA PVT. LTD. Introduction to Pig Introduction to Pig Agenda What is Pig? Key Features of Pig The Anatomy of Pig Pig on Hadoop Pig Philosophy Pig Latin Overview Pig Latin Statements Pig Latin: Identifiers Pig Latin: Comments Data Types

More information

brief contents PART 1 BACKGROUND AND FUNDAMENTALS...1 PART 2 PART 3 BIG DATA PATTERNS...253 PART 4 BEYOND MAPREDUCE...385

brief contents PART 1 BACKGROUND AND FUNDAMENTALS...1 PART 2 PART 3 BIG DATA PATTERNS...253 PART 4 BEYOND MAPREDUCE...385 brief contents PART 1 BACKGROUND AND FUNDAMENTALS...1 1 Hadoop in a heartbeat 3 2 Introduction to YARN 22 PART 2 DATA LOGISTICS...59 3 Data serialization working with text and beyond 61 4 Organizing and

More information

Hadoop IST 734 SS CHUNG

Hadoop IST 734 SS CHUNG Hadoop IST 734 SS CHUNG Introduction What is Big Data?? Bulk Amount Unstructured Lots of Applications which need to handle huge amount of data (in terms of 500+ TB per day) If a regular machine need to

More information

Has been into training Big Data Hadoop and MongoDB from more than a year now

Has been into training Big Data Hadoop and MongoDB from more than a year now NAME NAMIT EXECUTIVE SUMMARY EXPERTISE DELIVERIES Around 10+ years of experience on Big Data Technologies such as Hadoop and MongoDB, Java, Python, Big Data Analytics, System Integration and Consulting

More information

EXPERIMENTATION. HARRISON CARRANZA School of Computer Science and Mathematics

EXPERIMENTATION. HARRISON CARRANZA School of Computer Science and Mathematics BIG DATA WITH HADOOP EXPERIMENTATION HARRISON CARRANZA Marist College APARICIO CARRANZA NYC College of Technology CUNY ECC Conference 2016 Poughkeepsie, NY, June 12-14, 2016 Marist College AGENDA Contents

More information

International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February-2014 10 ISSN 2278-7763

International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February-2014 10 ISSN 2278-7763 International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February-2014 10 A Discussion on Testing Hadoop Applications Sevuga Perumal Chidambaram ABSTRACT The purpose of analysing

More information

Big Data Big Data/Data Analytics & Software Development

Big Data Big Data/Data Analytics & Software Development Big Data Big Data/Data Analytics & Software Development Danairat T. danairat@gmail.com, 081-559-1446 1 Agenda Big Data Overview Business Cases and Benefits Hadoop Technology Architecture Big Data Development

More information

Big Data and Data Science. The globally recognised training program

Big Data and Data Science. The globally recognised training program Big Data and Data Science The globally recognised training program Certificate in Big Data Analytics Duration 5 days Big Data and Data Science enables value creation from data, through the use of calculative

More information

Cloudera Certified Developer for Apache Hadoop

Cloudera Certified Developer for Apache Hadoop Cloudera CCD-333 Cloudera Certified Developer for Apache Hadoop Version: 5.6 QUESTION NO: 1 Cloudera CCD-333 Exam What is a SequenceFile? A. A SequenceFile contains a binary encoding of an arbitrary number

More information

Course 20467: Designing Self-Service Business Intelligence and Big Data Solutions

Course 20467: Designing Self-Service Business Intelligence and Big Data Solutions Course 20467: Designing Self-Service Business Intelligence and Big Data Solutions Type:Course Audience(s):IT Professionals Technology:Microsoft SQL Server Level:300 This Revision:C Delivery method: Instructor-led

More information

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2 Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue

More information

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi Getting Started with Hadoop Raanan Dagan Paul Tibaldi What is Apache Hadoop? Hadoop is a platform for data storage and processing that is Scalable Fault tolerant Open source CORE HADOOP COMPONENTS Hadoop

More information

Apache HBase. Crazy dances on the elephant back

Apache HBase. Crazy dances on the elephant back Apache HBase Crazy dances on the elephant back Roman Nikitchenko, 16.10.2014 YARN 2 FIRST EVER DATA OS 10.000 nodes computer Recent technology changes are focused on higher scale. Better resource usage

More information

Securing the Big Data Ecosystem

Securing the Big Data Ecosystem Securing the Big Data Ecosystem SESSION ID: STU-T07A Davi Ottenheimer Senior Director of Trust, EMC @daviottenheimer COWS NOT PETS ( ) (xx) /-------\/ / * ---- ^^ ^^ Systematic Treatment of Illness Easily

More information

PASS4TEST. IT Certification Guaranteed, The Easy Way! We offer free update service for one year

PASS4TEST. IT Certification Guaranteed, The Easy Way!  We offer free update service for one year PASS4TEST IT Certification Guaranteed, The Easy Way! \ http://www.pass4test.com We offer free update service for one year Exam : CCD-410 Title : Cloudera Certified Developer for Apache Hadoop (CCDH) Vendor

More information

Lecture 10: HBase! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl

Lecture 10: HBase! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl Big Data Processing, 2014/15 Lecture 10: HBase!! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl 1 Course content Introduction Data streams 1 & 2 The MapReduce paradigm Looking behind the

More information

Hadoop Ecosystem B Y R A H I M A.

Hadoop Ecosystem B Y R A H I M A. Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open

More information

Ge6ng to Know Hadoop. Sriram Mohan Senior Consultant, Avalon Consul,ng, LLC Associate Professor of CSSE, Rose- Hulman Ins,tute of Technology

Ge6ng to Know Hadoop. Sriram Mohan Senior Consultant, Avalon Consul,ng, LLC Associate Professor of CSSE, Rose- Hulman Ins,tute of Technology Ge6ng to Know Hadoop Sriram Mohan Senior Consultant, Avalon Consul,ng, LLC Associate Professor of CSSE, Rose- Hulman Ins,tute of Technology Presenter Overview Who we are Consultants providing expert technical

More information

Click the link below to get more detail

Click the link below to get more detail Click the link below to get more detail http://www.examkill.com/ ExamCode: Apache-Hadoop-Developer ExamName: Hadoop 2.0 Certification exam for Pig and Hive Developer Vendor Name: Hortonworks Edition =

More information

Hive Interview Questions

Hive Interview Questions HADOOPEXAM LEARNING RESOURCES Hive Interview Questions www.hadoopexam.com Please visit www.hadoopexam.com for various resources for BigData/Hadoop/Cassandra/MongoDB/Node.js/Scala etc. 1 Professional Training

More information

A Brief Outline on Bigdata Hadoop

A Brief Outline on Bigdata Hadoop A Brief Outline on Bigdata Hadoop Twinkle Gupta 1, Shruti Dixit 2 RGPV, Department of Computer Science and Engineering, Acropolis Institute of Technology and Research, Indore, India Abstract- Bigdata is

More information

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics In Organizations Mark Vervuurt Cluster Data Science & Analytics AGENDA 1. Yellow Elephant 2. Data Ingestion & Complex Event Processing 3. SQL on Hadoop 4. NoSQL 5. InMemory 6. Data Science & Machine Learning

More information

TRAINING PROGRAM ON BIGDATA/HADOOP

TRAINING PROGRAM ON BIGDATA/HADOOP Course: Training on Bigdata/Hadoop with Hands-on Course Duration / Dates / Time: 4 Days / 24th - 27th June 2015 / 9:30-17:30 Hrs Venue: Eagle Photonics Pvt Ltd First Floor, Plot No 31, Sector 19C, Vashi,

More information

Integrating Cloudera and SAP HANA

Integrating Cloudera and SAP HANA Integrating Cloudera and SAP HANA Version: 103 Table of Contents Introduction/Executive Summary 4 Overview of Cloudera Enterprise 4 Data Access 5 Apache Hive 5 Data Processing 5 Data Integration 5 Partner

More information

#TalendSandbox for Big Data

#TalendSandbox for Big Data Evalua&on von Apache Hadoop mit der #TalendSandbox for Big Data Julien Clarysse @whatdoesdatado @talend 2015 Talend Inc. 1 Connecting the Data-Driven Enterprise 2 Talend Overview Founded in 2006 BRAND

More information

Large scale processing using Hadoop. Ján Vaňo

Large scale processing using Hadoop. Ján Vaňo Large scale processing using Hadoop Ján Vaňo What is Hadoop? Software platform that lets one easily write and run applications that process vast amounts of data Includes: MapReduce offline computing engine

More information

Hadoop Introduction. Olivier Renault Solution Engineer - Hortonworks

Hadoop Introduction. Olivier Renault Solution Engineer - Hortonworks Hadoop Introduction Olivier Renault Solution Engineer - Hortonworks Hortonworks A Brief History of Apache Hadoop Apache Project Established Yahoo! begins to Operate at scale Hortonworks Data Platform 2013

More information

Spring Data. Modern Data Access for Enterprise Java. Jon Brisbin, and Michael Hunger O'REILLY* Mark Pollack, Oliver Gierke, Thomas Risberg, Cambridge

Spring Data. Modern Data Access for Enterprise Java. Jon Brisbin, and Michael Hunger O'REILLY* Mark Pollack, Oliver Gierke, Thomas Risberg, Cambridge Spring Data Modern Data Access for Enterprise Java Mark Pollack, Oliver Gierke, Thomas Risberg, Jon Brisbin, and Michael Hunger O'REILLY* Beijing Cambridge Farnham Koln Sebastopol Tokyo Table of Contents

More information

HBase Schema Design. NoSQL Ma4ers, Cologne, April 2013. Lars George Director EMEA Services

HBase Schema Design. NoSQL Ma4ers, Cologne, April 2013. Lars George Director EMEA Services HBase Schema Design NoSQL Ma4ers, Cologne, April 2013 Lars George Director EMEA Services About Me Director EMEA Services @ Cloudera ConsulFng on Hadoop projects (everywhere) Apache Commi4er HBase and Whirr

More information

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and

More information

Big Data Visualization. Apache Spark and Zeppelin

Big Data Visualization. Apache Spark and Zeppelin Big Data Visualization using Apache Spark and Zeppelin Prajod Vettiyattil, Software Architect, Wipro Agenda Big Data and Ecosystem tools Apache Spark Apache Zeppelin Data Visualization Combining Spark

More information

E6893 Big Data Analytics: Demo Session for HW I. Ruichi Yu, Shuguan Yang, Jen-Chieh Huang Meng-Yi Hsu, Weizhen Wang, Lin Haung.

E6893 Big Data Analytics: Demo Session for HW I. Ruichi Yu, Shuguan Yang, Jen-Chieh Huang Meng-Yi Hsu, Weizhen Wang, Lin Haung. E6893 Big Data Analytics: Demo Session for HW I Ruichi Yu, Shuguan Yang, Jen-Chieh Huang Meng-Yi Hsu, Weizhen Wang, Lin Haung 1 Oct 2, 2014 2 Part I: Pig installation and Demo Pig is a platform for analyzing

More information