BIG DATA & HADOOP DEVELOPER TRAINING & CERTIFICATION

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "BIG DATA & HADOOP DEVELOPER TRAINING & CERTIFICATION"

Transcription

1 FACT SHEET BIG DATA & HADOOP DEVELOPER TRAINING & CERTIFICATION BIGDATA & HADOOP CLASS ROOM SESSION GreyCampus provides Classroom sessions for Big Data & Hadoop Developer Certification. This course will teach Big Data Eco System, the need for Big Data and its applications. After completion of the course participants will gain adequate knowledge to be a successful Hadoop Developer. GreyCampus will provide certify participants who successfully complete the course. COURSE INCLUSION GreyCampus provides classes on Big Data and Hadoop Development. This course is an introduction to the Big Data Eco System, the need for Big Data and its applications. The course also covers Hadoop Architecture, Map-reduce framework, starting with installations, and explores other technologies like Pig, Hive, HBase, ZooKeeper, Oozie, and Flume. It focuses to provide participants hands on experience, so there would be multiple assignments, quizzes and a project. 5 Days Classroom Training 30 Hours of E-Learning content 10 Class Recordings of Live Online Batches 10 Modules with Assignments and Chapter-end Quizzes 30 PDUs Free 365 Days e-learning access Receive Certification on successful submission of Project Downloadable Classroom Material SALIENT FEATURES High impact, proven training over professionals globally have participated in our training programs Experienced, expert instructors our Instructors come with a rich, 10+ years industry experience & 3 Years experience in Big Data and Hadoop We have dierent training methodologies that help participants learn better

2 REGISTRATION PROCESS 2 Step 1: Check programs available at your location/ nearest city (check here) Step 2: Make a payment to register and book your seat for the program Step 3: Immediately on registration, you will receive an from us with the payment detail and access details for the on-demand elearning component Step 4: Receive an introduction and further communication from your GreyCampus training coordinator for the Prep program you have registered for FAQS Q. Who will be the instructor? A.GreyCampus is very rigorous in its process of selecting the instructors and only go for the best. All our instructors come with an Industry experience of around 10 yrs. Our instructors are well experienced in delivering classroom trainings. Q. Who do I talk to if I need any help? A. You can call , (toll free) or write to for any assistance. Q. What is the pre requisite of the course? A. Having a background with Core Java/SQL and some familiarity with Databases, ETL tools etc. would help make the learning process easier. Q. What are System Requirements? A. To run Hadoop participant s need to carry their own laptops with the following requirements: Multimedia PC with Speakers/Headphones and Microphone Windows 7 (or newer) / Mac OS 10.7 (Lion) or newer Processor: Intel Core i5 or newer RAM: 4GB minimum, 8GB would be recommended Operating System: Windows 7 and above Internet Connectivity: 2MBPS, preferably via LAN (wired connection) Browser: Latest version of Chrome or Firefox

3 COURSE AGENDA 3 SECTION 1: INTRODUCTION TO BIG DATA Introduction to Big Data Concepts of Big Data Velocity Volume Variety Facts about Big Data Introduction to Hadoop Concepts of HadoopNormal Distribution Curve or Bell Curve HDFS (Hadoop Distributed File System) MapReduce Introduction to the Big Data Ecosystem SECTION 2: HDFS ARCHITECTURE HDFS Architecture NameNode SecondaryNameNode DataNode Configuring Hadoop Environment on Ubuntu Dierent modes of installation Pseudo Distributed Mode Standalone mode Fully Distributed Mode HDFS Vs Regular File System SECTION 3: MAPREDUCE Introduction to MapReduce MapReduce Engine JobTracker TaskTracker MapReduce Programming Model Mapper Class Reducer Class Executing MapReduce Jobs MapReduce and Java MapReduce Programs in Java and Eclipse SECTION 4: ADVANCED MAPREDUCE Advanced MapReduce Datatypes in MapReduce Input Formats in MapReduce Output Formats in MapReduce Joins in MapReduce Reduce side join Replicated join Composite join Cartesian Use cases of MapReduce SECTION 5: HIVE Hive The Data Warehouse in Hadoop Concepts of Hive Hive Architecture Metastore Driver Thrift Server Web Interface JDBC / ODBC CLI Introduction to HQL (Hive Query Language) The Hive Data Model Partitions Data types Hive Configuration Sample Hive Queries and commands SECTION 6: PIG PIG Pig Execution Mode Local Mode MapReduce Mode Pig Engine Pig Latin Scripts Interactive Mode Batch Mode Configuring PIG Sample PIG Scripts SECTION 7: HBASE AND ZOOKEEER HBase Distributed Columnar Database NoSQL Movement HBase Architecture Region Servers HBase Storage Introduction to Zookeeper Entities of Zookeeper Leader Follower Observer Zookeeper Data Model Configuring HBase and Zookeeper HBase Examples HBase Use Cases

4 COURSE AGENDA 3 SECTION 8: SQOOP AND FLUME MOVING DATA TO AND FROM HDFS Introduction to Sqoop Sqoop Connectors to RDBMS Importing Data from Sqoop to Hive Sqoop Commands Introduction to Flume Flume Data Model Flume Examples Use Cases of Sqoop and Flume SECTION 9: HADOOP ECOSYSTEM AND COMPONENTS - INTRODUCTION Mahout Oozie Phoenix Squirel Cassandra MongoDB SECTION 10: COMMERCIAL DISTRIBUTIONS OF HADOOP Cloudera Hortonworks MapR

5 ABOUT GREYCAMPUS GreyCampus is a leading provider of training for professional certifications in the areas of project management (PMP, PRINCE2 ), IT service management (ITIL ) and quality management (Six Sigma). Our aim is to provide quality classroom and online training enabling professionals to achieve their career enhancement goals. GreyCampus has trained more than 10,000 professionals across 50 countries in the last twelve months. GreyCampus is a Registered Education Provider for PMI, USA and an accredited training organization for PRINCE2 and ITIL by PEOPLECERT. TRAINED OVER 10,000 PROFESSIONALS REACH ACROSS 50+ COUNTRIES EXAM PASS RATE OF OVER 97 % COURSES ACCREDITED BY LEADING GLOBAL BODIES ACCREDITATIONS & ASSOCIATIONS Provider ID : 3871 DISCLAIMER PMI, PMBOK, PMP CAPM and PMI-ACP are registered marks of the Project Management Institute, Inc. The Swirl logo is a trade mark of AXELOS Limited. ITIL is a registered trade mark of AXELOS Limited. PRINCE2 is a Registered Trade Mark of AXELOS Limited. MSP is a Registered Trade Mark of AXELOS Limited. COBIT is a trademark of ISACA registered in the United States and other countries. CISA is a Registered Trade Mark of the Information Systems Audit and Control Association (ISACA) and the IT Governance Institute. CiSSP is a registered mark of the The International Information Systems Security Certification Consortium ((ISC.)2)

Qsoft Inc www.qsoft-inc.com

Qsoft Inc www.qsoft-inc.com Big Data & Hadoop Qsoft Inc www.qsoft-inc.com Course Topics 1 2 3 4 5 6 Week 1: Introduction to Big Data, Hadoop Architecture and HDFS Week 2: Setting up Hadoop Cluster Week 3: MapReduce Part 1 Week 4:

More information

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture. Big Data Hadoop Administration and Developer Course This course is designed to understand and implement the concepts of Big data and Hadoop. This will cover right from setting up Hadoop environment in

More information

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP Big-Data and Hadoop Developer Training with Oracle WDP What is this course about? Big Data is a collection of large and complex data sets that cannot be processed using regular database management tools

More information

Complete Java Classes Hadoop Syllabus Contact No: 8888022204

Complete Java Classes Hadoop Syllabus Contact No: 8888022204 1) Introduction to BigData & Hadoop What is Big Data? Why all industries are talking about Big Data? What are the issues in Big Data? Storage What are the challenges for storing big data? Processing What

More information

BIG DATA HADOOP TRAINING

BIG DATA HADOOP TRAINING BIG DATA HADOOP TRAINING DURATION 40hrs AVAILABLE BATCHES WEEKDAYS (7.00AM TO 8.30AM) & WEEKENDS (10AM TO 1PM) MODE OF TRAINING AVAILABLE ONLINE INSTRUCTOR LED CLASSROOM TRAINING (MARATHAHALLI, BANGALORE)

More information

ITG Software Engineering

ITG Software Engineering Introduction to Apache Hadoop Course ID: Page 1 Last Updated 12/15/2014 Introduction to Apache Hadoop Course Overview: This 5 day course introduces the student to the Hadoop architecture, file system,

More information

HADOOP ADMINISTATION AND DEVELOPMENT TRAINING CURRICULUM

HADOOP ADMINISTATION AND DEVELOPMENT TRAINING CURRICULUM HADOOP ADMINISTATION AND DEVELOPMENT TRAINING CURRICULUM 1. Introduction 1.1 Big Data Introduction What is Big Data Data Analytics Bigdata Challenges Technologies supported by big data 1.2 Hadoop Introduction

More information

Workshop on Hadoop with Big Data

Workshop on Hadoop with Big Data Workshop on Hadoop with Big Data Hadoop? Apache Hadoop is an open source framework for distributed storage and processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly

More information

ON-DEMAND TRAINING FOR PROFESSIONALS

ON-DEMAND TRAINING FOR PROFESSIONALS FACT SHEET ON-DEMAND TRAINING FOR PROFESSIONALS REP ID : 3871 GET PMP CERTIFIED. GROW IN YOUR CAREER GreyCampus offers Instructor Led Training Program on Project Management Professional (PMP) Certification

More information

Peers Techno log ies Pv t. L td. HADOOP

Peers Techno log ies Pv t. L td. HADOOP Page 1 Peers Techno log ies Pv t. L td. Course Brochure Overview Hadoop is a Open Source from Apache, which provides reliable storage and faster process by using the Hadoop distibution file system and

More information

Certified Big Data and Apache Hadoop Developer VS-1221

Certified Big Data and Apache Hadoop Developer VS-1221 Certified Big Data and Apache Hadoop Developer VS-1221 Certified Big Data and Apache Hadoop Developer Certification Code VS-1221 Vskills certification for Big Data and Apache Hadoop Developer Certification

More information

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future

More information

Big Data Course Highlights

Big Data Course Highlights Big Data Course Highlights The Big Data course will start with the basics of Linux which are required to get started with Big Data and then slowly progress from some of the basics of Hadoop/Big Data (like

More information

Bringing Big Data to People

Bringing Big Data to People Bringing Big Data to People Microsoft s modern data platform SQL Server 2014 Analytics Platform System Microsoft Azure HDInsight Data Platform Everyone should have access to the data they need. Process

More information

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi Getting Started with Hadoop Raanan Dagan Paul Tibaldi What is Apache Hadoop? Hadoop is a platform for data storage and processing that is Scalable Fault tolerant Open source CORE HADOOP COMPONENTS Hadoop

More information

Hadoop Job Oriented Training Agenda

Hadoop Job Oriented Training Agenda 1 Hadoop Job Oriented Training Agenda Kapil CK hdpguru@gmail.com Module 1 M o d u l e 1 Understanding Hadoop This module covers an overview of big data, Hadoop, and the Hortonworks Data Platform. 1.1 Module

More information

Programming Hadoop 5-day, instructor-led BD-106. MapReduce Overview. Hadoop Overview

Programming Hadoop 5-day, instructor-led BD-106. MapReduce Overview. Hadoop Overview Programming Hadoop 5-day, instructor-led BD-106 MapReduce Overview The Client Server Processing Pattern Distributed Computing Challenges MapReduce Defined Google's MapReduce The Map Phase of MapReduce

More information

BIG DATA SERIES: HADOOP DEVELOPER TRAINING PROGRAM. An Overview

BIG DATA SERIES: HADOOP DEVELOPER TRAINING PROGRAM. An Overview BIG DATA SERIES: HADOOP DEVELOPER TRAINING PROGRAM An Overview Contents Contents... 1 BIG DATA SERIES: HADOOP DEVELOPER TRAINING PROGRAM... 1 Program Overview... 4 Curriculum... 5 Module 1: Big Data: Hadoop

More information

Data Analyst Program- 0 to 100

Data Analyst Program- 0 to 100 Development Data Analyst Program- 0 to 100 Master the Data Analysis tools like Pig and hive Data Science Build a recommendation engine 1 Data Analyst Program- 0 to 100 HADOOP SCHOOL OF TRAINING Basics

More information

Has been into training Big Data Hadoop and MongoDB from more than a year now

Has been into training Big Data Hadoop and MongoDB from more than a year now NAME NAMIT EXECUTIVE SUMMARY EXPERTISE DELIVERIES Around 10+ years of experience on Big Data Technologies such as Hadoop and MongoDB, Java, Python, Big Data Analytics, System Integration and Consulting

More information

Spring,2015. Apache Hive BY NATIA MAMAIASHVILI, LASHA AMASHUKELI & ALEKO CHAKHVASHVILI SUPERVAIZOR: PROF. NODAR MOMTSELIDZE

Spring,2015. Apache Hive BY NATIA MAMAIASHVILI, LASHA AMASHUKELI & ALEKO CHAKHVASHVILI SUPERVAIZOR: PROF. NODAR MOMTSELIDZE Spring,2015 Apache Hive BY NATIA MAMAIASHVILI, LASHA AMASHUKELI & ALEKO CHAKHVASHVILI SUPERVAIZOR: PROF. NODAR MOMTSELIDZE Contents: Briefly About Big Data Management What is hive? Hive Architecture Working

More information

Communicating with the Elephant in the Data Center

Communicating with the Elephant in the Data Center Communicating with the Elephant in the Data Center Who am I? Instructor Consultant Opensource Advocate http://www.laubersoltions.com sml@laubersolutions.com Twitter: @laubersm Freenode: laubersm Outline

More information

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney Introduction to Hadoop New York Oracle User Group Vikas Sawhney GENERAL AGENDA Driving Factors behind BIG-DATA NOSQL Database 2014 Database Landscape Hadoop Architecture Map/Reduce Hadoop Eco-system Hadoop

More information

Introduction to Big Data Training

Introduction to Big Data Training Introduction to Big Data Training The quickest way to be introduce with NOSQL/BIG DATA offerings Learn and experience Big Data Solutions including Hadoop HDFS, Map Reduce, NoSQL DBs: Document Based DB

More information

Deploying Hadoop with Manager

Deploying Hadoop with Manager Deploying Hadoop with Manager SUSE Big Data Made Easier Peter Linnell / Sales Engineer plinnell@suse.com Alejandro Bonilla / Sales Engineer abonilla@suse.com 2 Hadoop Core Components 3 Typical Hadoop Distribution

More information

ITG Software Engineering

ITG Software Engineering Introduction to Cloudera Course ID: Page 1 Last Updated 12/15/2014 Introduction to Cloudera Course : This 5 day course introduces the student to the Hadoop architecture, file system, and the Hadoop Ecosystem.

More information

BIG DATA - HADOOP PROFESSIONAL amron

BIG DATA - HADOOP PROFESSIONAL amron 0 Training Details Course Duration: 30-35 hours training + assignments + actual project based case studies Training Materials: All attendees will receive: Assignment after each module, video recording

More information

Implement Hadoop jobs to extract business value from large and varied data sets

Implement Hadoop jobs to extract business value from large and varied data sets Hadoop Development for Big Data Solutions: Hands-On You Will Learn How To: Implement Hadoop jobs to extract business value from large and varied data sets Write, customize and deploy MapReduce jobs to

More information

TRAINING PROGRAM ON BIGDATA/HADOOP

TRAINING PROGRAM ON BIGDATA/HADOOP Course: Training on Bigdata/Hadoop with Hands-on Course Duration / Dates / Time: 4 Days / 24th - 27th June 2015 / 9:30-17:30 Hrs Venue: Eagle Photonics Pvt Ltd First Floor, Plot No 31, Sector 19C, Vashi,

More information

Constructing a Data Lake: Hadoop and Oracle Database United!

Constructing a Data Lake: Hadoop and Oracle Database United! Constructing a Data Lake: Hadoop and Oracle Database United! Sharon Sophia Stephen Big Data PreSales Consultant February 21, 2015 Safe Harbor The following is intended to outline our general product direction.

More information

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE AGENDA Introduction to Big Data Introduction to Hadoop HDFS file system Map/Reduce framework Hadoop utilities Summary BIG DATA FACTS In what timeframe

More information

[Type text] Week. National summer training program on. Big Data & Hadoop. Why big data & Hadoop is important?

[Type text] Week. National summer training program on. Big Data & Hadoop. Why big data & Hadoop is important? 1 Week National summer training program on Big Data & Hadoop Why big data & Hadoop is important? Highlights of Big Data & Hadoop Implement a Hadoop Project Learn to write Complex MapReduce programs Perform

More information

Microsoft SQL Server 2012 with Hadoop

Microsoft SQL Server 2012 with Hadoop Microsoft SQL Server 2012 with Hadoop Debarchan Sarkar Chapter No. 1 "Introduction to Big Data and Hadoop" In this package, you will find: A Biography of the author of the book A preview chapter from the

More information

MySQL and Hadoop. Percona Live 2014 Chris Schneider

MySQL and Hadoop. Percona Live 2014 Chris Schneider MySQL and Hadoop Percona Live 2014 Chris Schneider About Me Chris Schneider, Database Architect @ Groupon Spent the last 10 years building MySQL architecture for multiple companies Worked with Hadoop for

More information

Hadoop Ecosystem B Y R A H I M A.

Hadoop Ecosystem B Y R A H I M A. Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open

More information

Next Gen Hadoop Gather around the campfire and I will tell you a good YARN

Next Gen Hadoop Gather around the campfire and I will tell you a good YARN Next Gen Hadoop Gather around the campfire and I will tell you a good YARN Akmal B. Chaudhri* Hortonworks *about.me/akmalchaudhri My background ~25 years experience in IT Developer (Reuters) Academic (City

More information

<Insert Picture Here> Big Data

<Insert Picture Here> Big Data Big Data Kevin Kalmbach Principal Sales Consultant, Public Sector Engineered Systems Program Agenda What is Big Data and why it is important? What is your Big

More information

Big Data and Hadoop. Module 1: Introduction to Big Data and Hadoop. Module 2: Hadoop Distributed File System. Module 3: MapReduce

Big Data and Hadoop. Module 1: Introduction to Big Data and Hadoop. Module 2: Hadoop Distributed File System. Module 3: MapReduce Big Data and Hadoop Module 1: Introduction to Big Data and Hadoop Learn about Big Data and the shortcomings of the prevailing solutions for Big Data issues. You will also get to know, how Hadoop eradicates

More information

Ankush Cluster Manager - Hadoop2 Technology User Guide

Ankush Cluster Manager - Hadoop2 Technology User Guide Ankush Cluster Manager - Hadoop2 Technology User Guide Ankush User Manual 1.5 Ankush User s Guide for Hadoop2, Version 1.5 This manual, and the accompanying software and other documentation, is protected

More information

Hadoop Development & BI- 0 to 100

Hadoop Development & BI- 0 to 100 Development Master the Data Analysis tools like Pig and hive Data Science Hadoop Development & BI- 0 to 100 Build a recommendation engine Hadoop Development - 0 to 100 HADOOP SCHOOL OF TRAINING Basics

More information

Hadoop implementation of MapReduce computational model. Ján Vaňo

Hadoop implementation of MapReduce computational model. Ján Vaňo Hadoop implementation of MapReduce computational model Ján Vaňo What is MapReduce? A computational model published in a paper by Google in 2004 Based on distributed computation Complements Google s distributed

More information

Hadoop 101. Lars George. NoSQL- Ma4ers, Cologne April 26, 2013

Hadoop 101. Lars George. NoSQL- Ma4ers, Cologne April 26, 2013 Hadoop 101 Lars George NoSQL- Ma4ers, Cologne April 26, 2013 1 What s Ahead? Overview of Apache Hadoop (and related tools) What it is Why it s relevant How it works No prior experience needed Feel free

More information

Big Data Too Big To Ignore

Big Data Too Big To Ignore Big Data Too Big To Ignore Geert! Big Data Consultant and Manager! Currently finishing a 3 rd Big Data project! IBM & Cloudera Certified! IBM & Microsoft Big Data Partner 2 Agenda! Defining Big Data! Introduction

More information

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give

More information

t] open source Hadoop Beginner's Guide ij$ data avalanche Garry Turkington Learn how to crunch big data to extract meaning from

t] open source Hadoop Beginner's Guide ij$ data avalanche Garry Turkington Learn how to crunch big data to extract meaning from Hadoop Beginner's Guide Learn how to crunch big data to extract meaning from data avalanche Garry Turkington [ PUBLISHING t] open source I I community experience distilled ftu\ ij$ BIRMINGHAMMUMBAI ')

More information

ON-DEMAND TRAINING FOR PROFESSIONALS

ON-DEMAND TRAINING FOR PROFESSIONALS ON-DEMAND TRAINING FOR PROFESSIONALS ONLINE SELF LEARNING LIVE ONLINE TRAINING IN-PERSON CLASSROOM TRAINING PROJECT MANAGEMENT QUALITY MANAGEMENT DATA ANALYTICS & CLOUD COMPUTING WORKPLACE TOOLS AGILE

More information

Virtual Machine (VM) These VMs are to be used for teaching: they are not workstations for calculation.

Virtual Machine (VM) These VMs are to be used for teaching: they are not workstations for calculation. Computer and Software Infrastructure Available to Teachers and Students of the MSc in Big Data Virtual Machine (VM) All students have at their disposal a VM Windows 7 64-bit, 3 GB RAM, 1 vcpu. This VM

More information

Lecture 2 (08/31, 09/02, 09/09): Hadoop. Decisions, Operations & Information Technologies Robert H. Smith School of Business Fall, 2015

Lecture 2 (08/31, 09/02, 09/09): Hadoop. Decisions, Operations & Information Technologies Robert H. Smith School of Business Fall, 2015 Lecture 2 (08/31, 09/02, 09/09): Hadoop Decisions, Operations & Information Technologies Robert H. Smith School of Business Fall, 2015 K. Zhang BUDT 758 What we ll cover Overview Architecture o Hadoop

More information

Dominik Wagenknecht Accenture

Dominik Wagenknecht Accenture Dominik Wagenknecht Accenture Improving Mainframe Performance with Hadoop October 17, 2014 Organizers General Partner Top Media Partner Media Partner Supporters About me Dominik Wagenknecht Accenture Vienna

More information

International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February-2014 10 ISSN 2278-7763

International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February-2014 10 ISSN 2278-7763 International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February-2014 10 A Discussion on Testing Hadoop Applications Sevuga Perumal Chidambaram ABSTRACT The purpose of analysing

More information

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP Eva Andreasson Cloudera Most FAQ: Super-Quick Overview! The Apache Hadoop Ecosystem a Zoo! Oozie ZooKeeper Hue Impala Solr Hive Pig Mahout HBase MapReduce

More information

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop Lecture 32 Big Data 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop 1 2 Big Data Problems Data explosion Data from users on social

More information

Big Data Training - Hackveda

Big Data Training - Hackveda Big Data Training - Hackveda Become a Hackveda Certified Big Data Professional - (Beginner) Skill level: Beginner Training fee: INR 9000 only (Topics covered: 108) Chief Trainer: Mr. Devanshu Shukla Training

More information

From Relational to Hadoop Part 1: Introduction to Hadoop. Gwen Shapira, Cloudera and Danil Zburivsky, Pythian

From Relational to Hadoop Part 1: Introduction to Hadoop. Gwen Shapira, Cloudera and Danil Zburivsky, Pythian From Relational to Hadoop Part 1: Introduction to Hadoop Gwen Shapira, Cloudera and Danil Zburivsky, Pythian Tutorial Logistics 2 Got VM? 3 Grab a USB USB contains: Cloudera QuickStart VM Slides Exercises

More information

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and

More information

Apache Hadoop: Past, Present, and Future

Apache Hadoop: Past, Present, and Future The 4 th China Cloud Computing Conference May 25 th, 2012. Apache Hadoop: Past, Present, and Future Dr. Amr Awadallah Founder, Chief Technical Officer aaa@cloudera.com, twitter: @awadallah Hadoop Past

More information

How to Hadoop Without the Worry: Protecting Big Data at Scale

How to Hadoop Without the Worry: Protecting Big Data at Scale How to Hadoop Without the Worry: Protecting Big Data at Scale SESSION ID: CDS-W06 Davi Ottenheimer Senior Director of Trust EMC Corporation @daviottenheimer Big Data Trust. Redefined Transparency Relevance

More information

Data processing goes big

Data processing goes big Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,

More information

HADOOP. Revised 10/19/2015

HADOOP. Revised 10/19/2015 HADOOP Revised 10/19/2015 This Page Intentionally Left Blank Table of Contents Hortonworks HDP Developer: Java... 1 Hortonworks HDP Developer: Apache Pig and Hive... 2 Hortonworks HDP Developer: Windows...

More information

Hadoop Introduction. 2012 coreservlets.com and Dima May. 2012 coreservlets.com and Dima May

Hadoop Introduction. 2012 coreservlets.com and Dima May. 2012 coreservlets.com and Dima May 2012 coreservlets.com and Dima May Hadoop Introduction Originals of slides and source code for examples: http://www.coreservlets.com/hadoop-tutorial/ Also see the customized Hadoop training courses (onsite

More information

Session: Big Data get familiar with Hadoop to use your unstructured data Udo Brede Dell Software. 22 nd October 2013 10:00 Sesión B - DB2 LUW

Session: Big Data get familiar with Hadoop to use your unstructured data Udo Brede Dell Software. 22 nd October 2013 10:00 Sesión B - DB2 LUW Session: Big Data get familiar with Hadoop to use your unstructured data Udo Brede Dell Software 22 nd October 2013 10:00 Sesión B - DB2 LUW 1 Agenda Big Data The Technical Challenges Architecture of Hadoop

More information

How Cisco IT Built Big Data Platform to Transform Data Management

How Cisco IT Built Big Data Platform to Transform Data Management Cisco IT Case Study August 2013 Big Data Analytics How Cisco IT Built Big Data Platform to Transform Data Management EXECUTIVE SUMMARY CHALLENGE Unlock the business value of large data sets, including

More information

Upcoming Announcements

Upcoming Announcements Enterprise Hadoop Enterprise Hadoop Jeff Markham Technical Director, APAC jmarkham@hortonworks.com Page 1 Upcoming Announcements April 2 Hortonworks Platform 2.1 A continued focus on innovation within

More information

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION Syed Rasheed Solution Manager Red Hat Corp. Kenny Peeples Technical Manager Red Hat Corp. Kimberly Palko Product Manager Red Hat Corp.

More information

BIG DATA TECHNOLOGY. Hadoop Ecosystem

BIG DATA TECHNOLOGY. Hadoop Ecosystem BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big

More information

Talend Open Studio for Big Data. Release Notes 5.2.1

Talend Open Studio for Big Data. Release Notes 5.2.1 Talend Open Studio for Big Data Release Notes 5.2.1 Talend Open Studio for Big Data Copyleft This documentation is provided under the terms of the Creative Commons Public License (CCPL). For more information

More information

HADOOP BIG DATA DEVELOPER TRAINING AGENDA

HADOOP BIG DATA DEVELOPER TRAINING AGENDA HADOOP BIG DATA DEVELOPER TRAINING AGENDA About the Course This course is the most advanced course available to Software professionals This has been suitably designed to help Big Data Developers and experts

More information

Internals of Hadoop Application Framework and Distributed File System

Internals of Hadoop Application Framework and Distributed File System International Journal of Scientific and Research Publications, Volume 5, Issue 7, July 2015 1 Internals of Hadoop Application Framework and Distributed File System Saminath.V, Sangeetha.M.S Abstract- Hadoop

More information

The Greenplum Analytics Workbench

The Greenplum Analytics Workbench The Greenplum Analytics Workbench External Overview 1 The Greenplum Analytics Workbench Definition Is a 1000-node Hadoop Cluster. Pre-configured with publicly available data sets. Contains the entire Hadoop

More information

COURSE CONTENT Big Data and Hadoop Training

COURSE CONTENT Big Data and Hadoop Training COURSE CONTENT Big Data and Hadoop Training 1. Meet Hadoop Data! Data Storage and Analysis Comparison with Other Systems RDBMS Grid Computing Volunteer Computing A Brief History of Hadoop Apache Hadoop

More information

Supported Platforms. HP Vertica Analytic Database. Software Version: 7.0.x

Supported Platforms. HP Vertica Analytic Database. Software Version: 7.0.x HP Vertica Analytic Database Software Version: 7.0.x Document Release Date: 5/7/2014 Legal Notices Warranty The only warranties for HP products and services are set forth in the express warranty statements

More information

#TalendSandbox for Big Data

#TalendSandbox for Big Data Evalua&on von Apache Hadoop mit der #TalendSandbox for Big Data Julien Clarysse @whatdoesdatado @talend 2015 Talend Inc. 1 Connecting the Data-Driven Enterprise 2 Talend Overview Founded in 2006 BRAND

More information

Fundamentals Curriculum HAWQ

Fundamentals Curriculum HAWQ Fundamentals Curriculum Pivotal Hadoop 2.1 HAWQ Education Services zdata Inc. 660 4th St. Ste. 176 San Francisco, CA 94107 t. 415.890.5764 zdatainc.com Pivotal Hadoop & HAWQ Fundamentals Course Description

More information

Bright Cluster Manager

Bright Cluster Manager Bright Cluster Manager A Unified Management Solution for HPC and Hadoop Martijn de Vries CTO Introduction Architecture Bright Cluster CMDaemon Cluster Management GUI Cluster Management Shell SOAP/ JSONAPI

More information

Big Data in a Relational World Presented by: Kerry Osborne JPMorgan Chase December, 2012

Big Data in a Relational World Presented by: Kerry Osborne JPMorgan Chase December, 2012 Big Data in a Relational World Presented by: Kerry Osborne JPMorgan Chase December, 2012 whoami Never Worked for Oracle Worked with Oracle DB Since 1982 (V2) Working with Exadata since early 2010 Work

More information

Chase Wu New Jersey Ins0tute of Technology

Chase Wu New Jersey Ins0tute of Technology CS 698: Special Topics in Big Data Chapter 4. Big Data Analytics Platforms Chase Wu New Jersey Ins0tute of Technology Some of the slides have been provided through the courtesy of Dr. Ching-Yung Lin at

More information

Hadoop Introduction. Olivier Renault Solution Engineer - Hortonworks

Hadoop Introduction. Olivier Renault Solution Engineer - Hortonworks Hadoop Introduction Olivier Renault Solution Engineer - Hortonworks Hortonworks A Brief History of Apache Hadoop Apache Project Established Yahoo! begins to Operate at scale Hortonworks Data Platform 2013

More information

Lessons Learned: Building a Big Data Research and Education Infrastructure

Lessons Learned: Building a Big Data Research and Education Infrastructure Lessons Learned: Building a Big Data Research and Education Infrastructure G. Hsieh, R. Sye, S. Vincent and W. Hendricks Department of Computer Science, Norfolk State University, Norfolk, Virginia, USA

More information

Native Connectivity to Big Data Sources in MSTR 10

Native Connectivity to Big Data Sources in MSTR 10 Native Connectivity to Big Data Sources in MSTR 10 Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single

More information

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look IBM BigInsights Has Potential If It Lives Up To Its Promise By Prakash Sukumar, Principal Consultant at iolap, Inc. IBM released Hadoop-based InfoSphere BigInsights in May 2013. There are already Hadoop-based

More information

Hadoop: The Definitive Guide

Hadoop: The Definitive Guide FOURTH EDITION Hadoop: The Definitive Guide Tom White Beijing Cambridge Famham Koln Sebastopol Tokyo O'REILLY Table of Contents Foreword Preface xvii xix Part I. Hadoop Fundamentals 1. Meet Hadoop 3 Data!

More information

Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015

Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015 Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015 We Do Hadoop Fall 2014 Page 1 HDP delivers a comprehensive data management platform GOVERNANCE Hortonworks Data Platform

More information

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics In Organizations Mark Vervuurt Cluster Data Science & Analytics AGENDA 1. Yellow Elephant 2. Data Ingestion & Complex Event Processing 3. SQL on Hadoop 4. NoSQL 5. InMemory 6. Data Science & Machine Learning

More information

BITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand?

BITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand? BITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand? The Big Data Buzz big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database

More information

Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware

Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware Created by Doug Cutting and Mike Carafella in 2005. Cutting named the program after

More information

PassTest. Bessere Qualität, bessere Dienstleistungen!

PassTest. Bessere Qualität, bessere Dienstleistungen! PassTest Bessere Qualität, bessere Dienstleistungen! Q&A Exam : CCD-410 Title : Cloudera Certified Developer for Apache Hadoop (CCDH) Version : DEMO 1 / 4 1.When is the earliest point at which the reduce

More information

Data Integration Checklist

Data Integration Checklist The need for data integration tools exists in every company, small to large. Whether it is extracting data that exists in spreadsheets, packaged applications, databases, sensor networks or social media

More information

Virtual Machine (VM) For Hadoop Training

Virtual Machine (VM) For Hadoop Training 2012 coreservlets.com and Dima May Virtual Machine (VM) For Hadoop Training Originals of slides and source code for examples: http://www.coreservlets.com/hadoop-tutorial/ Also see the customized Hadoop

More information

Agenda. ! Strengths of PostgreSQL. ! Strengths of Hadoop. ! Hadoop Community. ! Use Cases

Agenda. ! Strengths of PostgreSQL. ! Strengths of Hadoop. ! Hadoop Community. ! Use Cases Postgres & Hadoop Agenda! Strengths of PostgreSQL! Strengths of Hadoop! Hadoop Community! Use Cases Best of Both World Postgres Hadoop World s most advanced open source database solution Enterprise class

More information

Hive Interview Questions

Hive Interview Questions HADOOPEXAM LEARNING RESOURCES Hive Interview Questions www.hadoopexam.com Please visit www.hadoopexam.com for various resources for BigData/Hadoop/Cassandra/MongoDB/Node.js/Scala etc. 1 Professional Training

More information

AtScale Intelligence Platform

AtScale Intelligence Platform AtScale Intelligence Platform PUT THE POWER OF HADOOP IN THE HANDS OF BUSINESS USERS. Connect your BI tools directly to Hadoop without compromising scale, performance, or control. TURN HADOOP INTO A HIGH-PERFORMANCE

More information

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved. Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!

More information

Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases. Lecture 15

Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases. Lecture 15 Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases Lecture 15 Big Data Management V (Big-data Analytics / Map-Reduce) Chapter 16 and 19: Abideboul et. Al. Demetris

More information

You should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required.

You should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required. What is this course about? This course is an overview of Big Data tools and technologies. It establishes a strong working knowledge of the concepts, techniques, and products associated with Big Data. Attendees

More information

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2016 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and

More information

Cisco IT Hadoop Journey

Cisco IT Hadoop Journey Cisco IT Hadoop Journey Alex Garbarini, IT Engineer, Cisco 2015 MapR Technologies 1 Agenda Hadoop Platform Timeline Key Decisions / Lessons Learnt Data Lake Hadoop s place in IT Data Platforms Use Cases

More information

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013 Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software SC13, November, 2013 Agenda Abstract Opportunity: HPC Adoption of Big Data Analytics on Apache

More information

L1: Introduction to Hadoop

L1: Introduction to Hadoop L1: Introduction to Hadoop Feng Li feng.li@cufe.edu.cn School of Statistics and Mathematics Central University of Finance and Economics Revision: December 1, 2014 Today we are going to learn... 1 General

More information

Like what you hear? Tweet it using: #Sec360

Like what you hear? Tweet it using: #Sec360 Like what you hear? Tweet it using: #Sec360 HADOOP SECURITY Like what you hear? Tweet it using: #Sec360 HADOOP SECURITY About Robert: School: UW Madison, U St. Thomas Programming: 15 years, C, C++, Java

More information

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase Architectural patterns for building real time applications with Apache HBase Andrew Purtell Committer and PMC, Apache HBase Who am I? Distributed systems engineer Principal Architect in the Big Data Platform

More information