Click the link below to get more detail

Similar documents
Cloudera Certified Developer for Apache Hadoop

Qsoft Inc

Programming Hadoop 5-day, instructor-led BD-106. MapReduce Overview. Hadoop Overview

Internals of Hadoop Application Framework and Distributed File System

Map Reduce & Hadoop Recommended Text:

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Getting to know Apache Hadoop

Overview. Big Data in Apache Hadoop. - HDFS - MapReduce in Hadoop - YARN. Big Data Management and Analytics

Complete Java Classes Hadoop Syllabus Contact No:

Peers Techno log ies Pv t. L td. HADOOP

Session: Big Data get familiar with Hadoop to use your unstructured data Udo Brede Dell Software. 22 nd October :00 Sesión B - DB2 LUW

International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February ISSN

Big Data With Hadoop

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop)

ITG Software Engineering

Hadoop Ecosystem B Y R A H I M A.

Xiaoming Gao Hui Li Thilina Gunarathne

Hadoop Certification (Developer, Administrator HBase & Data Science) CCD-410, CCA-410 and CCB-400 and DS-200

MapReduce. Tushar B. Kute,

PassTest. Bessere Qualität, bessere Dienstleistungen!

Open source large scale distributed data management with Google s MapReduce and Bigtable

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware

A very short Intro to Hadoop

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh

Big Data Too Big To Ignore

HADOOP ADMINISTATION AND DEVELOPMENT TRAINING CURRICULUM

Hadoop Job Oriented Training Agenda

Big Data and Hadoop. Module 1: Introduction to Big Data and Hadoop. Module 2: Hadoop Distributed File System. Module 3: MapReduce

Open source Google-style large scale data analysis with Hadoop

Workshop on Hadoop with Big Data

CURSO: ADMINISTRADOR PARA APACHE HADOOP

Weekly Report. Hadoop Introduction. submitted By Anurag Sharma. Department of Computer Science and Engineering. Indian Institute of Technology Bombay

Hadoop. History and Introduction. Explained By Vaibhav Agarwal

Hadoop Introduction. Olivier Renault Solution Engineer - Hortonworks

Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA

Introduction to Cloud Computing

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

HADOOP. Revised 10/19/2015

COURSE CONTENT Big Data and Hadoop Training

Extreme Computing. Hadoop MapReduce in more detail.

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

Pivotal HD Enterprise

Hadoop implementation of MapReduce computational model. Ján Vaňo

White Paper: What You Need To Know About Hadoop

Hadoop IST 734 SS CHUNG

Apache HBase. Crazy dances on the elephant back

Certified Big Data and Apache Hadoop Developer VS-1221

brief contents PART 1 BACKGROUND AND FUNDAMENTALS...1 PART 2 PART 3 BIG DATA PATTERNS PART 4 BEYOND MAPREDUCE...385

BIG DATA - HADOOP PROFESSIONAL amron

t] open source Hadoop Beginner's Guide ij$ data avalanche Garry Turkington Learn how to crunch big data to extract meaning from

BIG DATA HADOOP TRAINING

Chapter 7. Using Hadoop Cluster and MapReduce

How To Write A Mapreduce Program On An Ipad Or Ipad (For Free)

Hadoop. Apache Hadoop is an open-source software framework for storage and large scale processing of data-sets on clusters of commodity hardware.

Processing of massive data: MapReduce. 2. Hadoop. New Trends In Distributed Systems MSc Software and Systems

BIG DATA TECHNOLOGY. Hadoop Ecosystem

Pro Apache Hadoop. Second Edition. Sameer Wadkar. Madhu Siddalingaiah

White Paper: Hadoop for Intelligence Analysis

HADOOP MOCK TEST HADOOP MOCK TEST II

ITG Software Engineering

DATA MINING WITH HADOOP AND HIVE Introduction to Architecture

E6893 Big Data Analytics: Demo Session for HW I. Ruichi Yu, Shuguan Yang, Jen-Chieh Huang Meng-Yi Hsu, Weizhen Wang, Lin Haung.

Lecture 10: HBase! Claudia Hauff (Web Information Systems)!

Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related

Deploying Hadoop with Manager

Big Data Management and NoSQL Databases

How To Handle Big Data With A Data Scientist

Lecture 3 Hadoop Technical Introduction CSE 490H

A bit about Hadoop. Luca Pireddu. March 9, CRS4Distributed Computing Group. (CRS4) Luca Pireddu March 9, / 18

Hadoop/MapReduce. Object-oriented framework presentation CSCI 5448 Casey McTaggart

Hadoop WordCount Explained! IT332 Distributed Systems

EXPERIMENTATION. HARRISON CARRANZA School of Computer Science and Mathematics

Introduction to Big Data Training

Large scale processing using Hadoop. Ján Vaňo

Hadoop: The Definitive Guide

Important Notice. (c) Cloudera, Inc. All rights reserved.

How To Scale Out Of A Nosql Database

MySQL and Hadoop. Percona Live 2014 Chris Schneider

Hadoop MapReduce: Review. Spring 2015, X. Zhang Fordham Univ.

Data-Intensive Programming. Timo Aaltonen Department of Pervasive Computing

Hadoop 只 支 援 用 Java 開 發 嘛? Is Hadoop only support Java? 總 不 能 全 部 都 重 新 設 計 吧? 如 何 與 舊 系 統 相 容? Can Hadoop work with existing software?

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi

University of Maryland. Tuesday, February 2, 2010

Hadoop. Dawid Weiss. Institute of Computing Science Poznań University of Technology

Working With Hadoop. Important Terminology. Important Terminology. Anatomy of MapReduce Job Run. Important Terminology

MapReduce. Introduction and Hadoop Overview. 13 June Lab Course: Databases & Cloud Computing SS 2012

H2O on Hadoop. September 30,

Hadoop for MySQL DBAs. Copyright 2011 Cloudera. All rights reserved. Not to be reproduced without prior written consent.

Big Data Technology Core Hadoop: HDFS-YARN Internals

L1: Introduction to Hadoop

How Companies are! Using Spark

A. Aiken & K. Olukotun PA3

What We Can Do in the Cloud (2) -Tutorial for Cloud Computing Course- Mikael Fernandus Simalango WISE Research Lab Ajou University, South Korea

A Brief Outline on Bigdata Hadoop

Transcription:

Click the link below to get more detail http://www.examkill.com/

ExamCode: Apache-Hadoop-Developer ExamName: Hadoop 2.0 Certification exam for Pig and Hive Developer Vendor Name: Hortonworks Edition = DEMO Question: 1 Identify the MapReduce v2 (MRv2 / YARN) daemon responsible for launching application containers and monitoring application resource usage? A. ResourceManager B. NodeManager C. ApplicationMaster D. ApplicationMasterService E. TaskTracker F. JobTracker Reference: Apache Hadoop YARN - Concepts & Applications Question: 2 Answer: B You want to run Hadoop jobs on your development workstation for testing before you submit them to your production cluster. Which mode of operation in Hadoop allows you to most closely simulate a production cluster while using a single machine? A. Run all the nodes in your production cluster as virtual machines on your development workstation. B. Run the hadoop command with the -jt local and the -fs file:///options. C. Run the DataNode, TaskTracker, NameNode and JobTracker daemons on a single machine. D. Run simldooop, the Apache open-source software for simulating Hadoop clusters. Question: 3 Answer: C

You have the following key-value pairs as output from your Map task: (the, 1) (fox, 1) (faster, 1) (than, 1) (the, 1) (dog, 1) How many keys will be passed to the Reducer's reduce method? A. Six B. Five C. Four D. Two E. One F. Three Answer: B Only one key value pair will be passed from the two (the, 1) key value pairs. Question: 4 Which project gives you a distributed, Scalable, data store that allows you random, realtime read/write access to hundreds of terabytes of data? A. HBase B. Hue C. Pig D. Hive E. Oozie F. Flume G. Sqoop Answer: A Use Apache HBase when you need random, realtime read/write access to your Big Data. Note: This project's goal is the hosting of very large tables -- billions of rows X millions of columns - - atop clusters of commodity hardware. Apache HBase is an open-source, distributed, versioned, column-oriented store modeled after Google's Bigtable: A Distributed Storage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, Apache HBase provides Bigtable-like capabilities on top of Hadoop and HDFS. Features Linear and modular scalability. Strictly consistent reads and writes. Automatic and configurable sharding of tables Automatic failover support between RegionServers. Convenient base classes for backing Hadoop MapReduce jobs with Apache HBase tables. Easy to use Java API for client access. Block cache and Bloom Filters for real-time queries. Query predicate push down via server side Filters Thrift gateway and a REST-ful Web service that supports XML, Protobuf, and binary data encoding

options Extensible jruby-based (JIRB) shell Support for exporting metrics via the Hadoop metrics subsystem to files or Ganglia; or via JMX Reference: http://hbase.apache.org/ (when would I use HBase? First sentence) Question: 5 Which one of the following statements describes a Pig bag. tuple, and map, respectively? A. Unordered collection of maps, ordered collection of tuples, ordered set of key/value pairs B. Unordered collection of tuples, ordered set of fields, set of key value pairs C. Ordered set of fields, ordered collection of tuples, ordered collection of maps D. Ordered collection of maps, ordered collection of bags, and unordered set of key/value pairs Question: 6 Answer: B Which HDFS command copies an HDFS file named foo to the local filesystem as localfoo? A. hadoop fs -get foo LocalFoo B. hadoop -cp foo LocalFoo C. hadoop fs -Is foo D. hadoop fs -put foo LocalFoo Question: 7 Answer: A You are developing a MapReduce job for sales reporting. The mapper will process input keys representing the year (IntWritable) and input values representing product indentifies (Text). Indentify what determines the data types used by the Mapper for a given job. A. The key and value types specified in the JobConf.setMapInputKeyClass and JobConf.setMapInputValuesClass methods B. The data types specified in HADOOP_MAP_DATATYPES environment variable C. The mapper-specification.xml file submitted with the job determine the mapper's input key and value types. D. The InputFormat used by the job determines the mapper's input key and value types. Answer: D

The input types fed to the mapper are controlled by the InputFormat used. The default input format, "TextInputFormat," will load data in as (LongWritable, Text) pairs. The long value is the byte offset of the line in the file. The Text object holds the string contents of the line of the file. Note: The data types emitted by the reducer are identified by setoutputkeyclass() andsetoutputvalueclass(). The data types emitted by the reducer are identified by setoutputkeyclass() and setoutputvalueclass(). By default, it is assumed that these are the output types of the mapper as well. If this is not the case, the methods setmapoutputkeyclass() and setmapoutputvalueclass() methods of the JobConf class will override these. Reference: Yahoo! Hadoop Tutorial, THE DRIVER METHOD Question: 8 All keys used for intermediate output from mappers must: A. Implement a splittable compression algorithm. B. Be a subclass of FileInputFormat. C. Implement WritableComparable. D. Override issplitable. E. Implement a comparator for speedy sorting. Answer: C The MapReduce framework operates exclusively on <key, value> pairs, that is, the framework views the input to the job as a set of <key, value> pairs and produces a set of <key, value> pairs as the output of the job, conceivably of different types. The key and value classes have to be serializable by the framework and hence need to implement the Writable interface. Additionally, the key classes have to implement the WritableComparable interface to facilitate sorting by the framework. Reference: MapReduce Tutorial Question: 9 Review the following data and Pig code:

What command to define B would produce the output (M,62,95l02) when invoking the DUMP operator on B? A. B = FILTER A BY (zip = = '95102' AND gender = = M"); B. B= FOREACH A BY (gender = = 'M' AND zip = = '95102'); C. B = JOIN A BY (gender = = 'M' AND zip = = '95102'); D. B= GROUP A BY (zip = = '95102' AND gender = = 'M'); Question: 10 Answer: A Assuming the following Hive query executes successfully: Which one of the following statements describes the result set? A. A bigram of the top 80 sentences that contain the substring "you are" in the lines column of the input data A1 table. B. An 80-value ngram of sentences that contain the words "you" or "are" in the lines column of the inputdata table. C. A trigram of the top 80 sentences that contain "you are" followed by a null space in the lines column of the inputdata table. D. A frequency distribution of the top 80 words that follow the subsequence "you are" in the lines column of the inputdata table. Answer: D

Powered by TCPDF (www.tcpdf.org) Click the link below to get more details http://www.examkill.com/ FEATURES: 100% Pass Guarantee 30 DaysMoney Back Guarantee 24/7 Live Chat Support(Technical & Sales) Instant Download or Email Attachment 50,000 +ve Reviews 100% Success Rate Discounts Available for Bulk Orders 9 P age