CURSO: ADMINISTRADOR PARA APACHE HADOOP



Similar documents
Certified Big Data and Apache Hadoop Developer VS-1221

Overview. Big Data in Apache Hadoop. - HDFS - MapReduce in Hadoop - YARN. Big Data Management and Analytics

Hadoop Distributed File System. Dhruba Borthakur June, 2007

Hadoop Architecture. Part 1

Lecture 2 (08/31, 09/02, 09/09): Hadoop. Decisions, Operations & Information Technologies Robert H. Smith School of Business Fall, 2015

Qsoft Inc

Test-King.CCA Q.A. Cloudera CCA-500 Cloudera Certified Administrator for Apache Hadoop (CCAH)

Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components

HADOOP MOCK TEST HADOOP MOCK TEST II

HDFS Users Guide. Table of contents

Hadoop Distributed File System. Dhruba Borthakur Apache Hadoop Project Management Committee

Hadoop Distributed File System. Dhruba Borthakur Apache Hadoop Project Management Committee June 3 rd, 2008

Apache Hadoop. Alexandru Costan

Deploying Hadoop with Manager

HADOOP MOCK TEST HADOOP MOCK TEST

t] open source Hadoop Beginner's Guide ij$ data avalanche Garry Turkington Learn how to crunch big data to extract meaning from

International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February ISSN

THE HADOOP DISTRIBUTED FILE SYSTEM

Session: Big Data get familiar with Hadoop to use your unstructured data Udo Brede Dell Software. 22 nd October :00 Sesión B - DB2 LUW

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

Complete Java Classes Hadoop Syllabus Contact No:

7 Deadly Hadoop Misconfigurations. Kathleen Ting February 2013

How To Install Hadoop From Apa Hadoop To (Hadoop)

How To Run Apa Hadoop 1.0 On Vsphere Tmt On A Hyperconverged Network On A Virtualized Cluster On A Vspplace Tmter (Vmware) Vspheon Tm (

Hadoop Distributed File System. Jordan Prosch, Matt Kipps

Prepared By : Manoj Kumar Joshi & Vikas Sawhney

Hadoop. Apache Hadoop is an open-source software framework for storage and large scale processing of data-sets on clusters of commodity hardware.

Apache Hadoop new way for the company to store and analyze big data

<Insert Picture Here> Big Data

and HDFS for Big Data Applications Serge Blazhievsky Nice Systems

Apache Hadoop 2.0 Installation and Single Node Cluster Configuration on Ubuntu A guide to install and setup Single-Node Apache Hadoop 2.

MySQL and Hadoop. Percona Live 2014 Chris Schneider

H2O on Hadoop. September 30,

HADOOP ADMINISTATION AND DEVELOPMENT TRAINING CURRICULUM

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi

Open source Google-style large scale data analysis with Hadoop

A very short Intro to Hadoop

CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop)

6. How MapReduce Works. Jari-Pekka Voutilainen

Big Data Analytics(Hadoop) Prepared By : Manoj Kumar Joshi & Vikas Sawhney

Large scale processing using Hadoop. Ján Vaňo

Weekly Report. Hadoop Introduction. submitted By Anurag Sharma. Department of Computer Science and Engineering. Indian Institute of Technology Bombay

Cloudera Administration

L1: Introduction to Hadoop

Hadoop 2.6 Configuration and More Examples

Cloudera Administrator Training for Apache Hadoop

Pivotal HD Enterprise

Fundamentals Curriculum HAWQ

Hadoop implementation of MapReduce computational model. Ján Vaňo

HADOOP MOCK TEST HADOOP MOCK TEST I

MapReduce. Tushar B. Kute,

MapReduce with Apache Hadoop Analysing Big Data

Cloudera Manager Health Checks

Big Data : Experiments with Apache Hadoop and JBoss Community projects

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

Hadoop Job Oriented Training Agenda

Hadoop (pseudo-distributed) installation and configuration

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop

Hadoop 101. Lars George. NoSQL- Ma4ers, Cologne April 26, 2013

MASSIVE DATA PROCESSING (THE GOOGLE WAY ) 27/04/2015. Fundamentals of Distributed Systems. Inside Google circa 2015

MapReduce, Hadoop and Amazon AWS

Hadoop Scalability at Facebook. Dmytro Molkov YaC, Moscow, September 19, 2011

docs.hortonworks.com

Cloudera Manager Health Checks

Reference Architecture and Best Practices for Virtualizing Hadoop Workloads Justin Murray VMware

Hadoop. History and Introduction. Explained By Vaibhav Agarwal

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Big Data Technology Core Hadoop: HDFS-YARN Internals

DATA MINING WITH HADOOP AND HIVE Introduction to Architecture

!"#$%&' ( )%#*'+,'-#.//"0( !"#$"%&'()*$+()',!-+.'/', 4(5,67,!-+!"89,:*$;'0+$.<.,&0$'09,&)"/=+,!()<>'0, 3, Processing LARGE data sets

Introduction to HDFS. Prasanth Kothuri, CERN

Big Data Operations Guide for Cloudera Manager v5.x Hadoop

HDFS Architecture Guide

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Data-Intensive Computing with Map-Reduce and Hadoop

Chapter 7. Using Hadoop Cluster and MapReduce

EXPERIMENTATION. HARRISON CARRANZA School of Computer Science and Mathematics

HADOOP PERFORMANCE TUNING

Introduction to HDFS. Prasanth Kothuri, CERN

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

Hadoop MapReduce and Spark. Giorgio Pedrazzi, CINECA-SCAI School of Data Analytics and Visualisation Milan, 10/06/2015

Hadoop and ecosystem * 本 文 中 的 言 论 仅 代 表 作 者 个 人 观 点 * 本 文 中 的 一 些 图 例 来 自 于 互 联 网. Information Management. Information Management IBM CDL Lab

Hadoop Ecosystem B Y R A H I M A.

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh

Distributed Filesystems

How To Use Hadoop

Spectrum Scale HDFS Transparency Guide

Distributed File Systems

Case Study : 3 different hadoop cluster deployments

Apache HBase. Crazy dances on the elephant back

A Brief Outline on Bigdata Hadoop

How to Hadoop Without the Worry: Protecting Big Data at Scale

Chase Wu New Jersey Ins0tute of Technology

Transcription:

CURSO: ADMINISTRADOR PARA APACHE HADOOP TEST DE EJEMPLO DEL EXÁMEN DE CERTIFICACIÓN www.formacionhadoop.com

1 Question: 1 A developer has submitted a long running MapReduce job with wrong data sets. You want to kill the running MapReduce job so that a new job with the correct data sets can be started. What method can be used to terminate the submitted MapReduce job? A. Use CTRL-C from the terminal where the MapReduce job was started. B. Open a remote terminal to the node running the ApplicationMaster and kill the JVM. C. hadoop datanode -rollback D. yarn application -kill <application_id> Answer: D 2

2 Question: 2 A specific node in your cluster appears to be running slower than other nodes with the same hardware configuration. You suspect that the system is swapping memory to disk due to over allocation of resources. Which commands may be used to view the memory and swap usage on the system? A. jps B. lsswap C. top D. memusage E. free F. Df E. vmstat Answer: C, E, E 3

3 Question: 3 What must you do if you are running a Hadoop cluster with a single NameNode and six DataNodes, and you wish to change the configuration of all DataNodes. A. You must restart the NameNode daemon to apply the changes to the cluster. B. You must modify the configuration files on your NameNode where the master configuration files reside for all DataNodes. C. You must restart all six DataNode daemons to apply the changes. D. You don t need to restart any daemon, as they will pick up changes automatically. Answer: C 4

4 Question: 4 You are running a Hadoop cluster with a NameNode on host mynamenode. What are two ways you can determine available HDFS space in your cluster? A. Connect to http://mynamenode:50070/ and locate the DFS Remaining value. B. Run hdfs dfsadmin -SpaceQuota and subtract DFS Used% from Configured Capacity. C. Run hdfs dfsadmin -report and locate the DFS Remaining value. D. Run hadoop fs -du / and locate the DFS Remaining Answer: A,C 5

5 Question: 5 You set the value of mapred.child.java.opts to -Xmx200M on all TaskTrackers in the cluster. You set the same configuration parameter to -Xmx500M on the JobTracker. What size heap will a Map task running on the cluster have? A. 64MB B. 128MB C. 200MB D. 256MB E. 500MB F. The job will fail because of the discrepancy Answer: C 6

6 Question: 6 After a file has been written to HDFS, which of the following operations can you perform? A. You can delete the file B. You can update the file s contents C. You can overwrite the file by creating a new file with the same name D. You can move the file E. You can rename the file Answer: A,D,E 7

7 Question: 7 Identify which is a recommended configuration of disk drives for a DataNode? A. 48 2TB disk drives in a RAID configuration B. One 3TB disk drive C. 12 1TB disk drives in a RAID configuration D. 12 2TB disk drives in a JBOD configuration Answer: D 8

8 Question: 8 Which tool is best suited to import a portion of a relational database every day as files into HDFS, and generate Java classes to interact with that imported data? A. Pig B. Hive C. Sqoop D. Hue E. Flume F. Oozie Answer: C 9

9 Question: 9 The NameNode needs to know which DataNodes hold each HDFS block. How is that block location information managed? A. The DataNodes communicate block locations to each other, peer-to-peer on startup and every 60 minutes (a changeable parameter) called the block report. B. The NameNode stores the block locations in RAM and in the fsimage file. C. The NameNode stores the block locations in the fsimage file only. D. The NameNode stores the block locations in RAM. They are never stored on disk. Answer: D 10

10 Question: 10 Identify the function performed by a Secondary NameNode daemon configured to run with a single NameNode. A. It combines the fsimage and edits files produced by the NameNode. B. It acts as a standby NameNode, providing a high availability profile for clients. C. It provides an alternate HDFS endpoint when the NameNode is too busy. D. It performs real-time backups of the NameNode. Answer: A 11

Contacto administracion@formacionhadoop.com www.formacionhadoop.com TWITTER Twitter.com/formacionhadoop FACEBOOK Facebook.com/formacionhadoop LINKEDIN linkedin.com/company/formación-hadoop 12