HDFS 2015: Past, Present, and Future



Similar documents
Extended Attributes and Transparent Encryption in Apache Hadoop

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

An Open Source Memory-Centric Distributed Storage System

Hadoop & its Usage at Facebook

Hadoop Scalability at Facebook. Dmytro Molkov YaC, Moscow, September 19, 2011

Take An Internal Look at Hadoop. Hairong Kuang Grid Team, Yahoo! Inc

HDFS: Hadoop Distributed File System

Design and Evolution of the Apache Hadoop File System(HDFS)

Apache Hadoop FileSystem and its Usage in Facebook

Hadoop & its Usage at Facebook

Hadoop Architecture. Part 1

Hadoop: Embracing future hardware

Hadoop Distributed File System. Dhruba Borthakur June, 2007

Prepared By : Manoj Kumar Joshi & Vikas Sawhney

HDFS Under the Hood. Sanjay Radia. Grid Computing, Hadoop Yahoo Inc.

HDFS Users Guide. Table of contents

The Hadoop Distributed File System

The Hadoop Distributed File System

Implementing the Hadoop Distributed File System Protocol on OneFS Jeff Hughes EMC Isilon

Apache Hadoop FileSystem Internals

Snapshots in Hadoop Distributed File System

Hadoop Distributed File System. T Seminar On Multimedia Eero Kurkela

Introduction to HDFS. Prasanth Kothuri, CERN

Sujee Maniyam, ElephantScale

Hadoop Size does Hadoop Summit 2013

Hadoop Distributed File System. Dhruba Borthakur Apache Hadoop Project Management Committee June 3 rd, 2008

BookKeeper. Flavio Junqueira Yahoo! Research, Barcelona. Hadoop in China 2011

HADOOP MOCK TEST HADOOP MOCK TEST I

Overview. Big Data in Apache Hadoop. - HDFS - MapReduce in Hadoop - YARN. Big Data Management and Analytics

HDFS Federation. Sanjay Radia Founder and Hortonworks. Page 1

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

Maximizing Hadoop Performance and Storage Capacity with AltraHD TM

THE HADOOP DISTRIBUTED FILE SYSTEM

Michael Thomas, Dorian Kcira California Institute of Technology. CMS Offline & Computing Week

Mambo Running Analytics on Enterprise Storage

HDFS Architecture Guide

GraySort and MinuteSort at Yahoo on Hadoop 0.23

Journal of science STUDY ON REPLICA MANAGEMENT AND HIGH AVAILABILITY IN HADOOP DISTRIBUTED FILE SYSTEM (HDFS)

Distributed File Systems

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE

Apache Hadoop. Alexandru Costan

Detailed Outline of Hadoop. Brian Bockelman

Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components

Big Data Analytics(Hadoop) Prepared By : Manoj Kumar Joshi & Vikas Sawhney

Big Data Technology Core Hadoop: HDFS-YARN Internals

Hadoop Distributed File System. Dhruba Borthakur Apache Hadoop Project Management Committee

The Evolving Apache Hadoop Eco-System

Hadoop Distributed File System. Jordan Prosch, Matt Kipps

Spectrum Scale HDFS Transparency Guide

Weekly Report. Hadoop Introduction. submitted By Anurag Sharma. Department of Computer Science and Engineering. Indian Institute of Technology Bombay

Distributed File Systems

Sector vs. Hadoop. A Brief Comparison Between the Two Systems

Hadoop. Apache Hadoop is an open-source software framework for storage and large scale processing of data-sets on clusters of commodity hardware.

Reference Architecture and Best Practices for Virtualizing Hadoop Workloads Justin Murray VMware

Unstructured Data Accelerator (UDA) Author: Motti Beck, Mellanox Technologies Date: March 27, 2012

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

CSE-E5430 Scalable Cloud Computing Lecture 2

Accelerating and Simplifying Apache

Jeffrey D. Ullman slides. MapReduce for data intensive computing

Introduction to HDFS. Prasanth Kothuri, CERN

A very short Intro to Hadoop

Lecture 5: GFS & HDFS! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Big Data Trends and HDFS Evolution

From Relational to Hadoop Part 1: Introduction to Hadoop. Gwen Shapira, Cloudera and Danil Zburivsky, Pythian

Implementation of Hadoop Distributed File System Protocol on OneFS Tanuj Khurana EMC Isilon Storage Division

CS2510 Computer Operating Systems

CS2510 Computer Operating Systems

Hadoop. History and Introduction. Explained By Vaibhav Agarwal

Performance and Energy Efficiency of. Hadoop deployment models

Hadoop Architecture and its Usage at Facebook

EMC IRODS RESOURCE DRIVERS

Cloudera Manager Training: Hands-On Exercises

Understanding Hadoop Performance on Lustre

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

RED HAT ENTERPRISE LINUX 7

Distributed Filesystems

MapReduce, Hadoop and Amazon AWS

Extending Hadoop beyond MapReduce

Parallels Cloud Storage

IBM General Parallel File System (GPFS ) 3.5 File Placement Optimizer (FPO)

Real-Time Big Data Analytics SAP HANA with the Intel Distribution for Apache Hadoop software

DATA MINING WITH HADOOP AND HIVE Introduction to Architecture

Case Study : 3 different hadoop cluster deployments

Panasas at the RCF. Fall 2005 Robert Petkus RHIC/USATLAS Computing Facility Brookhaven National Laboratory. Robert Petkus Panasas at the RCF

Data-intensive computing systems

Hadoop Distributed File System (HDFS) Overview

Alexandria Overview. Sept 4, 2015

HADOOP ON ORACLE ZFS STORAGE A TECHNICAL OVERVIEW

Apache Hadoop YARN: The Nextgeneration Distributed Operating. System. Zhijie Shen & Jian Hortonworks

Hadoop & Spark Using Amazon EMR

Introduction. Various user groups requiring Hadoop, each with its own diverse needs, include:

Encryption and Anonymization in Hadoop

Processing of massive data: MapReduce. 2. Hadoop. New Trends In Distributed Systems MSc Software and Systems

Transcription:

Apache: Big Data Europe 2015 HDFS 2015: Past, Present, and Future 9/30/2015 NTT DATA Corporation Akira Ajisaka Copyright 2015 NTT DATA Corporation

Self introduction Akira Ajisaka (NTT DATA) Apache Hadoop Committer 130+ commits in 2015 Working on usability 80+ documentation patches "Open-Source Professional Services" team Has deployed and supported 10k+ nodes of Hadoop clusters overall for 7 years Contributing to Apache Hadoop 6th in the world with NTT [1] [1] The Activities of Apache Hadoop Community 2014 http://ajisakaa.blogspot.com/2015/02/the-activities-of-apache-hadoop.html Copyright 2015 NTT DATA Corporation 2

About Copyright 2015 NTT DATA Corporation 3 Similar to "YARN 2015" presentation by @tshooter HDFS is developed faster than YARN 1400 1200 Resolved issues in 2015 (cumulative) HDFS YARN 1000 800 600 400 200 0 1-Jan-15 1-Feb-15 1-Mar-15 1-Apr-15 1-May-15 1-Jun-15 1-Jul-15 1-Aug-15 1-Sep-15 Need a summary of HDFS new features

4 Agenda Past Present Future

Past Copyright 2015 NTT DATA Corporation 5

6 Past releases 2.X is the release branch 1.X and 0.23.X are no longer maintained 2009 2010 2011 2012 2013 2014 2015 0.20.1 0.20.205 1.0.0 1.1.0 1.2.1(stable) New append branch-1 (branch-0.20) 0.21.0 Security 0.22.0 NameNode Federation, YARN 0.23.11(final) 0.23.0 NameNode HA 2.1.0-beta 2.3.0 2.5.0 2.7.0 branch-2 2.0.0-alpha 2.2.0 (GA) 2.4.0 2.6.0 trunk

7 Hadoop 2.2 (2013-10-13) NameNode High-Availability No Single Point of Failure Federation Multiple NameNodes, multiple namespaces Improve scalability Snapshots Read only point-in-time copy (Copy on Write) NFSv3 mount

8 Hadoop 2.3 (2014-02-20) Heterogeneous Storages (Phase 1) In-memory caching Introduce memory-locality Make efficient use of memory in DNs DFSClient 1. Ask NN to cache a file NameNode File DataNode DISK Memory

9 Hadoop 2.3 (2014-02-20) Heterogeneous Storages (Phase 1) In-memory caching Introduce memory-locality Make efficient use of memory in DNs DFSClient NameNode File DataNode File 2. Ask DN to cache blocks DISK Memory

10 Hadoop 2.3 (2014-02-20) Heterogeneous Storages (Phase 1) In-memory caching Introduce memory-locality Make efficient use of memory in DNs File DFSClient DataNode File If cached locally, read directly from memory and skip checksum calculation DISK Memory

11 Hadoop 2.4 (2014-04-07) Rolling Upgrades No need to wait for hours ACLs More fine-grained permissions Similar to POSIX ACL -rw-rw-r-- 3 tester hadoop 129 2015-09-15 12:00 /user/tester/test.txt $ hdfs dfs -setfacl -m group:hive:rw- /user/tester/test.txt gives write permission to hive group

12 Hadoop 2.5 (2014-08-11) Extended Attributes (XAttrs) Similar to extended attributes in Linux -rw-r--r-- 3 tester hadoop 129 2015-09-15 12:00 /user/tester/test.txt Set XAttrs $ hdfs dfs -setfattr -n user.locale -v jp /user/tester/test.txt $ hdfs dfs -setfattr -n user.city -v tokyo /user/tester/test.txt Get XAttrs $ hdfs dfs -getfattr -d /user/tester/test.txt # file: /user/tester/test.txt user.locale="jp" user.city="tokyo" Currently used by transparent encryption

13 Hadoop 2.6 (2014-11-18) Hot swap volumes Recover from disk failures w/o stopping DNs Integrate Apache HTrace (incubating) Trace RPCs inside HDFS Time node 1 node 2 RPC Span A Span B trace id: 12345 parent: root trace id: 12345 parent: A Easy to find parent-child relations RPC RPC node 3 Span C Span D Finding bottlenecks becomes easier

14 Hadoop 2.6 (2014-11-18) (Cont.d) Heterogeneous Storages (Phase 2) Archival Storage Memory as storage tier Transparent Encryption

Heterogeneous Storages Problem SSD is getting cheaper Want to store hot data in SSD to achieve higher throughput Solution: Introduce storage type and block placement policy Storage: HDD, SSD, ARCHIVE,... Policy: One_SSD, HOT, WARM, COLD,... Example: A -> One_SSD, B -> HOT Hadoop 2.6 A SSD DN1 DISK SSD DN2 B DISK SSD DN3 DISK DISK Copyright 2015 NTT DATA Corporation B DISK A DISK DISK DISK A B DISK 15

16 Heterogeneous Storages How to use Configure HDFS to recognize storage type for each disk <parameter> <name>dfs.datanode.data.dir</name> <value>[ssd]file:///data/ssd,[hdd]file:///data/hdd</value> </parameter> Set block placement policy to HDFS path Reset policy after putting data is possible Mover will move blocks to satisfy the policy considering rack awareness Hadoop 2.6 $ hdfs setstoragepolicies -setstoragepolicy -path <path> -policy <policy>

17 Archival Storage DISK or ARCHIVE? ARCHIVE is for cold data Hadoop 2.6 ebay reduces cost/gb by 5x [1] Use low-spec DNs for ARCHIVE No need to split cluster! Regular Node Archival Node Drives 12 HDDs 60 HDDs CPU 32 Cores 4 Cores Memory 128GB 64GB Run NodeManager Yes No [1] Reduce Storage Costs by 5x Using The New HDFS Tierd Storage Feature http://www.slideshare.net/hadoop_summit/reduce-storage-costs-by-5x-using-the-new-hdfstiered-storage-feature

18 Transparent Encryption Problem Cannot guard data from OS-level attacks Hadoop 2.6 DataTransferProtocol can be encrypted Data DataNode NOT encrypted! Client Encrypted data DISK Data Solution Provide end-to-end encryption Encrypt/decrypt data transparently No need to rewrite user application

Transparent Encryption: How to encrypt data Copyright 2015 NTT DATA Corporation 19 DEK (Data Encryption Key) Hadoop 2.6 A unique key for each file in EZ (Encryption Zone) Stored in an Xattr of the file, encrypted (EDEK) Client 1. Create file in EZ 3. Store EDEK in metadata EDEK NameNode 2. Get EDEK Proxy to underlying key provider ACLs on per key basis Bundled with Hadoop package Key Management Server

Transparent Encryption: How to encrypt data Copyright 2015 NTT DATA Corporation 20 DEK (Data Encryption Key) Hadoop 2.6 A unique key for each file in EZ (Encryption Zone) Stored in an Xattr of the file, encrypted (EDEK) EDEK Client 4. EDEK returned EDEK NameNode 5. Call to decrypt EDEK to DEK Key Management Server

Transparent Encryption: How to encrypt data Copyright 2015 NTT DATA Corporation 21 DEK (Data Encryption Key) Hadoop 2.6 A unique key for each file in EZ (Encryption Zone) Stored in an Xattr of the file, encrypted (EDEK) DEK Client EDEK NameNode Encrypted data 6. Write encrypted data to DN using DEK DataNode Encrypted data Key Management Server

22 Transparent Encryption: Very low overhead Very low overhead Simple benchmark with 3 slaves (m3.xlarge, 4 core Xeon E5-2670 v2) Use AES-NI Encryption Off 1GB Teragen 17 sec 18 sec 1GB Terasort 47 sec 49 sec Encryption On Hadoop 2.6 Known issue Encryption is sometimes done incorrectly (HADOOP-11343) Recommend 2.7.1 or 2.6.1

Present Copyright 2015 NTT DATA Corporation 23

24 Hadoop 2.7 (2014-11-18) Quota per storage type Truncate API Files with variable-length blocks Web UI for NFS gateway NNTop: top-like tool for NameNode List top users for each operation Exposed via metric fsck -blockid option Print the file which the blockid belongs to Inotify

25 INotify for HDFS Problem Some components do caching Hive caches path names Impala caches block locations When to invalidate cache? Hadoop 2.7 Solution Introduce a tool similar to Linux inotify Client can monitor the events without parsing NN log or edits

26 INotify for HDFS: Technical Approach Client polls NameNode periodically Not push model Hadoop 2.7 1. Poll any events after #XX Client NameNode 2. Return events after #XX Caches the highest event number Known issue Truncate is not notified (HDFS-8742) Fixed in 2.8.0

Future Copyright 2015 NTT DATA Corporation 27

Many features are being developed 2.8 (not released) Support OAuth2 in WebHDFS RPC Congestion control Feature branches Erasure Coding (HDFS-7285) Ozone: Object store (HDFS-7240) BlockManager Scalability Improvements (HDFS-7836) HTTP/2 support for DataTransferProtocol (HDFS-7966) Implement an async pure c++ HDFS client (HDFS-8707) Copyright 2015 NTT DATA Corporation 28

29 RPC Congestion Control Problem NameNode RPC queue is FIFO DDoS can kill entire cluster Hadoop 2.8 while (true) { dfs.exists("/data"); } Don't do this! Solution Fair scheduling for RPC queue (2.6.0) Retriable exception with exponential backoff (2.8.0) Enable by default in 2.8

30 Erasure Coding Problem Reduce costs of storage Blocks are replicated to 3 DNs 3x storage overhead is costly Solution Use Erasure Code 3-replication (6,3)-Reed-Solomon Tolerates 2 failures 3 failures Disk Usage 3x 1.5x

31 Erasure Coding: Write files using (6,3)-Reed-Solomon Write data to 9 DNs in parallel ECClient 6 Data Blocks DN1 Incoming Data 3 Parity Blocks DN6 DN7 DN9

Erasure Coding: Read files Copyright 2015 NTT DATA Corporation 32 Read data from 6 DNs in parallel ECClient DN1 DN6 DN7 DN9

Erasure Coding: Read files when DN fails Copyright 2015 NTT DATA Corporation 33 Read data from (arbitrary) 6 DNs in parallel ECClient DN1 DN6 DN7 DN9

34 Erasure Coding: Current status Suitable for cold data No data locality Very low cost/gb with archival storage Now preparing for merge Follow on work Intel ISA-L support for faster encoding Support append/truncate/hflush/hsync More encoding schemas Pipeline error handling Support contiguous layout (HDFS EC Phase 2)

Summary Copyright 2015 NTT DATA Corporation 35 Many features are still in development I cannot predict when the feature will be available Recommend anyone who wants a feature to join contributing to it to make the development faster There are many ways to contribute Creating/Testing/Reviewing patches Reporting bugs Writing documents Discussing architecture design https://wiki.apache.org/hadoop/howtocontribute

Copyright 2011 NTT DATA Corporation Copyright 2015 NTT DATA Corporation

References Apache Hadoop Docs: http://hadoop.apache.org/docs/current/ In-memory caching (HDFS-4949) In-memory Caching in HDFS: Lower Latency, Same Grate Taste: http://www.slideshare.net/hadoop_summit/inmemory-caching-inhdfs-lower-latency-same-great-taste-33921794 Heterogeneous Storages (HDFS-5682) Reduce Storage Costs by 5x Using The New HDFS Tiered Storage Feature: http://www.slideshare.net/hadoop_summit/reducestorage-costs-by-5x-using-the-new-hdfs-tiered-storage-feature Transparent Encryption (HDFS-6134) Transparent Encryption in HDFS: http://www.slideshare.net/hadoop_summit/transparentencryption-in-hdfs INotify (HDFS-6634) Keep Me in the Loop: Introducing HDFS Inotify: http://www.slideshare.net/hadoop_summit/keep-me-in-the-loopinotify-in-hdfs Copyright 2015 NTT DATA Corporation 37

References RPC congestion control (HADOOP-9640, HADOOP-10597, HDFS-8820) Improving HDFS Availability with Hadoop RPC Quality of Service: http://www.slideshare.net/mingma4/hadooprpcqoshadoopsummit2015 Erasure Coding (HDFS-7285) HDFS Erasure Code Storage - Same Reliability at Better Storage Efficiency: http://www.slideshare.net/hadoop_summit/hdfserasure-code-storage-same-reliability-at-better-storage-efficiency Copyright 2015 NTT DATA Corporation 38