Implementation of Hadoop Distributed File System Protocol on OneFS Tanuj Khurana EMC Isilon Storage Division
Outline HDFS Overview OneFS Overview HDFS protocol on OneFS HDFS protocol server implementation References Q&A 2
HDFS Overview Distributed File System Inspired by Google s GFS Designed for scalability and fault tolerance Fast streaming data access Minimal data motion Master Slave Architecture NameNode (Master) DataNodes http://hadoop.apache.org/docs/r1.2.1/hdfs_design 3
HDFS Overview: NameNode Manages the file-system namespace Stores all metadata in the RAM File names, owners, group, access info Maintains file to blocks mapping Manages block replication 4
HDFS Overview: DataNode Stores blocks of files on top of native host OS file-system (e.g. EXT3, ZFS) Same block is replicated on multiple data s for redundancy (typically 3X) Has no awareness of data blocks living elsewhere (only the NameNode does) 5
HDFS Overview: Workflow NFS Web Click data Name Node reply Compute Data Decision Support Databases OLAP HTTP CIFS FTP NFS Landing Zone Servers file info HDFS file copy2 copy3 info file copy2 copy3 info file copy2 copy3 info EDW Step 1: Data is copied into the Landing Zone Step 2: Data is copied into the Cluster (3 times) 3X file copy2 copy3 info Step 3: Hadoop Jobs are run 6
OneFS Overview Built from the ground up on FreeBSD Distributed scale-out file system Posix compliant Built in support for Data Protection, Snapshots, DR, Audit, Deduplication Support for multiple protocols SMB, NFS, HTTP, SWIFT, HDFS 7
OneFS Overview: Semantics Symmetric cluster architecture Metadata distributed across all s Globally coherent file system access Distributed lock manager Two-phase commit for all write operations Reed-Solomon FEC used for data protection 8
OneFS Overview: Architecture Servers Servers Servers Client/Application Layer Ethernet Layer Isilon IQ Storage Layer Intracluster Communication Infiniband 9
HDFS protocol on OneFS Implements the HDFS interface for Client- NameNode and Client-DataNode Each Isilon runs a NameNode and DataNode service Underlying file system is OneFS 10
HDFS protocol on OneFS: Architecture R (RHIPE) Mahout Hive HBase NameNode PIG Job Tracker ZooKeeper DataNode Compute Node Compute Node Compute Node Ethernet Compute Node Compute Node Compute Node name name name name data 11
HDFS protocol on OneFS: Benefits Multi-protocol access No data ingestion, faster time to results Single repository for all data Scale compute and data independently Higher storage efficiency (OneFS: 80% usable) Active-Active NameNode architecture Simultaneous multi-distribution and multi- Hadoop version support More data management options (Snapshots, DR, Audit etc ) 12
HDFS Workflow on OneFS NFS Web Click data Hadoop Cluster Decision Support Databases OLAP SMB, NFS, HTTP, FTP, HDFS Step 2: Jobs are run info info info info name name name data EDW Step 1: Much or all of the Data lives on the Isilon/Hadoop Cluster name Isilon 13
HDFS Protocol Impl: NameNode Most RPCs translate to POSIX system calls setpermission() chmod( ) settimes() utimes( ) create() open(, O_CREAT, ) Other RPCs need creative interpretation getblocklocations(), addblock(), abandonblock() renewlease(), recoverlease() Implements multiple versions of the protocol V1, V2 and V2.2 Versions have different wire formats 14
NameNode Connection Routing NameNode is configured as single URL Easy configuration: Set fs.defaultfs to hdfs://smartconnect.isilon.com:8020/ DNS round-robin to distribute across s Metadata IOPs get spread out OneFS maintains cross- consistency IP Failover plus client retries for resiliency 15
HDFS protocol Impl: Data Path 1) Read/Write Request GetBlockLocations( /file ) AddBlock( /file ) DFSClient Hadoop Node 4) Data (Zero-Copy) 2) Response Read: [ Blk1(locs[3]), Blk2( ), ] Write: [ Blk1(locs[1]), Blk2( ), ] 3) ReadBlock(block) / WriteBlock(block) NameNode NameNode NameNode NameNode DataNode OneFS Node DataNode OneFS Node DataNode OneFS Node DataNode OneFS Node OneFS Clustered FileSystem 16
HDFS protocol Impl: Data Path Specific to OneFS 17
HDFS protocol impl: Rack locality Configure racks to limit cross switch contention Core Ethernet Switch 1+ Gbps (used only for copy phase) 1+ Gbps (used only for copy phase) Rack Ethernet Switch Rack Ethernet Switch Compute 10 Gbps SATA Compute Isilon HDFS 10 Gbps Isilon HDFS Compute 10 Gbps SATA Compute Isilon HDFS 10 Gbps Isilon HDFS Shuffle Shuffle IB Shuffle Shuffle IB Isilon InfiniBand Switch HDFS I/O ALWAYS comes through a rack-local Isilon which collects data blocks from all other Isilon s across the InfiniBand fabric 18
HDFS protocol Impl: Authentication Simple Authentication Username sent in clear-text on wire, requires name resolution on every access Integrated with different directory services (AD, LDAP, NIS) Kerberos Authentication One hdfs service SPN for the cluster Kerberos provider manages the keytab and SPNs for both MIT/AD KDC Impersonation supported via proxyusers 19
HDFS protocol Impl: Leases HDFS implements a single-writer, multiplereader model Only one client can hold a lease on a file opened for writing, other clients can still read Clients periodically renew lease by sending requests to NameNode Leases expire On OneFS, leases are cluster aware because of distributed NameNode architecture Built on top of OneFS Distributed Lock Manager 20
HDFS protocol Impl: WebHDFS RESTful API to access HDFS Popular for scripting, toolkits and integration Used by Apache Hue, a popular HDFS file browser client Runs within the hdfs daemon Communicates with Apache web server over a unix domain socket using the FastCGI interface Supports both HTTP/HTTPS Supports SPNEGO via Kerberos 21
HDFS protocol Impl: Access Zones OneFS solution to Multi-Tenancy that ties together: Cluster network configuration ( IP Pools) Authentication providers File protocol access Zone context determined based on the cluster IP address the client connects to Logically partition cluster into self-contained units 22
Access Zones + HDFS Per-zone HDFS root directory Limits the file-system namespace view Virtualize all file path accesses (e.g. /home/user1 -> /ifs/zone1/home/user1) Per-zone HDFS security settings Simple_only / Kerberos_only / All Per-zone authentication services (AD, LDAP ) Key enabler for HDFS as a Service solution 23
References EMC Isilon OneFS Overview EMC Isilon Hadoop White Paper Isilon Hadoop Best Practices http://www.emc.com/collateral/hardware/white-papers/h10719-isilon-onefstechnical-overview-wp.pdf http://www.emc.com/collateral/software/white-papers/h10528-wp-hadoopon-isilon.pdf http://www.emc.com/collateral/white-paper/h12877-wp-emc-isilon-hadoopbest-practices.pdf EMC Hadoop Starter Kit https://community.emc.com/docs/doc-26892 24
Questions? 25