Deploying a big data solution using IBM GPFS-FPO

Size: px
Start display at page:

Download "Deploying a big data solution using IBM GPFS-FPO"

Transcription

1 Deploying a big data solution using IBM GPFS-FPO Best practices to make smart decisions for optimal performance and scalability Contents: 1 Abstract 1 Introduction 2 Assumptions 3 Understanding big data environments 6 GPFS-FPO key concepts and terminology 10 GPFS enterprise functions overview 13 Understanding data flow and workload 14 Setting up a GPFS-FPO cluster 25 Preparing the GPFS file system layout 26 Modifying the Hadoop configuration to use GPFS 29 Ingesting data into GPFS clusters 30 Exporting data out of GPFS clusters 30 Monitoring and administering GPFS clusters 31 GPFS-FPO restrictions 31 Conclusion 32 References and resources Abstract Every day, the world creates 2.5 quintillion bytes of data. In fact, 90 percent of the data in the world today has been created in the last two years alone. While there is much talk about big data, it is not mere hype. Businesses are realizing tangible results from investments in big data analytics, and IBM s big data platform is helping enterprises across all industries. IBM is unique in having developed an enterprise-class big data platform that allows you to address the full spectrum of big data business challenges. IBM General Parallel File System (GPFS ) offers an enterpriseclass alternative to Hadoop Distributed File System (HDFS) for building big data platforms. GPFS is a POSIX-compliant, highperforming and proven technology that is found in thousands of mission-critical commercial installations worldwide. GPFS provides a range of enterprise-class data management features. GPFS can be deployed independently or with IBM s big data platform, consisting of IBM InfoSphere BigInsights and IBM Platform Symphony. This document describes best practices for deploying GPFS in such environments to help ensure optimal performance and reliability. Introduction Businesses are discovering the huge potential of big data analytics across all dimensions of the business, from defining corporate strategy to managing customer relationships, and from improving operations to gaining competitive edge. The open source Apache Hadoop project, a software framework that enables high-performance analytics on unstructured data sets, is the centerpiece of big data solutions. Hadoop is designed to process data-intensive computational tasks, in parallel and at a scale, that previously were possible only in high-performance computing (HPC) environments.

2 The Hadoop ecosystem consists of many open source projects. One of the central components is the HDFS, a distributed file system designed to run on commodity hardware. Other related projects facilitate workflow and the coordination of jobs, support data movement between Hadoop and other systems, and implement scalable machine learning and data mining algorithms. However, HDFS lacks the enterprise-class functions necessary for reliability, data management and data governance. GPFS is a POSIX-compliant file system that offers an enterprise-class alternative to HDFS. GPFS has been used in many of the world s fastest supercomputers, including IBM Blue Gene, IBM Watson (the supercomputer featured on Jeopardy!) and the Argonne National Labs MIRA system. Apart from supercomputers, GPFS is commonly found in thousands of commercial mission-critical installations worldwide, from biomedical research to financial analytics. In 2009, GPFS was extended to work seamlessly in the Hadoop ecosystem and is available through a feature called GPFS File Placement Optimizer (GPFS-FPO). Storing your Hadoop data using GPFS-FPO allows you to gain advanced functions and the high I/O performance required for many big data operations. GPFS-FPO provides Hadoop compatibility extensions to replace HDFS in a Hadoop ecosystem, with no changes required to Hadoop applications. Assumptions The goal of this paper is to guide the administrator through various decision points to ensure optimal configuration based on the Hadoop application components being deployed. During the discussion, the following are assumed: A decision to use GPFS has already been made. The paper does not focus on why GPFS makes a better choice compared to using HDFS although differences with HDFS are highlighted where appropriate. The reader is already familiar with Hadoop components and other IBM products referred to in this paper. The paper covers high-level concepts only to establish the required context for follow-on topics. For further details, please use references listed at the end of this document or the respective product documentation. The reader understands this paper is not intended to serve as a comprehensive GPFS overview. Necessary concepts are described at a high level. Please use references for comprehensive understanding of the GPFS. The best practices presented in this paper show how to deploy GPFS-FPO as a file system platform for big data analytics. The paper covers a variety of Hadoop deployment architectures, including InfoSphere BigInsights, Platform Symphony, direct-from-hadoop open source (also referred as Do-It-Yourself) or with a Hadoop distribution from another vendor to work with GPFS. Architecture-specific notes are included where applicable. 2

3 Understanding big data environments Making the most of your big data environment requires that you include data from many different sources. Data sources vary by source and structure. Some are text-based documents such as logs from web applications and data feeds from social-networking applications, some are spreadsheets, and others can include transactional data from relational databases and data warehouses. Integration of these data sources forms the foundation of federated analytics. Data in these various forms tends to be large and continues to grow at an unprecedented rate. The speed at which data is analyzed and turned into business intelligence is critical to enterprises. Figure 1 shows components of a big data environment. This environment includes core Hadoop open source components along with IBM products that seamlessly extend or replace Hadoop components to make the solution more enterpriseready. The components applicable to your deployment depend on the data sources that you plan to integrate. Consider your current plans and future needs when making decisions for initial deployment so that the infrastructure can easily grow to support your business as it changes. Data sources/ connectors Visualization and discovery Development tools Eclipse plug-ins Systems management Biological sequences InfoSphere Streams IBM InfoSphere DataStage IBM BigSheets Text analytics Jaql Hadoop MapReduce Hive Query Web admin console Operational system IBM Netezza IBM DB2 Analytics ML analytics Text analytics Log and transactions JDBC Workload optimization Enhanced security Big SQL BigIndex Clickstreams Flume IBM LZO-based compression ZooKeeper Flexible scheduler Jaql Lucene Avro BoardReader Oozie Pig Hive CSV/XML/JSON Text, blog, weblog WebCrawler R Runtime Data store Hadoop MapReduce HBase Platform Symphony/ AdaptiveMR Column store Existing data sources IBM SPSS File system HDFS GPFS-FPO Open source components Generic document formats IBM extensions/options (GPFS, InfoSphere BigInsights, Platform Symphony) Figure 1. Big data environment. 3

4 Hadoop components overview The Hadoop project is comprised of three parts: HDFS, the Hadoop MapReduce model and Hadoop Common. The goal of Hadoop is to use commodity servers in a very large cluster to provide high I/O performance with minimal cost. HDFS is designed for workloads that: Use large files Access those files with sequential writes in append-only mode to store the data Use large sequential reads needed to run analytics jobs When writing a large file, HDFS distributes the contents of each file across the available storage by splitting it into large blocks and distributing each block across available servers (called data nodes). One of the nodes in the HDFS cluster, called the name node, coordinates access to the data in the cluster. The name node is assigned the critical role of managing data placement logic and storing metadata describing location details of the contents of a file. For optimal I/O performance, MapReduce uses the file metadata to assign workloads to the servers where the data is stored. Processing the data locally on a node eliminates network traffic overhead associated with storage area network (SAN) and network attached storage (NAS). To help ensure fault tolerance for common hardware failures, HDFS stores copies of each data block on more than one server. HDFS does not support in-place updates to stored data. To modify a file, it is necessary to move the file out of HDFS, modify the file and then move it back into HDFS. Since HDFS is not POSIX-compliant, you must use the HDFS shell and application programming interfaces (APIs) to work with the data stored within HDFS. MapReduce is the heart of Hadoop. The term MapReduce refers to two separate distinct tasks that Hadoop programs perform. The first is the map function, which takes a set of data and processes it to capture interesting aspects of the data. The reduce function takes the output from map jobs and combines it into a larger set of output to provide insights into the data. In a Hadoop cluster, jobs are managed by a component called JobTracker, which communicates with the name node to create multiple map-and-reduce tasks based on data locality. The tasks are managed by TaskTracker agents that run continually on the cluster nodes and report back status to the JobTracker. Hadoop Common Components are a set of libraries that support various Hadoop subprojects. These components are intended to simplify the complex task of creating MapReduce applications. Given that HDFS is not POSIX-compliant, and therefore, common Linux or UNIX tools and APIs cannot be used, it is necessary to use the hdfs dfs <args> (or in Hadoop version1.x, hadoop dfs <args>) file system shell command interface and Hadoop file system APIs. Flume is another component used for ingesting data into HDFS. The MapReduce application development environment includes several components to reduce the time and skill needed for application development. These components include Hive, Pig, Oozie, Zookeeper, HBase and Avro, to name a few in the fast-growing list. GPFS-FPO overview GPFS was originally designed as a SAN file system, which typically isn t suitable for a Hadoop cluster, since Hadoop clusters use locally attached disks to drive high performance for MapReduce applications. GPFS-FPO is an implementation of shared-nothing architecture that enables each node to operate independently, reducing the impact of failure events across multiple nodes. By storing your data in GPFS-FPO, you are freed from the architectural restrictions of HDFS. Additionally, you can take advantage of the GPFS-FPO pedigree as a multipurpose file system to gain tremendous management flexibility. You can manage your data using existing toolsets and processes along with a wide range of enterprise-class data management functions offered by GPFS, including: Full POSIX compliance Snapshot support for point-in-time data capture Simplified capacity management by using GPFS for all storage needs Policy-based information lifecycle management capabilities to manage petabytes of data Control over placement of each replica at file-level granularity Infrastructure to manage multi-tenant Hadoop clusters based on service-level agreements (SLAs) Simplified administration and automated recovery 4

5 GPFS Hadoop connector All big data applications run seamlessly with GPFS no application changes are required. GPFS provides seamless integration with Hadoop applications using a Hadoop connector. A GPFS Hadoop connector is shipped with GPFS, and a customized version is shipped with InfoSphere BigInsights. Platform Symphony uses the GPFS connector shipped with the GPFS product. As shown in Figure 2, the GPFS Hadoop connector implements Hadoop file system APIs using GPFS APIs to direct file system access requests to GPFS. This redirection of data access from HDFS to GPFS is enabled by changing Hadoop XML configuration files that are described later in the document. Distributed file system: GPFS Figure 2. GPFS Hadoop connector. Applications MapReduce API Hadoop FS APIs GPFS Hadoop Connector Distributed file system: HDFS InfoSphere BigInsights overview InfoSphere BigInsights is a platform for Hadoop with the simple goal of making Hadoop enterprise-ready. Enterprisereadiness includes three key aspects: Data integration: Providing a comprehensive view of the business by ensuring the Hadoop system integrates with the rest of the IT infrastructure for example, providing the ability to query Hadoop data from a data warehouse environment and vice versa. Operational excellence: Ensuring that the data stored in Hadoop adheres to the same business controls as the current infrastructure, including security, governance and administration. InfoSphere BigInsights features Apache Hadoop and its related open source projects as core components, along with several IBM extensions to provide enterprise-class capabilities. These extensions include a web management console; development tools such as Eclipse plug-ins and a text analytics workbench; and analytics accelerators, visualization tools and connectors to ingest and integrate data from variety of data sources. Platform Symphony overview Platform Symphony is a high-performance computing infrastructure and service-oriented architecture (SOA) platform for building, running and managing compute and data-intensive applications in large-scale and shared-cluster environments. The MapReduce framework in Platform Symphony specifically enables application developers to write or integrate Hadoopcompatible MapReduce applications. It provides an application adapter that enables users to execute Hadoop MapReduce jobs without changing application code or recompiling. The MapReduce framework in Platform Symphony is integrated with InfoSphere BigInsights as the Adaptive MapReduce component. This integration provides simplicity and ease of use for installation, configuration and integration of MapReduce applications by hiding product-specific details. Platform Symphony supports Apache Hadoop distribution and other distributions such as Cloudera. The GPFS integration is enabled in Platform Symphony Version and is the required minimum for this discussion. Analytics support: Enabling different classes of analysts to gain value directly from data that is stored in InfoSphere BigInsights, without the need for a team of Hadoop programmers or costly consulting. 5

6 Figure 3 illustrates Hadoop applications interfacing with the Platform Symphony MapReduce application adapter. The adapter interfaces with Platform Symphony APIs and Hadoop common libraries, providing seamless application integration to GPFS through the Hadoop connector. Hadoop MapReduce apps, Oozie, Pig, Hive and so on org.apache.hadoop.fs org.apache.hadoop.io org.apache.hadoop.util HDFS file system connector HDFS org.apache.hadoop.mapreduce org.apache.hadoop.mapred Hadoop MapReduce APIs Platform Symphony Hadoop MapReduce application adapter Hadoop common libraries Figure 3. IBM Platform Symphony in a big data solution. Platform Symphony APIs Platform Symphony GPFS file system connector GPFS The MapReduce framework is available with Platform Symphony Advanced Edition. An entitlement key is required to enable this feature. For detailed information about supported Hadoop distribution and licensing, see the Platform Symphony Integration Guide for MapReduce Applications listed under References and resources. Product versions The following product versions are used as the basis for this document: GPFS-FPO key concepts and terminology GPFS-FPO represents a set of features that have been added in GPFS Version This version provides formal support for GPFS-FPO. FPO extends the core GPFS architecture, providing greater control and flexibility to leverage data location, reduce hardware costs and improve I/O performance. It is important to understand key concepts and terms before you configure GPFS-FPO: Block: A block is the largest unit for single I/O operation and space allocation in a GPFS file system. The block size is specified when a file system is created. The block size defines the stripe width for distributing data on each disk used by GPFS. GPFS supports block sizes ranging from 16 KB to 16 MB and defaults to 256 KB. GPFS allows different block sizes for metadata and data, if disks for data and metadata are kept separate. Chunk: A chunk is a logical grouping of blocks that behaves like one large block. A block group factor dictates the number of blocks that are laid out on the disks attached to a node in a FPO configuration to form a chunk. The file data is divided into chunks, and each chunk is mapped to a node according to write-affinity depth setting. A chunk is then mapped to all available disks within a node. This allocation scheme is performed on a best-effort basis. For example, if a node does not have enough contiguous free blocks, GPFS attempts to allocate blocks on disks attached to other nodes. Block group factor allows applications to choose a chunk size at the file-level granularity needed to support data access requirements. The chunk size is defined by multiplying block group factor and block size. The block group factor ranges from 1 to 1,024. The default block group factor is 1 for compatibility with standard GPFS file systems. Setting block size to 1 MB and block group factor to 128 results in a chunk size of 128 MB. 1. GPFS V Use of the latest available GPFS 3.5 PTF is recommended, unless BigInsights is used. Refer to GPFS Support websites for available PTF. 2. InfoSphere BigInsights V (embeds GPFS V ) 3. Platform Symphony V

7 Network shared disk: When a LUN provided by a storage subsystem is configured for use by GPFS, it is referred to as a network shared disk (NSD). In a traditional GPFS installation, a LUN is typically made up of multiple physical disks provided by a RAID device. Once the LUN is defined as an NSD, GPFS allows all cluster nodes to have direct access to the NSDs. Alternatively, a subset of the nodes connected to the disks may provide access to the LUNs as an NSD server. In GPFS-FPO deployments, a physical disk and NSD can have a 1:1 mapping. In this case, each node in the cluster is a NSD server providing access to the disks from the rest of the cluster. In GPFS, use of the term disk is synonymous to NSD unless specified otherwise, mostly for historical reasons. NSDs are specified in an input file when using the mmcrnsd command or when creating a file system using the command mmcrfs. Storage pool: A storage pool is a collection of NSDs. Storage pools in GPFS allow you to group storage devices based on performance, locality or reliability characteristics within a file system. Storage pools provide a method to partition file system storage to offer several benefits, including improved price-performance, reduced contention on premium resources, seamless archiving and advanced failure containment. For example, one pool could be an enterprise-class storage system using highperformance flash devices, and another pool might consist of numerous disk controllers that host a large set of economical Serial ATA (SATA) disks. GPFS has two types of storage pools, internal and external. GPFS manages the data storage for internal storage pools. For external storage pools, GPFS does the policy processing but the data is managed by an external application such as IBM Tivoli Storage Manager. Storage pools are declared as an attribute of the disk when defining disks (NSDs). Failure group: A failure group is a set of disks that share a common point of failure that could cause them all to become simultaneously unavailable. When creating multiple replicas of a given block, GPFS uses failure group information to ensure that no two replicas exist within the same failure group. Traditionally, GPFS failure groups have been identified by simple integers. FPO introduced a multipart failure group concept to further convey the topology information of the nodes in the cluster that GPFS can exploit when making data placement decisions. A failure group can be specified as a list of up to three comma-separated numbers that convey topology information. In general, this list is a way to specify which disks are closer together and which are farther away. In practice, the three elements of the failure group definition may represent the rack number of the node to which the disk belongs, a position within the rack (lower or upper half) and a node number. For example, the failure group 2,1,0 identifies rack 2, bottom half, first node. Note that the second number in the locality information is restricted to be either 0 or 1. When considering two disks for striping or replica placement purposes, it is important to understand the following: Disks that differ in the first of the three numbers are considered farthest apart. In the rack topology example, this would be disks located in different racks. Disks that have the same first number but differ in the second number are considered closer. In the rack topology example, this would be disks located in a different half of the rack. Disks that differ only in the third number are assumed to reside in different nodes placed close together. In the rack topology example, this would be disks located in the same half of the rack. Only disks that have all three numbers in common reside in the same node. The use of the terms farthest apart and close together in failure group definitions refer to network topology. Two nodes close together may be on the same Ethernet switch, whereas nodes farther apart may traverse multiple switches to communicate. These parameters allow you the flexibility to design your data access patterns to best fit your environment, whether it is in a single data center or spread across many kilometers. 7

8 The failure group definition allows locality information of the cluster nodes and the associated disks to be described to GPFS and enables intelligent decision making for data block placements. Failure group is declared as an attribute of the disk when defining disks (NSDs). Metadata and data placement: GPFS allows for separation of storage used for metadata and data blocks since they may have different requirements for availability, access and performance. Metadata includes file metadata (file size, name, owner, last modified date); internal data structures such as quota information allocation maps, file system configuration and log files; and user-specified metadata such as inode, directory blocks, extended attributes and access control lists. The data includes data contents of a file. Metadata is vital to file system availability and should be stored on the most reliable storage media. The file data can be placed on the storage media based on the importance of the data. GPFS provides rich policies for initial placement and subsequent movement of data between storage pools (to be discussed later). GPFS places metadata in a storage pool named system. Within the system storage pool, disks can be assigned to hold metadata only, or both data and metadata. Unless specified otherwise, using placement policies, all data is stored in the system storage pool. Metadata is striped across designated disks in the system pool. For better resiliency, distributing metadata across all available nodes is not recommended in an FPO environment. Having the metadata on too many nodes reduces the availability of the system because the probability of three nodes failing at the same time increases with the number of nodes. Therefore, it is recommended that you limit the number of nodes with metadata disks to one or two nodes per rack. Replication: GPFS supports replication for failure containment and disaster recovery. Replication can incur costs in terms of disk capacity and performance, and therefore should be used carefully. GPFS supports replication of both metadata and data independently and provides fine-grained control of the replication. You can choose to replicate data for a single file, a set of files or the entire file system. Replication is an attribute of a file and can be managed individually or automated through policies. If replication is enabled, the target storage pool for the data must have at least two failure groups defined within the storage pool. Replication means that a copy of each block of data exists in at least two failure groups for two copies, and three failure groups for three copies. GPFS supports up to three replicas for both metadata and data. The default replication factor is 1, meaning no replicas are created. In many GPFS deployments, data protection is handled by the underlying RAID storage subsystem. In FPO environments, it is more common to rely on software replication for data protection. Data block placement decisions in FPO environments are affected by the level of replication and the value of the write-affinity depth and write-affinity failure group parameters that are described in this section. Write-affinity depth (WAD): Introduced for FPO, WAD is a policy for directing writes. It indicates that the node writing the data directs the write to disks on its own node for the first copy and to the disks on other nodes for the second and third copies (if specified). The policy allows the application to control placement of replicas within the cluster to optimize for typical access patterns. Write affinity is specified by a depth that indicates the number of replicas to be kept closer to the node ingesting data. This attribute can be specified at the storage pool or individual file level. The possible values are 0, 1 and 2. A write-affinity depth of 0 indicates that each chunk replica is to be striped across all the available nodes, with the restriction that no two replicas are in the same failure group. 8

9 A write-affinity depth of 1 indicates that the first replica is written to the node writing the data. The second and third replicas are striped at the chunk level among the nodes that are farthest apart (based on failure group definition) from the node writing the data, with the restriction that no two replicas are in the same failure group. Using the rack topology example, the rack used for the first replica is not used by the second and the third replicas. This allows data to remain available even if the entire rack fails. A write-affinity depth of 2 indicates that the first replica is written to the node writing the data. The second replica is written to a node in the same rack as the first replica (based on the failure group definition), but in the other half of the rack. The third replica is striped at the chunk level across all available nodes that are farthest (per the failure group definition) from the nodes used by the first two replicas. Using the rack topology example, the second replica is placed in the same rack as the first replica, but in the second half of the rack. The third replica is striped across nodes that are not located in the rack used by the first and second replicas. For compatibility reasons, the default write-affinity depth is 0 and the unit of striping is a block; however, if block group factor is specified, striping is done at the chunk level. The replica placement is done according to the specified policy at the best-effort level. If a node does not have sufficient free space, GPFS allocates space in other failure groups, ensuring that no two replicas of a chunk are assigned to the same node. For MapReduce and data warehousing applications, a write-affinity depth of 1 is recommended, as the application benefits from the locality of the first replica of the block. Write-affinity failure group: The write-affinity failure group extends the write-affinity depth concept by allowing the application to control placement of each replica of a file and is therefore applicable in FPO-enabled environments only. It defines a policy that indicates the range of nodes where replicas of blocks of a particular file are to be written. This enables applications to control the layout of a file in the cluster to align with data access patterns. When the write-affinity failure group provides specification for all replicas, write-affinity depth is completely overridden. Write-affinity depth policy is used only for replica specification that is missing in the write-affinity failure group specification. The default policy is a null specification and uses the write-affinity depth settings for replica placement. The following format is used to specify write-affinity failure groups: FailureGroup1[;FailureGroup2[;FailureGroup3]] where each FailureGroupN identifies nodes on which to place the first, second and third replicas of each chunk of data. FailureGroupN is a comma-separated string of up to three failure groups. Failure group list notation is extended to include more than one node or range of nodes in the following format: Rack1{:Rack2{:...{:Rackx}}},Location1{:Location2{:...{:Locationx}}},Ext Lg1{:ExtLg2{:...{:ExtLgx}}} Wildcard characters (*) are supported in these fields. Range can be specified using - Specific numbers are specified using : If any part of the field is missing, it is interpreted as 0. For example, the attribute 1,1,1:2;2,1,1-3;2,0,* using the rack topology terminology indicates that the first replica is striped on rack 1, rack location 1, nodes 1 and 2; the second replica is striped on rack 2, rack location 1, nodes 1, 2 and 3; and the third replica is on rack 2, rack location 0 and all nodes in that location. The attribute 1,*,*; 2,*,*;3,*,* results in striping the first replica on all nodes in rack 1, the second replica on all nodes in rack 2 and the third replica on all nodes in rack 3. This attribute can be set using the policy rule or by setting the extended attribute write-affinity-failure-group using the command mmchattr on a file before writing data blocks. The write-affinity failure group specification function uses a failure group list to specify the nodes to be used for each of the replicas. A write-affinity failure group can be specified for each replica of a file. 9

10 File placement policy: A file placement policy can be defined at the file system, storage pool or file level of granularity. For this purpose, GPFS has a policy engine that allows you to set data placement attributes based on the file name, fileset and other attributes. The policy engine offers a rich set of language to query and search files, take action and specify rules for data placement. In most cases, file placement policies apply to groups of files based on a set of attributes. Some of the placement attributes that can be affected by using policy rules include destination storage pool, chunk size, write-affinity depth, write-affinity failure group and replication factor. Recovery from failures: In a typical FPO cluster, nodes have direct-attached disks. These disks are not shared between nodes as in a traditional GPFS cluster, so if the node is inaccessible, the associated disks are also inaccessible. GPFS provides ways to automatically recover from these and similar common disk failure situations. In FPO environments, automated recovery from disk failures can be enabled using the restripeondiskfailure=yes configuration option. Whether an FPO-enabled file system is a subject of an automated recovery attempt is determined by the max replication values for the file system. If either the metadata or data replicas are greater than one, then a recovery action is triggered. The recovery actions are asynchronous, and GPFS continues processing while the recovery takes place. The results from the recovery actions and any errors encountered are recorded in the GPFS logs. When auto-recovery is enabled, GPFS delays recovery action for disk failures to avoid recovery from transient failures such as node reboot. The disk recovery wait period can be customized as needed. The default setting is 300 seconds for disks containing metadata and 600 seconds for disks containing data only. The recovery actions include restoring proper replication of data blocks from the failed disks by creating additional copies on other available disks. In case of a temporary failure that occurs within the recovery wait period, recovery actions include the restarting of failed disks so they are available for allocation. GPFS enterprise functions overview GPFS offers a rich set of features to simplify data management and data protection, and this section provides a brief summary of the key features. Please refer to GPFS documentation for additional details. Fileset In addition to a traditional file system directory based hierarchical structure, GPFS utilizes a file system object called a fileset. A fileset is a sub-tree of a file system namespace, and in many respects behaves like an independent file system. Filesets provide a means of partitioning the file system to allow administrative operations at a finer granularity than the entire file system: Filesets can be used to define quotas. The fileset is an attribute of each file and can be specified in a policy to control initial data placement, migration and replication of the file s data. Fileset snapshots can be created instead of creating a snapshot of an entire file system. The active file management feature is enabled on a per-fileset level. GPFS supports independent and dependent filesets. An independent fileset is a fileset with its own inode space. An inode space is a collection of inode number ranges reserved for an independent fileset. An inode space enables more efficient per-fileset functions, such as fileset snapshots. A dependent fileset shares the inode space of an existing, independent fileset. Files created in a dependent fileset are assigned inodes in the same inode number range that is reserved for the independent fileset from which it was created. In GPFS-FPO environments, consider snapshot and placement policy requirements as you plan GPFS file system layout. For example, temporary and transient data that does not need to be replicated can be placed in a fileset limited to one replica. Similarly, data that cannot be easily re-created may require frequent snapshots and therefore can reside in an independent fileset. 10

11 Information lifecycle management (ILM) GPFS can help you achieve data lifecycle management efficiencies through policy-driven, automated tiered storage management. Using storage pools, filesets and policies together, you can match the cost of your storage resources to the value of your data. With these tools, GPFS can automatically determine where to physically store your data regardless of its placement in the logical directory structure. You can choose to place the data in different storage tiers and move it to a lower-cost tier as the value of the data reduces over time. GPFS allows you to transparently move data to storage pools managed internally or to storage pools managed externally, such as Tivoli Storage Manager or IBM Linear Tape File System (LTFS). Snapshot You can snapshot an entire GPFS file system or only a fileset to preserve its contents at a single point in time. Snapshots are read-only, so changes can be made only to the active (that is, normal, non-snapshot) files and directories. The snapshot function allows a backup or mirror program to run concurrently with user updates and still obtain a consistent copy of the data as of the time that the snapshot was created. Snapshots can provide an online data protection capability that allows easy recovery from common problems such as accidental deletion of a file, and comparison with older versions of a file. GPFS allows snapshots of an entire file system, an independent fileset or an individual file. A file-level snapshot is called a clone. File clone A file clone is a writable snapshot of an individual file. Cloning a file is similar to creating a copy of a file, but the creation process is faster and more space-efficient because no additional disk space is consumed until the clone or the original file is modified. Multiple clones of the same file can be created with no additional space overhead. In a Hadoop workload, clones can be used to run MapReduce jobs while the original copy is being used to ingest new data to ensure your applications get a consistent view of data. Clones can be used to provision virtual machines by cloning a common base image to create a virtual disk for each machine. You can clone the virtual disk image of an individual machine as part of taking a snapshot of the machine state. Access control lists Access control protects directories and files by providing a means of specifying who is granted access. GPFS access control lists are either traditional ACLs based on the POSIX model, or network file system (NFS) V4 ACLs. NFS V4 ACLs are very different than POSIX ACLs and provide much more control of file and directory access. A comprehensive multitenancy solution can be built by leveraging GPFS ACLs to isolate and secure data. Quotas The GPFS quota system helps you to control the number of files and the amount of file data in a file system. GPFS quotas can be defined for individual users, groups of users and individual filesets. Quotas can be enabled by the system administrator to control the amount of space used by the individuals, groups of users or individual filesets. By default, user and group quota limits are enforced across the entire file system. Optionally, the scope of quota enforcement can be limited to an individual fileset boundary. In big data environments, quotas can be applied based on the data sources to effectively manage storage capacity. In multi-tenant and cloud environments, quotas can be applied on a per-tenant or application basis and can be leveraged in chargeback. Active file management (AFM) Active file management is a scalable, high-performance, file-system caching toolset integrated with the GPFS cluster file system. AFM lets you create associations from a local GPFS cluster to a remote cluster, and to define the location and flow of file data to automate the management of the data. This feature allows you to implement a single namespace view across sites around the world. AFM masks wide-area network latencies and outages by using GPFS to cache data sets, allowing data access and modifications even when the remote storage cluster is unavailable. In addition, AFM performs updates to the remote cluster asynchronously, which allows applications to continue operating unconstrained by limited outgoing network bandwidth. AFM leverages the inherent scalability of GPFS to provide a multi-node, consistent cache of data located at another GPFS cluster. By integrating with the file system, AFM offers a POSIX-compliant interface, making the cache completely transparent to applications. AFM can be enabled at the file system level or at the fileset (independent mode) level. 11

12 In big data environments, AFM can be used to ingest data in locations that are closer to the data source. This data can then be distributed, using AFM to move data between multiple clusters for further processing. You can use AFM to maintain an asynchronous copy of the data at a separate physical location or use GPFS synchronous replication (used by FPO replicas). For synchronous replication, data at the remote site is guaranteed to be kept in sync with the main site, but latency between sites can affect the performance of any GPFS file system using synchronous replication. Asynchronous replication as used by GPFS-AFM reduces the effect of latency or limited bandwidth between sites, but does not guarantee that the remote copy is kept in exact synchronization with the data at the main site. Clustered NFS GPFS allows you to configure a subset of nodes in the cluster to provide a highly available solution for exporting GPFS file systems NFS. The participating nodes are designated as clustered NFS (CNFS) member nodes, and the entire setup is frequently referred to as CNFS or a CNFS cluster. In this solution, all CNFS nodes export the same file systems to the NFS clients. When one of the CNFS nodes fails, the NFS serving load moves from the failing node to another node in the CNFS cluster. In GPFS-FPO clusters, all data accessed over NFS should have at least two replicas (three are preferred) to allow file serving to continue as nodes are rebooted. A single replica can cause unexpected reboot of multiple nodes as the NFS high-availability component attempts to find a node to serve data. In the event of errors, it triggers node reboot to correct data access issues and moves the NFS IP to another node. This action can lead to node reboot if the node containing data is not yet available. It is recommended that you dedicate two or more nodes outside the FPO data nodes for CNFS. The CNFS nodes should not have local GPFS NSDs. This ensures that NFS data is distributed evenly across all FPO nodes and also helps reduce the rolling node failure issue. A CNFS node requires a GPFS server license. Efficient backup with Tivoli Storage Manager Since GPFS is POSIX-compliant, you can use any standard backup solution on the file data. The GPFS policy engine can be used to scan the file system for changed files. GPFS provides an integrated solution for faster and efficient backup to Tivoli Storage Manager (TSM) using the mmbackup command. The mmbackup command utilizes all the scalable, parallel processing capabilities of the policy engine to: Scan the file system Evaluate the metadata of all the objects in the file system Determine which files should be sent to backup in TSM Determine which deleted files should be expired from TSM Both backup and expiration take place when running mmbackup in the incremental backup mode. The backup process causes a high volume of metadata I/O. Therefore, use of SSD/flash devices for storing file system metadata is highly recommended for fast file system scans when using the mmbackup command for backup to TSM. These devices are also recommended when creating a backup solution using the GPFS policy engine with another data protection product from another vendor. Retention and immutability To prevent files from being changed or deleted unexpectedly, GPFS provides immutability and appendonly restrictions. You can apply immutability and appendonly restrictions either to individual files within a fileset or to a directory. A retention period can be set on a file to prevent its deletion before the retention period is expired. An immutable file cannot be changed or renamed. An appendonly file allows append operations, but not delete, modify or rename operations. An immutable directory cannot be deleted or renamed, and files cannot be added or deleted under such a directory. An appendonly directory allows new files or subdirectories to be created with 0 byte length; all such new created files and subdirectories are marked as appendonly automatically. In big data environments, write workloads tend to be mostly append. The GPFS appendonly restriction can be useful in some business environments to ensure authenticity of the data by protecting against inadvertent updates or deletion. 12

13 Understanding data flow and workload Driving optimal performance from your resources requires understanding the workload to ensure the application data access pattern is aligned with the file system s storage layout. For instance, any temporary data that can be easily re-created does not need to be replicated. Hadoop consists of several different application components, and each may have different data storage patterns (file size, directory layout) and access patterns (random, sequential), as well as appropriate chunk sizes for MapReduce tasks. Storage workload characteristics To help you make informed decisions about the placement of data, Table 1 describes common Hadoop components along with their associated I/O workload characteristics and storage needs. The workload characteristics are described based on common production applications that might be different in your environment. Use this table as a guide to evaluate storage characteristics and to configure GPFS accordingly. Hadoop component Workload characteristics Comments Flume (Data collection and aggregation) Sqoop (Bulk data transfer between Hadoop and relational databases) Misc. data sources (for example, web server, system log, application logs) Hive (Data warehouse for data summarization, query, and analysis) Big SQL (Data summarization and query) Pig (Programming and query language) HBase (Real-time read and write database) Lucene (Text search) MapReduce Data files Large 1 files Read: Does not apply Write: Sequential append Data files (application-specific) Large/medium 2 Read: Sequential Write: Sequential HBase data files (application-specific) Large-medium 2 Read: Random (~64 K) Write: Sequential (~64 K) HBase log files: Small 3 Read: Sequential Write: Sequential ( bytes) Data files (Index files) Read: Random Write: Sequential Data files (Term/dictionary files) Read: Random Write: Random Data files: See application-specific entries Intermediate and output files (applicationspecific) Read sequential (~64K) Write write (~64K) Given that this data represents source data, replication should be set to 3. Refer to the Ingesting data into GPFS Clusters section to ensure replicas are striped across all data nodes for evenly distributed MapReduce jobs. Recommended chunk size 128 MB (block size =1 MB, block group factor =128) If ingesting bulk data from other sources, refer to the Ingesting data into GPFS Clusters section; otherwise, use the recommended default of WAD=1. Data and log: Recommended chunk size 128 MB (block size =1 MB, block group factor =128) If ingesting bulk data from other sources, refer to the Ingesting data into GPFS Clusters section; otherwise, use the recommended default of WAD=1. No location affinity needed. Stripe over all available data nodes. Use block group factor of 1 (default) to optimize access time to indices files. If the data is temporary, use single replica by using a fileset with replication set to 1 or use local file systems as per best practices. Oozie (Work flow and job orchestration) See comments Uses local file systems for configuration files. Application-specific Jar files stored in GPFS. Use WAD=0 or follow the WAFG setting described in the Ingesting data into GPFS Clusters section. Chukwa (Monitoring large clustered systems) See comments Chukwa uses HBase. Refer to HBase for details. Table 1. Hadoop components with storage workload characteristics. 1. Large file size: 1 GB and above 2. Medium file size: 128 MB to 1 GB 3. Small file size: Under 128 MB 13

14 Setting up a GPFS-FPO cluster IMPORTANT NOTES: 1. Do not copy and paste commands from this document directly, as the hidden control characters could lead to unexplainable errors. 2. If using InfoSphere BigInsights Adaptive MapReduce, follow the steps under If using InfoSphere BigInsights. Steps described under If using Platform Symphony should be followed only if Platform Symphony was installed explicitly and InfoSphere BigInsights is not being used. Planning considerations for optimum performance As you plan your GPFS-FPO cluster architecture, consider the following: For consistent performance, all cluster nodes should have similar hardware configuration, including the number of processors, amount of memory, and the number and type of hard disks used by GPFS. For resiliency, a minimum of two racks with a minimum of two nodes per rack are recommended for a production deployment. This implies a minimum of four nodes in the cluster. If rack configuration layout for failure group notation is being followed, an even number of nodes per rack is recommended to allow balanced grouping of nodes in rack halves. Prepare physical disks for GPFS consumption as follows: To obtain best price-performance, do not use hardware RAID configuration for disks used by GPFS for metadata and data. Ensure write-cache is enabled for all physical disks used by GPFS for storing data only. The disks used for metadata should have the write-cache disabled, since GPFS flushes metadata to the disk as necessary to ensure file system consistency. For Linux only GPFS on Linux does not label disks in a way that is recognized by the operating system and other system utilities. To all other software running on GPFS nodes, direct attach NSDs appear to be unused devices. As a result, GPFS data can be overwritten inadvertently, thus corrupting GPFS data. This often happens because the system utilities have no way to detect when a device contains GPFS data. To prevent this risk of data corruption, you should create a GUID Partition Table (GPT) and at least one partition on each disk to be used as GPFS NSD. The example below shows disk /dev/sdy being configured in a single partition consuming the entire disk capacity. [root@c150f1ap06 ~]# parted /dev/sdy print /dev/sdy: unrecognised disk label [root@c150f1ap06 ~]# parted /dev/sdy mklabel gpt [root@c150f1ap06 ~]# parted s a optimal /dev/sdy mkpart primary 0% 100% [root@c150f1ap06 ~]# parted /dev/sdy print Model: IBM MBF2300RC (scsi) Disk /dev/sdy: 300GB Sector size (logical/physical): 512B/512B Partition Table: gpt Number Start End Size Type File system Flags kB 300GB 300GB primary If your analytics workload generates many temporary files smaller than 128 MB (refer to Table 1), it is strongly recommended that you use local file systems such as ext3 and ext4. You can partition physical disks that are used for data (using fdisk or parted on Linux). Use the first 90 percent of the capacity for GPFS and leave the remaining 10 percent for a local file system. When creating multiple partitions on a disk, always use the first partition for GPFS to ensure GPFS and disk block boundaries are aligned. Create an ext3/ext4 file system independently on each of the second partitions. Do not use Linux Volume Manager to create a single file system combining all partitions, as that leads to performance degradation. To increase performance, use multiple local file systems for temporary data as opposed to a single file system on the entire disk. 14

15 Metadata storage To optimize performance and reliability, metadata nodes and disks should be planned carefully. Having too many nodes with metadata disks reduces overall reliability, while having too few nodes affects resiliency. Use the following guidelines to assign metadata nodes and disks: A GPFS cluster should have a minimum of four nodes with metadata disks. In a large cluster, designate one metadata node per rack. In a very small cluster with four or less nodes, assign metadata disks on each node. Each metadata node should have a minimum of two disks designated for metadata. Ensure metadata disks are distributed evenly across all metadata nodes for balanced workloads for example, each of the four nodes should have four SSD drives, as opposed to a mixed configuration of one node with eight drives, one with four drives and two nodes with two drives each. If you plan to use SSD for metadata, the system pool should include all SSD disks designated metadataonly. If you plan to use the same types of disks for metadata and data, it is still best to designate all disks in the system pool as metadataonly. However, this might result in larger space allocated for metadata than might be needed. If storage utilization is a concern, disks in the system pool can be designated as dataandmetadata for some performance trade-off. In that case, the file system block size must match the metadata block size. When using more than one pool, you must create GPFS policy rules to place selected data in the system pool, while directing data driving the MapReduce workload to the data pool that is FPO enabled. Identify the nodes in your cluster for the following GPFS node roles (a physical node can take on more than one role): Manager nodes: These nodes are part of the node pool from which nodes are selected for various management tasks within the cluster (some are described below). A minimum of one per rack is recommended. Provide additional memory of about 600 MB on these nodes to serve management functions. Primary and secondary cluster configuration server: It is best to use nodes in different racks for this purpose. Consider using nodes with disks holding metadata. Quorum node: Nodes in the configuration server and manager roles are expected to be more reliable in terms of availability and tend to get more attention from administrators. For quorum, use one node per rack minimum and a total count resulting in an odd number. A cluster-wide maximum of seven quorum nodes is recommended. Determine the number of required file systems for your cluster. For better performance, a single file system is recommended for the entire cluster. With GPFS, you can partition the data and storage as needed using one file system. In some cases, you may want to have a small file system for testing purposes and one for production. Note that you can use only one file system at a time with a given Hadoop installation. To optimize performance of a MapReduce workload, it is recommended that you use a larger inode size of 4,096 bytes (default 512 bytes), especially if the data files are smaller than 100 MB. This helps to improve performance of JobTracker when it is mapping out tasks based on the location of data blocks, by avoiding the overhead of reading additional block addresses from the disk. Block group factor (BGF) A block group factor of 128 works well for most workloads. However, further tuning may be needed for specific workloads based on empirical measurements. For example, in MapReduce environments, the file block size is set to 1 MB and the block group factor is set to 128, leading to an effective large block size of 128 MB. You should experiment with block group factor settings to determine the right block group factor or validate the recommended settings for your environment. Using the policy rules or the mmchattr command, BGF can be applied to a specific file or a group of files. Use the following guidelines: Identify the number of storage pools you plan to have. Storage pools are typically determined by the characteristic of the storage types. In a GPFS-FPO cluster, you will have a minimum of two storage pools: one for metadata (with standard block allocation) and one for data that is FPO-enabled. Carefully assign a failure group to each node in the cluster based on the failure boundaries for your rack configuration, power distribution and other factors. Typically, each node with disks should map to a unique failure group notation. 15

16 Node 2 (n1_1_2) FG:1,1,2 Node 2 (n2_1_2) FG:2,1,2 Node 2 (n3_1_2) FG:3,1,2 Top Half: 1 Node 1 (n1_1_1) FG:1,1,1 Node 1 (n2_1_1) FG:2,1,1 Node 1 (n3_1_1) FG:3,1,1 Bottom Half: 0 Node 2 (n1_0_2) FG:1,0,2 Node 2 (n2_0_2) FG:2,0,2 Node 2 (n3_0_2) FG:3,0,2 Node 1 (n1_0_1) FG:1,0,1 Node 1 (n2_0_1) FG:2,0,1 Node 1 (n3_0_1) FG:3,0,1 Rack No. 1 Rack No. 2 Rack No. 3 Metadata disks SSD Data disks SAS Figure 4. Sample cluster configuration. All nodes should have the same number of data disks with roughly equal capacity. If a node has fewer disks, as might be the case for nodes with metadata disks, ensure fewer MapReduce tasks are assigned to those nodes. If a node contains no data disks, no MapReduce tasks should be assigned to those nodes. For additional information on GPFS cluster planning, see the GPFS Concepts, Planning, and Installation Guide. Sample GPFS-FPO configuration Figure 4 describes a sample environment used as a basis to describe GPFS-FPO cluster planning and configuration in this section. The example system in Figure 4 has three racks, each containing four nodes. The nodes in each rack are logically grouped in top and bottom halves of the rack for topology specification. Each of the nodes has four SAS drives that are used for holding data. To simplify discussion, the number of drives used in this example is smaller than typically used in a real cluster. All nodes have the same hardware characteristics. One node in each rack has two SSD drives that are designated for metadata. It is sufficient to have one node per rack for metadata. To have three replicas for metadata, a minimum of three failure groups are required. For configuration with fewer racks, designate multiple nodes per rack for metadata storage. Failure group notation is used to describe the node layout. In failure group notation, the first number is used to describe the rack number (1, 2 or 3), and the second number is used to describe the bottom (0) or top (1) halves of the rack. The third number denotes the node s position (1 or 2) in the respective halves of the rack. Failure group and the unique node name (used in commands and input data for cluster setup) are identified for each of the nodes. Note that metadata disks for a node have the same failure group as the data disks for that node. This configuration results in 12 failure group notations. 16

17 Installing and configuring GPFS clusters GPFS-FPO was introduced with GPFS It is highly recommended that you use GPFS or above to leverage all GPFS-FPO functions described in this document. See the GPFS Concepts, Planning, and Installation Guide for directions specific to your operating system and hardware. To create a cluster, the first step is to assign cluster roles as described earlier. For this sample configuration: Quorum nodes: A total of five quorum nodes are selected. These include three metadata nodes (n1_0_2, n2_1_2, n3_0_2) and two configuration server nodes (n1_1_2, n3_1_1). The configuration servers are located in different rack halves than the metadata nodes. Primary cluster configuration server: Node n1_1_2. Secondary cluster configuration server: Node n2_1_2. Note that the primary and secondary configuration server nodes are in different racks to spread out the failure points. Manager nodes: Use a subset of quorum nodes and configuration server nodes for the manager role. In this case, n1_0_2, n2_1_2 and n3_1_1 are used as the manager nodes. For large clusters, it is easier to create a file with a list of nodes and their roles. The file gpfs-fpo-nodefile describes the nodes. n1_0_1.ibm.com:: n1_0_2.ibm.com:quorum-manager: n1_1_1.ibm.com:: n1_1_2.ibm.com:quorum: n2_0_1.ibm.com:: n2_0_2.ibm.com:: n2_1_1.ibm.com:: n2_1_2.ibm.com: quorum-manager: n3_0_1.ibm.com:: n3_0_2.ibm.com:quorum: n3_1_1.ibm.com:quorum-manager: n3_1_2.ibm.com:: The following mmcrcluster command is used to create a cluster named gpfs-fpo-cluster. mmcrcluster N gpfs-fpo-nodefile p n1_1_2.ibm.com s n2_0_2.ibm.com C gpfs-fpo-cluster A r /usr/bin/ ssh R /usr/bin/scp #Ensure GPFS daemon is running on all nodes in the GPFS cluster. mmgetstate -a Option A ensures that the GPFS daemon is started when a node is rebooted. Creating failure groups and storage pools The next step is to define the storage pools along with failure groups. All of this is accomplished with the command mmcrnsd, which takes an input file containing definitions for storage pools and NSD configuration. 17

18 Below is the stanza file fpo-poolfile for the example system. Refer to the GPFS documentation for additional details. %pool: pool=system layoutmap=cluster blocksize=256k #using smaller blocksize for metadata %pool: layoutmap=cluster blocksize=1024k allowwriteaffinity=yes #this option enables FPO feature writeaffinitydepth=1 #place 1st copy on disks local to the node writing data blockgroupfactor=128 #yields chunk size of 128MB #Disks in system pool are defined for metadata use only %nsd: nsd=n1_0_2_ssd_1 device=/dev/sdm servers=n1_0_2.ibm.com usage=metadataonly failuregroup=102 pool=system %nsd: nsd=n1_0_2_ssd_2 device=/dev/sdn servers=n1_0_2.ibm.com usage=metadataonly failuregroup=102 pool=system %nsd: nsd=n2_1_2_ssd_1 device=/dev/sdm servers=n2_1_2.ibm.com usage=metadataonly failuregroup=212 pool=system %nsd: nsd=n2_1_2_ssd_2 device=/dev/sdn servers=n2_1_2.ibm.com usage=metadataonly failuregroup=212 pool=system %nsd: nsd=n3_0_2_ssd_1 device=/dev/sdm servers=n3_0_2.ibm.com usage=metadataonly failuregroup=302 pool=system %nsd: nsd=n3_0_2_ssd_2 device=/dev/sdn servers=n3_0_2.ibm.com usage=metadataonly failuregroup=302 pool=system #Disks in fpodata pool %nsd: nsd=n1_0_1_disk1 device=/dev/sdb servers=n1_0_1.ibm.com usage=dataonly failuregroup=1,0,1 %nsd: nsd=n1_0_1_disk2 device=/dev/sdc servers=n1_0_1.ibm.com usage=dataonly failuregroup=1,0,1 %nsd: nsd=n1_0_1_disk3 device=/dev/sdd servers=n1_0_1.ibm.com usage=dataonly failuregroup=1,0,1 %nsd: nsd=n1_0_1_disk4 device=/dev/sde servers=n1_0_1.ibm.com usage=dataonly failuregroup=1,0,1 %nsd: nsd=n1_0_2_disk1 device=/dev/sdb servers=n1_0_2.ibm.com usage=dataonly failuregroup=1,0,2 %nsd: nsd=n1_0_2_disk2 device=/dev/sdc servers=n1_0_2.ibm.com usage=dataonly failuregroup=1,0,2 %nsd: nsd=n1_0_2_disk3 device=/dev/sdd servers=n1_0_2.ibm.com usage=dataonly failuregroup=1,0,2 %nsd: nsd=n1_0_2_disk4 device=/dev/sde servers=n1_0_2.ibm.com usage=dataonly failuregroup=1,0,2 %nsd: nsd=n1_1_1_disk1 device=/dev/sdb servers=n1_1_1.ibm.com usage=dataonly failuregroup=1,1,1 %nsd: nsd=n1_1_1_disk2 device=/dev/sdc servers=n1_1_1.ibm.com usage=dataonly failuregroup=1,1,1 %nsd: nsd=n1_1_1_disk3 device=/dev/sdd servers=n1_1_1.ibm.com usage=dataonly failuregroup=1,1,1 %nsd: nsd=n1_1_1_disk4 device=/dev/sde servers=n1_1_1.ibm.com usage=dataonly failuregroup=1,1,1 %nsd: nsd=n1_1_2_disk1 device=/dev/sdb servers= n1_1_2.ibm.com usage=dataonly failuregroup=1,1,2 %nsd: nsd=n1_1_2_disk2 device=/dev/sdc servers= n1_1_2.ibm.com usage=dataonly failuregroup=1,1,2 %nsd: nsd=n1_1_2_disk3 device=/dev/sdd servers= n1_1_2.ibm.com usage=dataonly failuregroup=1,1,2 %nsd: nsd=n1_1_2_disk4 device=/dev/sde servers= n1_1_2.ibm.com usage=dataonly failuregroup=1,1,2 <<< define disks in rack2,rack3 similarly to be in fpodata pool>>> 18

19 In the stanza file, NSD names include the host server name for easier illustration. Use the mmcrnsd command to create storage pools and NSDs. You can view the NSD created using the mmlsnsd command. Refer to the GPFS documentation for details on storage pool and NSD stanza format. mmcrnsd -F fpo-poolfile mmlsnsd m Creating a file system A file system can now be created using the newly created NSDs. The mmcrfs command is used to create the file system with inode size of 4,096 bytes. Refer to the GPFS documentation for details on various options. The following commands create a file system named called bigfs with three copies for metadata and data, and with options enabled for optimized performance (-E, -S). They also mount the file system on all nodes in the cluster. mmcrfs bigfs -F fpo-poolfile -A yes i m 3 -M 3 -n 32 -r 3 -R 3 S relatime E no mmmount bigfs /mnt/bigfs N all #/mnt/bigfs directory must exist on all nodes In this example, the FPO feature has already been enabled for the storage pool fpodata in the storage pool and NSD configuration. The storage pool stanza includes desired FPO-specific attributes, including the block size for system and data pools. You can view the storage pool and NSD configuration in the newly created file system using mmlspool and mmlsdisk commands. The option S specifies atime update mode. A value of relatime suppresses periodic updating of atime for files and reduces the performance overhead caused due to frequent updates triggered by frequent file read operations. If using InfoSphere BigInsights: Version creates the file system with an option S setting of yes, which results in suppressing atime updates altogether. You can use the command mmchfs to change this value. mmlspool bigfs all L mmlsdisk bigfs L Creating an initial placement policy Before any data can be placed in this file system, it is necessary to create a data placement policy. By default, GPFS places all the data in a system pool, unless specified. Best practices for FPO require separation of data and metadata disks in different storage pools. This practice allows the use of a smaller block size for metadata and a larger block size for data. In the sample configuration, a different class of storage (SSD) is being used for metadata, so it is best to create a separate pool. When you have more than one pool, you need to define a placement policy. The following can be used to set attributes for.tmp files and direct all data blocks to the fpopool defined in the sample configuration. Note that granular policies (per file level) can be added to this default policy with different FPO attributes as needed. Use the following steps: Create a policy file containing placement rule in a file named bigfs.pol. $cat bigfs.pol RULE bgf SET POOL fpopool REPLICATE(1) WHERE NAME LIKE %.tmp AND setbgf(4) AND setwad(1) RULE default SET POOL fpopool The first rule places all files with extension.tmp in fpopool pool using a chunk factor of four blocks, with write-affinity depth set to 1 and with data replication count set to 1 for these temporary files. The second rule is the default, placing all other files in the fpopool pool using the attributes specified at the storage pool level. This rule applies to all files that do not match any other prior rules and must be specified as the last rule. The first rule is optional and is added here only for demonstration purposes. Install the policy using the command mmchpolicy. The policies that are currently effective can be listed using the command mmlspolicy. mmchpolicy bigfs bigfs.pol Use the GPFS command mmlsattr L <path to a file in GPFS> to check these settings for individual files. Note that if an attribute is not listed in the output of mmlsattr, the settings provided at the pool or file system level are being used. 19

20 For additional details, refer to the GPFS Advanced Administration Guide > Information Lifecycle Management for GPFS > Policies and rules Tuning GPFS configuration for FPO GPFS-FPO clusters have a different architecture than traditional GPFS deployments. Therefore, default values of many of the configuration parameters are not suitable for FPO deployments and should be changed. Use the following table as a guide to update your cluster configuration settings. If using InfoSphere BigInsights, many of these parameters are already set to the values optimal for FPO configuration. There are some parameters that are not currently set by the installer for InfoSphere BigInsights or need to be further tuned for your cluster. These parameters are marked with * in the table. Use the mmchconfig and mmlsconfig commands to change and list the current value of any given parameter for example, to change readreplicapolicy to the local setting. You can specify multiple configuration parameters in a single command as shown below. #shows current setting of readreplicapolicy parameter mmlsconfig grep -i readreplicapolicy #changes the value to policy. Use i ensure change immediate and persistent across node reboots. mmchconfig readreplicapolicy=local i mmchconfig maxstatcache=100000,maxfilestocache= Replace the parameter name and values as necessary. Simply run these commands on one node in the cluster, and GPFS propagates the changes to all other nodes. Note: Some parameters such as maxstatcache and maxfilestocache do not take effect until GPFS is restarted. GPFS can be restarted using the following command. /usr/lpp/mmfs/bin/mmumount all a /usr/lpp/mmfs/bin/mmshutdown -a /usr/lpp/mmfs/bin/mmstartup a #Ensure GPFS daemon has started on all nodes in the GPFS cluster. mmgetstate -a 20

Big data management with IBM General Parallel File System

Big data management with IBM General Parallel File System Big data management with IBM General Parallel File System Optimize storage management and boost your return on investment Highlights Handles the explosive growth of structured and unstructured data Offers

More information

Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA

Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA WHITE PAPER April 2014 Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA Executive Summary...1 Background...2 File Systems Architecture...2 Network Architecture...3 IBM BigInsights...5

More information

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE Hadoop Storage-as-a-Service ABSTRACT This White Paper illustrates how EMC Elastic Cloud Storage (ECS ) can be used to streamline the Hadoop data analytics

More information

Accelerating and Simplifying Apache

Accelerating and Simplifying Apache Accelerating and Simplifying Apache Hadoop with Panasas ActiveStor White paper NOvember 2012 1.888.PANASAS www.panasas.com Executive Overview The technology requirements for big data vary significantly

More information

Overview. Big Data in Apache Hadoop. - HDFS - MapReduce in Hadoop - YARN. https://hadoop.apache.org. Big Data Management and Analytics

Overview. Big Data in Apache Hadoop. - HDFS - MapReduce in Hadoop - YARN. https://hadoop.apache.org. Big Data Management and Analytics Overview Big Data in Apache Hadoop - HDFS - MapReduce in Hadoop - YARN https://hadoop.apache.org 138 Apache Hadoop - Historical Background - 2003: Google publishes its cluster architecture & DFS (GFS)

More information

IBM Global Technology Services September 2007. NAS systems scale out to meet growing storage demand.

IBM Global Technology Services September 2007. NAS systems scale out to meet growing storage demand. IBM Global Technology Services September 2007 NAS systems scale out to meet Page 2 Contents 2 Introduction 2 Understanding the traditional NAS role 3 Gaining NAS benefits 4 NAS shortcomings in enterprise

More information

Distributed File Systems

Distributed File Systems Distributed File Systems Paul Krzyzanowski Rutgers University October 28, 2012 1 Introduction The classic network file systems we examined, NFS, CIFS, AFS, Coda, were designed as client-server applications.

More information

Prepared By : Manoj Kumar Joshi & Vikas Sawhney

Prepared By : Manoj Kumar Joshi & Vikas Sawhney Prepared By : Manoj Kumar Joshi & Vikas Sawhney General Agenda Introduction to Hadoop Architecture Acknowledgement Thanks to all the authors who left their selfexplanatory images on the internet. Thanks

More information

CS2510 Computer Operating Systems

CS2510 Computer Operating Systems CS2510 Computer Operating Systems HADOOP Distributed File System Dr. Taieb Znati Computer Science Department University of Pittsburgh Outline HDF Design Issues HDFS Application Profile Block Abstraction

More information

CS2510 Computer Operating Systems

CS2510 Computer Operating Systems CS2510 Computer Operating Systems HADOOP Distributed File System Dr. Taieb Znati Computer Science Department University of Pittsburgh Outline HDF Design Issues HDFS Application Profile Block Abstraction

More information

BlueArc unified network storage systems 7th TF-Storage Meeting. Scale Bigger, Store Smarter, Accelerate Everything

BlueArc unified network storage systems 7th TF-Storage Meeting. Scale Bigger, Store Smarter, Accelerate Everything BlueArc unified network storage systems 7th TF-Storage Meeting Scale Bigger, Store Smarter, Accelerate Everything BlueArc s Heritage Private Company, founded in 1998 Headquarters in San Jose, CA Highest

More information

Hadoop Ecosystem B Y R A H I M A.

Hadoop Ecosystem B Y R A H I M A. Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open

More information

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms Distributed File System 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributed File System Don t move data to workers move workers to the data! Store data on the local disks of nodes

More information

IBM General Parallel File System (GPFS ) 3.5 File Placement Optimizer (FPO)

IBM General Parallel File System (GPFS ) 3.5 File Placement Optimizer (FPO) IBM General Parallel File System (GPFS ) 3.5 File Placement Optimizer (FPO) Rick Koopman IBM Technical Computing Business Development Benelux Rick_koopman@nl.ibm.com Enterprise class replacement for HDFS

More information

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney Introduction to Hadoop New York Oracle User Group Vikas Sawhney GENERAL AGENDA Driving Factors behind BIG-DATA NOSQL Database 2014 Database Landscape Hadoop Architecture Map/Reduce Hadoop Eco-system Hadoop

More information

CSE-E5430 Scalable Cloud Computing Lecture 2

CSE-E5430 Scalable Cloud Computing Lecture 2 CSE-E5430 Scalable Cloud Computing Lecture 2 Keijo Heljanko Department of Computer Science School of Science Aalto University keijo.heljanko@aalto.fi 14.9-2015 1/36 Google MapReduce A scalable batch processing

More information

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give

More information

Big Data With Hadoop

Big Data With Hadoop With Saurabh Singh singh.903@osu.edu The Ohio State University February 11, 2016 Overview 1 2 3 Requirements Ecosystem Resilient Distributed Datasets (RDDs) Example Code vs Mapreduce 4 5 Source: [Tutorials

More information

Lecture 5: GFS & HDFS! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl

Lecture 5: GFS & HDFS! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl Big Data Processing, 2014/15 Lecture 5: GFS & HDFS!! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl 1 Course content Introduction Data streams 1 & 2 The MapReduce paradigm Looking behind

More information

June 2009. Blade.org 2009 ALL RIGHTS RESERVED

June 2009. Blade.org 2009 ALL RIGHTS RESERVED Contributions for this vendor neutral technology paper have been provided by Blade.org members including NetApp, BLADE Network Technologies, and Double-Take Software. June 2009 Blade.org 2009 ALL RIGHTS

More information

BIG DATA TRENDS AND TECHNOLOGIES

BIG DATA TRENDS AND TECHNOLOGIES BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.

More information

HadoopTM Analytics DDN

HadoopTM Analytics DDN DDN Solution Brief Accelerate> HadoopTM Analytics with the SFA Big Data Platform Organizations that need to extract value from all data can leverage the award winning SFA platform to really accelerate

More information

International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February-2014 10 ISSN 2278-7763

International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February-2014 10 ISSN 2278-7763 International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February-2014 10 A Discussion on Testing Hadoop Applications Sevuga Perumal Chidambaram ABSTRACT The purpose of analysing

More information

Hadoop Distributed File System. T-111.5550 Seminar On Multimedia 2009-11-11 Eero Kurkela

Hadoop Distributed File System. T-111.5550 Seminar On Multimedia 2009-11-11 Eero Kurkela Hadoop Distributed File System T-111.5550 Seminar On Multimedia 2009-11-11 Eero Kurkela Agenda Introduction Flesh and bones of HDFS Architecture Accessing data Data replication strategy Fault tolerance

More information

OPTIMIZING EXCHANGE SERVER IN A TIERED STORAGE ENVIRONMENT WHITE PAPER NOVEMBER 2006

OPTIMIZING EXCHANGE SERVER IN A TIERED STORAGE ENVIRONMENT WHITE PAPER NOVEMBER 2006 OPTIMIZING EXCHANGE SERVER IN A TIERED STORAGE ENVIRONMENT WHITE PAPER NOVEMBER 2006 EXECUTIVE SUMMARY Microsoft Exchange Server is a disk-intensive application that requires high speed storage to deliver

More information

Virtualizing Apache Hadoop. June, 2012

Virtualizing Apache Hadoop. June, 2012 June, 2012 Table of Contents EXECUTIVE SUMMARY... 3 INTRODUCTION... 3 VIRTUALIZING APACHE HADOOP... 4 INTRODUCTION TO VSPHERE TM... 4 USE CASES AND ADVANTAGES OF VIRTUALIZING HADOOP... 4 MYTHS ABOUT RUNNING

More information

Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components

Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components of Hadoop. We will see what types of nodes can exist in a Hadoop

More information

Storage Architectures for Big Data in the Cloud

Storage Architectures for Big Data in the Cloud Storage Architectures for Big Data in the Cloud Sam Fineberg HP Storage CT Office/ May 2013 Overview Introduction What is big data? Big Data I/O Hadoop/HDFS SAN Distributed FS Cloud Summary Research Areas

More information

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look IBM BigInsights Has Potential If It Lives Up To Its Promise By Prakash Sukumar, Principal Consultant at iolap, Inc. IBM released Hadoop-based InfoSphere BigInsights in May 2013. There are already Hadoop-based

More information

Distributed File Systems

Distributed File Systems Distributed File Systems Mauro Fruet University of Trento - Italy 2011/12/19 Mauro Fruet (UniTN) Distributed File Systems 2011/12/19 1 / 39 Outline 1 Distributed File Systems 2 The Google File System (GFS)

More information

CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop)

CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop) CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop) Rezaul A. Chowdhury Department of Computer Science SUNY Stony Brook Spring 2016 MapReduce MapReduce is a programming model

More information

Informix Dynamic Server May 2007. Availability Solutions with Informix Dynamic Server 11

Informix Dynamic Server May 2007. Availability Solutions with Informix Dynamic Server 11 Informix Dynamic Server May 2007 Availability Solutions with Informix Dynamic Server 11 1 Availability Solutions with IBM Informix Dynamic Server 11.10 Madison Pruet Ajay Gupta The addition of Multi-node

More information

EMC ISILON OneFS OPERATING SYSTEM Powering scale-out storage for the new world of Big Data in the enterprise

EMC ISILON OneFS OPERATING SYSTEM Powering scale-out storage for the new world of Big Data in the enterprise EMC ISILON OneFS OPERATING SYSTEM Powering scale-out storage for the new world of Big Data in the enterprise ESSENTIALS Easy-to-use, single volume, single file system architecture Highly scalable with

More information

I/O Considerations in Big Data Analytics

I/O Considerations in Big Data Analytics Library of Congress I/O Considerations in Big Data Analytics 26 September 2011 Marshall Presser Federal Field CTO EMC, Data Computing Division 1 Paradigms in Big Data Structured (relational) data Very

More information

HDFS. Hadoop Distributed File System

HDFS. Hadoop Distributed File System HDFS Kevin Swingler Hadoop Distributed File System File system designed to store VERY large files Streaming data access Running across clusters of commodity hardware Resilient to node failure 1 Large files

More information

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013 Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software SC13, November, 2013 Agenda Abstract Opportunity: HPC Adoption of Big Data Analytics on Apache

More information

Luncheon Webinar Series May 13, 2013

Luncheon Webinar Series May 13, 2013 Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration

More information

Hadoop IST 734 SS CHUNG

Hadoop IST 734 SS CHUNG Hadoop IST 734 SS CHUNG Introduction What is Big Data?? Bulk Amount Unstructured Lots of Applications which need to handle huge amount of data (in terms of 500+ TB per day) If a regular machine need to

More information

Cisco Unified Data Center Solutions for MapR: Deliver Automated, High-Performance Hadoop Workloads

Cisco Unified Data Center Solutions for MapR: Deliver Automated, High-Performance Hadoop Workloads Solution Overview Cisco Unified Data Center Solutions for MapR: Deliver Automated, High-Performance Hadoop Workloads What You Will Learn MapR Hadoop clusters on Cisco Unified Computing System (Cisco UCS

More information

Hadoop: Embracing future hardware

Hadoop: Embracing future hardware Hadoop: Embracing future hardware Suresh Srinivas @suresh_m_s Page 1 About Me Architect & Founder at Hortonworks Long time Apache Hadoop committer and PMC member Designed and developed many key Hadoop

More information

PADS GPFS Filesystem: Crash Root Cause Analysis. Computation Institute

PADS GPFS Filesystem: Crash Root Cause Analysis. Computation Institute PADS GPFS Filesystem: Crash Root Cause Analysis Computation Institute Argonne National Laboratory Table of Contents Purpose 1 Terminology 2 Infrastructure 4 Timeline of Events 5 Background 5 Corruption

More information

Journal of science STUDY ON REPLICA MANAGEMENT AND HIGH AVAILABILITY IN HADOOP DISTRIBUTED FILE SYSTEM (HDFS)

Journal of science STUDY ON REPLICA MANAGEMENT AND HIGH AVAILABILITY IN HADOOP DISTRIBUTED FILE SYSTEM (HDFS) Journal of science e ISSN 2277-3290 Print ISSN 2277-3282 Information Technology www.journalofscience.net STUDY ON REPLICA MANAGEMENT AND HIGH AVAILABILITY IN HADOOP DISTRIBUTED FILE SYSTEM (HDFS) S. Chandra

More information

Hadoop Architecture. Part 1

Hadoop Architecture. Part 1 Hadoop Architecture Part 1 Node, Rack and Cluster: A node is simply a computer, typically non-enterprise, commodity hardware for nodes that contain data. Consider we have Node 1.Then we can add more nodes,

More information

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All

More information

Hadoop Distributed File System. Dhruba Borthakur Apache Hadoop Project Management Committee dhruba@apache.org dhruba@facebook.com

Hadoop Distributed File System. Dhruba Borthakur Apache Hadoop Project Management Committee dhruba@apache.org dhruba@facebook.com Hadoop Distributed File System Dhruba Borthakur Apache Hadoop Project Management Committee dhruba@apache.org dhruba@facebook.com Hadoop, Why? Need to process huge datasets on large clusters of computers

More information

What s new in Hyper-V 2012 R2

What s new in Hyper-V 2012 R2 What s new in Hyper-V 2012 R2 Carsten Rachfahl MVP Virtual Machine Rachfahl IT-Solutions GmbH & Co KG www.hyper-v-server.de Thomas Maurer Cloud Architect & MVP itnetx gmbh www.thomasmaurer.ch Before Windows

More information

Einsatzfelder von IBM PureData Systems und Ihre Vorteile.

Einsatzfelder von IBM PureData Systems und Ihre Vorteile. Einsatzfelder von IBM PureData Systems und Ihre Vorteile demirkaya@de.ibm.com Agenda Information technology challenges PureSystems and PureData introduction PureData for Transactions PureData for Analytics

More information

WHITEPAPER. A Technical Perspective on the Talena Data Availability Management Solution

WHITEPAPER. A Technical Perspective on the Talena Data Availability Management Solution WHITEPAPER A Technical Perspective on the Talena Data Availability Management Solution BIG DATA TECHNOLOGY LANDSCAPE Over the past decade, the emergence of social media, mobile, and cloud technologies

More information

EMC Backup and Recovery for Microsoft SQL Server 2008 Enabled by EMC Celerra Unified Storage

EMC Backup and Recovery for Microsoft SQL Server 2008 Enabled by EMC Celerra Unified Storage EMC Backup and Recovery for Microsoft SQL Server 2008 Enabled by EMC Celerra Unified Storage Applied Technology Abstract This white paper describes various backup and recovery solutions available for SQL

More information

Deploying Hadoop with Manager

Deploying Hadoop with Manager Deploying Hadoop with Manager SUSE Big Data Made Easier Peter Linnell / Sales Engineer plinnell@suse.com Alejandro Bonilla / Sales Engineer abonilla@suse.com 2 Hadoop Core Components 3 Typical Hadoop Distribution

More information

Lecture 2 (08/31, 09/02, 09/09): Hadoop. Decisions, Operations & Information Technologies Robert H. Smith School of Business Fall, 2015

Lecture 2 (08/31, 09/02, 09/09): Hadoop. Decisions, Operations & Information Technologies Robert H. Smith School of Business Fall, 2015 Lecture 2 (08/31, 09/02, 09/09): Hadoop Decisions, Operations & Information Technologies Robert H. Smith School of Business Fall, 2015 K. Zhang BUDT 758 What we ll cover Overview Architecture o Hadoop

More information

High Availability with Windows Server 2012 Release Candidate

High Availability with Windows Server 2012 Release Candidate High Availability with Windows Server 2012 Release Candidate Windows Server 2012 Release Candidate (RC) delivers innovative new capabilities that enable you to build dynamic storage and availability solutions

More information

Addressing Open Source Big Data, Hadoop, and MapReduce limitations

Addressing Open Source Big Data, Hadoop, and MapReduce limitations Addressing Open Source Big Data, Hadoop, and MapReduce limitations 1 Agenda What is Big Data / Hadoop? Limitations of the existing hadoop distributions Going enterprise with Hadoop 2 How Big are Data?

More information

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh 1 Hadoop: A Framework for Data- Intensive Distributed Computing CS561-Spring 2012 WPI, Mohamed Y. Eltabakh 2 What is Hadoop? Hadoop is a software framework for distributed processing of large datasets

More information

Optimizing Storage for Better TCO in Oracle Environments. Part 1: Management INFOSTOR. Executive Brief

Optimizing Storage for Better TCO in Oracle Environments. Part 1: Management INFOSTOR. Executive Brief Optimizing Storage for Better TCO in Oracle Environments INFOSTOR Executive Brief a QuinStreet Excutive Brief. 2012 To the casual observer, and even to business decision makers who don t work in information

More information

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase Architectural patterns for building real time applications with Apache HBase Andrew Purtell Committer and PMC, Apache HBase Who am I? Distributed systems engineer Principal Architect in the Big Data Platform

More information

IBM BigInsights for Apache Hadoop

IBM BigInsights for Apache Hadoop IBM BigInsights for Apache Hadoop Efficiently manage and mine big data for valuable insights Highlights: Enterprise-ready Apache Hadoop based platform for data processing, warehousing and analytics Advanced

More information

Diagram 1: Islands of storage across a digital broadcast workflow

Diagram 1: Islands of storage across a digital broadcast workflow XOR MEDIA CLOUD AQUA Big Data and Traditional Storage The era of big data imposes new challenges on the storage technology industry. As companies accumulate massive amounts of data from video, sound, database,

More information

Design and Evolution of the Apache Hadoop File System(HDFS)

Design and Evolution of the Apache Hadoop File System(HDFS) Design and Evolution of the Apache Hadoop File System(HDFS) Dhruba Borthakur Engineer@Facebook Committer@Apache HDFS SDC, Sept 19 2011 Outline Introduction Yet another file-system, why? Goals of Hadoop

More information

HDFS Under the Hood. Sanjay Radia. Sradia@yahoo-inc.com Grid Computing, Hadoop Yahoo Inc.

HDFS Under the Hood. Sanjay Radia. Sradia@yahoo-inc.com Grid Computing, Hadoop Yahoo Inc. HDFS Under the Hood Sanjay Radia Sradia@yahoo-inc.com Grid Computing, Hadoop Yahoo Inc. 1 Outline Overview of Hadoop, an open source project Design of HDFS On going work 2 Hadoop Hadoop provides a framework

More information

IBM Global Technology Services November 2009. Successfully implementing a private storage cloud to help reduce total cost of ownership

IBM Global Technology Services November 2009. Successfully implementing a private storage cloud to help reduce total cost of ownership IBM Global Technology Services November 2009 Successfully implementing a private storage cloud to help reduce total cost of ownership Page 2 Contents 2 Executive summary 3 What is a storage cloud? 3 A

More information

Scala Storage Scale-Out Clustered Storage White Paper

Scala Storage Scale-Out Clustered Storage White Paper White Paper Scala Storage Scale-Out Clustered Storage White Paper Chapter 1 Introduction... 3 Capacity - Explosive Growth of Unstructured Data... 3 Performance - Cluster Computing... 3 Chapter 2 Current

More information

Hadoop and Map-Reduce. Swati Gore

Hadoop and Map-Reduce. Swati Gore Hadoop and Map-Reduce Swati Gore Contents Why Hadoop? Hadoop Overview Hadoop Architecture Working Description Fault Tolerance Limitations Why Map-Reduce not MPI Distributed sort Why Hadoop? Existing Data

More information

Hadoop & its Usage at Facebook

Hadoop & its Usage at Facebook Hadoop & its Usage at Facebook Dhruba Borthakur Project Lead, Hadoop Distributed File System dhruba@apache.org Presented at the Storage Developer Conference, Santa Clara September 15, 2009 Outline Introduction

More information

Maxta Storage Platform Enterprise Storage Re-defined

Maxta Storage Platform Enterprise Storage Re-defined Maxta Storage Platform Enterprise Storage Re-defined WHITE PAPER Software-Defined Data Center The Software-Defined Data Center (SDDC) is a unified data center platform that delivers converged computing,

More information

SQL Server Storage Best Practice Discussion Dell EqualLogic

SQL Server Storage Best Practice Discussion Dell EqualLogic SQL Server Storage Best Practice Discussion Dell EqualLogic What s keeping you up at night? Managing the demands of a SQL environment Risk Cost Data loss Application unavailability Data growth SQL Server

More information

HDFS Users Guide. Table of contents

HDFS Users Guide. Table of contents Table of contents 1 Purpose...2 2 Overview...2 3 Prerequisites...3 4 Web Interface...3 5 Shell Commands... 3 5.1 DFSAdmin Command...4 6 Secondary NameNode...4 7 Checkpoint Node...5 8 Backup Node...6 9

More information

Apache Hadoop: Past, Present, and Future

Apache Hadoop: Past, Present, and Future The 4 th China Cloud Computing Conference May 25 th, 2012. Apache Hadoop: Past, Present, and Future Dr. Amr Awadallah Founder, Chief Technical Officer aaa@cloudera.com, twitter: @awadallah Hadoop Past

More information

Understanding Disk Storage in Tivoli Storage Manager

Understanding Disk Storage in Tivoli Storage Manager Understanding Disk Storage in Tivoli Storage Manager Dave Cannon Tivoli Storage Manager Architect Oxford University TSM Symposium September 2005 Disclaimer Unless otherwise noted, functions and behavior

More information

EMC s Enterprise Hadoop Solution. By Julie Lockner, Senior Analyst, and Terri McClure, Senior Analyst

EMC s Enterprise Hadoop Solution. By Julie Lockner, Senior Analyst, and Terri McClure, Senior Analyst White Paper EMC s Enterprise Hadoop Solution Isilon Scale-out NAS and Greenplum HD By Julie Lockner, Senior Analyst, and Terri McClure, Senior Analyst February 2012 This ESG White Paper was commissioned

More information

Introduction to NetApp Infinite Volume

Introduction to NetApp Infinite Volume Technical Report Introduction to NetApp Infinite Volume Sandra Moulton, Reena Gupta, NetApp April 2013 TR-4037 Summary This document provides an overview of NetApp Infinite Volume, a new innovation in

More information

VMware Virtual Machine File System: Technical Overview and Best Practices

VMware Virtual Machine File System: Technical Overview and Best Practices VMware Virtual Machine File System: Technical Overview and Best Practices A VMware Technical White Paper Version 1.0. VMware Virtual Machine File System: Technical Overview and Best Practices Paper Number:

More information

HDFS Architecture Guide

HDFS Architecture Guide by Dhruba Borthakur Table of contents 1 Introduction... 3 2 Assumptions and Goals... 3 2.1 Hardware Failure... 3 2.2 Streaming Data Access...3 2.3 Large Data Sets... 3 2.4 Simple Coherency Model...3 2.5

More information

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop Lecture 32 Big Data 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop 1 2 Big Data Problems Data explosion Data from users on social

More information

Big Data Management and Security

Big Data Management and Security Big Data Management and Security Audit Concerns and Business Risks Tami Frankenfield Sr. Director, Analytics and Enterprise Data Mercury Insurance What is Big Data? Velocity + Volume + Variety = Value

More information

Best Practices for Hadoop Data Analysis with Tableau

Best Practices for Hadoop Data Analysis with Tableau Best Practices for Hadoop Data Analysis with Tableau September 2013 2013 Hortonworks Inc. http:// Tableau 6.1.4 introduced the ability to visualize large, complex data stored in Apache Hadoop with Hortonworks

More information

Hadoop. Apache Hadoop is an open-source software framework for storage and large scale processing of data-sets on clusters of commodity hardware.

Hadoop. Apache Hadoop is an open-source software framework for storage and large scale processing of data-sets on clusters of commodity hardware. Hadoop Source Alessandro Rezzani, Big Data - Architettura, tecnologie e metodi per l utilizzo di grandi basi di dati, Apogeo Education, ottobre 2013 wikipedia Hadoop Apache Hadoop is an open-source software

More information

VMware vsphere Data Protection 5.8 TECHNICAL OVERVIEW REVISED AUGUST 2014

VMware vsphere Data Protection 5.8 TECHNICAL OVERVIEW REVISED AUGUST 2014 VMware vsphere Data Protection 5.8 TECHNICAL OVERVIEW REVISED AUGUST 2014 Table of Contents Introduction.... 3 Features and Benefits of vsphere Data Protection... 3 Additional Features and Benefits of

More information

Protecting Big Data Data Protection Solutions for the Business Data Lake

Protecting Big Data Data Protection Solutions for the Business Data Lake White Paper Protecting Big Data Data Protection Solutions for the Business Data Lake Abstract Big Data use cases are maturing and customers are using Big Data to improve top and bottom line revenues. With

More information

IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems

IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems Proactively address regulatory compliance requirements and protect sensitive data in real time Highlights Monitor and audit data activity

More information

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE AGENDA Introduction to Big Data Introduction to Hadoop HDFS file system Map/Reduce framework Hadoop utilities Summary BIG DATA FACTS In what timeframe

More information

Data processing goes big

Data processing goes big Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,

More information

STORAGE CENTER. The Industry s Only SAN with Automated Tiered Storage STORAGE CENTER

STORAGE CENTER. The Industry s Only SAN with Automated Tiered Storage STORAGE CENTER STORAGE CENTER DATASHEET STORAGE CENTER Go Beyond the Boundaries of Traditional Storage Systems Today s storage vendors promise to reduce the amount of time and money companies spend on storage but instead

More information

<Insert Picture Here> Big Data

<Insert Picture Here> Big Data Big Data Kevin Kalmbach Principal Sales Consultant, Public Sector Engineered Systems Program Agenda What is Big Data and why it is important? What is your Big

More information

VMware vsphere Data Protection 6.0

VMware vsphere Data Protection 6.0 VMware vsphere Data Protection 6.0 TECHNICAL OVERVIEW REVISED FEBRUARY 2015 Table of Contents Introduction.... 3 Architectural Overview... 4 Deployment and Configuration.... 5 Backup.... 6 Application

More information

IBM InfoSphere BigInsights Enterprise Edition

IBM InfoSphere BigInsights Enterprise Edition IBM InfoSphere BigInsights Enterprise Edition Efficiently manage and mine big data for valuable insights Highlights Advanced analytics for structured, semi-structured and unstructured data Professional-grade

More information

IBM TSM DISASTER RECOVERY BEST PRACTICES WITH EMC DATA DOMAIN DEDUPLICATION STORAGE

IBM TSM DISASTER RECOVERY BEST PRACTICES WITH EMC DATA DOMAIN DEDUPLICATION STORAGE White Paper IBM TSM DISASTER RECOVERY BEST PRACTICES WITH EMC DATA DOMAIN DEDUPLICATION STORAGE Abstract This white paper focuses on recovery of an IBM Tivoli Storage Manager (TSM) server and explores

More information

SCALABLE FILE SHARING AND DATA MANAGEMENT FOR INTERNET OF THINGS

SCALABLE FILE SHARING AND DATA MANAGEMENT FOR INTERNET OF THINGS Sean Lee Solution Architect, SDI, IBM Systems SCALABLE FILE SHARING AND DATA MANAGEMENT FOR INTERNET OF THINGS Agenda Converging Technology Forces New Generation Applications Data Management Challenges

More information

IBM Big Data Platform

IBM Big Data Platform IBM Big Data Platform Turning big data into smarter decisions Stefan Söderlund. IBM kundarkitekt, Försvarsmakten Sesam vår-seminarie Big Data, Bigga byte kräver Pigga Hertz! May 16, 2013 By 2015, 80% of

More information

HADOOP MOCK TEST HADOOP MOCK TEST I

HADOOP MOCK TEST HADOOP MOCK TEST I http://www.tutorialspoint.com HADOOP MOCK TEST Copyright tutorialspoint.com This section presents you various set of Mock Tests related to Hadoop Framework. You can download these sample mock tests at

More information

Chase Wu New Jersey Ins0tute of Technology

Chase Wu New Jersey Ins0tute of Technology CS 698: Special Topics in Big Data Chapter 4. Big Data Analytics Platforms Chase Wu New Jersey Ins0tute of Technology Some of the slides have been provided through the courtesy of Dr. Ching-Yung Lin at

More information

HIGHLY AVAILABLE MULTI-DATA CENTER WINDOWS SERVER SOLUTIONS USING EMC VPLEX METRO AND SANBOLIC MELIO 2010

HIGHLY AVAILABLE MULTI-DATA CENTER WINDOWS SERVER SOLUTIONS USING EMC VPLEX METRO AND SANBOLIC MELIO 2010 White Paper HIGHLY AVAILABLE MULTI-DATA CENTER WINDOWS SERVER SOLUTIONS USING EMC VPLEX METRO AND SANBOLIC MELIO 2010 Abstract This white paper demonstrates key functionality demonstrated in a lab environment

More information

How Cisco IT Built Big Data Platform to Transform Data Management

How Cisco IT Built Big Data Platform to Transform Data Management Cisco IT Case Study August 2013 Big Data Analytics How Cisco IT Built Big Data Platform to Transform Data Management EXECUTIVE SUMMARY CHALLENGE Unlock the business value of large data sets, including

More information

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and

More information

Data-Intensive Programming. Timo Aaltonen Department of Pervasive Computing

Data-Intensive Programming. Timo Aaltonen Department of Pervasive Computing Data-Intensive Programming Timo Aaltonen Department of Pervasive Computing Data-Intensive Programming Lecturer: Timo Aaltonen University Lecturer timo.aaltonen@tut.fi Assistants: Henri Terho and Antti

More information

Jeffrey D. Ullman slides. MapReduce for data intensive computing

Jeffrey D. Ullman slides. MapReduce for data intensive computing Jeffrey D. Ullman slides MapReduce for data intensive computing Single-node architecture CPU Machine Learning, Statistics Memory Classical Data Mining Disk Commodity Clusters Web data sets can be very

More information

Enterprise Storage Solution for Hyper-V Private Cloud and VDI Deployments using Sanbolic s Melio Cloud Software Suite April 2011

Enterprise Storage Solution for Hyper-V Private Cloud and VDI Deployments using Sanbolic s Melio Cloud Software Suite April 2011 Enterprise Storage Solution for Hyper-V Private Cloud and VDI Deployments using Sanbolic s Melio Cloud Software Suite April 2011 Executive Summary Large enterprise Hyper-V deployments with a large number

More information

Big Data Storage Options for Hadoop Sam Fineberg, HP Storage

Big Data Storage Options for Hadoop Sam Fineberg, HP Storage Sam Fineberg, HP Storage SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless otherwise noted. Member companies and individual members may use this material in presentations

More information

CDH AND BUSINESS CONTINUITY:

CDH AND BUSINESS CONTINUITY: WHITE PAPER CDH AND BUSINESS CONTINUITY: An overview of the availability, data protection and disaster recovery features in Hadoop Abstract Using the sophisticated built-in capabilities of CDH for tunable

More information