MagFS: The Ideal File System for the Cloud

Similar documents
Maginatics Cloud Storage Platform for Elastic NAS Workloads

SolidFire and NetApp All-Flash FAS Architectural Comparison

Maginatics Cloud Storage Platform A primer

How To Use Hp Vertica Ondemand

Amazon Web Services and Maginatics Solution Brief

Long term retention and archiving the challenges and the solution

EMC ISILON OneFS OPERATING SYSTEM Powering scale-out storage for the new world of Big Data in the enterprise

Introduction to NetApp Infinite Volume

Business-centric Storage FUJITSU Hyperscale Storage System ETERNUS CD10000

Hadoop IST 734 SS CHUNG

Microsoft Windows Server Hyper-V in a Flash

EMC XTREMIO EXECUTIVE OVERVIEW

Scala Storage Scale-Out Clustered Storage White Paper

Putting Genomes in the Cloud with WOS TM. ddn.com. DDN Whitepaper. Making data sharing faster, easier and more scalable

A Virtual Filer for VMware s Virtual SAN A Maginatics and VMware Joint Partner Brief

How To Make A Backup System More Efficient

Accelerating and Simplifying Apache

The Benefits of Virtualizing

Best Practices for Managing Storage in the Most Challenging Environments

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

The BIG Data Era has. your storage! Bratislava, Slovakia, 21st March 2013

HyperQ Storage Tiering White Paper

Identifying the Hidden Risk of Data Deduplication: How the HYDRAstor TM Solution Proactively Solves the Problem

Building an AWS-Compatible Hybrid Cloud with OpenStack

THE EMC ISILON STORY. Big Data In The Enterprise. Copyright 2012 EMC Corporation. All rights reserved.

IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

Managing the Unmanageable: A Better Way to Manage Storage

Microsoft Windows Server Hyper-V in a Flash

Panasas at the RCF. Fall 2005 Robert Petkus RHIC/USATLAS Computing Facility Brookhaven National Laboratory. Robert Petkus Panasas at the RCF

Modernizing Servers and Software

Object Storage: A Growing Opportunity for Service Providers. White Paper. Prepared for: 2012 Neovise, LLC. All Rights Reserved.

IBM Global Technology Services March Virtualization for disaster recovery: areas of focus and consideration.

Analyzing Big Data with Splunk A Cost Effective Storage Architecture and Solution

HGST Object Storage for a New Generation of IT

Virtualizing Apache Hadoop. June, 2012

Microsoft Windows Server in a Flash

Hadoop in the Hybrid Cloud

Lab Validation Report

Introduction. Scalable File-Serving Using External Storage

Planning the Migration of Enterprise Applications to the Cloud

How To Create A Multi Disk Raid

New Hitachi Virtual Storage Platform Family. Name Date

Why RAID is Dead for Big Data Storage. The business case for why IT Executives are making a strategic shift from RAID to Information Dispersal

WHITE PAPER. Get Ready for Big Data:

GPFS Storage Server. Concepts and Setup in Lemanicus BG/Q system" Christian Clémençon (EPFL-DIT)" " 4 April 2013"

POSIX and Object Distributed Storage Systems

RAID for the 21st Century. A White Paper Prepared for Panasas October 2007

To run large data set applications in the cloud, and run them well,

Cisco Wide Area Application Services Optimizes Application Delivery from the Cloud

VMware Virtual SAN Design and Sizing Guide for Horizon View Virtual Desktop Infrastructures TECHNICAL MARKETING DOCUMENTATION REV A /JULY 2014

MagFS: The File System for the Cloud

EMC VPLEX FAMILY. Continuous Availability and data Mobility Within and Across Data Centers

Storage as a Service: Leverage the benefits of scalability and elasticity with Storage as a Service

Growth of Unstructured Data & Object Storage. Marcel Laforce Sr. Director, Object Storage

Improving Time to Results for Seismic Processing with Paradigm and DDN. ddn.com. DDN Whitepaper. James Coomer and Laurent Thiers

PARALLELS CLOUD STORAGE

Big data management with IBM General Parallel File System

MaxDeploy Ready. Hyper- Converged Virtualization Solution. With SanDisk Fusion iomemory products

White Paper. Cloud Native Advantage: Multi-Tenant, Shared Container PaaS. Version 1.1 (June 19, 2012)

Increased Security, Greater Agility, Lower Costs for AWS DELPHIX FOR AMAZON WEB SERVICES WHITE PAPER

WOS 360 FULL SPECTRUM OBJECT STORAGE

IBM TSM DISASTER RECOVERY BEST PRACTICES WITH EMC DATA DOMAIN DEDUPLICATION STORAGE

Performance in a Gluster System. Versions 3.1.x

Clustering Windows File Servers for Enterprise Scale and High Availability

Leveraging BlobSeer to boost up the deployment and execution of Hadoop applications in Nimbus cloud environments on Grid 5000

Aspera Direct-to-Cloud Storage WHITE PAPER

Introduction to Cloud : Cloud and Cloud Storage. Lecture 2. Dr. Dalit Naor IBM Haifa Research Storage Systems. Dalit Naor, IBM Haifa Research

Private Cloud Database Consolidation with Exadata. Nitin Vengurlekar Technical Director/Cloud Evangelist

WOS. High Performance Object Storage

White Paper: Nasuni Cloud NAS. Nasuni Cloud NAS. Combining the Best of Cloud and On-premises Storage

Whitepaper. NexentaConnect for VMware Virtual SAN. Full Featured File services for Virtual SAN

Making the Move to Desktop Virtualization No More Reasons to Delay

Unitrends Recovery-Series: Addressing Enterprise-Class Data Protection

A Total Cost of Ownership Comparison of MongoDB & Oracle

EMC s Enterprise Hadoop Solution. By Julie Lockner, Senior Analyst, and Terri McClure, Senior Analyst

Quantum StorNext. Product Brief: Distributed LAN Client

Violin: A Framework for Extensible Block-level Storage

ANY SURVEILLANCE, ANYWHERE, ANYTIME

High Availability with Windows Server 2012 Release Candidate

FAS6200 Cluster Delivers Exceptional Block I/O Performance with Low Latency

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE

EMC IRODS RESOURCE DRIVERS

vcloud Virtual Private Cloud Fulfilling the promise of cloud computing A Resource Pool of Compute, Storage and a Host of Network Capabilities

IBM Spectrum Protect in the Cloud

SMB Direct for SQL Server and Private Cloud

Pivot3 Desktop Virtualization Appliances. vstac VDI Technology Overview

Evaluation Methodology of Converged Cloud Environments

Large Unstructured Data Storage in a Small Datacenter Footprint: Cisco UCS C3160 and Red Hat Gluster Storage 500-TB Solution

Daniel J. Adabi. Workshop presentation by Lukas Probst

Transcription:

: The Ideal File System for the Cloud is the first true file system for the cloud. It provides lower cost, easier administration, and better scalability and performance than any alternative in-cloud file system. is the only storage solution that enables organizations to take full advantage of the economics, agility and elasticity of the cloud for running scale-out workloads. Inhibitors to Cloud Adoption The cloud is the ideal platform for running the scale-out workloads found in a range of disciplines from Life Sciences (e.g., genomic analysis) to Media and Entertainment (e.g., video rendering). Even largescale Web farms can benefit from in-cloud operation. The reason, quite simply, is the effectively infinite compute and storage resources available in the cloud. To run these applications in the cloud, organizations either re-write them or, more often, use a legacy file system backed by volumes of block storage. At scale, these legacy file system approaches are costly to build and maintain because of: The number of compute nodes dedicated to the storage cluster The requirement to pre-allocate capacity The need to use block storage rather than more cost-effective object-based storage, and The inefficiency of local and network RAID required for data protection. Maginatics File System () delivers the scalability and performance of cloud computing without trade-offs or compromises and eliminates the price premium that accompanies the use of legacy file systems for these deployments. Translating Traditional Storage Architectures to the Cloud Most legacy file systems replicate, in the cloud, the same storage configuration used in physical data centers. This forces system administrators to go through capacity and performance planning just as they would with a traditional storage system, limiting one key benefit of the cloud. In addition, the fault tolerance and data protection mechanisms required by these file systems increase cost by reducing data efficiency and, for certain workloads, can adversely affect system performance. Costs of alternative solutions are also driven by the need for a cluster of compute nodes just to stand up the storage cluster and the relatively expensive (block) storage that must be pre-allocated whether

it is used or not. boasts a significant cost advantage over these systems because of the small compute footprint dedicated to the storage cluster, the use of less expensive and on-demand capacity and the built-in reliability of object storage, eliminating the need for local or network RAID. The following figures provide examples of the cost difference between and a leading alternative open source solution over a six month period: 6 Month Cost $800,000 $700,000 $600,000 cost $500,000 $400,000 $300,000 50TB Workload $709,657 $200,000 $100,000 $85,956 $158,478 $- Other DFS (6TB/node) Other DFS (0.6TB/node) Figure 1: Cost comparison of and a leading open source solution including cloud, support and license expenses for a 50TB workload 6 Month Cost $250,000 $200,000 cost $150,000 15TB Workload $214,352 $100,000 $50,000 $35,1 $51,681 $- Other DFS (6TB/node) Other DFS (0.6TB/node) Figure 2: Cost Comparison of and a leading open source solution including cloud, support and license expenses for a 15TB workload

Block Size With scale-out workloads moving to the cloud, there is a need for new storage architectures to take full advantage its benefits, especially elasticity. addresses that need head-on. Another inhibitor to migrating workloads to the cloud is the time required to move data to the cloud in the first place. not only provides a superior in-cloud file system, it accelerates initial data injection thanks to its native WAN optimization capabilities and distributed architecture. The Benefits of vs. Alternative Distributed File Systems in the Cloud The benefits of versus alternative solutions can be summarized as follows: No scale-performance trade-off. The agent on each worker node provides consistent, unabated access to all data in the underlying shared object storage capacity pool. Because all worker nodes have their own native connectivity to the object store yet maintain a consistent view of the namespace, adding more worker nodes, more data or both does not impact data access. The following figures provide examples of the aggregate throughput of and a leading alternative open source solution, GlusterFS, with multiple concurrent clients. The GlusterFS cluster is comprised of three replicated nodes in 4x50GB RAID0 configuration. 1024 KB Multi-Client Write Small Files 4 Clients x (1,000 x 1MB) 301 Advantage 123% 512 KB 242 304% 256 KB 81 326 216% 64 KB 110 244 290% 0 50 100 150 200 250 300 350 Aggregate Bandwdth (MB/sec) GlusterFS Figure 3: Multi-client throughput comparison between and a leading open source solution, GlusterFS, for small files.

Multi-Client Write Large Files 4 Clients x (2 x 5GB x 2) Advantage 1024 KB 75 300 316% 512 KB 81 310 382% 256 KB 68 329 284% 64 KB 319 299% 0 50 100 150 200 250 300 350 GlusterFS Figure 4: Multi-client throughput comparison between and a leading open source solution, GlusterFS, for large files. Up to 98.5% data efficiency without sacrificing reliability. The inherent data durability afforded by object storage obviates the need for data striping schemes that impact performance and reduce data efficiency. For example, a public in-cloud storage cluster built with a legacy, block-based file system and replication across nodes can reduce data efficiency by more than 60%. This means that for a terabyte of raw capacity, the actual usable capacity is under 400GB. With, essentially each terabyte of raw capacity is usable. That is because the efficient metadata to data storage ratio. Elastic Storage Capacity. With, you can add worker nodes as needed without having to reconfigure a storage cluster to add additional capacity. As new compute nodes are added, they immediately see a file system that is fully consistent across all nodes and scales with the capacity of the underlying object storage. Optimized data injection. enables organizations to accelerate data injection into the cloud via its native WAN optimization capabilities and distributed architecture. This is a major advantage versus alternative solutions, where the time needed to move data into the cloud is often an inhibitor of cloud adoption. The ability of to accelerate data over distance also accounts the efficient access it provides from geographically-distributed clients; e.g., for hybrid cloud and bursting scenarios.

Conclusion Compared to alternative in-cloud file systems, reduces expenses, eases or eliminates the burden of administrative overhead, enhances scalability and improves performance for scale-out workloads running in the cloud. By delivering the full advantages of cloud economics, agility and elasticity for running these workloads, can improve an organization s efficiency, productivity and profitability while lowering its risk profile. XNU-87 Maginatics, Inc. info@maginatics.com (800) 360-1620 or (650) 265-1659 www.maginatics.com