: The Ideal File System for the Cloud is the first true file system for the cloud. It provides lower cost, easier administration, and better scalability and performance than any alternative in-cloud file system. is the only storage solution that enables organizations to take full advantage of the economics, agility and elasticity of the cloud for running scale-out workloads. Inhibitors to Cloud Adoption The cloud is the ideal platform for running the scale-out workloads found in a range of disciplines from Life Sciences (e.g., genomic analysis) to Media and Entertainment (e.g., video rendering). Even largescale Web farms can benefit from in-cloud operation. The reason, quite simply, is the effectively infinite compute and storage resources available in the cloud. To run these applications in the cloud, organizations either re-write them or, more often, use a legacy file system backed by volumes of block storage. At scale, these legacy file system approaches are costly to build and maintain because of: The number of compute nodes dedicated to the storage cluster The requirement to pre-allocate capacity The need to use block storage rather than more cost-effective object-based storage, and The inefficiency of local and network RAID required for data protection. Maginatics File System () delivers the scalability and performance of cloud computing without trade-offs or compromises and eliminates the price premium that accompanies the use of legacy file systems for these deployments. Translating Traditional Storage Architectures to the Cloud Most legacy file systems replicate, in the cloud, the same storage configuration used in physical data centers. This forces system administrators to go through capacity and performance planning just as they would with a traditional storage system, limiting one key benefit of the cloud. In addition, the fault tolerance and data protection mechanisms required by these file systems increase cost by reducing data efficiency and, for certain workloads, can adversely affect system performance. Costs of alternative solutions are also driven by the need for a cluster of compute nodes just to stand up the storage cluster and the relatively expensive (block) storage that must be pre-allocated whether
it is used or not. boasts a significant cost advantage over these systems because of the small compute footprint dedicated to the storage cluster, the use of less expensive and on-demand capacity and the built-in reliability of object storage, eliminating the need for local or network RAID. The following figures provide examples of the cost difference between and a leading alternative open source solution over a six month period: 6 Month Cost $800,000 $700,000 $600,000 cost $500,000 $400,000 $300,000 50TB Workload $709,657 $200,000 $100,000 $85,956 $158,478 $- Other DFS (6TB/node) Other DFS (0.6TB/node) Figure 1: Cost comparison of and a leading open source solution including cloud, support and license expenses for a 50TB workload 6 Month Cost $250,000 $200,000 cost $150,000 15TB Workload $214,352 $100,000 $50,000 $35,1 $51,681 $- Other DFS (6TB/node) Other DFS (0.6TB/node) Figure 2: Cost Comparison of and a leading open source solution including cloud, support and license expenses for a 15TB workload
Block Size With scale-out workloads moving to the cloud, there is a need for new storage architectures to take full advantage its benefits, especially elasticity. addresses that need head-on. Another inhibitor to migrating workloads to the cloud is the time required to move data to the cloud in the first place. not only provides a superior in-cloud file system, it accelerates initial data injection thanks to its native WAN optimization capabilities and distributed architecture. The Benefits of vs. Alternative Distributed File Systems in the Cloud The benefits of versus alternative solutions can be summarized as follows: No scale-performance trade-off. The agent on each worker node provides consistent, unabated access to all data in the underlying shared object storage capacity pool. Because all worker nodes have their own native connectivity to the object store yet maintain a consistent view of the namespace, adding more worker nodes, more data or both does not impact data access. The following figures provide examples of the aggregate throughput of and a leading alternative open source solution, GlusterFS, with multiple concurrent clients. The GlusterFS cluster is comprised of three replicated nodes in 4x50GB RAID0 configuration. 1024 KB Multi-Client Write Small Files 4 Clients x (1,000 x 1MB) 301 Advantage 123% 512 KB 242 304% 256 KB 81 326 216% 64 KB 110 244 290% 0 50 100 150 200 250 300 350 Aggregate Bandwdth (MB/sec) GlusterFS Figure 3: Multi-client throughput comparison between and a leading open source solution, GlusterFS, for small files.
Multi-Client Write Large Files 4 Clients x (2 x 5GB x 2) Advantage 1024 KB 75 300 316% 512 KB 81 310 382% 256 KB 68 329 284% 64 KB 319 299% 0 50 100 150 200 250 300 350 GlusterFS Figure 4: Multi-client throughput comparison between and a leading open source solution, GlusterFS, for large files. Up to 98.5% data efficiency without sacrificing reliability. The inherent data durability afforded by object storage obviates the need for data striping schemes that impact performance and reduce data efficiency. For example, a public in-cloud storage cluster built with a legacy, block-based file system and replication across nodes can reduce data efficiency by more than 60%. This means that for a terabyte of raw capacity, the actual usable capacity is under 400GB. With, essentially each terabyte of raw capacity is usable. That is because the efficient metadata to data storage ratio. Elastic Storage Capacity. With, you can add worker nodes as needed without having to reconfigure a storage cluster to add additional capacity. As new compute nodes are added, they immediately see a file system that is fully consistent across all nodes and scales with the capacity of the underlying object storage. Optimized data injection. enables organizations to accelerate data injection into the cloud via its native WAN optimization capabilities and distributed architecture. This is a major advantage versus alternative solutions, where the time needed to move data into the cloud is often an inhibitor of cloud adoption. The ability of to accelerate data over distance also accounts the efficient access it provides from geographically-distributed clients; e.g., for hybrid cloud and bursting scenarios.
Conclusion Compared to alternative in-cloud file systems, reduces expenses, eases or eliminates the burden of administrative overhead, enhances scalability and improves performance for scale-out workloads running in the cloud. By delivering the full advantages of cloud economics, agility and elasticity for running these workloads, can improve an organization s efficiency, productivity and profitability while lowering its risk profile. XNU-87 Maginatics, Inc. info@maginatics.com (800) 360-1620 or (650) 265-1659 www.maginatics.com