GINORMOUS SYSTEMS April 30 May 1, 2013 Washington, D.C. REINFORCEMENT PAPERS
|
|
- Douglas Eaton
- 8 years ago
- Views:
Transcription
1 GINORMOUS SYSTEMS April 30 May 1, 2013 Washington, D.C. REINFORCEMENT PAPERS 20131
2 Disruptive Change in Storage Technologies for Big Data Dr. Garth Gibson When considering the size requirements of a digital storage system, two metrics come to mind: the number of files it must hold and manage, and the relevant prefix for byte on the total storage system perhaps giga, tera, peta, or exa. A greater number of files does not necessarily correspond to more data. For instance, high-performance computing 1,7,123 on seismic data entails individual files that are measured in terabytes, but key value stores deal with a huge number of very small files of just a few bytes each. Each end of this size/quantity range presents challenges to the design of storage systems and their underlying technologies. 61,62,63,124,125,126,127,128,129,130,131,132,133 RAID pioneer Garth Gibson discusses these and other drivers of change in the storage arena, along with the solutions that are rising to meet each need. Muddying the waters is the shifting economic model for disc drives, with the limits on areal density of conventional drives 125,127,129,130,131 making NAND flash solidstate storage a potentially attractive alternative as cheap even disposable slow memory when the number of small stored items is large and as a replacement to disk drives once the density crossover point appears in the rearview mirror. I can always represent [files] that are getting bigger in a small amount of metadata, do a small amount of locking and a small amount of synchronization, as long as the thing we are operating on keeps getting bigger. Gibson lays out the dual problem, using the example of storage to support high-performance computing. The vast number of objects, even at Los Alamos National Labs operating supercomputers, are tiny, says Gibson, although he goes on to note that the capacity of the 100M-file system remains entirely ample until the largest files, which currently measure 4TB/ file, are taken into account. The performance issues at Los Alamos are driven by dealing with objects that are a gigabyte or larger, but the management issues are typically driven by objects that are very, very small. The time-tested approach to large-file storage is object storage, which replaces fixed-size blocks with flexible-size objects. Whereas block-based storage addresses each block individually, entailing significant metadata overhead for large files and appreciable wasted storage capacity for small files, objects allow for better management. An object-based storage device collects logical blocks into an object that contains not only the data but also its attributes. This approach alleviates the need for the database management system to directly keep track of the block assignments for each file, conferring significant advantage and making it the solution chosen by the cloud and HPC communities. We wrap up blocks into large files and pass around pointers to these large objects to access them, says Gibson. And we leave metadata for accessing how they are stored inside the containers where they are stored to try and minimize the amount we have to synchronize on. Pioneered by Gibson, Panasas s object-based storage for the HPC environment is specifically implemented with a chassis of ten blades in combination with a metadataservicing unit and networking hardware. These elements make up a site s distributed file system, which presents transparently to the customer as a single file server. As with any distributed system, fault-tolerance is essential to the practical use of object-based storage. Quarter-century-old RAID (redundant array of independent disks) remains the underpinning technology in this arena, although its modern implementation in software is where the action is. Although RAID is old and dead, it is hardware RAID that is dead, while variations in software RAID are actually very innovative right now, says Gibson. The challenge is to manage systems as they scale out. The largest Panasas deployments currently entail 8 PB of storage and 500K metadata operations each second, with bandwidths spanning the range of GB/s. Managed as a distributed system, Panasas s solution reliably tracks components for proper operation and performs smooth failover when problems arise. As long as the object model works, this is primarily an issue of the distributed system, says Gibson. How many things in there can fail? Can I keep an image of what s working and not working? And can I keep a consistency 1,2,6 and failover strategy in place? While not an easy set of attributes to master, Gibson sees this as attainable, and, like Google s Spanner, Panasas places high value on maintaining consistency. With his model, individual clients no longer feature hardware RAID, but a RAID is performed over individual pieces of each file, and then the RAID over these pieces become distributed through the network to the storage component. That is, each client is held responsible for creating redundancy for its native data, ensuring scalability, while the distributed nature of the system and direct writing from client to the parallel storage system provides the speed necessary for consistency, provided the network uses large buffers to accommodate nonuniform traffic flows. 2 I have scaled out all of the RAID computation, all of the reliability computation, and pushed it out to the client nodes, which scale out with the total amount of the system, and I flow this out at the speed of the network in parallel, says Gibson. This scales. Reconstruction, however, poses its own set of considerations. As disk capacity grows 40% year on year, the size of the failure unit scales up commensurately, particularly if recovery is pegged at the node level, rather than the disk level. Unless the reliability of individual components improves dramatically, the 2
3 Disruptive Change in Storage Technologies for Big Data Garth Gibson inevitable result is more frequent failure and a heavier load on the storage system s recovery mechanism. Being a realist, Gibson knows he must prepare for the worst and therefore assumes the need to accommodate a media error during the reconstruction process itself. You need to be able to protect against two failures from the beginning, he says. The solution: parallelize recovery. Hardware RAID does not permit this, leading to shift to software RAID. The way this works for our system is that individual files pick random locations, they grow out, they get set on some stripe, they calculate a RAID code, while other files are allocated to completely different places, explains Gibson. Over time what you get is RAID sets that are parts of the files distributed over failure domains, and they are not aligned. It is not like a RAID set is ten wide on ten devices; a RAID set is ten wide on 1000 devices, picked at random. Reconstruction occurs through parallel reading, while parallel writing across the free space persists the full corpus reliably, provided the metadata capacity remains sufficient. This latter requirement is tantamount to file sizes growing reliably larger; when data consists of a barrage of small files, the problem becomes one of metadata inundation, as described below. NAND flash becomes double buffering. It becomes cheap, slow memory to offset the fault-tolerance strategy that disks are currently holding. It is about dumping memory really quickly into a copy that then gets dribbled into the storage, because the capacity is still in the storage. Before addressing the problem of small files, however, Gibson raises an issue that plagues the HPC community 1,7 : the need for fast networks to move the scientific community s large files. Exascale systems are coming, 41 raising the bar yet higher on network communications and data management. Another way of looking at this notion of system size is to consider how the number of nodes, which is in the process of growing to fully 1M nodes within a decade. There is no particular reason to assume that each node is going to get more reliable, so failure rates are going to go way up, says Gibson, making effective recovery systems all the more necessary. Roughly halving the time to achieve a memory dump to 300 seconds by 2018 becomes the standard. If the failures are happening more commonly, then I have to have a shorter time period between checkpoints; and if I have a shorter time period between checkpoints, in order to keep the cycles on my computation, I have to have a fault-tolerance strategy that executes faster, which means I have to dump memory in less time. Carrying this logic forward, network speed must rise to carry memory-borne data away from points of failure and to a safe haven. I need a moderate amount of capacity, but a phenomenal amount of bandwidth, says Gibson, who is addressing this need by using solid-state flash to rapidly accept this memory dump and serve as a form of inexpensive memory that can then trickle data out to disk on a more relaxed time scale. This hybrid NAND flash-plus-disk solution is the most cost-effective solution for the combined bandwidth and storage requirement. This use of NAND flash as memory increases both its cost and its value. Currently, raw solid-state NAND exceeds the Using NAND Flash as Checkpoint Memory Saves Systems from Frequent Component Failure FAST WRITE Checkpoint Memory SLOW WRITE Disk Storage Devices Compute Cluster 3
4 cost of disk storage by a factor of ten, and wrap some processing around it to make it useful in the memory capacity just described, and the cost goes up by another order of magnitude, but the economics of memory follows a different trajectory than that of storage. An SSD device is a little computer with a microcontroller and some DRAM in there to hide the characteristics of the flash, and then you sell that component based on how smart your controller is, and right now that gives a lot of value to the customer, and there are large margins in this space, details Gibson. This makes for an appealing solution for fast-responding consumer devices, but poses a problem in the large-systems space because of the electronic properties at the materials level. This NAND flash stuff is moving dozens of electrons through an insulator into a floating gate, and that floating gate changes the voltage you need to apply to it in order to get current flow, and you can sense how much you have done, says Gibson. That solidstate technology works operating on large chunks disk-like chunks where block sizes for erase will be in megabytes, and writing will be in page-size units, and those units are going to go up with the density of the NAND flash. It is the job of the controller to make the block sizes transparent to the user, as well as to overlay the technology with a workaround for the limited number of writes that NAND flash can handle before the material degrades, as Gibson explains: We try to hide that amount of wear on any one page by mapping and remapping things constantly. Given that the rewrite capacity for the lowest cost so-called triple-level cell NAND flash is roughly 500 cycles, its utilization model in at-scale systems is likely to be similar to a printer s toner cartridge; that is, a consumable storage component. The discussion thus far pertains to systems in which large files determine capacity, such as HPC. Other applications, however, inherently entail a huge number of tiny files. In finance, for instance, 20% of files measure just a couple of kilobytes, while fewer than 10% are greater than 1 MB; key value stores are even more skewed toward small files. Just as finance requires a different storage system setup than HPC, key value stores require yet another alternative approach. Gibson considers each in turn. To satisfy the characteristic file distribution found in the finance community, Panasas configures systems with a more storage-centric use of SSDs, where NAND flash not only serves a memory function, but also steps in on the storage side to accept small files, while leaving disk storage available for the behemoths. We are combining a storage unit in our case it is a blade with some disks and some SSD, with variable configuration sizes, says Gibson. And you choose those configurations based on the workloads. When file sizes become truly small, the metadata describing the data becomes appreciable, even relative to the absolute volume of raw data. We are increasingly getting main memory full of data structures that can get to the actual data you need, when it is small and random, in a small number of fetches, says Gibson, reducing the potential to use SSDs as storage instead of memory. Consider some numbers: If the files in storage are small (1 KB) photos, as with Facebook thumbnails, 4 GB of index in memory will represent 1 TB of data. If, however, the data are tweets (168 bytes each), then it will take 24 GB of index to represent the same 1 TB. Take this down further to the scale of data deduplication hashes (a mere 32 bytes each), and the memory requirement becomes a massive 125 GB of DRAM for single-step SSD lookup. At this scale, the need for efficient index lookup is paramount. Of the various approaches to configuring a key value store index, Gibson sees SILT (small index, large table) as the most promising, given that it requires sub-one-byte per entry on SSD, retrievable in a single lookup, and functions well with systems comprising low-power processors. This system brings with it the challenge of replumbing the operating system, which is unable to handle the unusually fast access rate. With these various architectural changes afoot to accommodate the scope of data, and with NAND flash taking on an increasingly prominent role, disk storage remains the economic medium of choice, at least for the time being. However, it must continue to increase in capacity to keep pace with demand. Gibson describes three disk drive technologies that are competing for acclaim in the next couple of generations of areal densification: heat-assisted magnetic recording (HAMR), bit-patterned media (BPM), and nearest term shingledtrack disks. The heat assist of HAMR disks refers to the need to locally heat the magnetic medium to coax its extra-small bits into changing orientation. We need to make grains of magnetic orientation smaller and smaller, says Gibson, and use materials that resist change more strongly so that the Unshingled (left) vs. Shingled (right) Tracks; w and w lndicate the Write Width, g is the Gap between Tracks, and r is the Read Width for the Shingled Tracks 4
5 Disruptive Change in Storage Technologies for Big Data Garth Gibson superparamagnetic limit is pushed further out. A microlaser on the write head that introduces enough heat to raise the temperature of the medium by a couple of hundred degrees Celsius does the trick by preparing the high-coercivity disk material to flip orientation. Most of the disk companies are working on this, but it is taking a long time, says Gibson, who recognizes the process engineering challenge of manufacturing laser-augmented heads that travel over the disk surface with nanometer-scale tolerances. If [magnetic-disk drive manufacturers] don t continue to improve areal density, their customers don t have any reason to buy from them at all, he says, highlighting the business challenge for traditional players on the hardware side. Bit-patterned media will be later in coming than heat-assisted technology, but Gibson sees it as the expected follow-on. In this case, nanolithography lays down cells, each capable of storing a single bit. This technology is even further out than HAMR. With engineering and fabrication challenges hampering the release dates of HAMR and BPM, respectively, Gibson anticipates the nearest term update to disk drive technology will be shingling. The current mode of laying down tracks on magnetic storage media is for each track to consume roughly 40 nm of width, with at least 5 nm between adjacent tracks to minimize crosstalk. Instead of this unshingled configuration, the new mode entails a wider swath for writing each track, but with substantial overlap between adjacent tracks. With unshingled tracks, the margins on both sides of each track must be clean, with no data laid down between tracks; in contrast, only one side of each shingled track must have a clean margin, with the read process always occurring close to that edge, as illustrated. The same area of disk can therefore accommodate more tracks, as progress demands. Although increasing density, the overlap inherent with shingled tracks poses the conundrum of not being able to rewrite tracks at will. It will change the system model, says Gibson. The system model will be that the disk will not be able to rewrite individual sectors, or it will be expensive because of having to pick up an entire set of tracks and write them back down, and that will take tens or hundreds of seconds to do. To avoid this potentially fatal flaw, shingled disks, when they appear, may well be fitted with a microcontroller as the NAND flash in SSDs is to force writes to always proceed sequentially and to perform periodic defragmentation to eliminate holes. The technology to do this has been well established in the SSD space, yet Gibson worries that the cost of implementation might be too great to enable shingled disks to succeed in the marketplace. We re not going to be able to pay for it, says Gibson, where the it is the controller, which, recall, boosted the cost of SSDs by an order of magnitude over raw NAND flash. The technology is well understood, but it comes at a cost. We re only keeping up with the areal density, but we re not providing you any more benefit than you expected, so you re not going to pay more. If disks stop decreasing in dollar per gigabyte, then the cost of an SSD becomes less of a disadvantage, and maybe SSDs put disks away. This alone might spur a change in application interface, although the industry is mounting a passionate resistance. The world is going to divide up into people who treat NAND flash as slow, cheap memory and those who treat it as very fast but expensive disks; and the disk world, if it survives, will do it by making itself even bigger, even slower, and even more awkward. Dr. Garth Gibson, Co-Founder and Chief Scientist, Panasas Dr. Garth Gibson s work at Panasas covers large-scale parallelism in computer systems and its implications on application performance, operating system design, fault tolerance, and data center manageability. Panasas is a scalable storage-cluster company using an object-storage architecture and providing hundreds of terabytes of high-performance storage in a single management domain. Dr. Gibson also concentrates on secondary memory system technologies; parallel and distributed file systems; and local-, storage-, and system-area networking. While working on his Ph.D., Garth co-wrote the seminal Berkeley RAID paper. He is also on the faculty at Carnegie Mellon University, where he founded the Parallel Data Laboratory. Garth also formed the Network-Attached Storage Device working group of the National Storage Industry Consortium, led storage systems research at the Data Storage Systems Center, and founded The Petascale Data Storage Institute for the Department of Energy s Scientific Discovery through Advanced Computing. Garth s contributions to computer storage have been recognized with the 1999 IEEE Reynold B. Johnson Information Storage Award for outstanding contributions in the field of information storage, inclusion in the hall of fame of the ACM Special Interest Group for Operating Systems, and the 2012 Jean-Claude Laprie Award in Dependable Computing from the IFIP Working Group 10.4 on Dependable Computing and Fault Tolerance. 5
Cloud Storage. Parallels. Performance Benchmark Results. White Paper. www.parallels.com
Parallels Cloud Storage White Paper Performance Benchmark Results www.parallels.com Table of Contents Executive Summary... 3 Architecture Overview... 3 Key Features... 4 No Special Hardware Requirements...
More informationSSDs and RAID: What s the right strategy. Paul Goodwin VP Product Development Avant Technology
SSDs and RAID: What s the right strategy Paul Goodwin VP Product Development Avant Technology SSDs and RAID: What s the right strategy Flash Overview SSD Overview RAID overview Thoughts about Raid Strategies
More informationPIONEER RESEARCH & DEVELOPMENT GROUP
SURVEY ON RAID Aishwarya Airen 1, Aarsh Pandit 2, Anshul Sogani 3 1,2,3 A.I.T.R, Indore. Abstract RAID stands for Redundant Array of Independent Disk that is a concept which provides an efficient way for
More informationPARALLELS CLOUD STORAGE
PARALLELS CLOUD STORAGE Performance Benchmark Results 1 Table of Contents Executive Summary... Error! Bookmark not defined. Architecture Overview... 3 Key Features... 5 No Special Hardware Requirements...
More informationFlash Memory Arrays Enabling the Virtualized Data Center. July 2010
Flash Memory Arrays Enabling the Virtualized Data Center July 2010 2 Flash Memory Arrays Enabling the Virtualized Data Center This White Paper describes a new product category, the flash Memory Array,
More informationCOSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters
COSC 6374 Parallel I/O (I) I/O basics Fall 2012 Concept of a clusters Processor 1 local disks Compute node message passing network administrative network Memory Processor 2 Network card 1 Network card
More informationGeneral Parallel File System (GPFS) Native RAID For 100,000-Disk Petascale Systems
General Parallel File System (GPFS) Native RAID For 100,000-Disk Petascale Systems Veera Deenadhayalan IBM Almaden Research Center 2011 IBM Corporation Hard Disk Rates Are Lagging There have been recent
More information3PAR Fast RAID: High Performance Without Compromise
3PAR Fast RAID: High Performance Without Compromise Karl L. Swartz Document Abstract: 3PAR Fast RAID allows the 3PAR InServ Storage Server to deliver higher performance with less hardware, reducing storage
More informationCOSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters
COSC 6374 Parallel Computation Parallel I/O (I) I/O basics Spring 2008 Concept of a clusters Processor 1 local disks Compute node message passing network administrative network Memory Processor 2 Network
More informationEMC XTREMIO EXECUTIVE OVERVIEW
EMC XTREMIO EXECUTIVE OVERVIEW COMPANY BACKGROUND XtremIO develops enterprise data storage systems based completely on random access media such as flash solid-state drives (SSDs). By leveraging the underlying
More informationDisks and RAID. Profs. Bracy and Van Renesse. based on slides by Prof. Sirer
Disks and RAID Profs. Bracy and Van Renesse based on slides by Prof. Sirer 50 Years Old! 13th September 1956 The IBM RAMAC 350 Stored less than 5 MByte Reading from a Disk Must specify: cylinder # (distance
More informationSolid State Drive Architecture
Solid State Drive Architecture A comparison and evaluation of data storage mediums Tyler Thierolf Justin Uriarte Outline Introduction Storage Device as Limiting Factor Terminology Internals Interface Architecture
More informationFile System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System
CS341: Operating System Lect 36: 1 st Nov 2014 Dr. A. Sahu Dept of Comp. Sc. & Engg. Indian Institute of Technology Guwahati File System & Device Drive Mass Storage Disk Structure Disk Arm Scheduling RAID
More informationFlash s Role in Big Data, Past Present, and Future OBJECTIVE ANALYSIS. Jim Handy
Flash s Role in Big Data, Past Present, and Future Jim Handy Tutorial: Fast Storage for Big Data Hot Chips Conference August 25, 2013 Memorial Auditorium Stanford University OBJECTIVE ANALYSIS OBJECTIVE
More informationMoving Virtual Storage to the Cloud. Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage
Moving Virtual Storage to the Cloud Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage Table of Contents Overview... 1 Understanding the Storage Problem... 1 What Makes
More informationSOLID STATE DRIVES AND PARALLEL STORAGE
SOLID STATE DRIVES AND PARALLEL STORAGE White paper JANUARY 2013 1.888.PANASAS www.panasas.com Overview Solid State Drives (SSDs) have been touted for some time as a disruptive technology in the storage
More informationioscale: The Holy Grail for Hyperscale
ioscale: The Holy Grail for Hyperscale The New World of Hyperscale Hyperscale describes new cloud computing deployments where hundreds or thousands of distributed servers support millions of remote, often
More informationMoving Virtual Storage to the Cloud
Moving Virtual Storage to the Cloud White Paper Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage www.parallels.com Table of Contents Overview... 3 Understanding the Storage
More informationSeptember 25, 2007. Maya Gokhale Georgia Institute of Technology
NAND Flash Storage for High Performance Computing Craig Ulmer cdulmer@sandia.gov September 25, 2007 Craig Ulmer Maya Gokhale Greg Diamos Michael Rewak SNL/CA, LLNL Georgia Institute of Technology University
More informationWITH A FUSION POWERED SQL SERVER 2014 IN-MEMORY OLTP DATABASE
WITH A FUSION POWERED SQL SERVER 2014 IN-MEMORY OLTP DATABASE 1 W W W. F U S I ON I O.COM Table of Contents Table of Contents... 2 Executive Summary... 3 Introduction: In-Memory Meets iomemory... 4 What
More informationSamsung 3bit 3D V-NAND technology
White Paper Samsung 3bit 3D V-NAND technology Yield more capacity, performance and power efficiency Stay abreast of increasing data demands with Samsung's innovative vertical architecture Introduction
More informationPrice/performance Modern Memory Hierarchy
Lecture 21: Storage Administration Take QUIZ 15 over P&H 6.1-4, 6.8-9 before 11:59pm today Project: Cache Simulator, Due April 29, 2010 NEW OFFICE HOUR TIME: Tuesday 1-2, McKinley Last Time Exam discussion
More informationFAWN - a Fast Array of Wimpy Nodes
University of Warsaw January 12, 2011 Outline Introduction 1 Introduction 2 3 4 5 Key issues Introduction Growing CPU vs. I/O gap Contemporary systems must serve millions of users Electricity consumed
More informationJune 2009. Blade.org 2009 ALL RIGHTS RESERVED
Contributions for this vendor neutral technology paper have been provided by Blade.org members including NetApp, BLADE Network Technologies, and Double-Take Software. June 2009 Blade.org 2009 ALL RIGHTS
More informationSciDAC Petascale Data Storage Institute
SciDAC Petascale Data Storage Institute Advanced Scientific Computing Advisory Committee Meeting October 29 2008, Gaithersburg MD Garth Gibson Carnegie Mellon University and Panasas Inc. SciDAC Petascale
More informationTop Ten Questions. to Ask Your Primary Storage Provider About Their Data Efficiency. May 2014. Copyright 2014 Permabit Technology Corporation
Top Ten Questions to Ask Your Primary Storage Provider About Their Data Efficiency May 2014 Copyright 2014 Permabit Technology Corporation Introduction The value of data efficiency technologies, namely
More informationSummer Student Project Report
Summer Student Project Report Dimitris Kalimeris National and Kapodistrian University of Athens June September 2014 Abstract This report will outline two projects that were done as part of a three months
More informationScala Storage Scale-Out Clustered Storage White Paper
White Paper Scala Storage Scale-Out Clustered Storage White Paper Chapter 1 Introduction... 3 Capacity - Explosive Growth of Unstructured Data... 3 Performance - Cluster Computing... 3 Chapter 2 Current
More informationThe Shortcut Guide to Balancing Storage Costs and Performance with Hybrid Storage
The Shortcut Guide to Balancing Storage Costs and Performance with Hybrid Storage sponsored by Dan Sullivan Chapter 1: Advantages of Hybrid Storage... 1 Overview of Flash Deployment in Hybrid Storage Systems...
More informationA New Chapter for System Designs Using NAND Flash Memory
A New Chapter for System Designs Using Memory Jim Cooke Senior Technical Marketing Manager Micron Technology, Inc December 27, 2010 Trends and Complexities trends have been on the rise since was first
More informationOperating Systems. RAID Redundant Array of Independent Disks. Submitted by Ankur Niyogi 2003EE20367
Operating Systems RAID Redundant Array of Independent Disks Submitted by Ankur Niyogi 2003EE20367 YOUR DATA IS LOST@#!! Do we have backups of all our data???? - The stuff we cannot afford to lose?? How
More informationDELL RAID PRIMER DELL PERC RAID CONTROLLERS. Joe H. Trickey III. Dell Storage RAID Product Marketing. John Seward. Dell Storage RAID Engineering
DELL RAID PRIMER DELL PERC RAID CONTROLLERS Joe H. Trickey III Dell Storage RAID Product Marketing John Seward Dell Storage RAID Engineering http://www.dell.com/content/topics/topic.aspx/global/products/pvaul/top
More informationCluster Scalability of ANSYS FLUENT 12 for a Large Aerodynamics Case on the Darwin Supercomputer
Cluster Scalability of ANSYS FLUENT 12 for a Large Aerodynamics Case on the Darwin Supercomputer Stan Posey, MSc and Bill Loewe, PhD Panasas Inc., Fremont, CA, USA Paul Calleja, PhD University of Cambridge,
More informationAccelerating Server Storage Performance on Lenovo ThinkServer
Accelerating Server Storage Performance on Lenovo ThinkServer Lenovo Enterprise Product Group April 214 Copyright Lenovo 214 LENOVO PROVIDES THIS PUBLICATION AS IS WITHOUT WARRANTY OF ANY KIND, EITHER
More informationChapter 6. 6.1 Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig. 6.1. I/O devices can be characterized by. I/O bus connections
Chapter 6 Storage and Other I/O Topics 6.1 Introduction I/O devices can be characterized by Behavior: input, output, storage Partner: human or machine Data rate: bytes/sec, transfers/sec I/O bus connections
More informationRevoScaleR Speed and Scalability
EXECUTIVE WHITE PAPER RevoScaleR Speed and Scalability By Lee Edlefsen Ph.D., Chief Scientist, Revolution Analytics Abstract RevoScaleR, the Big Data predictive analytics library included with Revolution
More informationRAID Overview: Identifying What RAID Levels Best Meet Customer Needs. Diamond Series RAID Storage Array
ATTO Technology, Inc. Corporate Headquarters 155 Crosspoint Parkway Amherst, NY 14068 Phone: 716-691-1999 Fax: 716-691-9353 www.attotech.com sales@attotech.com RAID Overview: Identifying What RAID Levels
More informationSolid State Drive Technology
Technical white paper Solid State Drive Technology Differences between SLC, MLC and TLC NAND Table of contents Executive summary... 2 SLC vs MLC vs TLC... 2 NAND cell technology... 2 Write amplification...
More informationNAND Flash Architecture and Specification Trends
NAND Flash Architecture and Specification Trends Michael Abraham (mabraham@micron.com) NAND Solutions Group Architect Micron Technology, Inc. August 2012 1 Topics NAND Flash Architecture Trends The Cloud
More informationBest Practices for Deploying Citrix XenDesktop on NexentaStor Open Storage
Best Practices for Deploying Citrix XenDesktop on NexentaStor Open Storage White Paper July, 2011 Deploying Citrix XenDesktop on NexentaStor Open Storage Table of Contents The Challenges of VDI Storage
More informationRAID for the 21st Century. A White Paper Prepared for Panasas October 2007
A White Paper Prepared for Panasas October 2007 Table of Contents RAID in the 21 st Century...1 RAID 5 and RAID 6...1 Penalties Associated with RAID 5 and RAID 6...1 How the Vendors Compensate...2 EMA
More informationHadoop Architecture. Part 1
Hadoop Architecture Part 1 Node, Rack and Cluster: A node is simply a computer, typically non-enterprise, commodity hardware for nodes that contain data. Consider we have Node 1.Then we can add more nodes,
More informationBenchmarking Cassandra on Violin
Technical White Paper Report Technical Report Benchmarking Cassandra on Violin Accelerating Cassandra Performance and Reducing Read Latency With Violin Memory Flash-based Storage Arrays Version 1.0 Abstract
More informationMaximizing Your Server Memory and Storage Investments with Windows Server 2012 R2
Executive Summary Maximizing Your Server Memory and Storage Investments with Windows Server 2012 R2 October 21, 2014 What s inside Windows Server 2012 fully leverages today s computing, network, and storage
More informationChapter 10: Mass-Storage Systems
Chapter 10: Mass-Storage Systems Physical structure of secondary storage devices and its effects on the uses of the devices Performance characteristics of mass-storage devices Disk scheduling algorithms
More informationBENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB
BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB Planet Size Data!? Gartner s 10 key IT trends for 2012 unstructured data will grow some 80% over the course of the next
More informationRAID Implementation for StorSimple Storage Management Appliance
RAID Implementation for StorSimple Storage Management Appliance Alpa Kohli June, 2012 KB-00008 Document Revision 1 StorSimple knowledge base articles are intended to provide customers with the information
More informationDesigning a Cloud Storage System
Designing a Cloud Storage System End to End Cloud Storage When designing a cloud storage system, there is value in decoupling the system s archival capacity (its ability to persistently store large volumes
More informationAddressing Fatal Flash Flaws That Plague All Flash Storage Arrays
Addressing Fatal Flash Flaws That Plague All Flash Storage Arrays By Scott D. Lowe, vexpert Co-Founder, ActualTech Media February, 2015 Table of Contents Introduction: How Flash Storage Works 3 Flash Storage
More informationDIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION
DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION A DIABLO WHITE PAPER AUGUST 2014 Ricky Trigalo Director of Business Development Virtualization, Diablo Technologies
More information1 Storage Devices Summary
Chapter 1 Storage Devices Summary Dependability is vital Suitable measures Latency how long to the first bit arrives Bandwidth/throughput how fast does stuff come through after the latency period Obvious
More informationGoogle File System. Web and scalability
Google File System Web and scalability The web: - How big is the Web right now? No one knows. - Number of pages that are crawled: o 100,000 pages in 1994 o 8 million pages in 2005 - Crawlable pages might
More informationImprove Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database
WHITE PAPER Improve Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com Table of Contents Executive
More informationPerformance Beyond PCI Express: Moving Storage to The Memory Bus A Technical Whitepaper
: Moving Storage to The Memory Bus A Technical Whitepaper By Stephen Foskett April 2014 2 Introduction In the quest to eliminate bottlenecks and improve system performance, the state of the art has continually
More informationObject Storage: A Growing Opportunity for Service Providers. White Paper. Prepared for: 2012 Neovise, LLC. All Rights Reserved.
Object Storage: A Growing Opportunity for Service Providers Prepared for: White Paper 2012 Neovise, LLC. All Rights Reserved. Introduction For service providers, the rise of cloud computing is both a threat
More informationData Storage Industry: Global Trends, Developments and Opportunities. Spectacular Global Growth
Data Storage Industry: Global Trends, Developments and Opportunities Spectacular Global Growth The data storage industry is currently growing at a healthy rate. The holiday pictures that you now put on
More informationAll-Flash Arrays: Not Just for the Top Tier Anymore
All-Flash Arrays: Not Just for the Top Tier Anymore Falling prices, new technology make allflash arrays a fit for more financial, life sciences and healthcare applications EXECUTIVE SUMMARY Real-time financial
More informationBusiness-centric Storage FUJITSU Hyperscale Storage System ETERNUS CD10000
Business-centric Storage FUJITSU Hyperscale Storage System ETERNUS CD10000 Clear the way for new business opportunities. Unlock the power of data. Overcoming storage limitations Unpredictable data growth
More informationWilliam Stallings Computer Organization and Architecture 7 th Edition. Chapter 6 External Memory
William Stallings Computer Organization and Architecture 7 th Edition Chapter 6 External Memory Types of External Memory Magnetic Disk RAID Removable Optical CD-ROM CD-Recordable (CD-R) CD-R/W DVD Magnetic
More informationOutline. Database Management and Tuning. Overview. Hardware Tuning. Johann Gamper. Unit 12
Outline Database Management and Tuning Hardware Tuning Johann Gamper 1 Free University of Bozen-Bolzano Faculty of Computer Science IDSE Unit 12 2 3 Conclusion Acknowledgements: The slides are provided
More informationComprehending the Tradeoffs between Deploying Oracle Database on RAID 5 and RAID 10 Storage Configurations. Database Solutions Engineering
Comprehending the Tradeoffs between Deploying Oracle Database on RAID 5 and RAID 10 Storage Configurations A Dell Technical White Paper Database Solutions Engineering By Sudhansu Sekhar and Raghunatha
More informationISTANBUL AYDIN UNIVERSITY
ISTANBUL AYDIN UNIVERSITY 2013-2014 Academic Year Fall Semester Department of Software Engineering SEN361 COMPUTER ORGANIZATION HOMEWORK REPORT STUDENT S NAME : GÖKHAN TAYMAZ STUDENT S NUMBER : B1105.090068
More information1 / 25. CS 137: File Systems. Persistent Solid-State Storage
1 / 25 CS 137: File Systems Persistent Solid-State Storage Technology Change is Coming Introduction Disks are cheaper than any solid-state memory Likely to be true for many years But SSDs are now cheap
More informationRAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29
RAID Redundant Array of Inexpensive (Independent) Disks Use multiple smaller disks (c.f. one large disk) Parallelism improves performance Plus extra disk(s) for redundant data storage Provides fault tolerant
More informationTechnology Insight Series
Evaluating Storage Technologies for Virtual Server Environments Russ Fellows June, 2010 Technology Insight Series Evaluator Group Copyright 2010 Evaluator Group, Inc. All rights reserved Executive Summary
More informationHigh Performance Server SAN using Micron M500DC SSDs and Sanbolic Software
High Performance Server SAN using Micron M500DC SSDs and Sanbolic Software White Paper Overview The Micron M500DC SSD was designed after months of close work with major data center service providers and
More informationHPC Advisory Council
HPC Advisory Council September 2012, Malaga CHRIS WEEDEN SYSTEMS ENGINEER WHO IS PANASAS? Panasas is a high performance storage vendor founded by Dr Garth Gibson Panasas delivers a fully supported, turnkey,
More informationFlash-optimized Data Progression
A Dell white paper Howard Shoobe, Storage Enterprise Technologist John Shirley, Product Management Dan Bock, Product Management Table of contents Executive summary... 3 What is different about Dell Compellent
More informationMaxDeploy Ready. Hyper- Converged Virtualization Solution. With SanDisk Fusion iomemory products
MaxDeploy Ready Hyper- Converged Virtualization Solution With SanDisk Fusion iomemory products MaxDeploy Ready products are configured and tested for support with Maxta software- defined storage and with
More informationTechnology Insight Series
HP s Information Supply Chain Optimizing Information, Data and Storage for Business Value John Webster August, 2011 Technology Insight Series Evaluator Group Copyright 2011 Evaluator Group, Inc. All rights
More informationOPTIMIZING VIDEO STORAGE AT THE EDGE OF THE NETWORK
White Paper OPTIMIZING VIDEO STORAGE AT THE EDGE OF THE NETWORK Leveraging Intelligent Content Distribution Software, Off-the-Shelf Hardware and MLC Flash to Deploy Scalable and Economical Pay-As-You-Grow
More informationData Storage Technology Update
Data Storage Technology Update Hal Woods Vice President and Chief Architect HGST Elastic Storage Platforms April 15, 2015 I have some bad news for you and good news for me You are a data hoarder, an addict
More informationStorage Options for Document Management
Storage Options for Document Management Document management and imaging systems store large volumes of data, which must be maintained for long periods of time. Choosing storage is not simply a matter of
More informationSpeed and Persistence for Real-Time Transactions
Speed and Persistence for Real-Time Transactions by TimesTen and Solid Data Systems July 2002 Table of Contents Abstract 1 Who Needs Speed and Persistence 2 The Reference Architecture 3 Benchmark Results
More informationCOS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University. (http://www.cs.princeton.edu/courses/cos318/)
COS 318: Operating Systems Storage Devices Kai Li Computer Science Department Princeton University (http://www.cs.princeton.edu/courses/cos318/) Today s Topics Magnetic disks Magnetic disk performance
More informationWill They Blend?: Exploring Big Data Computation atop Traditional HPC NAS Storage
Will They Blend?: Exploring Big Data Computation atop Traditional HPC NAS Storage Ellis H. Wilson III 1,2 Mahmut Kandemir 1 Garth Gibson 2,3 1 Department of Computer Science and Engineering, The Pennsylvania
More informationTechnology Update White Paper. High Speed RAID 6. Powered by Custom ASIC Parity Chips
Technology Update White Paper High Speed RAID 6 Powered by Custom ASIC Parity Chips High Speed RAID 6 Powered by Custom ASIC Parity Chips Why High Speed RAID 6? Winchester Systems has developed High Speed
More informationDELL SOLID STATE DISK (SSD) DRIVES
DELL SOLID STATE DISK (SSD) DRIVES STORAGE SOLUTIONS FOR SELECT POWEREDGE SERVERS By Bryan Martin, Dell Product Marketing Manager for HDD & SSD delltechcenter.com TAB LE OF CONTENTS INTRODUCTION 3 DOWNFALLS
More information2009 Oracle Corporation 1
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material,
More informationFAS6200 Cluster Delivers Exceptional Block I/O Performance with Low Latency
FAS6200 Cluster Delivers Exceptional Block I/O Performance with Low Latency Dimitris Krekoukias Systems Engineer NetApp Data ONTAP 8 software operating in Cluster-Mode is the industry's only unified, scale-out
More informationData Center Storage Solutions
Data Center Storage Solutions Enterprise software, appliance and hardware solutions you can trust When it comes to storage, most enterprises seek the same things: predictable performance, trusted reliability
More informationAccelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software
WHITEPAPER Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software SanDisk ZetaScale software unlocks the full benefits of flash for In-Memory Compute and NoSQL applications
More informationTHE CEO S GUIDE TO INVESTING IN FLASH STORAGE
THE CEO S GUIDE TO INVESTING IN FLASH STORAGE EXECUTIVE SUMMARY Flash storage is every data center s version of a supercharged sports car. Nothing beats it in speed, efficiency, and handling though it
More informationMicrosoft Windows Server Hyper-V in a Flash
Microsoft Windows Server Hyper-V in a Flash Combine Violin s enterprise-class storage arrays with the ease and flexibility of Windows Storage Server in an integrated solution to achieve higher density,
More informationComparison of NAND Flash Technologies Used in Solid- State Storage
An explanation and comparison of SLC and MLC NAND technologies August 2010 Comparison of NAND Flash Technologies Used in Solid- State Storage By Shaluka Perera IBM Systems and Technology Group Bill Bornstein
More informationMaginatics Cloud Storage Platform for Elastic NAS Workloads
Maginatics Cloud Storage Platform for Elastic NAS Workloads Optimized for Cloud Maginatics Cloud Storage Platform () is the first solution optimized for the cloud. It provides lower cost, easier administration,
More informationTaking Linux File and Storage Systems into the Future. Ric Wheeler Director Kernel File and Storage Team Red Hat, Incorporated
Taking Linux File and Storage Systems into the Future Ric Wheeler Director Kernel File and Storage Team Red Hat, Incorporated 1 Overview Going Bigger Going Faster Support for New Hardware Current Areas
More informationTechnologies Supporting Evolution of SSDs
Technologies Supporting Evolution of SSDs By TSUCHIYA Kenji Notebook PCs equipped with solid-state drives (SSDs), featuring shock and vibration durability due to the lack of moving parts, appeared on the
More informationCSCA0102 IT & Business Applications. Foundation in Business Information Technology School of Engineering & Computing Sciences FTMS College Global
CSCA0102 IT & Business Applications Foundation in Business Information Technology School of Engineering & Computing Sciences FTMS College Global Chapter 2 Data Storage Concepts System Unit The system unit
More informationAmazon Cloud Storage Options
Amazon Cloud Storage Options Table of Contents 1. Overview of AWS Storage Options 02 2. Why you should use the AWS Storage 02 3. How to get Data into the AWS.03 4. Types of AWS Storage Options.03 5. Object
More informationFlash Memory Technology in Enterprise Storage
NETAPP WHITE PAPER Flash Memory Technology in Enterprise Storage Flexible Choices to Optimize Performance Mark Woods and Amit Shah, NetApp November 2008 WP-7061-1008 EXECUTIVE SUMMARY Solid state drives
More informationPhysical Data Organization
Physical Data Organization Database design using logical model of the database - appropriate level for users to focus on - user independence from implementation details Performance - other major factor
More informationRAID Levels and Components Explained Page 1 of 23
RAID Levels and Components Explained Page 1 of 23 What's RAID? The purpose of this document is to explain the many forms or RAID systems, and why they are useful, and their disadvantages. RAID - Redundant
More informationMagFS: The Ideal File System for the Cloud
: The Ideal File System for the Cloud is the first true file system for the cloud. It provides lower cost, easier administration, and better scalability and performance than any alternative in-cloud file
More informationHow To Improve Performance On A Single Chip Computer
: Redundant Arrays of Inexpensive Disks this discussion is based on the paper:» A Case for Redundant Arrays of Inexpensive Disks (),» David A Patterson, Garth Gibson, and Randy H Katz,» In Proceedings
More informationAmazon EC2 Product Details Page 1 of 5
Amazon EC2 Product Details Page 1 of 5 Amazon EC2 Functionality Amazon EC2 presents a true virtual computing environment, allowing you to use web service interfaces to launch instances with a variety of
More informationIntroduction to the Mathematics of Big Data. Philippe B. Laval
Introduction to the Mathematics of Big Data Philippe B. Laval Fall 2015 Introduction In recent years, Big Data has become more than just a buzz word. Every major field of science, engineering, business,
More informationData Distribution Algorithms for Reliable. Reliable Parallel Storage on Flash Memories
Data Distribution Algorithms for Reliable Parallel Storage on Flash Memories Zuse Institute Berlin November 2008, MEMICS Workshop Motivation Nonvolatile storage Flash memory - Invented by Dr. Fujio Masuoka
More informationBest Practices for Optimizing Your Linux VPS and Cloud Server Infrastructure
Best Practices for Optimizing Your Linux VPS and Cloud Server Infrastructure Q1 2012 Maximizing Revenue per Server with Parallels Containers for Linux www.parallels.com Table of Contents Overview... 3
More informationCisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database
Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database Built up on Cisco s big data common platform architecture (CPA), a
More information