The Advantages and Disadvantages of Network Computing Nodes
|
|
- Lucas Cooper
- 3 years ago
- Views:
Transcription
1 Big Data & Scripting storage networks and distributed file systems 1,
2 2, in the remainder we use networks of computing nodes to enable computations on even larger datasets for a computation, each node will work on the part of the dataset that is locally available to the node computing nodes will have a partial, local copy of the whole dataset an optimal scenario will distribute the data in advance using nodes in parallel for storage and computations general setting nodes connected by network each node has external memory (e.g. hard disk) in addition: internal memory and computing capacity in this part we consider only storage and distribution of data
3 3, design issues for storage networks Space and Access balance even distribution of data to machines Availability implement redundancy and tolerance for data loss Resource Efficiency use resources in useful way (don t waste space) Access Efficiency provide fast access to stored data Heterogeneity integrate different types of hardware Adaptivity storage of growing amounts of data Locality minimize degree of communication for data access
4 4, storage networks model n nodes N 1,..., N n node N i has capacity C i total capacity: S = n i=1 C i, i.e. space for S blocks in total blocks stored on N i : F i (filling state) nodes are connected by network: N i can send data to N j for arbitrary i, j data is accessed by users from outside: retrieve a set of blocks (for now) retrieve the result of an operation on a set of blocks (later)
5 5, balancing problem: consider a simplified scenario with C i constant, i.e. all nodes have the same capacity and distribute m blocks to n nodes subject to: minimize i F i m/n (close to equal distribution) and minimize max i F i (minimize max load)
6 6, striping all objects combined to single stream of data divide data into blocks B i divide blocks into striping units U i of k blocks each store striping unit i to node N (i mod n) at position i div n stripe unit D1 D2 D3 D Stripe 0 Stripe 1 block advantage: units in one stripe can be read in parallel
7 7, striping: size of striping unit k? assumptions: operations tend to involve adjacent blocks example: one big file (e.g. large csv table) spanning several blocks several data accesses in parallel e.g. different users using different files small k high bandwidth (access in parallel) many parallel accesses block each other large k low bandwidth (most files in single node) parallel accesses (to different files) are distributed among nodes choice of k only depends on access-structure and average node performance 1 1 Chen, Patterson, Maximizing performance in a striped disk array, 1990
8 8, striping: advantages/disadvantages advantages perfectly balanced data distribution simple addressing/storage scheme disadvantages modifying stored data (blocks) block deletion yields holes (new data at the end or into holes) fragmentation (additional indexing adding and removing nodes (machines) addition could be solved by new striping removal leads to (partial) redistribution solutions exists, but striping is best for static scenarios
9 9, balancing: centralized approach idea one central address and positioning node master coordinates all data access, knows state of nodes store new blocks to nodes with lowest filling state adding/removing storage nodes is straightforward data access: client sends operation to server (read/write, add, delete) server answers with address of node to interact with operation is executed between client and node
10 10, centralized approach: advantages/disadvantages advantages optimal data distribution can be guaranteed operations can be synchronized disadvantages address and positioning node is bottleneck one centralized dictionary block id node return to access schemes later
11 11, balancing: distribution by hashing treat nodes as bins, use hash function h() for distribution write block B to node N h(b) load factor α >> 1 (many blocks per node) the balls to bins model usual assumption in hashing: α < 1, avoid collision here: α >> 1 achieve balanced distribution of blocks (balls) to nodes (bins) optimal distribution: m/n blocks (out of m) on each node (out of n) question: can we guarantee that maximum elements in one bin is not too large?
12 balancing: distribution by hashing when using the distribution of a hash function directly, the fill states of the bins tend to be unbalanced bin fill state, m= blocks in n=100 bins (blocks in bin) m/n bin experiment: distribute blocks to 100 bins expected fill state: 100 blocks per bin 12,
13 13, balancing: distribution by hashing the simple case : m elements, n bins, m > n log n, assumption h(x) uniform distributed then with high probability: expected number of elements in most B i : m/n bin with m/n + Θ ( mln(n)/n ) additional load more than with high probability m/n (compared to opt.) In a system with some parameter n, an event X appears with high probability if P(X) 1 1 n α for some constant α > 0. similar cases often denoted as P(X) = 1 o(1)
14 14, balancing: greedy improvement the expected distribution O(m/n) in each node is good bins with higher load can block computations and data access improvement: greedy(d) for each block, choose d 2 nodes N i1,..., N id find b = arg min k {1,...,d} F ik (break ties arbitrary) place block in N b example: consider block h(b) and blocks to the left and right retrieval: recalculate addresses and test all (in parallel)
15 balancing: greedy improvement experiment: comparing default choice and greedy improvement bin fill states, m= blocks in n=100 bins, (2 alternatives in greedy) direct greedy (blocks in bin) m/n bin each greedy insert uses bin from h(b) 1, h(b), h(b + 1) with minimal fill state 15,
16 16, analysis of greedy(d) 2 theorem: maximal load Insert m blocks into n nodes using greedy(d), then with high probability: max i F i is ln(ln(n))/ln(d) Θ(m/n) theorem: number of overloaded bins Let γ be a suitable constant. If m balls are distributed into n bins using strategy greedy(d), with probability > 1 1 at most n n exp( d i ) bins have load > m + i + γ. n 1. the maximal load is not too extreme 2. only few bins with much more than the optimal load exist 2 c.f. Berenbrink, Czumaj, Steger, Vöcking, Balanced allocations: The heavily loaded case, 2000
17 17, heterogeneity implicit assumption above: C i = C j all nodes have equal capacities useful assumption but not realistic heterogeneity: arbitrary hardware for nodes in general C i C j (differing capacities) load balancing is more complicated more freedom of hardware choice e.g. upgrade with constantly larger nodes
18 18, heterogeneity: virtual buckets the hashing approach can be extended to heterogeneous settings by subdividing all node capacities into virtual buckets choose largest common storage unit C as size of virtual bucket real capacities C i should be approx. multiples of C: C i k i C with k i N every node N i is split into k i buckets s.t. K = i k i (K is the total number of buckets) hash function maps blocks to {1,..., K} (buckets) second mapping m : {1,..., K} {1,..., N}, with {m 1 (i)} = k i map K buckets to N nodes number of buckets for each node corresponding to node size
19 19, availability: prevent data loss avoid loss of data, i.e. ensure that stored data is available motivational example storage network with N uniform nodes probability of node failure within one month is p P(node survives a month) = (1 p) P(N nodes survive k months) = (1 p) N k failure probability exponential in number of nodes and time failures will happen eventually can not be avoided with fail-safe hardware use redundancy to handle failures
20 20, availability: implementing redundancy basic principle store additional information (more than only the given data) use that information to recover in case of partial data loss two basic approaches mirroring store data elements several times parity codes create additional information to recover missing bits
21 21, availability: redundancy by mirroring idea (simple version) for each block store r duplicate on different nodes failure rate for one node p probability of loosing block: p r problem: need rm space instead of r when node fails: create copies of all blocks on failed node from duplicates on update of nodes: update all duplicates
22 22, availability: parity codes assume string of bits s = s 1 s 2 s 3... s n e.g parity: p(s) = i s i mod 2, e.g. 0 if one bit of s is lost, e.g. s = s 1 xs 3... s n, was x =1 or x =0? use parity of available part: { 0, if p(s) = p(s x = ) 1, else one additional bit allows recovering of one arbitrary lost bit can be extended to larger amounts of missing bits one example: Hamming code store additional bits instead of duplicates and restore on data loss often implemented on hardware level
23 23, adaptivity capacity is constantly extended by adding nodes problem: rehashing for every new node to expensive idea: adaptive hash function hash function with adaptive range change of range avoids total reorganization, but rearranges only (small) portion of input values when new nodes are added, only a few blocks have to be rearranged
24 24, adaptivity: adaptive hashing basic idea position nodes in space S for each block determine position in S by hash function store block on nearest node find nearest position for arbitrary point by binary search adapt to new/removed nodes: removing/adding points in space reassign neighboring blocks problem: when node is removed, all blocks go to neighbor(s) when node is added, takes huge load from neighbors refine using multiple positions for each node
25 25, adaptivity: adaptive hashing use one-dimensional ring [0, 1) as space (distance using modulo) assign k positions to each node i: P i 1,..., P i k every block is mapped to [0, 1)-position by hash function h block positioning determine hash value h(b) for block assign block B to nearest node by position: arg min min{ h(b) Pj i, 1 h(b) Pj i } i j adding a node create new positions for node reassign blocks from neighboring positions remove node reassign blocks remove positions, remove node
26 26, adaptivity: adaptive hashing the points P i j of node i can be determined by hash-functions for each insertion, a search for the nearest point has to be done until now: homogeneous setting (C i constant) heterogeneous settings: model different sizes by additional points reflect capacity by corresponding number of points using the virtual blocks approach large number of points
Chapter 13. Disk Storage, Basic File Structures, and Hashing
Chapter 13 Disk Storage, Basic File Structures, and Hashing Chapter Outline Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files Hashed Files Dynamic and Extendible Hashing
More informationChapter 13. Chapter Outline. Disk Storage, Basic File Structures, and Hashing
Chapter 13 Disk Storage, Basic File Structures, and Hashing Copyright 2007 Ramez Elmasri and Shamkant B. Navathe Chapter Outline Disk Storage Devices Files of Records Operations on Files Unordered Files
More informationCopyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 13-1
Slide 13-1 Chapter 13 Disk Storage, Basic File Structures, and Hashing Chapter Outline Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files Hashed Files Dynamic and Extendible
More informationChapter 13 Disk Storage, Basic File Structures, and Hashing.
Chapter 13 Disk Storage, Basic File Structures, and Hashing. Copyright 2004 Pearson Education, Inc. Chapter Outline Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files
More informationStorage Systems Autumn 2009. Chapter 6: Distributed Hash Tables and their Applications André Brinkmann
Storage Systems Autumn 2009 Chapter 6: Distributed Hash Tables and their Applications André Brinkmann Scaling RAID architectures Using traditional RAID architecture does not scale Adding news disk implies
More informationBig Data & Scripting storage networks and distributed file systems
Big Data & Scripting storage networks and distributed file systems 1, 2, adaptivity: Cut-and-Paste 1 distribute blocks to [0, 1] using hash function start with n nodes: n equal parts of [0, 1] [0, 1] N
More informationtechnology brief RAID Levels March 1997 Introduction Characteristics of RAID Levels
technology brief RAID Levels March 1997 Introduction RAID is an acronym for Redundant Array of Independent Disks (originally Redundant Array of Inexpensive Disks) coined in a 1987 University of California
More informationBig Data & Scripting Part II Streaming Algorithms
Big Data & Scripting Part II Streaming Algorithms 1, Counting Distinct Elements 2, 3, counting distinct elements problem formalization input: stream of elements o from some universe U e.g. ids from a set
More informationOperating Systems. RAID Redundant Array of Independent Disks. Submitted by Ankur Niyogi 2003EE20367
Operating Systems RAID Redundant Array of Independent Disks Submitted by Ankur Niyogi 2003EE20367 YOUR DATA IS LOST@#!! Do we have backups of all our data???? - The stuff we cannot afford to lose?? How
More informationPIONEER RESEARCH & DEVELOPMENT GROUP
SURVEY ON RAID Aishwarya Airen 1, Aarsh Pandit 2, Anshul Sogani 3 1,2,3 A.I.T.R, Indore. Abstract RAID stands for Redundant Array of Independent Disk that is a concept which provides an efficient way for
More informationHard Disk Drives and RAID
Hard Disk Drives and RAID Janaka Harambearachchi (Engineer/Systems Development) INTERFACES FOR HDD A computer interfaces is what allows a computer to send and retrieve information for storage devices such
More informationPhysical Data Organization
Physical Data Organization Database design using logical model of the database - appropriate level for users to focus on - user independence from implementation details Performance - other major factor
More informationNon-Redundant (RAID Level 0)
There are many types of RAID and some of the important ones are introduced below: Non-Redundant (RAID Level 0) A non-redundant disk array, or RAID level 0, has the lowest cost of any RAID organization
More informationChapter 8: Structures for Files. Truong Quynh Chi tqchi@cse.hcmut.edu.vn. Spring- 2013
Chapter 8: Data Storage, Indexing Structures for Files Truong Quynh Chi tqchi@cse.hcmut.edu.vn Spring- 2013 Overview of Database Design Process 2 Outline Data Storage Disk Storage Devices Files of Records
More informationDistributed Storage Networks and Computer Forensics
Distributed Storage Networks 5 Raid-6 Encoding Technical Faculty Winter Semester 2011/12 RAID Redundant Array of Independent Disks Patterson, Gibson, Katz, A Case for Redundant Array of Inexpensive Disks,
More informationData Storage - II: Efficient Usage & Errors
Data Storage - II: Efficient Usage & Errors Week 10, Spring 2005 Updated by M. Naci Akkøk, 27.02.2004, 03.03.2005 based upon slides by Pål Halvorsen, 12.3.2002. Contains slides from: Hector Garcia-Molina
More informationAlgorithms and Methods for Distributed Storage Networks 5 Raid-6 Encoding Christian Schindelhauer
Algorithms and Methods for Distributed Storage Networks 5 Raid-6 Encoding Institut für Informatik Wintersemester 2007/08 RAID Redundant Array of Independent Disks Patterson, Gibson, Katz, A Case for Redundant
More informationAn Introduction to RAID. Giovanni Stracquadanio stracquadanio@dmi.unict.it www.dmi.unict.it/~stracquadanio
An Introduction to RAID Giovanni Stracquadanio stracquadanio@dmi.unict.it www.dmi.unict.it/~stracquadanio Outline A definition of RAID An ensemble of RAIDs JBOD RAID 0...5 Configuring and testing a Linux
More informationChapter 6 External Memory. Dr. Mohamed H. Al-Meer
Chapter 6 External Memory Dr. Mohamed H. Al-Meer 6.1 Magnetic Disks Types of External Memory Magnetic Disks RAID Removable Optical CD ROM CD Recordable CD-R CD Re writable CD-RW DVD Magnetic Tape 2 Introduction
More informationCS 153 Design of Operating Systems Spring 2015
CS 153 Design of Operating Systems Spring 2015 Lecture 22: File system optimizations Physical Disk Structure Disk components Platters Surfaces Tracks Arm Track Sector Surface Sectors Cylinders Arm Heads
More informationChapter Objectives. Chapter 9. Sequential Search. Search Algorithms. Search Algorithms. Binary Search
Chapter Objectives Chapter 9 Search Algorithms Data Structures Using C++ 1 Learn the various search algorithms Explore how to implement the sequential and binary search algorithms Discover how the sequential
More informationRAID Level Descriptions. RAID 0 (Striping)
RAID Level Descriptions RAID 0 (Striping) Offers low cost and maximum performance, but offers no fault tolerance; a single disk failure results in TOTAL data loss. Businesses use RAID 0 mainly for tasks
More informationLecture 36: Chapter 6
Lecture 36: Chapter 6 Today s topic RAID 1 RAID Redundant Array of Inexpensive (Independent) Disks Use multiple smaller disks (c.f. one large disk) Parallelism improves performance Plus extra disk(s) for
More informationDependable Systems. 9. Redundant arrays of. Prof. Dr. Miroslaw Malek. Wintersemester 2004/05 www.informatik.hu-berlin.de/rok/zs
Dependable Systems 9. Redundant arrays of inexpensive disks (RAID) Prof. Dr. Miroslaw Malek Wintersemester 2004/05 www.informatik.hu-berlin.de/rok/zs Redundant Arrays of Inexpensive Disks (RAID) RAID is
More informationHow To Write A Disk Array
200 Chapter 7 (This observation is reinforced and elaborated in Exercises 7.5 and 7.6, and the reader is urged to work through them.) 7.2 RAID Disks are potential bottlenecks for system performance and
More informationCHAPTER 4 RAID. Section Goals. Upon completion of this section you should be able to:
HPTER 4 RI s it was originally proposed, the acronym RI stood for Redundant rray of Inexpensive isks. However, it has since come to be known as Redundant rray of Independent isks. RI was originally described
More informationStoring Data: Disks and Files
Storing Data: Disks and Files (From Chapter 9 of textbook) Storing and Retrieving Data Database Management Systems need to: Store large volumes of data Store data reliably (so that data is not lost!) Retrieve
More informationB+ Tree Properties B+ Tree Searching B+ Tree Insertion B+ Tree Deletion Static Hashing Extendable Hashing Questions in pass papers
B+ Tree and Hashing B+ Tree Properties B+ Tree Searching B+ Tree Insertion B+ Tree Deletion Static Hashing Extendable Hashing Questions in pass papers B+ Tree Properties Balanced Tree Same height for paths
More informationCOSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters
COSC 6374 Parallel I/O (I) I/O basics Fall 2012 Concept of a clusters Processor 1 local disks Compute node message passing network administrative network Memory Processor 2 Network card 1 Network card
More informationData Warehousing und Data Mining
Data Warehousing und Data Mining Multidimensionale Indexstrukturen Ulf Leser Wissensmanagement in der Bioinformatik Content of this Lecture Multidimensional Indexing Grid-Files Kd-trees Ulf Leser: Data
More informationUniversal hashing. In other words, the probability of a collision for two different keys x and y given a hash function randomly chosen from H is 1/m.
Universal hashing No matter how we choose our hash function, it is always possible to devise a set of keys that will hash to the same slot, making the hash scheme perform poorly. To circumvent this, we
More informationBig Data & Scripting Part II Streaming Algorithms
Big Data & Scripting Part II Streaming Algorithms 1, 2, a note on sampling and filtering sampling: (randomly) choose a representative subset filtering: given some criterion (e.g. membership in a set),
More informationAnalysis of Algorithms I: Binary Search Trees
Analysis of Algorithms I: Binary Search Trees Xi Chen Columbia University Hash table: A data structure that maintains a subset of keys from a universe set U = {0, 1,..., p 1} and supports all three dictionary
More informationWhat is RAID and how does it work?
What is RAID and how does it work? What is RAID? RAID is the acronym for either redundant array of inexpensive disks or redundant array of independent disks. When first conceived at UC Berkley the former
More informationTheoretical Aspects of Storage Systems Autumn 2009
Theoretical Aspects of Storage Systems Autumn 2009 Chapter 3: Data Deduplication André Brinkmann News Outline Data Deduplication Compare-by-hash strategies Delta-encoding based strategies Measurements
More informationIntroduction. What is RAID? The Array and RAID Controller Concept. Click here to print this article. Re-Printed From SLCentral
Click here to print this article. Re-Printed From SLCentral RAID: An In-Depth Guide To RAID Technology Author: Tom Solinap Date Posted: January 24th, 2001 URL: http://www.slcentral.com/articles/01/1/raid
More informationData Corruption In Storage Stack - Review
Theoretical Aspects of Storage Systems Autumn 2009 Chapter 2: Double Disk Failures André Brinkmann Data Corruption in the Storage Stack What are Latent Sector Errors What is Silent Data Corruption Checksum
More informationUnit 4.3 - Storage Structures 1. Storage Structures. Unit 4.3
Storage Structures Unit 4.3 Unit 4.3 - Storage Structures 1 The Physical Store Storage Capacity Medium Transfer Rate Seek Time Main Memory 800 MB/s 500 MB Instant Hard Drive 10 MB/s 120 GB 10 ms CD-ROM
More informationCS161: Operating Systems
CS161: Operating Systems Matt Welsh mdw@eecs.harvard.edu Lecture 18: RAID April 19, 2007 2007 Matt Welsh Harvard University 1 RAID Redundant Arrays of Inexpensive Disks Invented in 1986-1987 by David Patterson
More informationVirtual Infrastructure Security
Virtual Infrastructure Security 2 The virtual server is a perfect alternative to using multiple physical servers: several virtual servers are hosted on one physical server and each of them functions both
More informationDistributed Computing over Communication Networks: Topology. (with an excursion to P2P)
Distributed Computing over Communication Networks: Topology (with an excursion to P2P) Some administrative comments... There will be a Skript for this part of the lecture. (Same as slides, except for today...
More informationDiversity Coloring for Distributed Data Storage in Networks 1
Diversity Coloring for Distributed Data Storage in Networks 1 Anxiao (Andrew) Jiang and Jehoshua Bruck California Institute of Technology Pasadena, CA 9115, U.S.A. {jax, bruck}@paradise.caltech.edu Abstract
More informationRAID. Storage-centric computing, cloud computing. Benefits:
RAID Storage-centric computing, cloud computing. Benefits: Improved reliability (via error correcting code, redundancy). Improved performance (via redundancy). Independent disks. RAID Level 0 Provides
More informationBlock1. Block2. Block3. Block3 Striping
Introduction to RI Team members: 電 機 一 94901150 王 麒 鈞, 電 機 一 94901151 吳 炫 逸, 電 機 一 94901154 孫 維 隆. Motivation Gosh, my hard disk is broken again, and my computer can t boot normally. I even have no chance
More informationCopyright www.agileload.com 1
Copyright www.agileload.com 1 INTRODUCTION Performance testing is a complex activity where dozens of factors contribute to its success and effective usage of all those factors is necessary to get the accurate
More informationCSE 120 Principles of Operating Systems
CSE 120 Principles of Operating Systems Fall 2004 Lecture 13: FFS, LFS, RAID Geoffrey M. Voelker Overview We ve looked at disks and file systems generically Now we re going to look at some example file
More informationHow To Improve Performance On A Single Chip Computer
: Redundant Arrays of Inexpensive Disks this discussion is based on the paper:» A Case for Redundant Arrays of Inexpensive Disks (),» David A Patterson, Garth Gibson, and Randy H Katz,» In Proceedings
More informationRAID Overview 91.520
RAID Overview 91.520 1 The Motivation for RAID Computing speeds double every 3 years Disk speeds can t keep up Data needs higher MTBF than any component in system IO Performance and Availability Issues!
More information1 Storage Devices Summary
Chapter 1 Storage Devices Summary Dependability is vital Suitable measures Latency how long to the first bit arrives Bandwidth/throughput how fast does stuff come through after the latency period Obvious
More informationReliability and Fault Tolerance in Storage
Reliability and Fault Tolerance in Storage Dalit Naor/ Dima Sotnikov IBM Haifa Research Storage Systems 1 Advanced Topics on Storage Systems - Spring 2014, Tel-Aviv University http://www.eng.tau.ac.il/semcom
More informationDatabase Systems. Session 8 Main Theme. Physical Database Design, Query Execution Concepts and Database Programming Techniques
Database Systems Session 8 Main Theme Physical Database Design, Query Execution Concepts and Database Programming Techniques Dr. Jean-Claude Franchitti New York University Computer Science Department Courant
More informationHigh Availability Solutions for the MariaDB and MySQL Database
High Availability Solutions for the MariaDB and MySQL Database 1 Introduction This paper introduces recommendations and some of the solutions used to create an availability or high availability environment
More informationLecture 23: Interconnection Networks. Topics: communication latency, centralized and decentralized switches (Appendix E)
Lecture 23: Interconnection Networks Topics: communication latency, centralized and decentralized switches (Appendix E) 1 Topologies Internet topologies are not very regular they grew incrementally Supercomputers
More informationOperating Systems CSE 410, Spring 2004. File Management. Stephen Wagner Michigan State University
Operating Systems CSE 410, Spring 2004 File Management Stephen Wagner Michigan State University File Management File management system has traditionally been considered part of the operating system. Applications
More informationSummer Student Project Report
Summer Student Project Report Dimitris Kalimeris National and Kapodistrian University of Athens June September 2014 Abstract This report will outline two projects that were done as part of a three months
More informationRAID. Tiffany Yu-Han Chen. # The performance of different RAID levels # read/write/reliability (fault-tolerant)/overhead
RAID # The performance of different RAID levels # read/write/reliability (fault-tolerant)/overhead Tiffany Yu-Han Chen (These slides modified from Hao-Hua Chu National Taiwan University) RAID 0 - Striping
More informationHardware Configuration Guide
Hardware Configuration Guide Contents Contents... 1 Annotation... 1 Factors to consider... 2 Machine Count... 2 Data Size... 2 Data Size Total... 2 Daily Backup Data Size... 2 Unique Data Percentage...
More informationMerkle Hash Trees for Distributed Audit Logs
Merkle Hash Trees for Distributed Audit Logs Subject proposed by Karthikeyan Bhargavan Karthikeyan.Bhargavan@inria.fr April 7, 2015 Modern distributed systems spread their databases across a large number
More informationReview. Lecture 21: Reliable, High Performance Storage. Overview. Basic Disk & File System properties CSC 468 / CSC 2204 11/23/2006
S 468 / S 2204 Review Lecture 2: Reliable, High Performance Storage S 469HF Fall 2006 ngela emke rown We ve looked at fault tolerance via server replication ontinue operating with up to f failures Recovery
More informationA Novel Data Placement Model for Highly-Available Storage Systems
A Novel Data Placement Model for Highly-Available Storage Systems Rama, Microsoft Research joint work with John MacCormick, Nick Murphy, Kunal Talwar, Udi Wieder, Junfeng Yang, and Lidong Zhou Introduction
More informationCS 61C: Great Ideas in Computer Architecture. Dependability: Parity, RAID, ECC
CS 61C: Great Ideas in Computer Architecture Dependability: Parity, RAID, ECC Instructor: Justin Hsia 8/08/2013 Summer 2013 Lecture #27 1 Review of Last Lecture MapReduce Data Level Parallelism Framework
More informationDELL RAID PRIMER DELL PERC RAID CONTROLLERS. Joe H. Trickey III. Dell Storage RAID Product Marketing. John Seward. Dell Storage RAID Engineering
DELL RAID PRIMER DELL PERC RAID CONTROLLERS Joe H. Trickey III Dell Storage RAID Product Marketing John Seward Dell Storage RAID Engineering http://www.dell.com/content/topics/topic.aspx/global/products/pvaul/top
More informationStriped Set, Advantages and Disadvantages of Using RAID
Algorithms and Methods for Distributed Storage Networks 4: Volume Manager and RAID Institut für Informatik Wintersemester 2007/08 RAID Redundant Array of Independent Disks Patterson, Gibson, Katz, A Case
More informationBrightStor ARCserve Backup for Windows
BrightStor ARCserve Backup for Windows Tape RAID Option Guide r11.5 D01183-1E This documentation and related computer software program (hereinafter referred to as the "Documentation") is for the end user's
More informationCloud Based Application Architectures using Smart Computing
Cloud Based Application Architectures using Smart Computing How to Use this Guide Joyent Smart Technology represents a sophisticated evolution in cloud computing infrastructure. Most cloud computing products
More informationDATABASE DESIGN - 1DL400
DATABASE DESIGN - 1DL400 Spring 2015 A course on modern database systems!! http://www.it.uu.se/research/group/udbl/kurser/dbii_vt15/ Kjell Orsborn! Uppsala Database Laboratory! Department of Information
More informationCS420: Operating Systems
NK YORK COLLEGE OF PENNSYLVANIA HG OK 2 RAID YORK COLLEGE OF PENNSYLVAN James Moscola Department of Physical Sciences York College of Pennsylvania Based on Operating System Concepts, 9th Edition by Silberschatz,
More informationCassandra A Decentralized, Structured Storage System
Cassandra A Decentralized, Structured Storage System Avinash Lakshman and Prashant Malik Facebook Published: April 2010, Volume 44, Issue 2 Communications of the ACM http://dl.acm.org/citation.cfm?id=1773922
More informationStorage node capacity in RAID0 is equal to the sum total capacity of all disks in the storage node.
RAID configurations defined 1/7 Storage Configuration: Disk RAID and Disk Management > RAID configurations defined Next RAID configurations defined The RAID configuration you choose depends upon how you
More informationCS 6290 I/O and Storage. Milos Prvulovic
CS 6290 I/O and Storage Milos Prvulovic Storage Systems I/O performance (bandwidth, latency) Bandwidth improving, but not as fast as CPU Latency improving very slowly Consequently, by Amdahl s Law: fraction
More informationLoad Balancing in Distributed Web Server Systems With Partial Document Replication
Load Balancing in Distributed Web Server Systems With Partial Document Replication Ling Zhuo, Cho-Li Wang and Francis C. M. Lau Department of Computer Science and Information Systems The University of
More informationEMC CENTERA VIRTUAL ARCHIVE
White Paper EMC CENTERA VIRTUAL ARCHIVE Planning and Configuration Guide Abstract This white paper provides best practices for using EMC Centera Virtual Archive in a customer environment. The guide starts
More informationISTANBUL AYDIN UNIVERSITY
ISTANBUL AYDIN UNIVERSITY 2013-2014 Academic Year Fall Semester Department of Software Engineering SEN361 COMPUTER ORGANIZATION HOMEWORK REPORT STUDENT S NAME : GÖKHAN TAYMAZ STUDENT S NUMBER : B1105.090068
More informationG22.3250-001. Porcupine. Robert Grimm New York University
G22.3250-001 Porcupine Robert Grimm New York University Altogether Now: The Three Questions! What is the problem?! What is new or different?! What are the contributions and limitations? Porcupine from
More informationFiling Systems. Filing Systems
Filing Systems At the outset we identified long-term storage as desirable characteristic of an OS. EG: On-line storage for an MIS. Convenience of not having to re-write programs. Sharing of data in an
More informationRecoverable Encryption through Noised Secret over Large Cloud
Recoverable Encryption through Noised Secret over Large Cloud Sushil Jajodia 1, W. Litwin 2 & Th. Schwarz 3 1 George Mason University, Fairfax, VA {jajodia@gmu.edu} 2 Université Paris Dauphine, Lamsade
More informationData Deduplication: An Essential Component of your Data Protection Strategy
WHITE PAPER: THE EVOLUTION OF DATA DEDUPLICATION Data Deduplication: An Essential Component of your Data Protection Strategy JULY 2010 Andy Brewerton CA TECHNOLOGIES RECOVERY MANAGEMENT AND DATA MODELLING
More informationGuideline for stresstest Page 1 of 6. Stress test
Guideline for stresstest Page 1 of 6 Stress test Objective: Show unacceptable problems with high parallel load. Crash, wrong processing, slow processing. Test Procedure: Run test cases with maximum number
More informationInternational Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 349 ISSN 2229-5518
International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 349 Load Balancing Heterogeneous Request in DHT-based P2P Systems Mrs. Yogita A. Dalvi Dr. R. Shankar Mr. Atesh
More informationES-1 Elettronica dei Sistemi 1 Computer Architecture
ES- Elettronica dei Sistemi Computer Architecture Lesson 7 Disk Arrays Network Attached Storage 4"» "» 8"» 525"» 35"» 25"» 8"» 3"» high bandwidth disk systems based on arrays of disks Decreasing Disk Diameters
More informationCloud Computing at Google. Architecture
Cloud Computing at Google Google File System Web Systems and Algorithms Google Chris Brooks Department of Computer Science University of San Francisco Google has developed a layered system to handle webscale
More informationCS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen
CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen LECTURE 14: DATA STORAGE AND REPRESENTATION Data Storage Memory Hierarchy Disks Fields, Records, Blocks Variable-length
More informationData Link Layer(1) Principal service: Transferring data from the network layer of the source machine to the one of the destination machine
Data Link Layer(1) Principal service: Transferring data from the network layer of the source machine to the one of the destination machine Virtual communication versus actual communication: Specific functions
More informationFile Management. Chapter 12
Chapter 12 File Management File is the basic element of most of the applications, since the input to an application, as well as its output, is usually a file. They also typically outlive the execution
More informationChapter 9: Peripheral Devices: Magnetic Disks
Chapter 9: Peripheral Devices: Magnetic Disks Basic Disk Operation Performance Parameters and History of Improvement Example disks RAID (Redundant Arrays of Inexpensive Disks) Improving Reliability Improving
More informationLecture 1: Data Storage & Index
Lecture 1: Data Storage & Index R&G Chapter 8-11 Concurrency control Query Execution and Optimization Relational Operators File & Access Methods Buffer Management Disk Space Management Recovery Manager
More informationStorage and File Structure
Storage and File Structure Chapter 10: Storage and File Structure Overview of Physical Storage Media Magnetic Disks RAID Tertiary Storage Storage Access File Organization Organization of Records in Files
More informationSmartSync Backup Efficient NAS-to-NAS backup
Allion Ingrasys Europe SmartSync Backup Efficient NAS-to-NAS backup 1. Abstract A common approach to back up data stored in a NAS server is to run backup software on a Windows or UNIX systems and back
More informationHow To Virtualize A Storage Area Network (San) With Virtualization
A New Method of SAN Storage Virtualization Table of Contents 1 - ABSTRACT 2 - THE NEED FOR STORAGE VIRTUALIZATION 3 - EXISTING STORAGE VIRTUALIZATION METHODS 4 - A NEW METHOD OF VIRTUALIZATION: Storage
More informationA survey of big data architectures for handling massive data
CSIT 6910 Independent Project A survey of big data architectures for handling massive data Jordy Domingos - jordydomingos@gmail.com Supervisor : Dr David Rossiter Content Table 1 - Introduction a - Context
More informationSistemas Operativos: Input/Output Disks
Sistemas Operativos: Input/Output Disks Pedro F. Souto (pfs@fe.up.pt) April 28, 2012 Topics Magnetic Disks RAID Solid State Disks Topics Magnetic Disks RAID Solid State Disks Magnetic Disk Construction
More informationScalable Prefix Matching for Internet Packet Forwarding
Scalable Prefix Matching for Internet Packet Forwarding Marcel Waldvogel Computer Engineering and Networks Laboratory Institut für Technische Informatik und Kommunikationsnetze Background Internet growth
More informationA Deduplication-based Data Archiving System
2012 International Conference on Image, Vision and Computing (ICIVC 2012) IPCSIT vol. 50 (2012) (2012) IACSIT Press, Singapore DOI: 10.7763/IPCSIT.2012.V50.20 A Deduplication-based Data Archiving System
More informationHow To Create A Multi Disk Raid
Click on the diagram to see RAID 0 in action RAID Level 0 requires a minimum of 2 drives to implement RAID 0 implements a striped disk array, the data is broken down into blocks and each block is written
More informationSSDs and RAID: What s the right strategy. Paul Goodwin VP Product Development Avant Technology
SSDs and RAID: What s the right strategy Paul Goodwin VP Product Development Avant Technology SSDs and RAID: What s the right strategy Flash Overview SSD Overview RAID overview Thoughts about Raid Strategies
More informationNutanix Tech Note. Failure Analysis. 2013 All Rights Reserved, Nutanix Corporation
Nutanix Tech Note Failure Analysis A Failure Analysis of Storage System Architectures Nutanix Scale-out v. Legacy Designs Types of data to be protected Any examination of storage system failure scenarios
More informationINTRODUCTION The collection of data that makes up a computerized database must be stored physically on some computer storage medium.
Chapter 4: Record Storage and Primary File Organization 1 Record Storage and Primary File Organization INTRODUCTION The collection of data that makes up a computerized database must be stored physically
More informationRAID technology and IBM TotalStorage NAS products
IBM TotalStorage Network Attached Storage October 2001 RAID technology and IBM TotalStorage NAS products By Janet Anglin and Chris Durham Storage Networking Architecture, SSG Page No.1 Contents 2 RAID
More informationGrid Computing Approach for Dynamic Load Balancing
International Journal of Computer Sciences and Engineering Open Access Review Paper Volume-4, Issue-1 E-ISSN: 2347-2693 Grid Computing Approach for Dynamic Load Balancing Kapil B. Morey 1*, Sachin B. Jadhav
More informationRAID HARDWARE. On board SATA RAID controller. RAID drive caddy (hot swappable) SATA RAID controller card. Anne Watson 1
RAID HARDWARE On board SATA RAID controller SATA RAID controller card RAID drive caddy (hot swappable) Anne Watson 1 RAID The word redundant means an unnecessary repetition. The word array means a lineup.
More information