Topics in Computer System Performance and Reliability: Storage Systems!

Save this PDF as:
Size: px
Start display at page:

Download "Topics in Computer System Performance and Reliability: Storage Systems!"

Transcription

1 CSC 2233: Topics in Computer System Performance and Reliability: Storage Systems! Note: some of the slides in today s lecture are borrowed from a course taught by Greg Ganger and Garth Gibson at Carnegie Mellon University 1

2 Who am I? 2

3 What makes storage systems so cool? 1. Combines so many topic areas: hardware meets OS meets networking meets distributed systems meets security meets AI meets HCI 3

4 What makes storage systems so cool? 1. Combines so many topic areas 2. This is where great jobs are! Designers and implementers still needed not just testing J Continuing growth area for the future The Internet is a network, but the web is a storage system Strong existing companies: EMC, NetApp, Core competency for Internet services: Google, Microsoft, Amazon, and still support for start-ups 4

5 What makes storage systems so cool? 1. Combines so many topic areas 2. Great careers 3. Still so much room to contribute: performance actually matters here in fact, it dominates other parts of system performance in many cases and reliability too storage management wide open and, storage starting to take over computation Big data Lots is and will be happening Solid state drives and other technologies? 5

6 Amdahl s Law Speedup limited to fraction improved obvious, but fundamental, observation % reduction in BLUE yields only 45% reduction in total What does this mean for storage systems? 6

7 Technology Trends Normalized value relative to 2000" 100" 10" 1" CPU Performance" Network Bandwidth" Memory Bandwidth" Disk Bandwidth" Network Latency" Disk Latency" 2000" 2002" 2004" 2006" 2008" 2010" Year" 7

8 Consequence: storage performance dominates CPU Time I/O Time CPU Time I/O Time Example of Amdahl s Law 8

9 Storage systems: fun quotes I/O certainly has been lagging in the last decade Seymour Cray, 1976 Also, I/O needs a lot of work David Kuck, 1988 In 3 to 5 years, we will start seeing servers as peripherals to storage SUN Chief Technology Officer, 1998 Scalable I/O is perhaps the most overlooked area of high-performance computing R&D Suggested R&D topic report for

10 Logistics & Administratives Class time: Thu 10am 12pm Office hours: By appointment Class web page 10

11 Grading 30% class participation Participation in class discussions (Read all papers prior to class) Class presentation of research paper 70% class project No exams, no homework, no paper summaries 11

12 Class project Can be done in team of two or alone Start looking for a partner now! On a research project you pick I will suggest possible projects (see course web page) You can propose your own Start thinking about it soon, proposal due in ~3 weeks Output: workshop quality research paper (10-12 pages) Even better: conference quality paper Use latex template on course web page All reports will be published as tech-report 12

13 Class project Output: workshop quality research paper (10-12 pages) I will help you get there --- multiple milestones: Project proposal Related work Status reports Final report And meetings with instructor 13

14 Topic of class project Project topic must be related to the topic of the class Is it OK to have overlap with my research / my course project in another course? You cannot get academic credit for the same piece of work twice 14

15 Paper presentation Each of you will present one or two papers in class Format of the presentation: 30 min presentation of paper 5-15 min paper review Good points Bad points 10 min class discussion that you lead! Prepare questions! 15

16 Paper presentation What I do not want: A long laundry list of all things the paper did What I do want: A lecture style presentation of the paper Including background material your fellow class mates might need to understand the paper A critical discussion of the paper Strength & Weaknesses Prepare questions! 16

17 Purpose of presentation Wrong answers: To give a verbal version of the paper, cramming all its content into 30 min To impress people with your technical depth and thoroughness In fact, no one cares about these things The goal is to filter out the main points of the paper and present them well By the end, everybody in the audience should remember 2-3 takehome messages 17

18 What s on each slide? Each slide should have one basic point There should NOT be tons of text Use sentence fragments Use pictures everywhere you possibly can! A picture says more than 1000 words Saves text and thus slides Much easier to process 18

19 Rest of today: Some review 19

20 What are storage systems all about? Memory/storage hierarchy 20

21 Memory/storage hierarchies Balancing performance with cost Small memories are fast but expensive Large memories are slow but cheap Exploit locality to get the best of both worlds locality = re-use/nearness of accesses allows most accesses to use small, fast memory Capacity Performance 21

22 Example memory hierarchy values Notice the huge access time gap between DRAM and disk Where will SSDs go? 22

23 What are storage systems all about? Memory/storage hierarchy Combining many technologies to balance costs/benefits No longer the focal point of storage system design Still important though Maybe more so with new technologies arriving on the market 23

24 What are storage systems all about? Memory/storage hierarchy Combining many technologies to balance costs/benefits No longer the focal point of storage system design Still important though Maybe more so with new technologies arriving on the market Persistence Storing data for lengthy periods of time To be useful, it must also be possible to find it again later this brings in data organization, consistency, and management issues This is where the serious action is and it does relate to the memory/storage hierarchy 24

25 Why persistence is important Some statistics: Among companies who lose data in a disaster, 50% never re-open and 90% are out of business within two years Even smaller incidents can be costly Reproducing some tens of megabytes of accounting data can take several weeks and cost tens of thousands of dollars Bad PR! 25

26 What is a storage system: Big Picture Application Bob1 Bob2 Bob2 Bob3 Bob4 The storage system Application gives keeps the data objects data objects & their and returns one upon IDs to storage request (by ID) Storage System Bob1 Bob2 Bob3 Bob4 26

27 Storage Systems & Interfaces What is a Storage System? Hardware (devices, controllers, interconnect) and Software (file system, device drivers, firmware) dedicated to providing management of and access to persistent storage. One view: defined by collection of interfaces 27

28 Storage Software Interfaces Program File system Device driver I/O controller Physical Media High level of abstraction No abstraction Understands files and directories HDD understands platters, cylinders, tracks, sectors 28

29 OS sees storage as linear array of blocks OS s view of storage device Common disk block size: 512 bytes Number of blocks: device capacity / block size Common OS-to-storage requests defined by few fields R/W, block #, # of blocks, memory source/dest 29

30 OS sees storage as linear array of blocks OS s view of storage device How does the OS implement the abstraction of files and directories on top of this logical array of disk blocks? 30

31 File System Implementation File systems define a block size (e.g., 4KB) Disk space is allocated in granularity of blocks Notice the terminology clash here: block is used for different things by the file system and the disk interface and this kind of thing is common in storage systems!! Superblock Bitmap Default usage of LBN space Space to store files and directories 31

32 File System Implementation File systems define a block size (e.g., 4KB) Disk space is allocated in granularity of blocks A Master Block determines location of root directory (aka superblock) Always at a well-known disk location Often replicated across disk for reliability A free map determines which blocks are free, allocated Usually a bitmap, one bit per block on the disk Also stored on disk, cached in memory for performance Remaining disk blocks used to store files (and dirs) There are many ways to do this Superblock Bitmap Default usage of LBN space Space to store files and directories 32

33 Disk Layout Strategies Files span multiple blocks How do you allocate the blocks for a file? 1. Contiguous allocation 33

34 Contiguous Allocation Disk directory File Name Start Blk Length File A 2 3 File B 9 5 File C 18 8 File D

35 Disk Layout Strategies Files span multiple disk blocks How do you find all of the blocks for a file? 1. Contiguous allocation Like memory Fast, simplifies directory access Inflexible, causes fragmentation, needs compaction 2. Linked, or chained, structure 35

36 Linked Allocation directory File Name Start Blk File B 1 22 Last Blk

37 Disk Layout Strategies Files span multiple disk blocks How do you find all of the blocks for a file? 1. Contiguous allocation Like memory Fast, simplifies directory access Inflexible, causes fragmentation, needs compaction 2. Linked, or chained, structure Each block points to the next, directory points to the first Good for sequential access, bad for all others 3. Indexed structure (indirection, hierarchy) An index block contains pointers to many other blocks Handles random better, still good for sequential May need multiple index blocks (linked together) 37

38 Indexed Allocation: Unix Inodes Unix inodes implement an indexed structure for files Each file is represented by an inode Each inode contains 15 block pointers First 12 are direct block pointers (e.g., 4 KB data blocks) Then single, double, and triple indirect

39 Unix Inodes and Path Search Unix Inodes are not directories They describe where on the disk the blocks for a file are placed Directories are files, so inodes also describe where the blocks for directories are placed on the disk Directory entries map file names to inodes To open /one, use Master Block to find inode for / on disk and read inode into memory inode allows us to find data block for directory / Read /, look for entry for one This entry locates the inode for one Read the inode for one into memory The inode says where first data block is on disk Read that block into memory to access the data in the file 39

40 Data and Inode Placement Original Unix FS had two placement problems: 1. Data blocks allocated randomly in aging file systems Blocks for the same file allocated sequentially when FS is new As FS ages and fills, need to allocate into blocks freed up when other files are deleted Problem: Deleted files essentially randomly placed So, blocks for new files become scattered across the disk 2. Inodes allocated far from blocks All inodes at beginning of disk, far from data Traversing file name paths, manipulating files, directories requires going back and forth from inodes to data blocks Both of these problems generate many long seeks Superblock Default usage of LBN space Bitmap Inodes CSC369 Data Blocks 40

41 Cylinder Groups BSD Fast File System (FFS) addressed placement problems using the notion of a cylinder group (aka allocation groups in lots of modern FS s) Disk partitioned into groups of cylinders Data blocks in same file allocated in same cylinder group Files in same directory allocated in same cylinder group Inodes for files allocated in same cylinder group as file data blocks Superblock Cylinder Group Cylinder group organization 41

42 More FFS solutions Small blocks (1K) in orig. Unix FS caused 2 problems: Low bandwidth utilization Small max file size (function of block size) => fix using a larger block (4K) Problem: Media failures Replicate master block (superblock) Problem: Device oblivious Parameterize according to device characteristics 42

43 File Buffer Cache Applications exhibit significant locality for reading and writing files Idea: Cache file blocks in memory to capture locality Issues This is called the file buffer cache Cache is system wide, used and shared by all processes Reading from the cache makes a disk perform like memory Even a 4 MB cache can be very effective The file buffer cache competes with VM (tradeoff here) Like VM, it has limited size Need replacement algorithms 43

44 Read Ahead Many file systems implement read ahead FS predicts that the process will request next block FS goes ahead and requests it from the disk This can happen while the process is computing on previous block Overlap I/O with execution When the process requests block, it will be in cache Compliments the on-disk cache, which also is doing read ahead For sequentially accessed files, can be a big win Unless blocks for the file are scattered across the disk File systems try to prevent that, though (during allocation) 44

45 Caching Writes On a write, some applications assume that data makes it through the buffer cache and onto the disk As a result, writes are often slow even with caching Several ways to compensate for this write-behind Maintain a queue of uncommitted blocks Periodically flush the queue to disk Unreliable Battery backed-up RAM (NVRAM) As with write-behind, but maintain queue in NVRAM Expensive Log-structured file system Always write contiguously at end of previous write 45

46 Remainder of the course Other optimizations: Other file system designs: log-structured, journaling Devices: Hard disks & Solid state drives Reliability & fault tolerance Performance modeling Distributed file systems: Google & Netapp Parallel file systems: GPFS & PanFS Storage for data-intensive computing 46

CSE 120 Principles of Operating Systems

CSE 120 Principles of Operating Systems CSE 120 Principles of Operating Systems Fall 2004 Lecture 13: FFS, LFS, RAID Geoffrey M. Voelker Overview We ve looked at disks and file systems generically Now we re going to look at some example file

More information

CS 153 Design of Operating Systems Spring 2015

CS 153 Design of Operating Systems Spring 2015 CS 153 Design of Operating Systems Spring 2015 Lecture 22: File system optimizations Physical Disk Structure Disk components Platters Surfaces Tracks Arm Track Sector Surface Sectors Cylinders Arm Heads

More information

Operating Systems CSE 410, Spring 2004. File Management. Stephen Wagner Michigan State University

Operating Systems CSE 410, Spring 2004. File Management. Stephen Wagner Michigan State University Operating Systems CSE 410, Spring 2004 File Management Stephen Wagner Michigan State University File Management File management system has traditionally been considered part of the operating system. Applications

More information

COS 318: Operating Systems. File Layout and Directories. Topics. File System Components. Steps to Open A File

COS 318: Operating Systems. File Layout and Directories. Topics. File System Components. Steps to Open A File Topics COS 318: Operating Systems File Layout and Directories File system structure Disk allocation and i-nodes Directory and link implementations Physical layout for performance 2 File System Components

More information

File Systems Management and Examples

File Systems Management and Examples File Systems Management and Examples Today! Efficiency, performance, recovery! Examples Next! Distributed systems Disk space management! Once decided to store a file as sequence of blocks What s the size

More information

Chapter 13 File and Database Systems

Chapter 13 File and Database Systems Chapter 13 File and Database Systems Outline 13.1 Introduction 13.2 Data Hierarchy 13.3 Files 13.4 File Systems 13.4.1 Directories 13.4. Metadata 13.4. Mounting 13.5 File Organization 13.6 File Allocation

More information

Chapter 13 File and Database Systems

Chapter 13 File and Database Systems Chapter 13 File and Database Systems Outline 13.1 Introduction 13.2 Data Hierarchy 13.3 Files 13.4 File Systems 13.4.1 Directories 13.4. Metadata 13.4. Mounting 13.5 File Organization 13.6 File Allocation

More information

Review. Lecture 21: Reliable, High Performance Storage. Overview. Basic Disk & File System properties CSC 468 / CSC 2204 11/23/2006

Review. Lecture 21: Reliable, High Performance Storage. Overview. Basic Disk & File System properties CSC 468 / CSC 2204 11/23/2006 S 468 / S 2204 Review Lecture 2: Reliable, High Performance Storage S 469HF Fall 2006 ngela emke rown We ve looked at fault tolerance via server replication ontinue operating with up to f failures Recovery

More information

& Data Processing 2. Exercise 2: File Systems. Dipl.-Ing. Bogdan Marin. Universität Duisburg-Essen

& Data Processing 2. Exercise 2: File Systems. Dipl.-Ing. Bogdan Marin. Universität Duisburg-Essen Folie a: Name & Data Processing 2 2: File Systems Dipl.-Ing. Bogdan Marin Fakultät für Ingenieurwissenschaften Abteilung Elektro-und Informationstechnik -Technische Informatik- Objectives File System Concept

More information

File-System Implementation

File-System Implementation File-System Implementation 11 CHAPTER In this chapter we discuss various methods for storing information on secondary storage. The basic issues are device directory, free space management, and space allocation

More information

Chapter 12 File Management

Chapter 12 File Management Operating Systems: Internals and Design Principles Chapter 12 File Management Eighth Edition By William Stallings Files Data collections created by users The File System is one of the most important parts

More information

Two Parts. Filesystem Interface. Filesystem design. Interface the user sees. Implementing the interface

Two Parts. Filesystem Interface. Filesystem design. Interface the user sees. Implementing the interface File Management Two Parts Filesystem Interface Interface the user sees Organization of the files as seen by the user Operations defined on files Properties that can be read/modified Filesystem design Implementing

More information

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System CS341: Operating System Lect 36: 1 st Nov 2014 Dr. A. Sahu Dept of Comp. Sc. & Engg. Indian Institute of Technology Guwahati File System & Device Drive Mass Storage Disk Structure Disk Arm Scheduling RAID

More information

File Management. Chapter 12

File Management. Chapter 12 Chapter 12 File Management File is the basic element of most of the applications, since the input to an application, as well as its output, is usually a file. They also typically outlive the execution

More information

Operating Systems: Internals and Design Principles. Chapter 12 File Management Seventh Edition By William Stallings

Operating Systems: Internals and Design Principles. Chapter 12 File Management Seventh Edition By William Stallings Operating Systems: Internals and Design Principles Chapter 12 File Management Seventh Edition By William Stallings Operating Systems: Internals and Design Principles If there is one singular characteristic

More information

COS 318: Operating Systems

COS 318: Operating Systems COS 318: Operating Systems File Performance and Reliability Andy Bavier Computer Science Department Princeton University http://www.cs.princeton.edu/courses/archive/fall10/cos318/ Topics File buffer cache

More information

Chapter 12 File Management. Roadmap

Chapter 12 File Management. Roadmap Operating Systems: Internals and Design Principles, 6/E William Stallings Chapter 12 File Management Dave Bremer Otago Polytechnic, N.Z. 2008, Prentice Hall Overview Roadmap File organisation and Access

More information

Chapter 12 File Management

Chapter 12 File Management Operating Systems: Internals and Design Principles, 6/E William Stallings Chapter 12 File Management Dave Bremer Otago Polytechnic, N.Z. 2008, Prentice Hall Roadmap Overview File organisation and Access

More information

1 File Management. 1.1 Naming. COMP 242 Class Notes Section 6: File Management

1 File Management. 1.1 Naming. COMP 242 Class Notes Section 6: File Management COMP 242 Class Notes Section 6: File Management 1 File Management We shall now examine how an operating system provides file management. We shall define a file to be a collection of permanent data with

More information

Price/performance Modern Memory Hierarchy

Price/performance Modern Memory Hierarchy Lecture 21: Storage Administration Take QUIZ 15 over P&H 6.1-4, 6.8-9 before 11:59pm today Project: Cache Simulator, Due April 29, 2010 NEW OFFICE HOUR TIME: Tuesday 1-2, McKinley Last Time Exam discussion

More information

The Classical Architecture. Storage 1 / 36

The Classical Architecture. Storage 1 / 36 1 / 36 The Problem Application Data? Filesystem Logical Drive Physical Drive 2 / 36 Requirements There are different classes of requirements: Data Independence application is shielded from physical storage

More information

The Linux Virtual Filesystem

The Linux Virtual Filesystem Lecture Overview Linux filesystem Linux virtual filesystem (VFS) overview Common file model Superblock, inode, file, dentry Object-oriented Ext2 filesystem Disk data structures Superblock, block group,

More information

Lecture 17: Virtual Memory II. Goals of virtual memory

Lecture 17: Virtual Memory II. Goals of virtual memory Lecture 17: Virtual Memory II Last Lecture: Introduction to virtual memory Today Review and continue virtual memory discussion Lecture 17 1 Goals of virtual memory Make it appear as if each process has:

More information

CHAPTER 17: File Management

CHAPTER 17: File Management CHAPTER 17: File Management The Architecture of Computer Hardware, Systems Software & Networking: An Information Technology Approach 4th Edition, Irv Englander John Wiley and Sons 2010 PowerPoint slides

More information

File System Management

File System Management Lecture 7: Storage Management File System Management Contents Non volatile memory Tape, HDD, SSD Files & File System Interface Directories & their Organization File System Implementation Disk Space Allocation

More information

ProTrack: A Simple Provenance-tracking Filesystem

ProTrack: A Simple Provenance-tracking Filesystem ProTrack: A Simple Provenance-tracking Filesystem Somak Das Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology das@mit.edu Abstract Provenance describes a file

More information

Storage Management. in a Hybrid SSD/HDD File system

Storage Management. in a Hybrid SSD/HDD File system Project 2 Storage Management Part 2 in a Hybrid SSD/HDD File system Part 1 746, Spring 2011, Greg Ganger and Garth Gibson 1 Project due on April 11 th (11.59 EST) Start early Milestone1: finish part 1

More information

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance. Agenda Enterprise Performance Factors Overall Enterprise Performance Factors Best Practice for generic Enterprise Best Practice for 3-tiers Enterprise Hardware Load Balancer Basic Unix Tuning Performance

More information

Speeding Up Cloud/Server Applications Using Flash Memory

Speeding Up Cloud/Server Applications Using Flash Memory Speeding Up Cloud/Server Applications Using Flash Memory Sudipta Sengupta Microsoft Research, Redmond, WA, USA Contains work that is joint with B. Debnath (Univ. of Minnesota) and J. Li (Microsoft Research,

More information

PIONEER RESEARCH & DEVELOPMENT GROUP

PIONEER RESEARCH & DEVELOPMENT GROUP SURVEY ON RAID Aishwarya Airen 1, Aarsh Pandit 2, Anshul Sogani 3 1,2,3 A.I.T.R, Indore. Abstract RAID stands for Redundant Array of Independent Disk that is a concept which provides an efficient way for

More information

COS 318: Operating Systems. Virtual Memory and Address Translation

COS 318: Operating Systems. Virtual Memory and Address Translation COS 318: Operating Systems Virtual Memory and Address Translation Today s Topics Midterm Results Virtual Memory Virtualization Protection Address Translation Base and bound Segmentation Paging Translation

More information

Outline. Failure Types

Outline. Failure Types Outline Database Management and Tuning Johann Gamper Free University of Bozen-Bolzano Faculty of Computer Science IDSE Unit 11 1 2 Conclusion Acknowledgements: The slides are provided by Nikolaus Augsten

More information

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters COSC 6374 Parallel Computation Parallel I/O (I) I/O basics Spring 2008 Concept of a clusters Processor 1 local disks Compute node message passing network administrative network Memory Processor 2 Network

More information

FAWN - a Fast Array of Wimpy Nodes

FAWN - a Fast Array of Wimpy Nodes University of Warsaw January 12, 2011 Outline Introduction 1 Introduction 2 3 4 5 Key issues Introduction Growing CPU vs. I/O gap Contemporary systems must serve millions of users Electricity consumed

More information

Outline. Database Management and Tuning. Overview. Hardware Tuning. Johann Gamper. Unit 12

Outline. Database Management and Tuning. Overview. Hardware Tuning. Johann Gamper. Unit 12 Outline Database Management and Tuning Hardware Tuning Johann Gamper 1 Free University of Bozen-Bolzano Faculty of Computer Science IDSE Unit 12 2 3 Conclusion Acknowledgements: The slides are provided

More information

Chapter 11: File System Implementation. Operating System Concepts with Java 8 th Edition

Chapter 11: File System Implementation. Operating System Concepts with Java 8 th Edition Chapter 11: File System Implementation 11.1 Silberschatz, Galvin and Gagne 2009 Chapter 11: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation

More information

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters COSC 6374 Parallel I/O (I) I/O basics Fall 2012 Concept of a clusters Processor 1 local disks Compute node message passing network administrative network Memory Processor 2 Network card 1 Network card

More information

CPS104 Computer Organization and Programming Lecture 18: Input-Output. Robert Wagner

CPS104 Computer Organization and Programming Lecture 18: Input-Output. Robert Wagner CPS104 Computer Organization and Programming Lecture 18: Input-Output Robert Wagner cps 104 I/O.1 RW Fall 2000 Outline of Today s Lecture The I/O system Magnetic Disk Tape Buses DMA cps 104 I/O.2 RW Fall

More information

File Systems for Flash Memories. Marcela Zuluaga Sebastian Isaza Dante Rodriguez

File Systems for Flash Memories. Marcela Zuluaga Sebastian Isaza Dante Rodriguez File Systems for Flash Memories Marcela Zuluaga Sebastian Isaza Dante Rodriguez Outline Introduction to Flash Memories Introduction to File Systems File Systems for Flash Memories YAFFS (Yet Another Flash

More information

Chapter 6, The Operating System Machine Level

Chapter 6, The Operating System Machine Level Chapter 6, The Operating System Machine Level 6.1 Virtual Memory 6.2 Virtual I/O Instructions 6.3 Virtual Instructions For Parallel Processing 6.4 Example Operating Systems 6.5 Summary Virtual Memory General

More information

Chapter 11: File System Implementation. Operating System Concepts 8 th Edition

Chapter 11: File System Implementation. Operating System Concepts 8 th Edition Chapter 11: File System Implementation Operating System Concepts 8 th Edition Silberschatz, Galvin and Gagne 2009 Chapter 11: File System Implementation File-System Structure File-System Implementation

More information

Lecture 18: Reliable Storage

Lecture 18: Reliable Storage CS 422/522 Design & Implementation of Operating Systems Lecture 18: Reliable Storage Zhong Shao Dept. of Computer Science Yale University Acknowledgement: some slides are taken from previous versions of

More information

Summer Student Project Report

Summer Student Project Report Summer Student Project Report Dimitris Kalimeris National and Kapodistrian University of Athens June September 2014 Abstract This report will outline two projects that were done as part of a three months

More information

Data Storage - II: Efficient Usage & Errors

Data Storage - II: Efficient Usage & Errors Data Storage - II: Efficient Usage & Errors Week 10, Spring 2005 Updated by M. Naci Akkøk, 27.02.2004, 03.03.2005 based upon slides by Pål Halvorsen, 12.3.2002. Contains slides from: Hector Garcia-Molina

More information

GraySort on Apache Spark by Databricks

GraySort on Apache Spark by Databricks GraySort on Apache Spark by Databricks Reynold Xin, Parviz Deyhim, Ali Ghodsi, Xiangrui Meng, Matei Zaharia Databricks Inc. Apache Spark Sorting in Spark Overview Sorting Within a Partition Range Partitioner

More information

CS2510 Computer Operating Systems

CS2510 Computer Operating Systems CS2510 Computer Operating Systems HADOOP Distributed File System Dr. Taieb Znati Computer Science Department University of Pittsburgh Outline HDF Design Issues HDFS Application Profile Block Abstraction

More information

CS2510 Computer Operating Systems

CS2510 Computer Operating Systems CS2510 Computer Operating Systems HADOOP Distributed File System Dr. Taieb Znati Computer Science Department University of Pittsburgh Outline HDF Design Issues HDFS Application Profile Block Abstraction

More information

Disks and RAID. Profs. Bracy and Van Renesse. based on slides by Prof. Sirer

Disks and RAID. Profs. Bracy and Van Renesse. based on slides by Prof. Sirer Disks and RAID Profs. Bracy and Van Renesse based on slides by Prof. Sirer 50 Years Old! 13th September 1956 The IBM RAMAC 350 Stored less than 5 MByte Reading from a Disk Must specify: cylinder # (distance

More information

Storing Data: Disks and Files. Disks and Files. Why Not Store Everything in Main Memory? Chapter 7

Storing Data: Disks and Files. Disks and Files. Why Not Store Everything in Main Memory? Chapter 7 Storing : Disks and Files Chapter 7 Yea, from the table of my memory I ll wipe away all trivial fond records. -- Shakespeare, Hamlet base Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Disks and

More information

Chapter 2: Computer-System Structures. Computer System Operation Storage Structure Storage Hierarchy Hardware Protection General System Architecture

Chapter 2: Computer-System Structures. Computer System Operation Storage Structure Storage Hierarchy Hardware Protection General System Architecture Chapter 2: Computer-System Structures Computer System Operation Storage Structure Storage Hierarchy Hardware Protection General System Architecture Operating System Concepts 2.1 Computer-System Architecture

More information

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen LECTURE 14: DATA STORAGE AND REPRESENTATION Data Storage Memory Hierarchy Disks Fields, Records, Blocks Variable-length

More information

TELE 301 Lecture 7: Linux/Unix file

TELE 301 Lecture 7: Linux/Unix file Overview Last Lecture Scripting This Lecture Linux/Unix file system Next Lecture System installation Sources Installation and Getting Started Guide Linux System Administrators Guide Chapter 6 in Principles

More information

Memory Hierarchy. Arquitectura de Computadoras. Centro de Investigación n y de Estudios Avanzados del IPN. adiaz@cinvestav.mx. MemoryHierarchy- 1

Memory Hierarchy. Arquitectura de Computadoras. Centro de Investigación n y de Estudios Avanzados del IPN. adiaz@cinvestav.mx. MemoryHierarchy- 1 Hierarchy Arturo Díaz D PérezP Centro de Investigación n y de Estudios Avanzados del IPN adiaz@cinvestav.mx Hierarchy- 1 The Big Picture: Where are We Now? The Five Classic Components of a Computer Processor

More information

Trends in Enterprise Backup Deduplication

Trends in Enterprise Backup Deduplication Trends in Enterprise Backup Deduplication Shankar Balasubramanian Architect, EMC 1 Outline Protection Storage Deduplication Basics CPU-centric Deduplication: SISL (Stream-Informed Segment Layout) Data

More information

RevoScaleR Speed and Scalability

RevoScaleR Speed and Scalability EXECUTIVE WHITE PAPER RevoScaleR Speed and Scalability By Lee Edlefsen Ph.D., Chief Scientist, Revolution Analytics Abstract RevoScaleR, the Big Data predictive analytics library included with Revolution

More information

INCREASING EFFICIENCY WITH EASY AND COMPREHENSIVE STORAGE MANAGEMENT

INCREASING EFFICIENCY WITH EASY AND COMPREHENSIVE STORAGE MANAGEMENT INCREASING EFFICIENCY WITH EASY AND COMPREHENSIVE STORAGE MANAGEMENT UNPRECEDENTED OBSERVABILITY, COST-SAVING PERFORMANCE ACCELERATION, AND SUPERIOR DATA PROTECTION KEY FEATURES Unprecedented observability

More information

Computer Engineering and Systems Group Electrical and Computer Engineering SCMFS: A File System for Storage Class Memory

Computer Engineering and Systems Group Electrical and Computer Engineering SCMFS: A File System for Storage Class Memory SCMFS: A File System for Storage Class Memory Xiaojian Wu, Narasimha Reddy Texas A&M University What is SCM? Storage Class Memory Byte-addressable, like DRAM Non-volatile, persistent storage Example: Phase

More information

Big Fast Data Hadoop acceleration with Flash. June 2013

Big Fast Data Hadoop acceleration with Flash. June 2013 Big Fast Data Hadoop acceleration with Flash June 2013 Agenda The Big Data Problem What is Hadoop Hadoop and Flash The Nytro Solution Test Results The Big Data Problem Big Data Output Facebook Traditional

More information

RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29

RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29 RAID Redundant Array of Inexpensive (Independent) Disks Use multiple smaller disks (c.f. one large disk) Parallelism improves performance Plus extra disk(s) for redundant data storage Provides fault tolerant

More information

Filing Systems. Filing Systems

Filing Systems. Filing Systems Filing Systems At the outset we identified long-term storage as desirable characteristic of an OS. EG: On-line storage for an MIS. Convenience of not having to re-write programs. Sharing of data in an

More information

Storing Data: Disks and Files

Storing Data: Disks and Files Storing Data: Disks and Files (From Chapter 9 of textbook) Storing and Retrieving Data Database Management Systems need to: Store large volumes of data Store data reliably (so that data is not lost!) Retrieve

More information

Distributed File Systems Part I. Issues in Centralized File Systems

Distributed File Systems Part I. Issues in Centralized File Systems Distributed File Systems Part I Daniel A. Menascé File Naming Issues in Centralized File Systems c:\courses\cs571\procs.ps (MS-DOS) /usr/menasce/courses/cs571/processes.ps (UNIX) File Structure bitstream

More information

Hypertable Architecture Overview

Hypertable Architecture Overview WHITE PAPER - MARCH 2012 Hypertable Architecture Overview Hypertable is an open source, scalable NoSQL database modeled after Bigtable, Google s proprietary scalable database. It is written in C++ for

More information

SAN Conceptual and Design Basics

SAN Conceptual and Design Basics TECHNICAL NOTE VMware Infrastructure 3 SAN Conceptual and Design Basics VMware ESX Server can be used in conjunction with a SAN (storage area network), a specialized high speed network that connects computer

More information

Memory Management Outline. Background Swapping Contiguous Memory Allocation Paging Segmentation Segmented Paging

Memory Management Outline. Background Swapping Contiguous Memory Allocation Paging Segmentation Segmented Paging Memory Management Outline Background Swapping Contiguous Memory Allocation Paging Segmentation Segmented Paging 1 Background Memory is a large array of bytes memory and registers are only storage CPU can

More information

Lecture 16: Storage Devices

Lecture 16: Storage Devices CS 422/522 Design & Implementation of Operating Systems Lecture 16: Storage Devices Zhong Shao Dept. of Computer Science Yale University Acknowledgement: some slides are taken from previous versions of

More information

Part III Storage Management. Chapter 11: File System Implementation

Part III Storage Management. Chapter 11: File System Implementation Part III Storage Management Chapter 11: File System Implementation 1 Layered File System 2 Overview: 1/4 A file system has on-disk and in-memory information. A disk may contain the following for implementing

More information

University of Dublin Trinity College. Storage Hardware. Owen.Conlan@cs.tcd.ie

University of Dublin Trinity College. Storage Hardware. Owen.Conlan@cs.tcd.ie University of Dublin Trinity College Storage Hardware Owen.Conlan@cs.tcd.ie Hardware Issues Hard Disk/SSD CPU Cache Main Memory CD ROM/RW DVD ROM/RW Tapes Primary Storage Floppy Disk/ Memory Stick Secondary

More information

Operating Systems. Design and Implementation. Andrew S. Tanenbaum Melanie Rieback Arno Bakker. Vrije Universiteit Amsterdam

Operating Systems. Design and Implementation. Andrew S. Tanenbaum Melanie Rieback Arno Bakker. Vrije Universiteit Amsterdam Operating Systems Design and Implementation Andrew S. Tanenbaum Melanie Rieback Arno Bakker Vrije Universiteit Amsterdam Operating Systems - Winter 2012 Outline Introduction What is an OS? Concepts Processes

More information

Outline. Operating Systems Design and Implementation. Chap 1 - Overview. What is an OS? 28/10/2014. Introduction

Outline. Operating Systems Design and Implementation. Chap 1 - Overview. What is an OS? 28/10/2014. Introduction Operating Systems Design and Implementation Andrew S. Tanenbaum Melanie Rieback Arno Bakker Outline Introduction What is an OS? Concepts Processes and Threads Memory Management File Systems Vrije Universiteit

More information

DSS. High performance storage pools for LHC. Data & Storage Services. Łukasz Janyst. on behalf of the CERN IT-DSS group

DSS. High performance storage pools for LHC. Data & Storage Services. Łukasz Janyst. on behalf of the CERN IT-DSS group DSS High performance storage pools for LHC Łukasz Janyst on behalf of the CERN IT-DSS group CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it Introduction The goal of EOS is to provide a

More information

A Journey through the MongoDB Internals?

A Journey through the MongoDB Internals? #nosql13 A Journey through the MongoDB Internals? Christian Amor Kvalheim Engineering Team lead, MonboDB Inc, @christkv Who Am I Driver Engineering lead at MongoDB Inc Work on the Mongo DB Node.js driver

More information

Prof. Dr. Ing. Axel Hunger Dipl.-Ing. Bogdan Marin. Operation Systems and Computer Networks Betriebssysteme und Computer Netzwerke

Prof. Dr. Ing. Axel Hunger Dipl.-Ing. Bogdan Marin. Operation Systems and Computer Networks Betriebssysteme und Computer Netzwerke Ex 2 File Systems A file is a logical collection of information and a file system is a collection of files, where the latter may also include a variety of other objects that share many of the properties

More information

CS 377: Operating Systems. Outline. A review of what you ve learned, and how it applies to a real operating system. Lecture 25 - Linux Case Study

CS 377: Operating Systems. Outline. A review of what you ve learned, and how it applies to a real operating system. Lecture 25 - Linux Case Study CS 377: Operating Systems Lecture 25 - Linux Case Study Guest Lecturer: Tim Wood Outline Linux History Design Principles System Overview Process Scheduling Memory Management File Systems A review of what

More information

The Design and Implementation of a Log-Structured File System

The Design and Implementation of a Log-Structured File System The Design and Implementation of a Log-Structured File System Mendel Rosenblum and John K. Ousterhout Electrical Engineering and Computer Sciences, Computer Science Division University of California Berkeley,

More information

Flash Performance in Storage Systems. Bill Moore Chief Engineer, Storage Systems Sun Microsystems

Flash Performance in Storage Systems. Bill Moore Chief Engineer, Storage Systems Sun Microsystems Flash Performance in Storage Systems Bill Moore Chief Engineer, Storage Systems Sun Microsystems 1 Disk to CPU Discontinuity Moore s Law is out-stripping disk drive performance (rotational speed) As a

More information

Storage in Database Systems. CMPSCI 445 Fall 2010

Storage in Database Systems. CMPSCI 445 Fall 2010 Storage in Database Systems CMPSCI 445 Fall 2010 1 Storage Topics Architecture and Overview Disks Buffer management Files of records 2 DBMS Architecture Query Parser Query Rewriter Query Optimizer Query

More information

Understanding Flash SSD Performance

Understanding Flash SSD Performance Understanding Flash SSD Performance Douglas Dumitru CTO EasyCo LLC August 16, 2007 DRAFT Flash based Solid State Drives are quickly becoming popular in a wide variety of applications. Most people think

More information

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University. (http://www.cs.princeton.edu/courses/cos318/)

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University. (http://www.cs.princeton.edu/courses/cos318/) COS 318: Operating Systems Storage Devices Kai Li Computer Science Department Princeton University (http://www.cs.princeton.edu/courses/cos318/) Today s Topics Magnetic disks Magnetic disk performance

More information

361 Computer Architecture Lecture 14: Cache Memory

361 Computer Architecture Lecture 14: Cache Memory 1 361 Computer Architecture Lecture 14 Memory cache.1 The Motivation for s Memory System Processor DRAM Motivation Large memories (DRAM) are slow Small memories (SRAM) are fast Make the average access

More information

SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011

SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011 SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications Jürgen Primsch, SAP AG July 2011 Why In-Memory? Information at the Speed of Thought Imagine access to business data,

More information

Recommended hardware system configurations for ANSYS users

Recommended hardware system configurations for ANSYS users Recommended hardware system configurations for ANSYS users The purpose of this document is to recommend system configurations that will deliver high performance for ANSYS users across the entire range

More information

The Data Placement Challenge

The Data Placement Challenge The Data Placement Challenge Entire Dataset Applications Active Data Lowest $/IOP Highest throughput Lowest latency 10-20% Right Place Right Cost Right Time 100% 2 2 What s Driving the AST Discussion?

More information

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 412, University of Maryland. Guest lecturer: David Hovemeyer.

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 412, University of Maryland. Guest lecturer: David Hovemeyer. Guest lecturer: David Hovemeyer November 15, 2004 The memory hierarchy Red = Level Access time Capacity Features Registers nanoseconds 100s of bytes fixed Cache nanoseconds 1-2 MB fixed RAM nanoseconds

More information

Secondary Storage. Any modern computer system will incorporate (at least) two levels of storage: magnetic disk/optical devices/tape systems

Secondary Storage. Any modern computer system will incorporate (at least) two levels of storage: magnetic disk/optical devices/tape systems 1 Any modern computer system will incorporate (at least) two levels of storage: primary storage: typical capacity cost per MB $3. typical access time burst transfer rate?? secondary storage: typical capacity

More information

Storage and File Systems. Chester Rebeiro IIT Madras

Storage and File Systems. Chester Rebeiro IIT Madras Storage and File Systems Chester Rebeiro IIT Madras 1 Two views of a file system system calls protection rwx attributes Application View Look & Feel File system Hardware view 2 Magnetic Disks Chester Rebeiro

More information

CS3210: Crash consistency. Taesoo Kim

CS3210: Crash consistency. Taesoo Kim 1 CS3210: Crash consistency Taesoo Kim 2 Administrivia Quiz #2. Lab4-5, Ch 3-6 (read "xv6 book") Open laptop/book, no Internet 3:05pm ~ 4:25-30pm (sharp) NOTE Lab6: 10% bonus, a single lab (bump up your

More information

RAMCloud and the Low- Latency Datacenter. John Ousterhout Stanford University

RAMCloud and the Low- Latency Datacenter. John Ousterhout Stanford University RAMCloud and the Low- Latency Datacenter John Ousterhout Stanford University Most important driver for innovation in computer systems: Rise of the datacenter Phase 1: large scale Phase 2: low latency Introduction

More information

In-Memory Databases Algorithms and Data Structures on Modern Hardware. Martin Faust David Schwalb Jens Krüger Jürgen Müller

In-Memory Databases Algorithms and Data Structures on Modern Hardware. Martin Faust David Schwalb Jens Krüger Jürgen Müller In-Memory Databases Algorithms and Data Structures on Modern Hardware Martin Faust David Schwalb Jens Krüger Jürgen Müller The Free Lunch Is Over 2 Number of transistors per CPU increases Clock frequency

More information

HP Smart Array Controllers and basic RAID performance factors

HP Smart Array Controllers and basic RAID performance factors Technical white paper HP Smart Array Controllers and basic RAID performance factors Technology brief Table of contents Abstract 2 Benefits of drive arrays 2 Factors that affect performance 2 HP Smart Array

More information

CSE-E5430 Scalable Cloud Computing P Lecture 5

CSE-E5430 Scalable Cloud Computing P Lecture 5 CSE-E5430 Scalable Cloud Computing P Lecture 5 Keijo Heljanko Department of Computer Science School of Science Aalto University keijo.heljanko@aalto.fi 12.10-2015 1/34 Fault Tolerance Strategies for Storage

More information

Disk Storage & Dependability

Disk Storage & Dependability Disk Storage & Dependability Computer Organization Architectures for Embedded Computing Wednesday 19 November 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition,

More information

Linux Kernel Architecture

Linux Kernel Architecture Linux Kernel Architecture Amir Hossein Payberah payberah@yahoo.com Contents What is Kernel? Kernel Architecture Overview User Space Kernel Space Kernel Functional Overview File System Process Management

More information

Introduction. What is RAID? The Array and RAID Controller Concept. Click here to print this article. Re-Printed From SLCentral

Introduction. What is RAID? The Array and RAID Controller Concept. Click here to print this article. Re-Printed From SLCentral Click here to print this article. Re-Printed From SLCentral RAID: An In-Depth Guide To RAID Technology Author: Tom Solinap Date Posted: January 24th, 2001 URL: http://www.slcentral.com/articles/01/1/raid

More information

Google File System. Web and scalability

Google File System. Web and scalability Google File System Web and scalability The web: - How big is the Web right now? No one knows. - Number of pages that are crawled: o 100,000 pages in 1994 o 8 million pages in 2005 - Crawlable pages might

More information

Operating Systems. RAID Redundant Array of Independent Disks. Submitted by Ankur Niyogi 2003EE20367

Operating Systems. RAID Redundant Array of Independent Disks. Submitted by Ankur Niyogi 2003EE20367 Operating Systems RAID Redundant Array of Independent Disks Submitted by Ankur Niyogi 2003EE20367 YOUR DATA IS LOST@#!! Do we have backups of all our data???? - The stuff we cannot afford to lose?? How

More information

Distributed File Systems

Distributed File Systems Distributed File Systems Paul Krzyzanowski Rutgers University October 28, 2012 1 Introduction The classic network file systems we examined, NFS, CIFS, AFS, Coda, were designed as client-server applications.

More information

DISK DEFRAG Professional

DISK DEFRAG Professional auslogics DISK DEFRAG Professional Help Manual www.auslogics.com / Contents Introduction... 5 Installing the Program... 7 System Requirements... 7 Installation... 7 Registering the Program... 9 Uninstalling

More information

Accelerating Server Storage Performance on Lenovo ThinkServer

Accelerating Server Storage Performance on Lenovo ThinkServer Accelerating Server Storage Performance on Lenovo ThinkServer Lenovo Enterprise Product Group April 214 Copyright Lenovo 214 LENOVO PROVIDES THIS PUBLICATION AS IS WITHOUT WARRANTY OF ANY KIND, EITHER

More information

High Performance Tier Implementation Guideline

High Performance Tier Implementation Guideline High Performance Tier Implementation Guideline A Dell Technical White Paper PowerVault MD32 and MD32i Storage Arrays THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS

More information