Lecture 13: I/O: RAID, I/O Benchmarks, UNIX File System Performance Professor David A. Patterson Computer Science 252 Fall 1996

Similar documents

1 Storage Devices Summary

RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29

How To Improve Performance On A Single Chip Computer

Input / Ouput devices. I/O Chapter 8. Goals & Constraints. Measures of Performance. Anatomy of a Disk Drive. Introduction - 8.1

Chapter 11 I/O Management and Disk Scheduling

Chapter Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig I/O devices can be characterized by. I/O bus connections

CS 61C: Great Ideas in Computer Architecture. Dependability: Parity, RAID, ECC

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

Q & A From Hitachi Data Systems WebTech Presentation:

COMP 7970 Storage Systems

Lecture 36: Chapter 6

Comprehending the Tradeoffs between Deploying Oracle Database on RAID 5 and RAID 10 Storage Configurations. Database Solutions Engineering

HP Smart Array Controllers and basic RAID performance factors

Price/performance Modern Memory Hierarchy

RAID 5 rebuild performance in ProLiant

RAID Storage, Network File Systems, and DropBox

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University. (

Dependable Systems. 9. Redundant arrays of. Prof. Dr. Miroslaw Malek. Wintersemester 2004/05

COS 318: Operating Systems. Storage Devices. Kai Li and Andy Bavier Computer Science Department Princeton University

Today s Papers. RAID Basics (Two optional papers) Array Reliability. EECS 262a Advanced Topics in Computer Systems Lecture 4

Oracle Database Deployments with EMC CLARiiON AX4 Storage Systems

Chapter 10: Mass-Storage Systems

Disks and RAID. Profs. Bracy and Van Renesse. based on slides by Prof. Sirer

RAID Performance Analysis

Chapter 11 I/O Management and Disk Scheduling

Reliability and Fault Tolerance in Storage

Fault Tolerance & Reliability CDA Chapter 3 RAID & Sample Commercial FT Systems

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

Database Management Systems

AIX NFS Client Performance Improvements for Databases on NAS

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

Gigabit Ethernet Design

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

Storage. The text highlighted in green in these slides contain external hyperlinks. 1 / 14

Accelerating Server Storage Performance on Lenovo ThinkServer

Online Remote Data Backup for iscsi-based Storage Systems

Quiz for Chapter 6 Storage and Other I/O Topics 3.10

Distribution One Server Requirements

CPS104 Computer Organization and Programming Lecture 18: Input-Output. Robert Wagner

Network Attached Storage. Jinfeng Yang Oct/19/2015

Storing Data: Disks and Files

Distributed File System Performance. Milind Saraph / Rich Sudlow Office of Information Technologies University of Notre Dame

Overview of I/O Performance and RAID in an RDBMS Environment. By: Edward Whalen Performance Tuning Corporation

Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering

Best Practices for Optimizing SQL Server Database Performance with the LSI WarpDrive Acceleration Card

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 412, University of Maryland. Guest lecturer: David Hovemeyer.

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

A Deduplication File System & Course Review

The IntelliMagic White Paper: Storage Performance Analysis for an IBM Storwize V7000

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle

HP reference configuration for entry-level SAS Grid Manager solutions

Benchmarking Cassandra on Violin

DELL TM PowerEdge TM T Mailbox Resiliency Exchange 2010 Storage Solution

Disk Storage & Dependability

Data Storage - II: Efficient Usage & Errors

Array Performance 101 Part 4

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

SAN Conceptual and Design Basics

Deep Dive: Maximizing EC2 & EBS Performance

Performance and scalability of a large OLTP workload

Enterprise Applications

Design and Implementation of a Storage Repository Using Commonality Factoring. IEEE/NASA MSST2003 April 7-10, 2003 Eric W. Olsen

Chapter 7. Disk subsystem

IBM ^ xseries ServeRAID Technology

LSI SAS inside 60% of servers. 21 million LSI SAS & MegaRAID solutions shipped over last 3 years. 9 out of 10 top server vendors use MegaRAID

Input/output (I/O) I/O devices. Performance aspects. CS/COE1541: Intro. to Computer Architecture. Input/output subsystem.

Cray DVS: Data Virtualization Service

Chapter 12: Mass-Storage Systems

Implementing Network Attached Storage. Ken Fallon Bill Bullers Impactdata

HP ProLiant DL380p Gen mailbox 2GB mailbox resiliency Exchange 2010 storage solution

CS 6290 I/O and Storage. Milos Prvulovic

Introduction to I/O and Disk Management

The IntelliMagic White Paper on: Storage Performance Analysis for an IBM San Volume Controller (SVC) (IBM V7000)

COMPUTER HARDWARE. Input- Output and Communication Memory Systems

VMware Virtual SAN Backup Using VMware vsphere Data Protection Advanced SEPTEMBER 2014

Big Picture. IC220 Set #11: Storage and I/O I/O. Outline. Important but neglected

Sistemas Operativos: Input/Output Disks

IOmark- VDI. Nimbus Data Gemini Test Report: VDI a Test Report Date: 6, September

VDI Optimization Real World Learnings. Russ Fellows, Evaluator Group

DELL RAID PRIMER DELL PERC RAID CONTROLLERS. Joe H. Trickey III. Dell Storage RAID Product Marketing. John Seward. Dell Storage RAID Engineering

Virtuoso and Database Scalability

CS2510 Computer Operating Systems

CS2510 Computer Operating Systems

Xangati Storage Solution Brief. Optimizing Virtual Infrastructure Storage Systems with Xangati

Definition of RAID Levels

PERFORMANCE TUNING ORACLE RAC ON LINUX

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15

Remote Copy Technology of ETERNUS6000 and ETERNUS3000 Disk Arrays

RAID technology and IBM TotalStorage NAS products

CS 147: Computer Systems Performance Analysis

Perforce with Network Appliance Storage

OVERVIEW. CEP Cluster Server is Ideal For: First-time users who want to make applications highly available

Transcription:

Lecture 3: I/O: RAID, I/O Benchmarks, UNIX File System Performance Professor David A. Patterson Computer Science 252 Fall 996 DAP.F96

Summary: A Little Queuing Theory Queue System server Proc IOC Device Queuing models assume state of equilibrium: input rate = output rate Notation: r average number of arriving customers/second T ser average time to service a customer (tradtionally µ = / T ser ) u server utilization (..): u = r x T ser T q average time/customer in queue T sys average time/customer in system: T sys = T q + T ser L q average length of queue: L q = r x T q L sys average length of system : L sys = r x T sys Little s Law: Length system = rate x Time system (Mean number customers = arrival rate x mean service time) DAP.F96 2

Summary: Processor Interface Issues Processor interface interrupts memory mapped I/O I/O Control Structures polling interrupts DMA I/O Controllers I/O Processors DAP.F96 3

Summary: Relationship to Processor Architecture I/O instructions have disappeared Interrupt vectors have been replaced by jump tables Interrupt stack replaced by shadow registers Interrupt types reduced in number Caches required for processor performance cause problems for I/O Virtual memory frustrates DMA Load/store architecture at odds with atomic operations Stateful processors hard to context switch DAP.F96 4

Network Attached Storage Decreasing Disk Diameters 4"» "» 8"» 5.25"» 3.5"» 2.5"».8"».3"»... high bandwidth disk systems based on arrays of disks Network provides well defined physical and logical interfaces: separate CPU and storage system! High Performance Storage Service on a High Speed Network Network File Services OS structures supporting remote file access 3 Mb/s» Mb/s» 5 Mb/s» Mb/s» Gb/s» Gb/s networks capable of sustaining high bandwidth transfers Increasing Network Bandwidth DAP.F96 5

Manufacturing Advantages of Disk Arrays Disk Product Families Conventional: 4 disk designs 3.5 5.25 4 Low End High End Disk Array: disk design 3.5 DAP.F96 6

Replace Small # of Large Disks with Large # of Small Disks! IBM 339 (K) IBM 3.5" 6 x7 Data Capacity 2 GBytes 32 MBytes 23 GBytes Volume 97 cu. ft.. cu. ft. cu. ft. Power 3 KW W KW Data Rate 5 MB/s.5 MB/s 2 MB/s I/O Rate 6 I/Os/s 55 I/Os/s 39 IOs/s MTTF 25 KHrs 5 KHrs??? Hrs Cost $25K $2K $5K large data and I/O rates Disk Arrays have potential for high MB per cu. ft., high MB per KW awful reliability DAP.F96 7

Redundant Arrays of Disks Files are "striped" across multiple spindles Redundancy yields high data availability Disks will fail Contents reconstructed from data redundantly stored in the array Capacity penalty to store it Bandwidth penalty to update Mirroring/Shadowing (high capacity cost) Techniques: Horizontal Hamming Codes (overkill) Parity & Reed-Solomon Codes Failure Prediction (no capacity overhead!) VaxSimPlus Technique is controversial DAP.F96 8

Array Reliability Reliability of N disks = Reliability of Disk N 5, Hours 7 disks = 7 hours Disk system MTTF: Drops from 6 years to month! Arrays without redundancy too unreliable to be useful! Hot spares support reconstruction in parallel with access: very high media availability can be achieved DAP.F96 9

Redundant Arrays of Disks (RAID) Techniques Disk Mirroring, Shadowing Each disk is fully duplicated onto its "shadow" Logical write = two physical writes % capacity overhead Parity Data Bandwidth Array Parity computed horizontally Logically a single high data bw disk High I/O Rate Parity Array Interleaved parity blocks Independent reads and writes Logical write = 2 reads + 2 writes Parity + Reed-Solomon codes DAP.F96

Problems of Disk Arrays: Small Writes RAID-5: Small Write Algorithm Logical Write = 2 Physical Reads + 2 Physical Writes D' D D D2 D3 P new data old data (. Read) old parity (2. Read) + XOR + XOR (3. Write) (4. Write) D' D D2 D3 P' DAP.F96

Redundant Arrays of Disks RAID : Disk Mirroring/Shadowing recovery group Each disk is fully duplicated onto its "shadow" Very high availability can be achieved Bandwidth sacrifice on write: Logical write = two physical writes Reads may be optimized Most expensive solution: % capacity overhead Targeted for high I/O rate, high availability environments DAP.F96 2

Redundant Arrays of Disks RAID 3: Parity Disk... P logical record Striped physical records Parity computed across recovery group to protect against hard disk failures 33% capacity cost for parity in this configuration wider arrays reduce capacity costs, decrease expected availability, increase reconstruction time Arms logically synchronized, spindles rotationally synchronized logically a single high capacity, high transfer rate disk Targeted for high bandwidth applications: Scientific, Image Processing DAP.F96 3

Redundant Arrays of Disks RAID 5+: High I/O Rate Parity A logical write becomes four physical I/Os Independent writes possible because of interleaved parity Reed-Solomon Codes ("Q") for protection during reconstruction Targeted for mixed applications D D D2 D3 P D4 D5 D6 P D7 D8 D9 P D D D2 P D3 D4 D5 P D6 D7 D8 D9 Increasing Logical Disk Addresses Stripe Stripe Unit D2 D2 D22 D23 P........ Disk. Columns...... DAP.F96 4

CS 252 Administrivia Homework on Chapter 6 due Monday/2 at 5PM in 252 box, done in pairs: Exercises 6.5, 6.6, 6.7 Project Survey #2 due Wednesday /23 in class DAP.F96 5

RA9: Ave. Seek: 8.5 ms Rotation: 6.7 ms Xfer Rate: 2.8 MB/s Capacity: 2 MB IBM Small Disks: Ave. Seek: 2.5 ms Rotation: 4 ms Xfer Rate: 2.4 MB/s Capacity: 3 MB Normal Operating Range of Most Existing Systems 4+2 Array Group Mirrored RA9's all writes all DAP.F96 reads 6

Subsystem Organization host host adapter array controller single board disk controller manages interface to host, DMA control, buffering, parity logic physical device control striping software off-loaded from host to array controller no applications modifications no reduction of host performance single board disk controller single board disk controller single board disk controller often piggy-backed in small format devices DAP.F96 7

System Availability: Orthogonal RAIDs String Controller... Array Controller String Controller String Controller String Controller String Controller String Controller............... Data Recovery Group: unit of data redundancy Redundant Support Components: fans, power supplies, controller, cables End to End Data Integrity: internal parity protected data paths DAP.F96 8

System-Level Availability host host I/O Controller Fully dual redundant I/O Controller Array Controller Array Controller............ Goal: No Single Points of Failure... Recovery Group... with duplicated paths, higher performance can be obtained when there are no failures DAP.F96 9

Review: Storage System Issues Historical Context of Storage I/O Storage I/O Performance Measures Secondary and Tertiary Storage Devices A Little Queuing Theory Processor Interface Issues I/O & Memory Buses RAID ABCs of UNIX File Systems I/O Benchmarks Comparing UNIX File System Performance Tertiary Storage Possbilities DAP.F96 2

ABCs of UNIX File Systems Key Issues File vs. Raw I/O File Cache Size Policy Write Policy Local Disk vs. Server Disk File vs. Raw: File system access is the norm: standard policies apply Raw: alternate I/O system to avoid file system, used by data bases File Cache Size Policy % of main memory dedicated to file cache is fixed at system generation (e.g., %) % of main memory for file cache varies depending on amount of file I/O (e.g., up to 8%) DAP.F96 2

ABCs of UNIX File Systems Write Policy File Storage should be permanent; either write immediately or flush file cache after fixed period (e.g., 3 seconds) Write Through with Write Buffer Write Back Write Buffer often confused with Write Back» Write Through with Write Buffer, all writes go to disk» Write Through with Write Buffer, writes are asynchronous, so processor doesn t have to wait for disk write» Write Back will combine multiple writes to same page; hence can be called Write Cancelling DAP.F96 22

ABCs of UNIX File Systems Local vs. Server Unix File systems have historically had different policies (and even file sytems) for local client vs. remote server NFS local disk allows 3 second delay to flush writes NFS server disk writes through to disk on file close Cache coherency problem if allow clients to have file caches in addition to server file cache» NFS just writes through on file close Stateless protocol: periodically get new copies of file blocks» Other file systems use cache coherency with write back to check state and selectively invalidate or update DAP.F96 23

Network File Systems Application Program UNIX System Call Layer remote accesses Virtual File System Interface NFS Client UNIX File System Network Protocol Stack Block Device Driver local accesses UNIX System Call Layer UNIX System Call Layer Virtual File System Interface Virtual File System Interface NFS File System Server Routines RPC/Transmission Protocols Client Network Server RPC/Transmission Protocols DAP.F96 24

Typical File Server Architecture NFS Request Kernel NFS Protocol & File Processing TCP/IP Protocols Ethernet Ethernet Driver Unix File System Disk Manager & Driver Single Processor File Server Backplane Bus Primary Memory Disk Controller Limits to performance: data copying... read data staged from device to primary memory copy again into network packet templates copy yet again to network interface No specialization for fast processing between network and disk DAP.F96 25

AUSPEX NS5 File Server Special hardware/software architecture for high performance NFS I/O Functional multiprocessing I/O buffers Primary Primary Memory Memory Host Processor Host Memory Single Board Computer UNIX frontend Ethernet Processor File File Processor Independent File System Storage Processor... 2 Enhanced VME Backplane Parallel SCSI Channels specialized for protocol processing dedicated FS software manages SCSI channels DAP.F96 26

AUSPEX Software Architecture Unix System Call Layer NFS Client VFS Interface LFS Client Host Processor Ethernet Processor LFS Client NSF Server Protocols Network I/F Ethernet File Processor LFS Server File System Server Primary Memory Storage Processor Disk Arrays Limited control interfaces Primary data flow Primary control flow DAP.F96 27

Berkeley RAID-II Disk Array File Server FDDI Network to UltraNet HiPPI TMC IOP Bus TMC HiPPIS TMC HiPPID X-Bus Board X-Bus Board IOPB In IOPB Out 8 Port Interleaved Memory 8 Port (28 Interleaved MByte) Memory (28 MByte) 8 x 8 x 32-bit 8 Crossbar x 8 x 32-bit Crossbar XOR XOR VME VME File Server VME VME VME VME VME VME VME VME VME Control Bus Low latency transfers mixed with high bandwidth transfers "Diskless Supercomputers" ATC ATC 5 SCSI ATC5 SCSI ATC 5 SCSI Channels Channels 5 SCSI Channels Channels to 2 disk drives DAP.F96 28

5 minute Class Break Lecture Format: minute: review last time & motivate this lecture 2 minute lecture 3 minutes: discuss class manangement 25 minutes: lecture 5 minutes: break 25 minutes: lecture minute: summary of today s important topics DAP.F96 29

I/O Benchmarks For better or worse, benchmarks shape a field Processor benchmarks classically aimed at response time for fixed sized problem I/O benchmarks typically measure throughput, possibly with upper limit on response times (or 9% of response times) What if fix problem size, given 6%/year increase in DRAM capacity? Benchmark Size of Data % Time I/O Year I/OStones MB 26% 99 Andrew 4.5 MB 4% 988 Not much time in I/O Not measuring disk (or even main memory) DAP.F96 3

I/O Benchmarks Alternative: self-scaling benchmark; automatically and dynamically increase aspects of workload to match characteristics of system measured Measures wide range of current & future Describe three self-scaling benchmarks Transacition Processing: TPC-A, TPC-B, TPC-C NFS: SPEC SFS (LADDIS) Unix I/O: Willy DAP.F96 3

I/O Benchmarks: Transaction Processing Transaction Processing (TP) (or On-line TP=OLTP) Changes to a large body of shared information from many terminals, with the TP system guaranteeing proper behavior on a failure If a bank s computer fails when a customer withdraws money, the TP system would guarantee that the account is debited if the customer received the money and that the account is unchanged if the money was not received Airline reservation systems & banks use TP Atomic transactions makes this work Each transaction => 2 to disk I/Os & 5, and 2, CPU instructions per disk I/O Efficiency of TP SW & avoiding disks accesses by keeping information in main memory Classic metric is Transactions Per Second (TPS) Under what workload? how machine configured? DAP.F96 32

I/O Benchmarks: Transaction Processing Early 98s great interest in OLTP Expecting demand for high TPS (e.g., ATM machines, credit cards) Tandem s success implied medium range OLTP expands Each vendor picked own conditions for TPS claims, report only CPU times with widely different I/O Conflicting claims led to disbelief of all benchmarks=> chaos 984 Jim Gray of Tandem distributed paper to Tandem employees and 9 in other industries to propose standard benchmark Published A measure of transaction processing power, Datamation, 985 by Anonymous et. al To indicate that this was effort of large group To avoid delays of legal department of each author s firm Still get mail at Tandem to author DAP.F96 33

I/O Benchmarks: TP by Anon et. al Proposed 3 standard tests to characterize commercial OLTP TP: OLTP test, DebitCredit, simulates ATMs (TP) Batch sort Batch scan Debit/Credit: One type of transaction: bytes each Recorded 3 places: account file, branch file, teller file + events recorded in history file (9 days)» 5% requests for different branches Under what conditions, how report results? DAP.F96 34

I/O Benchmarks: TP by Anon et. al DebitCredit Scalability: size of account, branch, teller, history function of throughput TPS Number of ATMs Account-file size,. GB,. GB,,. GB,,,. GB Each input TPS =>, account records, branches, ATMs Accounts must grow since a person is not likely to use the bank more frequently just because the bank has a faster computer! Response time: 95% transactions take second Configuration control: just report price (initial purchase price + 5 year maintenance = cost of ownership) DAP.F96 35 By publishing, in public domain

I/O Benchmarks: TP by Anon et. al Problems Often ignored the user network to terminals Used transaction generator with no think time; made sense for database vendors, but not what customer would see Solution: Hire auditor to certify results Auditors soon saw many variations of ways to trick system Proposed minimum compliance list (3 pages); still, DEC tried IBM test on different machine with poorer results than claimed by auditor Created Transaction Processing Performance Council in 988: founders were CDC, DEC, ICL, Pyramid, Stratus, Sybase, Tandem, and Wang Led to TPC standard benchmarks in 99 DAP.F96 36

I/O Benchmarks: TPC Benchmarks TPC-A: Revised version of TP/DebitCredit Arrivals: Random (TPC) vs. uniform (TP) Terminals: Smart vs. dumb (affects instruction path length) ATM scaling: terminals per TPS vs. Branch scaling: branch record per TPS vs. Response time constraint: 9% 2 seconds vs. 95% Full disclosure, approved by TPC Complete TPS vs. response time plots vs. single point TPC-B: Same as TPC-A but without terminals batch processing of requests Response time makes no sense: plots tps vs. residence time (time of transaction resides in system) Other efforts underway on complex query processing (C) and decision support (D) DAP.F96 37

TPC Results TPC-A Machine tpsa-local K$/tps OS/DB Date HP 852S 43 24 HPUX 7/Infmx 4 2/9 VAX 4 4 23 VMS 5.4/Dec 6 7/9 IBM RS6/55 32 2 Aix 3./infmx 4 /9 Compaq SysPro 72 5?? /93 SPARCserve4 8 7?? /93 HP 9 89/4 7 8?? /93 TPC-B Machine tpsb K$/tps OS/DB Date HP 852S 9 5 HPUX 7/Infmx 4 2/9 IBM RS6/55 58 5 Aix 3./infmx 4 /9 Sun SS 49 57 8 Sun4./Sybase 4 /9 Sun SS 2 52 4 Sun4./Sybase 4 /9 DAP.F96 38

SPEC SFS/LADDIS Predecessor: NFSstones NFSStones: synthetic benchmark that generates series of NFS requests from single client to test server: reads, writes, & commands & file sizes from other studies Problem: client could not always stress server Files and block sizes not realistic Clients had to run SunOS DAP.F96 39

SPEC SFS/LADDIS 993 Attempt by NFS companies to agree on standard benchmark: Legato, Auspex, Data General, DEC, Interphase, Sun. Like NFSstones but Run on multiple clients & networks (to prevent bottlenecks) Same caching policy in all clients Reads: 85% full block & 5% partial blocks Writes: 5% full block & 5% partial blocks Average response time: 5 ms Scaling: for every NFS ops/sec, increase capacity GB Results: plot of server load (throughput) vs. response time & number of users» Assumes: user => NFS ops/sec DAP.F96 4

Example SPEC SFS Result: DEC Alpha 2 MHz 264: 8KI + 8KD + 2MB L2; 52 MB; Gigaswitch DEC OSF v2. 4 FDDI networks; 32 NFS Daemons, 24 GB file size 88 Disks, 6 controllers, 84 file systems Avg. NSF Resp. Time 5 4 3 2 487 2 3 4 5 NFS Throughput (nfs ops/sec) DAP.F96 4

Willy UNIX File System Benchmark that gives insight into I/O system behavior (Chen and Patterson, 993) Self scaling to automatically explore system size Examines five parameters Unique bytes touched: data size; locality via LRU» Gives file cache size Percentage of reads: %writes = % reads; typically 5%» % reads gives peak throughput Average I/O Request Size: Bernoulli, C= Percentage sequential requests: typically 5% Number of processes: concurrency of workload (number processes issuing I/O requests) Fix four parameters while vary one parameter Searches space to find high throughput DAP.F96 42

Example Willy: DS 5 Sprite Ultrix Avg. Access Size 32 KB 3 KB Data touched (file cache) 2MB, 5 MB 2 MB Data touched (disk) 36 MB 6 MB % reads = 5%, % sequential = 5% DS 5 32 MB memory Ultrix: Fixed File Cache Size, Write through Sprite: Dynamic File Cache Size, Write back (Write cancelling) DAP.F96 43

Sprite's Log Structured File System Large file caches effective in reducing disk reads Disk traffic likely to be dominated by writes Write-Optimized File System Only representation on disk is log Stream out files, directories, maps without seeks Advantages: Speed Stripes easily across several disks Fast recovery Temporal locality Versioning Problems: Random access retrieval Log wrap Disk space utilization DAP.F96 44

Willy: DS 5 Number Bytes Touched MB/sec 8 7 6 5 4 3 2 Ultrix Sprite Number MB Touched Log Structured File System: effective write cache of LFS much smaller (5-8 MB) than read cache (2 MB) Reads cached while writes are not => 3 plateaus DAP.F96 45

Summary: I/O Benchmarks Scaling to track technological change TPC: price performance as nomalizing configuration feature Auditing to ensure no foul play Througput with restricted response time is normal measure DAP.F96 46