The University of New Mexico. Complexity Issues. Barney Maccabe Computer Science Department & Center for High Performance Computing

Similar documents
Sistemas Operativos: Input/Output Disks

I/O Considerations in Big Data Analytics

- An Essential Building Block for Stable and Reliable Compute Clusters

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

Publish Cisco VXC Manager GUI as Microsoft RDS Remote App

Hybrid Software Architectures for Big

WHITE PAPER 1

VMware Virtual SAN Backup Using VMware vsphere Data Protection Advanced SEPTEMBER 2014

Amazon EC2 Product Details Page 1 of 5

The Bus (PCI and PCI-Express)

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

The Data Placement Challenge

BookKeeper. Flavio Junqueira Yahoo! Research, Barcelona. Hadoop in China 2011

RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency

Petascale Software Challenges. Piyush Chaudhary High Performance Computing

Introducing Storm 1 Core Storm concepts Topology design

Highly Available Service Environments Introduction

Making Multicore Work and Measuring its Benefits. Markus Levy, president EEMBC and Multicore Association

Simplifying Storage Operations By David Strom (published 3.15 by VMware) Introduction

BSC vision on Big Data and extreme scale computing

DISTRIBUTED SYSTEMS AND CLOUD COMPUTING. A Comparative Study

Scalability of web applications. CSCI 470: Web Science Keith Vertanen

Data Storage - I: Memory Hierarchies & Disks

Distribution transparency. Degree of transparency. Openness of distributed systems

OS/Run'me and Execu'on Time Produc'vity

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical

STORAGE. Buying Guide: TARGET DATA DEDUPLICATION BACKUP SYSTEMS. inside

1 / 25. CS 137: File Systems. Persistent Solid-State Storage

Fault Tolerance in Hadoop for Work Migration

Windows Server Performance Monitoring

Building your Big Data Architecture on Amazon Web Services

SCALABILITY AND AVAILABILITY

Dependable Systems. 9. Redundant arrays of. Prof. Dr. Miroslaw Malek. Wintersemester 2004/05

Application-Focused Flash Acceleration

BlueArc unified network storage systems 7th TF-Storage Meeting. Scale Bigger, Store Smarter, Accelerate Everything

HyperQ Remote Office White Paper

ntier Verde: Simply Affordable File Storage No previous storage experience required

The PHI solution. Fujitsu Industry Ready Intel XEON-PHI based solution. SC Denver

Storing Data: Disks and Files

Bypassing Local Windows Authentication to Defeat Full Disk Encryption. Ian Haken

nanohub.org An Overview of Virtualization Techniques

2.1 What are distributed systems? What are systems? Different kind of systems How to distribute systems? 2.2 Communication concepts

Symmetric Multiprocessing

Enterprise Remote Control 5.6 Manual

Analisi di un servizio SRM: StoRM

Distributed Systems. REK s adaptation of Prof. Claypool s adaptation of Tanenbaum s Distributed Systems Chapter 1

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

VMWARE WHITE PAPER 1

Assignment # 1 (Cloud Computing Security)

HRG Assessment: Stratus everrun Enterprise

JoramMQ, a distributed MQTT broker for the Internet of Things

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

VDI Appliances Accelerate and Simplify Virtual Desktop Deployment

Protocols. Packets. What's in an IP packet

Disks and RAID. Profs. Bracy and Van Renesse. based on slides by Prof. Sirer

Building Reliable, Scalable AR System Solutions. High-Availability. White Paper

Virtual server management: Top tips on managing storage in virtual server environments

One solution to protect all Physical, Virtual and Cloud

White Paper Storage for Big Data and Analytics Challenges

The Use of Flash in Large-Scale Storage Systems.

Principles and characteristics of distributed systems and environments

XpoLog Center Suite Log Management & Analysis platform

Tier Architectures. Kathleen Durant CS 3200

Next Generation Operating Systems

Fault Tolerance & Reliability CDA Chapter 3 RAID & Sample Commercial FT Systems

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle

Big Data: A Storage Systems Perspective Muthukumar Murugan Ph.D. HP Storage Division

When talking about hosting

CONFIGURATION GUIDELINES: EMC STORAGE FOR PHYSICAL SECURITY

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

Solution Guide Parallels Virtualization for Linux

Computer Organization & Architecture Lecture #19

Taking Linux File and Storage Systems into the Future. Ric Wheeler Director Kernel File and Storage Team Red Hat, Incorporated

Deployment Guide: Unidesk and Hyper- V

Technical Note. Dell PowerVault Solutions for Microsoft SQL Server 2005 Always On Technologies. Abstract

Benchmarking Hadoop & HBase on Violin

Cloud Optimize Your IT

Cloud Computing Trends

GlobalSCAPE DMZ Gateway, v1. User Guide

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

SmartSync Backup Efficient NAS-to-NAS backup

NEC Corporation of America Intro to High Availability / Fault Tolerant Solutions

Trends in Enterprise Backup Deduplication

Note that if at any time during the setup process you are asked to login, click either Cancel or Work Offline depending upon the prompt.

Whitepaper. A Guide to Ensuring Perfect VoIP Calls. blog.sevone.com info@sevone.com

Designing a Cloud Storage System

CS2510 Computer Operating Systems

CS2510 Computer Operating Systems

Use Cases for Argonaut Project. Version 1.1

GridSolve: : A Seamless Bridge Between the Standard Programming Interfaces and Remote Resources

SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011

Cloud Computing and Amazon Web Services

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

File System Management

Michael Kagan.

StreamStorage: High-throughput and Scalable Storage Technology for Streaming Data

Hadoop Size does Hadoop Summit 2013

Log Analyzer Reference

Transcription:

Complexity Issues Barney Maccabe Computer Science Department & Center for High Performance Computing

Bandwidth In the next decade is the bandwidth transferred into or out of one high end computing file system a. going down 10X or more, b. staying about the same, c. going up 10X or more, or d. your answer here, as a result of the expected increase in computational speed in its client clusters/ MPPs, and why? 2

Increased Compute Rate...... is not the only reason for the increased bandwidth requirement External sources/sinks will also create and consume data at a higher rate Growth rate of aggregate MPP memory 3

Aggregate Memory Growth 30 Aggregate Memory 2006 Red Storm 30 TB 25 20 15 10 5 1993 Intel Paragon.037 TB 1997 ASCI/Red 1.212 TB 0 1992 1994 1996 1998 2000 2002 2004 2006 4

Spindle Count In the next decade is the number of magnetic disks in one high end computing file system c. going up 10X or more, or as a result of the expected increase in computational speed in its client clusters/ MPPs, and why? It s unlikely that increases in disk density will be able to keep up with the increased demand for storage 5

Others Concurrency: the number of concurrent streams of requests is likely to increase by at least 10x more systems, more users, more remote access,... Seek Efficiency: I have no clue how the number of bytes moved per magnetic disk seek will change Failures: the number of independent failure domains may increase by 10x we don t tolerate failures very well. 6

Complexity Explain why these large increases are not going to increase the complexity of storage software significantly. Make the user deal with it 7

Development Time Trends Even if complexity is increasing, the time and effort required to achieve acceptable 9s of availability at speed MUST stay about the same. Are you relying on the development of any currently insufficient technologies, and if so, which? magic? 8

The temptation Fat layers provide opportunities for optimization they also provide lots of opportunities to do the wrong thing Fat layers are magic magic is (mostly) OK on my laptop... Getting to fat layers requires a lot of experience 9

The obligatory F1 analogy Shifting gears Falcon F1 Synchromesh Heal & toe skill Automatic Transmission Automatic Clutch enhancement Wired Magazine; March 2001 220 MPH, 17,000 RPM, 500,000 lines of code 10

Performance Transparency When performance matters, the API should accurately reflect resource costs FORTRAN reflects costs of the (an?) underlying architecture MPI reflects costs of memory distribution It s hard to encourage people to do the right thing MPI does not encourage people to overlap communication with computation Let users see through and discard layers 11

Coping with complexity Butler Lampson (Software 1, 1), simplicity: Perfection is reached, not when there is no longer anything to add, but when there is no longer anything to take away. A. Saint-Exupery Make it fast, rather than general or powerful Don't hide power Leave it to the client 12

Abstracting storage You can t fix latency, you have to tolerate it asynchronous interface (split transactions) find something else to do I/O Bandwidth needs to be manageable expose parallelism eliminate need for anything other than data movement and access control provide mechanisms to do everything else offline Mask failures 13

LWFS App Launch Client Client Authenticate Credential Authentication Server Validate Credential Credential Client Client Client Credential Capability Op + Capability Authorization Server Client Storage Server Storage Server Validate Capability Storage Server 14

Why Data isn t Compute Persistence rebooting the storage system won t clean it up Sharing performance isolation is much harder you can t space share the storage servers Interdependencies management requires the construction of global views 15