Future Prospects of Scalable Cloud Computing



Similar documents
Web Technologies: RAMCloud and Fiz. John Ousterhout Stanford University

CSE-E5430 Scalable Cloud Computing Lecture 2

Storage of Structured Data: BigTable and HBase. New Trends In Distributed Systems MSc Software and Systems

CSE-E5430 Scalable Cloud Computing Lecture 11

Introduction to Big Data! with Apache Spark" UC#BERKELEY#

Hadoop-BAM and SeqPig

Processing NGS Data with Hadoop-BAM and SeqPig

Big Data and Hadoop with components like Flume, Pig, Hive and Jaql

Big Fast Data Hadoop acceleration with Flash. June 2013

Hadoop Ecosystem B Y R A H I M A.

Big Data Analysis using Hadoop components like Flume, MapReduce, Pig and Hive

Big Data With Hadoop

Large scale processing using Hadoop. Ján Vaňo

Lecture 5: GFS & HDFS! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl

Hadoop implementation of MapReduce computational model. Ján Vaňo

Bigtable is a proven design Underpins 100+ Google services:

Cloud Computing at Google. Architecture

Big Data and Apache Hadoop s MapReduce

Slave. Master. Research Scholar, Bharathiar University

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

Scalable Cloud Computing Solutions for Next Generation Sequencing Data

Petabyte Scale Data at Facebook. Dhruba Borthakur, Engineer at Facebook, SIGMOD, New York, June 2013

Hadoop IST 734 SS CHUNG

Enabling High performance Big Data platform with RDMA

Apache HBase. Crazy dances on the elephant back

Benchmarking Hadoop & HBase on Violin

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

Communication System Design Projects

Moving From Hadoop to Spark

Performance Beyond PCI Express: Moving Storage to The Memory Bus A Technical Whitepaper

DataStax Enterprise, powered by Apache Cassandra (TM)

Big Data and Hadoop with Components like Flume, Pig, Hive and Jaql

Scalable Cloud Computing

Lecture 10: HBase! Claudia Hauff (Web Information Systems)!

Architectures for Big Data Analytics A database perspective

Application Development. A Paradigm Shift

NoSQL Data Base Basics

How To Scale Out Of A Nosql Database

GraySort on Apache Spark by Databricks

A Novel Optimized Mapreduce across Datacenter s Using Dache: a Data Aware Caching for BigData Applications

Overview of Databases On MacOS. Karl Kuehn Automation Engineer RethinkDB

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

Big Table A Distributed Storage System For Data

Flash for Databases. September 22, 2015 Peter Zaitsev Percona

Distributed Lucene : A distributed free text index for Hadoop

Big Data and Industrial Internet

Xiaoming Gao Hui Li Thilina Gunarathne

American International Journal of Research in Science, Technology, Engineering & Mathematics

Jeffrey D. Ullman slides. MapReduce for data intensive computing

Hadoop & its Usage at Facebook

Joining Cassandra. Luiz Fernando M. Schlindwein Computer Science Department University of Crete Heraklion, Greece

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

Sedna: A Memory Based Key-Value Storage System for Realtime Processing in Cloud

Outline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging

Rakam: Distributed Analytics API

A Study on Workload Imbalance Issues in Data Intensive Distributed Computing

What is Analytic Infrastructure and Why Should You Care?

Benchmarking Cassandra on Violin

BIG DATA WEB ORGINATED TECHNOLOGY MEETS TELEVISION BHAVAN GHANDI, ADVANCED RESEARCH ENGINEER SANJEEV MISHRA, DISTINGUISHED ADVANCED RESEARCH ENGINEER

Can the Elephants Handle the NoSQL Onslaught?

Chapter 7: Distributed Systems: Warehouse-Scale Computing. Fall 2011 Jussi Kangasharju

Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases. Lecture 14

CSCI 5980 TOPICS IN DISTRIBUTED SYSTEMS FINAL REPORT 1. Locality-Aware Load Balancer for HBase

THE HADOOP DISTRIBUTED FILE SYSTEM

CSE-E5430 Scalable Cloud Computing P Lecture 5

SSD Performance Tips: Avoid The Write Cliff

Hosting Transaction Based Applications on Cloud

MinCopysets: Derandomizing Replication In Cloud Storage

Hadoop: Embracing future hardware

How To Scale Myroster With Flash Memory From Hgst On A Flash Flash Flash Memory On A Slave Server

Petabyte Scale Data at Facebook. Dhruba Borthakur, Engineer at Facebook, UC Berkeley, Nov 2012

An Overview of Flash Storage for Databases

RAMCloud and the Low- Latency Datacenter. John Ousterhout Stanford University

White paper. QNAP Turbo NAS with SSD Cache

Accelerating Cassandra Workloads using SanDisk Solid State Drives

Big Data Primer. 1 Why Big Data? Alex Sverdlov alex@theparticle.com

Petabyte Scale Data at Facebook. Dhruba Borthakur, Engineer at Facebook, XLDB Conference at Stanford University, Sept 2012

CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop)

Four Orders of Magnitude: Running Large Scale Accumulo Clusters. Aaron Cordova Accumulo Summit, June 2014

CSE-E5430 Scalable Cloud Computing. Lecture 4

Lecture 6 Cloud Application Development, using Google App Engine as an example

Data-Intensive Computing with Map-Reduce and Hadoop

Scaling Out With Apache Spark. DTL Meeting Slides based on

BIG DATA What it is and how to use?

Large-Scale Data Processing

Hadoop & its Usage at Facebook

Snapshots in Hadoop Distributed File System

MySQL performance in a cloud. Mark Callaghan

Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related

Solid State Storage in Massive Data Environments Erik Eyberg

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Design and Evolution of the Apache Hadoop File System(HDFS)

Transcription:

Future Prospects of Scalable Cloud Computing Keijo Heljanko Department of Information and Computer Science School of Science Aalto University keijo.heljanko@aalto.fi 7.3-2012 1/17

Future Cloud Topics Beyond MapReduce Pig (Scripting language for MapReduce) Hive (SQL engine for Hadoop) PACT (MapReduce, Joins, ++) Spark (Berkeley, iterative algorithms) Future of Storage Technologies - Disk, Flash & DRAM Cloud Databases - BigTable & HBase Distributed Databases with Transactions - Google Percolator (used for Google s new search engine) 2/17

Flash Storage Currently one of the trends is the introduction of Flash memory SSDs (solid state disks) are replacing hard disks in many applications Flash capacity per Euro is increasing faster than hard disk capacity per Euro Random SSD read (and often also write) IOPS are more than 100 those of high end hard disk read and write IOPS 3/17

Sustained Flash Write Performance Because of the write leveling algoritms, the sustained Flash memory write speeds are often hard to benchmark and need long test runs to saturate the SSD: http: //www.xssist.com/blog/[ssd]_comparison_of_ Intel_X25-E_G1_vs_Intel_X25-M_G2.htm 4/17

Optimizing for Flash One should minimize the number of bytes written to Flash Less bytes written means less wear and longer lifetime expectancy of Flash Less bytes written means less frequent slow block erases, and this improves the overall Flash IOPS Heavy sequential write workloads (e.g., database logs) can be efficiently handled by arrays of hard disks without any problems with write endurance Operating systems support such as the TRIM command which tells to the SSD which data blocks can be safely discarded is very useful 5/17

Future of Storage Technologies A mantra from Jim Gray of Microsoft in his 2006 presentation http://research.microsoft.com/en-us/um/people/ gray/talks/flash_is_good.ppt: Tape is Dead Disk is Tape Flash is Disk RAM Locality is King 6/17

Jim Gray on Disks (in 2006) Disk are cheap per capacity Sequential access full disk reads of take hours Random access full disk reads take weeks Thus most of disk should be treated as a cold-storage archive 7/17

Jim Gray on Flash (in 2006) Lots of IOPS Expensive compared to disks (but improving) Limited write endurance Slow to write (compared to reading) 8/17

Jim Gray on RAM (2006 numbers) Flash/disk is 100000-1000000 cycles away from the CPU RAM is 100 cycles away from the CPU Thus Jim Gray concludes that main memory databases are going to be common 9/17

Some Rule of Thumb Numbers for Flash For a single random access read, the following rules of thumb numbers apply: Main memory reference: 100 ns Flash drive access: 100 000 ns (1000x slower than memory) Disk seek: 10 000 000 ns (100 000x slower than memory) 10/17

RAMCloud John K. Ousterhout, Parag Agrawal, David Erickson, Christos Kozyrakis, Jacob Leverich, David Mazires, Subhasish Mitra, Aravind Narayanan, Diego Ongaro, Guru M. Parulkar, Mendel Rosenblum, Stephen M. Rumble, Eric Stratmann, Ryan Stutsman: The case for RAMCloud. Commun. ACM 54(7): 121-130 (2011). Facebook in 2009 has been running in 2009 with enough RAM to fit 75% of their dataset (not counting images or video) in RAM If they would use enough RAM cache to hit 99% of the dataset, the random disk seek latency would still kill their average latency performance: (99% 100 ns + 1% 10 000 000 ns) = 100099 ns, which is way more that 100ns! 11/17

RAMCloud - Discussing the Numbers With Flash disks and 99% cache hit rate we get latencies: (99% 100 ns + 1% 100 000 ns) = 1099 ns, which is still 10x more that 100ns. The above concerns only average latency, not throughput, but similar reasoning also applies to aggregate IOPS numbers of RAM, Flash, and Disk If a system can have enough memory to have 100% of the working set in RAM, instead of Flash / Hard disk, way more IOPS can be served with the same number of servers. 12/17

RAMCloud - Limiting Factors The main problem in RAMcloud style designs to guarantee data persistence in case of power failure + quick recovery in case of server crash / power outage The second problem is to have enough low latency networking hardware and OS to keep RAM busy 13/17

RAM vs Flash vs Disk The RAMCloud paper has the following diagram adapted from the paper: Andersen, D., Franklin, J., Kaminsky, M., et al., FAWN: A Fast Array of Wimpy Nodes, Proc. 22nd Symposium on Operating Systems Principles, 2009: 14/17

BigTable Described in the paper: Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Michael Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber: Bigtable: A Distributed Storage System for Structured Data. ACM Trans. Comput. Syst. 26(2): (2008) A highly scalable consistent and partition tolerant datastore Implemented on top of the Google Filesystem (GFS) GFS provides data persistence by replication but only supports sequential writes BigTable design does no random writes, only sequential writes are used Open source clone: Apache HBase 15/17

BigTable Recall that sequential writes are much faster than random writes BigTable is a write optimized design: Writes are optimized over reads The random reads that miss the DRAM cache used are slower than in traditional RDBMS - Most active data should be in DRAM! The design also minimizes the number of bytes written by using efficient logging and compression of data. Also a good design principle for Flash SSD use 16/17

BigTable Summary BigTable is a scalable write optimized database design It uses only sequential writes for improved write performance For read intensive workloads BigTable can be a good fit if most of the working set fits into DRAM For read intensive working sets much larger than DRAM, traditional RDBMS systems are still a good match BigTable also has an impressive data scan speed, so scan based workloads (getting MapReduce input data from BigTable) are a good match for its performance Some optimizations such as agressive compression and use of Bloom filters are only viable because SSTables are immutable data objects 17/17