A Survey on Cloud Storage Systems



Similar documents
Cloud Computing. Lecture 24 Cloud Platform Comparison

Chapter 3 Cloud Infrastructure. Cloud Computing: Theory and Practice. 1

Users VM A A A. Application. Compute/Storage/Network. VM Virtual Machine. On-Premises Data Center

Storing and Processing Sensor Networks Data in Public Clouds

Where We Are. References. Cloud Computing. Levels of Service. Cloud Computing History. Introduction to Data Management CSE 344

Amazon Elastic Beanstalk

Public Cloud Offerings and Private Cloud Options. Week 2 Lecture 4. M. Ali Babar

Cloud Models and Platforms

References. Introduction to Database Systems CSE 444. Motivation. Basic Features. Outline: Database in the Cloud. Outline

Introduction to Database Systems CSE 444

Cloud Computing. Up until now

Platforms in the Cloud

Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases. Lecture 14

Storage Options in the AWS Cloud

2) Xen Hypervisor 3) UEC

Cloud Computing and Amazon Web Services. CJUG March, 2009 Tom Malaher

Investigating Private Cloud Storage Deployment using Cumulus, Walrus, and OpenStack/Swift

Cloud Computing: Making the right choices

Real Time Big Data Processing

WINDOWS AZURE DATA MANAGEMENT

How To Compare Cloud Computing To Cloud Platforms And Cloud Computing

CUMULUX WHICH CLOUD PLATFORM IS RIGHT FOR YOU? COMPARING CLOUD PLATFORMS. Review Business and Technology Series

Scaling in the Cloud with AWS. By: Eli White (CTO & mojolive) eliw.com - mojolive.com

Aspera Direct-to-Cloud Storage WHITE PAPER

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect

A Comparison of Clouds: Amazon Web Services, Windows Azure, Google Cloud Platform, VMWare and Others (Fall 2012)

Introduction to Cloud computing. Viet Tran

Alfresco Enterprise on AWS: Reference Architecture

Designing a Data Solution with Microsoft SQL Server 2014

Leveraging Public Clouds to Ensure Data Availability

Service Organization Controls 3 Report

MS 20465C: Designing a Data Solution with Microsoft SQL Server

How AWS Pricing Works

Simple Storage Service (S3)

Private Distributed Cloud Deployment in a Limited Networking Environment

How AWS Pricing Works May 2015

Designing a Data Solution with Microsoft SQL Server

Leveraging BlobSeer to boost up the deployment and execution of Hadoop applications in Nimbus cloud environments on Grid 5000

Microsoft Azure Data Technologies: An Overview

WINDOWS AZURE DATA MANAGEMENT AND BUSINESS ANALYTICS

Storage Solutions in the AWS Cloud. Miles Ward Enterprise Solutions Architect

Course 20465: Designing a Data Solution with Microsoft SQL Server

Designing a Data Solution with Microsoft SQL Server

Cloud Computing For Bioinformatics

Open Cloud System. (Integration of Eucalyptus, Hadoop and AppScale into deployment of University Private Cloud)

Last time. Today. IaaS Providers. Amazon Web Services, overview

Amazon Cloud Storage Options

20465: Designing a Data Solution with Microsoft SQL Server

A programming model in Cloud: MapReduce

Big Data Technologies Compared June 2014

Amazon Web Services Student Tutorial

SMB in the Cloud David Disseldorp

Introduction to Cloud Computing

Assignment # 1 (Cloud Computing Security)

Introduction to Cloud : Cloud and Cloud Storage. Lecture 2. Dr. Dalit Naor IBM Haifa Research Storage Systems. Dalit Naor, IBM Haifa Research

Drupal in the Cloud. Scaling with Drupal and Amazon Web Services. Northern Virginia Drupal Meetup

Cloud Computing: Meet the Players. Performance Analysis of Cloud Providers

Amazon AWS in.net. Presented by: Scott Reed

BIG DATA TRENDS AND TECHNOLOGIES

Introduction to Azure: Microsoft s Cloud OS

Cloud Computing. Adam Barker

Web Application Deployment in the Cloud Using Amazon Web Services From Infancy to Maturity

Open Source Technologies on Microsoft Azure

Web Application Hosting in the AWS Cloud Best Practices

Cloud Computing. Chapter 6 Data Storage in the Cloud

20465C: Designing a Data Solution with Microsoft SQL Server

Shadi Khalifa Database Systems Laboratory (DSL)

Designing a Data Solution with Microsoft SQL Server 2014

Cloud Computing an introduction

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Course 20465C: Designing a Data Solution with Microsoft SQL Server

Course 20465C: Designing a Data Solution with Microsoft SQL Server

Cloud Computing Overview

FREE AND OPEN SOURCE SOFTWARE FOR CLOUD COMPUTING SERENA SPINOSO FULVIO VALENZA

Designing a Data Solution with Microsoft SQL Server

ZADARA STORAGE. Managed, hybrid storage EXECUTIVE SUMMARY. Research Brief

CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop)

Building your Big Data Architecture on Amazon Web Services

Getting Started with Attunity CloudBeam for Azure SQL Data Warehouse BYOL

Using Cloud Services for Test Environments A case study of the use of Amazon EC2

Scalable Application. Mikalai Alimenkou

AppDev OnDemand Cloud Computing Learning Library

PRIVACY PRESERVATION ALGORITHM USING EFFECTIVE DATA LOOKUP ORGANIZATION FOR STORAGE CLOUDS

Cloud Computing Now and the Future Development of the IaaS

ANDREW HERTENSTEIN Manager Microsoft Modern Datacenter and Azure Solutions En Pointe Technologies Phone

Amazon Web Services EC2 & S3

Chapter 9 PUBLIC CLOUD LABORATORY. Sucha Smanchat, PhD. Faculty of Information Technology. King Mongkut s University of Technology North Bangkok

Cloud Computing and Big Data What Technical Writers Need to Know

APP DEVELOPMENT ON THE CLOUD MADE EASY WITH PAAS

Cloud Courses Description

Performance Evaluation of Online Backup Cloud Storage

Big Analytics in the Cloud. Matt Winkler PM, Big

Transcription:

A Survey on Cloud Storage Systems Team : Xiaoming Xiaogang Adarsh Abhijeet Pranav

Motivations No Taxonomy Detailed Survey for users Starting point for researchers

Taxonomy Category Definition Example Instance Storage Storage coming with virtual machine images Amazon EC2 instance Object Block Storage of binary objects provided in the form of Web services. An object can be any type of file. Virtual block devices that can be attached to VM instances and used like local disks. Amazon Simple Storage Service (S3) Amazon Elastic Block Store (EBS) Semi-structured data Database service for storing semi-structured data with high availability, high scalability, and high performance. Amazon Simple DB Relational Database Relational database servers on VM instances in clouds. Amazon Relational Database service Distributed file system Online Drive/ Folder service Distributed provided through file system interfaces with high availability and high scalability. Storage space provided in the form of a virtual drive or folder on Internet. Google File System Microsoft SkyDrive

Commercial Cloud Providers Vendor Instance Object Block Semistructured data Relational Database Distributed File System Amazon EC2 S3 EBS SimpleDB RDS N/A Online Folder/Drive Microsoft Azure VM Azure Blob Azure drive Google N/A Google Storage for Developers Azure table SQL Azure N/A SkyDrive/Mesh N/A BigTable N/A Google File System

Commercial Cloud Providers Windows Azure Blob - Distributed for large items. Each item can be of maximum size 50 GB. - One can view Azure Blob as a container. Each container consists of blobs and each blob is made of blocks. - All access to Azure Blob is through HTTP REST interface. Windows SQL Azure - SQL Azure provides web-facing database functionality as utility service. - TDS is the protocol which is used to connect to a Cloud-based database. - Queries are formulated in Transact-SQL language. - Applications and tools already in use with existing other relational databases work seamlessly with SQL Azure. Windows Azure Table - Provides structured for maintaining service state. - Structured is provided in the form of tables which contain a set entities and each entity is made up of a set of named properties. - Provides support for LINQ, ADO.NET data services and REST. - Azure Table can be thought of as a fancy spreadsheet. One can store the state of an entity in the columns of the spreadsheet.

Commercial Cloud Providers Amazon Elastic Block Store (EBS) - Off-instance that persists independently from the life of an instance. - Storage volumes behave like raw, unformatted, block devices. - Can store from 1 GB to 1 TB in volumes, can be mounted on EC2 instances. Amazon S3 - Object that is designed to make web-scale computing easier for developers. - Users can store persistent data organized in buckets and objects. - Uses standards-based REST and SOAP interfaces designed to work with any Internet- development toolkit. - Unlimited objects containing 1 byte to 5 GB of data each can be stored. Amazon Relational Database Storage (RDS) - Provides cost-effective and resizable capacity. - Applications and tools in use with existing MySQL databases work seamlessly with Amazon RDS. Amazon SimpleDB - Non-relational database that offloads the work of database administration. - User can Focus on application development without worrying about infrastructure provisioning, high availability, software maintenance.

Commercial Cloud Providers

Commercial Cloud Providers - Use Cases Creating a Web Application With Relational Data SQL Azure or Amazon RDS can be used Creating parallel processing Application, Storage for data analysis, Backup and Recovery (examples: financial modeling at a bank, New drug development in a pharmaceutical company.) Azure Blob or Amazon S3 can be used to store intermediate data. Creating Scalable Web Application, gaming application, metadata indexing (example : On line Tickiet system, news video site etc,) Azure table or Amazon Simple DB can be used Applications that require a database, file system, or access to raw block level. Amazon EBS or Azure drive can be used.

System Academic Cloud Systems Instance Object Block Semi-structured data Distributed file system Eucalyptus VM S3 EBS N/A N/A Nimbus VM Cumulus N/A N/A N/A OpenNebula VM N/A N/A N/A N/A OpenStack VM OpenStack object N/A N/A N/A Hadoop N/A N/A N/A HBase Hadoop distributed file system (HDFS)

Academic Cloud Systems Eucalyptus SOAP/REST based tools Cluster A Storage Controller Walrus Storage Controller Cluster B S3 mainly used for VM image Typical configuration contains one server per cluster

Academic Cloud Systems Nimbus - Cumulus service used for VM image - Cumulus can be configured to use various backend OpenNebula -Two ways to manage VM images: shared NFS and nonshared SSH

Academic Cloud Systems OpenStack - OpenStack object used for VM image management - Uses disk blocks directly instead of file systems Hadoop - HDFS interface is not totally compatible with POSIX standard, nor is the system optimized for file I/Os - Hbase is built on top of HDFS

Conclusions and Future work Virtualized I/O performance of cloud services not comparable to local disk yet Academic cloud systems are not providing a rich set of services so far Performance tests for commercial services in future More investigation on design and implementation details Include emerging services from other providers.