Windows Azure Storage Essential Cloud Storage Services http://www.azureusergroup.com



Similar documents
Cloud Computing with Windows Azure using your Preferred Technology

Introduction to Azure: Microsoft s Cloud OS

Introduction to Windows Azure Cloud Computing Futures Group, Microsoft Research Roger Barga, Jared Jackson,Nelson Araujo, Dennis Gannon, Wei Lu, and

Windows Azure Storage Scaling Cloud Storage Andrew Edwards Microsoft

INTRODUCING WINDOWS AZURE

Storing and Processing Sensor Networks Data in Public Clouds

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

Microsoft Lab Of Things - Week6 Tuesday -

WINDOWS AZURE DATA MANAGEMENT AND BUSINESS ANALYTICS

Microsoft Azure Data Technologies: An Overview

Near Real Time Indexing Kafka Message to Apache Blur using Spark Streaming. by Dibyendu Bhattacharya

Google File System. Web and scalability

INTRODUCING WINDOWS AZURE

Technical Overview Simple, Scalable, Object Storage Software

A programming model in Cloud: MapReduce

Patterns for Cloud Computing. Simon Guest Senior Director, Technical Strategy Microsoft Corporation

Distributed File Systems

Windows Azure Data Services (basics) 55093A; 3 Days

WINDOWS AZURE DATA MANAGEMENT

Facebook: Cassandra. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation

Hypertable Architecture Overview

MS 10978A Introduction to Azure for Developers

Where We Are. References. Cloud Computing. Levels of Service. Cloud Computing History. Introduction to Data Management CSE 344

Cloud Computing at Google. Architecture

SharePoint & Azure: Digital Asset Management

Course 10978A Introduction to Azure for Developers

Service Level Agreement for Windows Azure operated by 21Vianet

INTRODUCING WINDOWS AZURE

Cloud storage with Apache jclouds

SWIFT. Page:1. Openstack Swift. Object Store Cloud built from the grounds up. David Hadas Swift ATC. HRL 2012 IBM Corporation

Distributed Systems. Tutorial 12 Cassandra

Cloud Computing with Azure PaaS for Educational Institutions

Design and Evolution of the Apache Hadoop File System(HDFS)

ASP.NET Multi-Tier Windows Azure Application Using Storage Tables, Queues, and Blobs

Apache Cassandra for Big Data Applications

This module provides an overview of service and cloud technologies using the Microsoft.NET Framework and the Windows Azure cloud.

SQL 2016 and SQL Azure

Recovery Principles in MySQL Cluster 5.1

Cloud Computing with Microsoft Azure

Data storing and data access

INTRODUCING THE WINDOWS AZURE PLATFORM

Data Management in the Cloud

Microsoft Introduction to Azure for Developers

An Approach to Implement Map Reduce with NoSQL Databases

The Google File System

10978A: Introduction to Azure for Developers

INDIA September 2011 virtual techdays

Lecture 5: GFS & HDFS! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl

Apache Hadoop FileSystem and its Usage in Facebook

Scaling Analysis Services in the Cloud

WHITEPAPER SECURITY APPROACHES AND SECURITY TECHNOLOGIES IN INTEGRATION CLOUD

Introduction to Azure for Developers

2.1.5 Storing your application s structured data in a cloud database

Java DB Performance. Olav Sandstå Sun Microsystems, Trondheim, Norway Submission ID: 860

Cloud Powered Mobile Apps with Azure

Windows Azure Security

Hadoop & its Usage at Facebook

How to Choose Between Hadoop, NoSQL and RDBMS

Developing Microsoft Azure Solutions

CAT: Azure SQL DB Premium Deep Dive and Mythbuster

Developing Microsoft Azure Solutions 20532A; 5 days

f...-. I enterprise Amazon SimpIeDB Developer Guide Scale your application's database on the cloud using Amazon SimpIeDB Prabhakar Chaganti Rich Helms

MySQL Storage Engines

Microsoft SQL Server 2012: What to Expect

Diagram 1: Islands of storage across a digital broadcast workflow

SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011

Enterprise Data Integration for Microsoft Dynamics CRM

CRM Magic with Data Migration & Integration

How To Manage A Network On A Server On A Microsoft Server On An Ipad Or Ipad (For A Supermicroserve) On A Network (For An Ipa) On An Uniden (For Microsoft) Server On Your

GeoGrid Project and Experiences with Hadoop

Xiaoming Gao Hui Li Thilina Gunarathne

Cloud Computing Trends

Hadoop & its Usage at Facebook

THE WINDOWS AZURE PROGRAMMING MODEL

Database Fundamentals

Hadoop Distributed File System. Dhruba Borthakur June, 2007

Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications

Building Scalable Applications Using Microsoft Technologies

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

MS 20487A Developing Windows Azure and Web Services

Chapter 13 File and Database Systems

Chapter 13 File and Database Systems

FAWN - a Fast Array of Wimpy Nodes

SQL Azure vs. SQL Server

Scaling Database Performance in Azure

Flight Workflow User's Guide. Release

Release Notes LS Retail Data Director August 2011

Developing Microsoft Azure Solutions 20532B; 5 Days, Instructor-led

RSA SecurID Ready Implementation Guide

Transcription:

Windows Azure Storage Essential Cloud Storage Services http://www.azureusergroup.com David Pallmann, Neudesic

Windows Azure Windows Azure is the foundation of Microsoft s Cloud Platform It is an Operating System for the Cloud and provides Essential Services for the Cloud Virtualized Computation Scalable Storage Automatic Management Developer SDK

Cloud Storage is part of Windows Azure

Windows Azure Storage The goal is to allow users and applications to: Access their data efficiently from anywhere at anytime Store data for any length of time Scale to store any amount of data Be confident that the data will not be lost Pay for only what they use/store

Windows Azure Storage Storage that is Durable Scalable (capacity and throughput) Highly Available Security Performance Efficient Rich Data Abstractions Service communication: queues, locks, Large user data items: blobs, blocks, Service state: tables, caches, Simple and Familiar Programming Interfaces REST Accessible and ADO.NET

Windows Azure Storage Account User creates a globally unique storage account name Receive a 256 bit secret key when creating account Provides security for accessing the store Use secret key to create a HMAC SHA256 signature for each request Use signature to authenticate request at server

SETTING UP CLOUD STORAGE

Fundamental Data Abstractions Blobs Provide a simple interface for storing named files along with metadata for the file Tables Provide structured storage. A Table is a set of entities, which contain a set of properties Queues Provide reliable storage and delivery of messages for an application

Blob Storage Concepts Account Container Blob sally pictures movies IMG001.JPG IMG002.JPG MOV1.AVI

Storage Account And Blob Containers Storage Account An account can have many Blob Containers Container A container is a set of blobs Sharing policies are set at the container level Public READ or Private Associate Metadata with Container Metadata is <name, value> pairs Up to 8KB per container List the blobs in a container Account sally Container pictures movies Blob IMG001.JPG IMG002.JPG MOV1.AVI

Blob Namespace Blob URL http://<account>.blob.core.windows.net/<container>/<blobname> Example: Account sally Container music BlobName rock/rush/xanadu.mp3 URL: http://sally.blob.core.windows.net/music/rock/rush/xanadu.mp3

Blob Features And Functions Store Large Objects (up to 50 GB each) Standard REST PUT/GET Interface http://<account>.blob.core.windows.net/<container>/<blobname> PutBlob Inserts a new blob or overwrites the existing blob GetBlob Get whole blob or by starting offset, length DeleteBlob Support for Continuation on Upload Associate Metadata with Blob Metadata is <name, value> pairs Set/Get with or separate from blob data bits Up to 8KB per blob

Continuation On Upload Scenario Want to upload a large multi GB file into the cloud If upload fails in the middle, need an efficient way to resume the upload from where it failed

Block Id 1 Block Id 2 Block Id 3 Block Id N Uploading A Blob Via Blocks Uploading a Large Blob 10 GB Movie blobname = TheBlob.wmv ; PutBlock(blobName, blockid1, block1bits); PutBlock(blobName, blockid2, block2bits); PutBlock(blobName, blockidn, blocknbits); PutBlockList(blobName, blockid1,,blockidn); Benefit: Efficient continuation and retry Parallel and out of order upload of blocks TheBlob.wmv Windows Azure Storage

Block Id 2 Block Id 3 Block Id 4 Block Id 1 Block Id 3 Block Id 4 Block Id 2 Block Id 4 PutBlockList Example BlobName = ExampleBlob.wmv Example Uploading Blocks Out of Order Same Block IDs Unused Blocks Sequence of Operations PutBlock(BlockId1) PutBlock(BlockId3) PutBlock(BlockId4) PutBlock(BlockId2) PutBlock(BlockId4) Committed and readable version of blob PutBlockList(BlockId2, BlockId3, BlockId4)

Blob As A List Of Blocks Blob Consists of a List of Blocks Properties of Blocks Each Block defined by a Block ID Up to 64 Bytes, scoped by Blob Name Blocks are immutable A block is up to 64MB Do not have to be same size

BlockList Operations PutBlockList for a Blob Provide the list of blocks to comprise the readable version of the blob If multiple blocks are uploaded with same Block ID Last committed block wins Blocks not used will be garbage collected GetBlockList for a Blob Returns the list of blocks that represent the readable (committed) version of the blob Block ID and Size of Block is returned for each block

Blob Storage

Summary Of Windows Azure Blobs Easy to use REST Put/Get/Delete interface Can read from any Offset, Length of Blob Conditional Put and Get Blob Max Blob size 50 GB using PutBlock and PutBlockList 64 MB using PutBlob Blocks provide continuation for blob upload Put Blob/BlockList == Replace Blob for CTP Can replace an existing blob with new blob/blocks

Future Windows Azure Blob Support Update Blob Ability to replace, add, or remove blocks from a blob Append Blob Ability to append a block to a blob Copy Blob Ability to copy an existing blob to a new blob name

Fundamental Data Abstractions Blobs Provide a simple interface for storing named files along with metadata for the file Tables Provide structured storage. A Table is a set of entities, which contain a set of properties Queues Provide reliable storage and delivery of messages for an application

Windows Azure Tables Provides Structured Storage Massively Scalable Tables Billions of entities (rows) and TBs of data Automatically scales to thousands of servers as traffic grows Highly Available Can always access your data Durable Data is replicated at least 3 times Familiar and Easy to use Programming Interfaces ADO.NET Data Services.NET 3.5 SP1.NET classes and LINQ REST - with any platform or language

Table Data Model Table A Storage Account can create many tables Table name is scoped by Account Data is stored in Tables A Table is a set of Entities (rows) An Entity is a set of Properties (columns) Entity Two key properties that together are the unique ID of the entity in the Table PartitionKey enables scalability RowKey uniquely identifies the entity within the partition

Partition Key And Partitions Every Table has a Partition Key It is the first property (column) of your Table Used to group entities in the Table into partitions A Table Partition All entities in a Table with the same partition key value Partition Key is exposed in the programming model Allows application to control the granularity of the partitions and enable scalability

Partition Example Partition Key Document Name Row Key Version Property 3 Modification Time.. Property N Description Examples Doc V1.0 8/2/2007.. Committed version Examples Doc V2.0.1 9/28/2007 Alice s working version FAQ Doc V1.0 5/2/2007 Committed version FAQ Doc V1.0.1 7/6/2007 Alice s working version FAQ Doc V1.0.2 8/1/2007 Sally s working version Partition 1 Partition 2 Table Partition - all entities in table with same partition key value Application controls granularity of partition

Purpose Of The Partition Performance and Entity Locality Entities in the same partition will be stored together Efficient querying and cache locality Table Scalability We monitor the usage patterns of partitions Automatically load balance partitions Each partition can be served by a different storage node Scale to meet the traffic needs of your table

Performance And Scalability Partition Key Document Name Row Key Version Property 3 Modification Time.. Property N Description Examples Doc V1.0 8/2/2007.. Committed version Examples Doc V2.0.1 9/28/2007 Alice s working version FAQ Doc V1.0 5/2/2007 Committed version FAQ Doc V1.0.1 7/6/2007 Alice s working version FAQ Doc V1.0.2 8/1/2007 Sally s working version Partition 1 Partition 2 Efficient retrieval of all of the versions of FAQ Doc Since we are accessing a single partition The two partitions can be served from different servers to scale out access

Choosing A Partition Key Use a PartitionKey that is common in your queries If Partition Key is part of Query Fast access to retrieve entities within a single partition If Partition Key is not specified in a Query Then every partition has to be scanned

Query With And Without PartionKey Partition Key Document Name Row Key Version Property 3 Modification Time.. Property N Description Examples Doc V1.0 8/2/2007.. Committed version Examples Doc V2.0.1 9/28/2007 Alice s working version FAQ Doc V1.0 5/2/2007 Committed version FAQ Doc V1.0.1 7/6/2007 Alice s working version FAQ Doc V1.0.2 8/1/2007 Sally s working version Partition 1 Partition 2 Getting all entities with (PartitionKey == FAQ Doc ) is fast Access single partition Get all docs with (ModifiedTime < 6/01/2007) is more expensive Have to traverse all partitions

Choosing A Partition Key Use a PartitionKey that is common in your queries If Partition Key is part of Query Fast access to retrieve entities within a single partition If Partition Key is not specified in a Query Then every partition has to be scanned Spread out load across partitions Partition Key allows you to control what goes into your Table partitions More partitions makes it easier to automatically balance load At the tradeoff of Entity Locality

Primary Key For Entity Partition Key Document Name Row Key Version Property 3 Modification Time.. Property N Description Examples Doc V1.0 8/2/2007.. Committed version Examples Doc V2.0.1 9/28/2007 Alice s working version FAQ Doc V1.0 5/2/2007 Committed version FAQ Doc V1.0.1 7/6/2007 Alice s working version FAQ Doc V1.0.2 8/1/2007 Sally s working version Primary Key for the Entity is the composite of PartitionKey Unique ID of the partition the entity belongs to within the Table RowKey Unique ID for the entity within the partition Partition 1 Partition 2

Entities And Properties Each Entity can have up to 255 properties Mandatory Properties for every Entity in Table Partition Key Row Key All entities have a system maintained version No fixed schema for rest of properties Each property is stored as a <name, typed value> pair No schema stored for a table 2 entities within the same table can have different properties

Property Types Supported Partition Key and Row Key String (up to 64KB) Property Types String (up to 64KB) Binary (up to 64KB) Bool DateTime GUID Int Int64 Double

Table Programming Model Provide familiar and easy to use interfaces Leverage your.net expertise Table Entities are accessed as objects via ADO.NET Data Services.NET 3.5 SP1 LINQ Language Integrated Query RESTful access to table and entities Insert/Update/Delete Entities over the Table Query over Tables Get back a list of structured Entities

Example Table Definition Example using ADO.NET Data Services Table Entities are represented as Class Objects [DataServiceKey("PartitionKey", "RowKey")] public class Customer { // Partition key Customer Last name public string PartitionKey { get; set; } // Row Key Customer First name public string RowKey { get; set; } } // User defined properties here public DateTime CustomerSince { get; set; } public double Rating { get; set; } public string Occupation { get; set; }

Create Customers Table Every Account has a master table called Tables It is used to keep track of the tables in your account To use a table it has to be inserted into Tables [DataServiceKey("TableName")] public class TableStorageTable { public string TableName { get; set; } } // serviceuri is http://<account>.table.core.windows.net/ DataServiceContext context = new DataServiceContext(serviceUri); TableStorageTable table = new TableStorageTable("Customers"); context.addobject("tables", table); DataServiceResponse response = context.savechanges();

Table Storage

Query A Table LINQ DataServiceContext context = new DataServiceContext( http://myaccount.table.core.windows.net ); var customers = from o in context.createquery<customer>( Customers ) where o.partitionkey == Lee select o; foreach (Customer customer in customers) { } GET http://myaccount.table.core.windows.net/customers? $filter= PartitionKey eq Lee

Summary Of Windows Azure Tables Built to provide Massively Scalable, Highly Available and Durable Structured Storage Automatic Load Balancing and Scaling of Tables Partition Key is exposed to the application Familiar and Easy to use LINQ and REST programming interfaces ADO.Net Data Services Not a relational database No joins, no maintenance of foreign keys, etc

Future Windows Azure Table Support At CTP Single Index Query and retrieve results sorted by PartitionKey and RowKey Future Support for Secondary Indexes Query and retrieve results sorted by other properties Single Entity Transactions Atomically Insert, Update, or Delete a Single Entity Entity Groups Atomic transactions across multiple entities within same partition

Fundamental Data Abstractions Blobs Provide a simple interface for storing named files along with metadata for the file Tables Provide structured storage. A Table is a set of entities, which contain a set of properties Queues Provide reliable storage and delivery of messages for an application

LB Web + Worker Queue Example n m Web Role Worker Role Cloud Storage (blob, table, queue)

Windows Azure Queues Provide reliable message delivery Simple, asynchronous work dispatch Programming semantics ensure that a message can be processed at least once Queues are Highly Available, Durable and Performance Efficient Access is provided via REST

Account, Queues And Messages An Account can create many Queues Queue Name is scoped by the Account A Queue contains Messages No limit on number of messages stored in a Queue But a Message is stored for at most a week http://<account>.queue.core.windows.net/<queuename> Messages Message Size <= 8 KB To store larger data, store data in blob/entity storage, and the blob/entity name in the message

Queue Programming API Queues Create/Clear/Delete Queues Inspect Queue Length Messages Enqueue (QueueName, Message) Dequeue (QueueName, Invisibility Time T) Returns the Message with a MessageID Makes the Message Invisible for Time T Delete(QueueName, MessageID)

Dequeue And Delete Messages Producers P 2 Consumers C 1 1. Dequeue(Q, 30 sec) msg 1 4 3 2 1 P 1 C 2 2. Dequeue(Q, 30 sec) msg 2

Dequeue And Delete Messages Producers Consumers P 2 1 C 1 1. Dequeue(Q, 30 sec) msg 1 5. C 1 crashed P 1 4 3 2 1 6. msg1 visible 30 seconds after Dequeue 2 C 2 2. Dequeue(Q, 30 sec) msg 2 3. C2 consumed msg 2 4. Delete(Q, msg 2) 7. Dequeue(Q, 30 sec) msg 1 Benefit: Insures that every message can be processed at least once

Queue Storage

Summary Of Windows Azure Queues Provide reliable message delivery Allows Messages to be retrieved and processed at least once No limit on number of messages stored in a Queue Message size is <=8KB

DURABILITY, AVAILABILITY AND SCALABILITY OF WINDOWS AZURE STORAGE

Storage Durability All data is replicated at least 3 times Replicas are spread out over different fault and upgrade domains in same data center or geo-distribution and geo-replication All of Storage (Blobs, Tables and Queues) is built on this replication layer Dynamic replication to maintain a healthy number of replicas Recover from a lost/unresponsive Drive or Node Recover from data bit rot Data continuously scanned against bit rot

Availability And Scalability Efficient Failover Data served immediately elsewhere within data center from available replicas Automatic Load Balancing of Hot Data Monitor the usage patterns and load balance access to Blob Containers, Table Partitions and Queues Distribute access to the hot data over the data center according to traffic Caching of Hot Blobs, Entities and Queues Hot Blobs are cached to scale out access to them Hot Entity and Queue data pages are cached and served from memory

Windows Azure Storage Summary Essential Storage for the Cloud Durable, Scalable, Highly Available, Security, Performance Efficient Familiar and Easy to Use Programming Interfaces REST Accessible, LINQ and ADO.NET Rich Data Abstractions Service communication: queues, locks, Large user data items: blobs, blocks, Service state: tables, caches,

Resources Azure portal: http://www.azure.com Samples: Azure SDK, Cloud Storage sample User group: http://www.azureusergroup.com Blog: http://www.azurefeeds.com Forums: Azure Forums on MSDN Azure Storage Explorer: http://www.codeplex.com/azurestorageexplorer

QUESTIONS?