Windows Azure as a Platform as a Service (PaaS)

Size: px
Start display at page:

Download "Windows Azure as a Platform as a Service (PaaS)"

Transcription

1 DEUTSCH-FRANZÖSISCHE SOMMERUNIVERSITÄT FÜR NACHWUCHSWISSENSCHAFTLER 2011 CLOUD COMPUTING : HERAUSFORDERUNGEN UND MÖGLICHKEITEN UNIVERSITÉ D ÉTÉ FRANCO-ALLEMANDE POUR JEUNES CHERCHEURS 2011 CLOUD COMPUTING : DÉFIS ET OPPORTUNITÉS Windows Azure as a Platform as a Service (PaaS) Jared Jackson Microsoft Research

2 Before we begin Some Results Favorite Ice Cream Ice Cream Consumption Cookies and Cream Cheescake 3% 3% Stratiatella 10% Walnut 3% Cinamon 3% Vanilla 23% Other 29% Vanilla 33% Tiramisu 3% Malaga 3% Cherry 3% Amarena 3% Mango 3% Pistachio 7% Strawberry 10% Chocolate 13% Chocolate Chip 4% Neapolitan 4% Butter Pecan 7% Chocolate 11% Banana 3% Coffee 3% Cookies and Cream 4% Cherry 2% Coffee 2% Strawberry 5% Source: International Ice Cream Association (makeicecream.com)

3 Windows Azure Overview

4 Web Application Model Comparison Ad Hoc Application Model Machines Running IIS / ASP.NET Machines Running Windows Services Machines Running SQL Server 4

5 Web Application Model Comparison Ad Hoc Application Model Windows Azure Application Model Machines Running IIS / ASP.NET Machines Running Web Role Instances Windows Services Worker Role Instances Machines Running SQL Server Azure Storage Blob / Queue / Table SQL Azure 5

6 Key Components Fabric Controller Manages hardware and virtual machines for service Compute Web Roles Web application front end Worker Roles Utility compute VM Roles Custom compute role; You own and customize the VM Storage Blobs Binary objects Tables Entity storage Queues Role coordination SQL Azure SQL in the cloud

7 Key Components Fabric Controller Think of it as an automated IT department Cloud Layer on top of: Windows Server 2008 A custom version of Hyper-V called the Windows Azure Hypervisor Allows for automated management of virtual machines

8 Key Components Fabric Controller Think of it as an automated IT department Cloud Layer on top of: Windows Server 2008 A custom version of Hyper-V called the Windows Azure Hypervisor Allows for automated management of virtual machines It s job is to provision, deploy, monitor, and maintain applications in data centers Applications have a shape and a configuration. The configuration definition describes the shape of a service Role types Role VM sizes External and internal endpoints Local storage The configuration settings configures a service Instance count Storage keys Application-specific settings

9 Key Components Fabric Controller Manages nodes and edges in the fabric (the hardware) Power-on automation devices Routers / Switches Hardware load balancers Physical servers Virtual servers State transitions Current State Goal State Does what is needed to reach and maintain the goal state It s a perfect IT employee! Never sleeps Doesn t ever ask for raise Always does what you tell it to do in configuration definition and settings

10 Creating a New Project

11 Windows Azure Compute

12 Key Components Compute Web Roles Web Front End Cloud web server Web pages Web services You can create the following types: ASP.NET web roles ASP.NET MVC 2 web roles WCF service web roles Worker roles CGI-based web roles

13 Key Components Compute Worker Roles Utility compute Windows Server 2008 Background processing Each role can define an amount of local storage. Protected space on the local drive, considered volatile storage. May communicate with outside services Azure Storage SQL Azure Other Web services Can expose external and internal endpoints

14 Suggested Application Model Using queues for reliable messaging

15 Scalable, Fault Tolerant Applications Queues are the application glue Decouple parts of application, easier to scale independently; Resource allocation, different priority queues and backend servers Mask faults in worker roles (reliable messaging).

16 Key Components Compute VM Roles Customized Role You own the box How it works: Download Guest OS to Server 2008 Hyper-V Customize the OS as you need to Upload the differences VHD Azure runs your VM role using Base OS Differences VHD

17 Application Hosting

18 Grokking the service model Imagine white-boarding out your service architecture with boxes for nodes and arrows describing how they communicate The service model is the same diagram written down in a declarative format You give the Fabric the service model and the binaries that go with each of those nodes The Fabric can provision, deploy and manage that diagram for you Find hardware home Copy and launch your app binaries Monitor your app and the hardware In case of failure, take action. Perhaps even relocate your app At all times, the diagram stays whole

19 Automated Service Management Provide code + service model Platform identifies and allocates resources, deploys the service, manages service health Configuration is handled by two files ServiceDefinition.csdef ServiceConfiguration.cscfg

20 Service Definition

21 Service Configuration

22 GUI Double click on Role Name in Azure Project

23 Deploying to the cloud We can deploy from the portal or from script VS builds two files. Encrypted package of your code Your config file You must create an Azure account, then a service, and then you deploy your code. Can take up to 20 minutes (which is better than six months)

24 Service Management API REST based API to manage your services X509-certs for authentication Lets you create, delete, change, upgrade, swap,. Lots of community and MSFT-built tools around the API - Easy to roll your own

25 The Secret Sauce The Fabric The Fabric is the brain behind Windows Azure. 1. Process service model 1. Determine resource requirements 2. Create role images 2. Allocate resources 3. Prepare nodes 1. Place role images on nodes 2. Configure settings 3. Start roles 4. Configure load balancers 5. Maintain service health 1. If role fails, restart the role, based on policy 2. If node fails, migrate the role, based on policy

26 Storage

27 Durable Storage, At Massive Scale Blob - Massive files e.g. videos, logs Drive - Use standard file system APIs Tables - Non-relational, but with few scale limits - Use SQL Azure for relational data Queues - Facilitate loosely-coupled, reliable, systems

28 Blob Features and Functions Store Large Objects (up to 1TB in size) Can be served through Windows Azure CDN service Standard REST Interface PutBlob Inserts a new blob, overwrites the existing blob GetBlob Get whole blob or a specific range DeleteBlob CopyBlob SnapshotBlob LeaseBlob

29 Two Types of Blobs Under the Hood Block Blob Targeted at streaming workloads Each blob consists of a sequence of blocks Each block is identified by a Block ID Size limit 200GB per blob Page Blob Targeted at random read/write workloads Each blob consists of an array of pages Each page is identified by its offset from the start of the blob Size limit 1TB per blob

30 Windows Azure Drive Provides a durable NTFS volume for Windows Azure applications to use Use existing NTFS APIs to access a durable drive Durability and survival of data on application failover Enables migrating existing NTFS applications to the cloud A Windows Azure Drive is a Page Blob Example, mount Page Blob as X:\ All writes to drive are made durable to the Page Blob Drive made durable through standard Page Blob replication Drive persists even when not mounted as a Page Blob

31 Windows Azure Tables Provides Structured Storage Massively Scalable Tables Billions of entities (rows) and TBs of data Can use thousands of servers as traffic grows Highly Available & Durable Data is replicated several times Familiar and Easy to use API WCF Data Services and OData.NET classes and LINQ REST with any platform or language

32 Windows Azure Queues Queue are performance efficient, highly available and provide reliable message delivery Simple, asynchronous work dispatch Programming semantics ensure that a message can be processed at least once Access is provided via REST

33 Storage Partitioning Understanding partitioning is key to understanding performance Every data object has a partition key Different for each data type (blobs, entities, queues) Partition key is unit of scale A partition can be served by a single server System load balances partitions based on traffic pattern Controls entity locality System load balances Server Busy Load balancing can take a few minutes to kick in Can take a couple of seconds for partition to be available on a different server Use exponential backoff on Server Busy Our system load balances to meet your traffic needs Single partition limits have been reached

34 Partition Keys In Each Abstraction Blobs Container name + Blob name Every blob and its snapshots are in a single partition Container Name image image video Entities TableName + PartitionKey Blob Name annarbor/bighouse.jpg foxborough/gillette.jpg annarbor/bighouse.jpg Entities w/ same PartitionKey value served from same partition PartitionKey (CustomerId) RowKey (RowKind) Name CreditCardNumber OrderTotal 1 Customer-John Smith John Smith xxxx-xxxx-xxxx-xxxx 1 Order 1 $ Customer-Bill Johnson Bill Johnson xxxx-xxxx-xxxx-xxxx 2 Order 3 $10.00 Messages Queue Name All messages for a single queue belong to the same partition Queue jobs jobs workflow Message Message1 Message2 Message1

35 Scalability Targets Storage Account Capacity Up to 100 TBs Transactions Up to a few thousand requests per second Bandwidth Up to a few hundred megabytes per second Single Blob Partition Throughput up to 60 MB/s Single Queue/Table Partition Up to 500 transactions per second To go above these numbers, partition between multiple storage accounts and partitions When limit is hit, app will see 503 server busy : applications should implement exponential backoff

36 Partitions and Partition Ranges PartitionKey RowKey Timestamp ReleaseDate PartitionKey RowKey Timestamp ReleaseDate (Category) (Title) (Category) (Title) Action Fast & Furious 2009 Action The The Bourne Bourne Ultimatum Ultimatum Animation Open Season Animation Open Season Animation The Ant Bully 2006 Animation The Ant Bully 2006 PartitionKey RowKey Timestamp ReleaseDate (Category) (Title) Comedy Office Space 1999 SciFi X-Men X-Men Origins: Origins: Wolverine War Defiance 2008 War Defiance 2008

37 Key Selection: Things to Consider Scalability Distribute load as much as possible Hot partitions can be load balanced PartitionKey is critical for scalability Query Efficiency & Speed Avoid frequent large scans Parallelize queries Point queries are most efficient Entity group transactions Transactions across a single partition Transaction semantics & Reduce round trips See and for more information

38 Expect Continuation Tokens Seriously! Maximum of 1000 rows in a response Maximum of 1000 rows in a response At the end of partition range boundary At the end of partition range boundary Maximum of 5 seconds to execute the query

39 Tables Recap Select PartitionKey and RowKey that help scale Efficient for frequently used queries Supports batch transactions Distributes load Avoid Append only patterns Distribute by using a hash etc. as prefix Always Handle continuation tokens Expect continuation tokens for range queries OR predicates are not optimized Execute the queries that form the OR predicates as separate queries Implement back-off strategy for retries WCF Data Services Server busy Load balance partitions to meet traffic needs Load on single partition has exceeded the limits Use a new context for each logical operation AddObject/AttachTo can throw exception if entity is already being tracked Point query throws an exception if resource does not exist. Use IgnoreResourceNotFoundException

40 Queues Their Unique Role in Building Reliable, Scalable Applications Want roles that work closely together, but are not bound together. Tight coupling leads to brittleness This can aid in scaling and performance A queue can hold an unlimited number of messages Messages must be serializable as XML Limited to 8KB in size Commonly use the work ticket pattern Why not simply use a table?

41 Queue Terminology

42 Message Lifecycle HTTP/ OK Transfer-Encoding: chunked Content-Type: application/xml Date: Tue, PutMessage 09 Dec :04:30 GMT Server: Nephos Queue Service Version 1.0 Microsoft-HTTPAPI/2.0 Msg 1 Msg 4 Queue GetMessage RemoveMessage (Timeout) Worker Role <?xml version="1.0" encoding="utf-8"?> <QueueMessagesList> POST Web Role Msg 2 Msg 21 <QueueMessage> DELETE <MessageId>5974b586-0df3-4e2d-ad0c-18e3892bfca2</MessageId> <InsertionTime>Mon, 22 Sep :29:20 Msg GMT</InsertionTime> 3 GM0MDFiZDAwYzEw <ExpirationTime>Mon, 29 Sep :29:20 GMT</ExpirationTime> <PopReceipt>YzQ4Yzg1MDIGM0MDFiZDAwYzEw</PopReceipt> <TimeNextVisible>Tue, 23 Sep :29:20GMT</TimeNextVisible> <MessageText>PHRlc3Q+dG...dGVzdD4=</MessageText> </QueueMessage> </QueueMessagesList> Worker Role Msg 2

43 Truncated Exponential Back Off Polling Consider a backoff polling approach Each empty poll increases interval by 2x A successful sets the interval back to 1.

44 Removing Poison Messages Producers Consumers P 2 C 1 1. GetMessage(Q, 30 s) msg P 1 C 2 2. GetMessage(Q, 30 s) msg 2 44

45 Removing Poison Messages Producers Consumers P C 1 1. GetMessage(Q, 30 s) msg 1 5. C 1 crashed P msg1 visible 30 s after Dequeue 2 1 C 2 2. GetMessage(Q, 30 s) msg 2 3. C2 consumed msg 2 4. DeleteMessage(Q, msg 2) 7. GetMessage(Q, 30 s) msg 1 45

46 Removing Poison Messages Producers Consumers P C 1 1. Dequeue(Q, 30 sec) msg 1 5. C 1 crashed 10. C1 restarted 11. Dequeue(Q, 30 sec) msg DequeueCount > Delete (Q, msg1) P Dequeue(Q, 30 sec) msg 2 3. C2 consumed msg 2 4. Delete(Q, msg 2) 7. Dequeue(Q, 30 sec) msg 1 8. C2 crashed C 2 6. msg1 visible 30s after Dequeue 9. msg1 visible 30s after Dequeue 46

47 Queues Recap Make message processing idempotent No need to deal with failures Do not rely on order Invisible messages result in out of order Use Dequeue count to remove poison messages Use blob to store message data with reference in message Use message count to scale Enforce threshold on message s dequeue count Messages > 8KB Batch messages Garbage collect orphaned blobs Dynamically increase/reduce workers

48 Windows Azure Storage Takeaways Blobs Drives Tables Queues

49 A Quick Exercise Then let s look at some code and some tools 49

50 Code AccountInformation.cs public class AccountInformation { private static string storagekey = thisisnotmykey"; private static string accountname = "jjstore"; private static StorageCredentialsAccountAndKey credentials; internal static StorageCredentialsAccountAndKey Credentials { get { if (credentials == null) credentials = new StorageCredentialsAccountAndKey(accountName, storagekey); } } } } return credentials; 50

51 Code BlobHelper.cs public class BlobHelper { private static string defaultcontainername = "school"; private CloudBlobClient client = null; private CloudBlobContainer container = null; private void InitContainer() { if (client == null) client = new CloudStorageAccount(AccountInformation.Credentials, false).createcloudblobclient(); container = client.getcontainerreference(defaultcontainername); container.createifnotexist(); 51 } } BlobContainerPermissions permissions = container.getpermissions(); permissions.publicaccess = BlobContainerPublicAccessType.Container; container.setpermissions(permissions);

52 Code BlobHelper.cs public void WriteFileToBlob(string filepath) { if (client == null container == null) InitContainer(); FileInfo file = new FileInfo(filePath); CloudBlob blob = container.getblobreference(file.name); blob.properties.contenttype = GetContentType(file.Extension); blob.uploadfile(file.fullname); } // Or if you want to write a string replace the last line with: // blob.uploadtext(somestring); // And make sure you set the content type to the appropriate MIME type (e.g. text/plain ) 52

53 Code BlobHelper.cs public string GetBlobText(string blobname) { if (client == null container == null) InitContainer(); } CloudBlob blob = container.getblobreference(blobname); try { return blob.downloadtext(); } catch (Exception) { // The blob probably does not exist or there is no connection available return null; } 53

54 Application Code - Blobs private void SaveToCloudButton_Click(object sender, RoutedEventArgs e) { StringBuilder buff = new StringBuilder(); buff.appendline("lastname,firstname, ,birthday,nativelanguage,favoriteicecream,yearsinphd,graduated"); foreach (AttendeeEntity attendee in attendees) { buff.appendline(attendee.tocsvstring()); } } blobhelper.writestringtoblob("summerschoolattendees.txt", buff.tostring()); The blob is now available at: Or in this case: 54

55 Code - TableEntities using Microsoft.WindowsAzure.StorageClient; public class AttendeeEntity : TableServiceEntity { public string FirstName { get; set; } public string LastName { get; set; } public string { get; set; } public DateTime Birthday { get; set; } public string FavoriteIceCream { get; set; } public int YearsInPhD { get; set; } public bool Graduated { get; set; } } 55

56 Code - TableEntities public void UpdateFrom(AttendeeEntity other) { FirstName = other.firstname; LastName = other.lastname; = other. ; Birthday = other.birthday; FavoriteIceCream = other.favoriteicecream; YearsInPhD = other.yearsinphd; Graduated = other.graduated; } UpdateKeys(); public void UpdateKeys() { PartitionKey = "SummerSchool"; RowKey = ; } 56

57 Code TableHelper.cs public class TableHelper { private CloudTableClient client = null; private TableServiceContext context = null; private Dictionary<string,AttendeeEntity> allattendees = null; private string tablename = "Attendees"; private CloudTableClient Client { get { if (client == null) client = new CloudStorageAccount(AccountInformation.Credentials, false).createcloudtableclient(); return client; } } private TableServiceContext Context { get { if (context == null) context = Client.GetDataServiceContext(); return context; } } } 57

58 Code TableHelper.cs private void ReadAllAttendees() { allattendees = new Dictionary<string, AttendeeEntity>(); CloudTableQuery<AttendeeEntity> query = Context.CreateQuery<AttendeeEntity>(tableName).AsTableServiceQuery(); } try { foreach (AttendeeEntity attendee in query) { allattendees[attendee. ] = attendee; } } catch (Exception) { // No entries in table - or other exception } 58

59 Code TableHelper.cs public void DeleteAttendee(string ) { if (allattendees == null) ReadAllAttendees(); if (!allattendees.containskey( )) return; AttendeeEntity attendee = allattendees[ ]; // Delete from the cloud table Context.DeleteObject(attendee); Context.SaveChanges(); } // Delete from the memory cache allattendees.remove( ); 59

60 Code TableHelper.cs public AttendeeEntity GetAttendee(string ) { if (allattendees == null) ReadAllAttendees(); if (allattendees.containskey( )) return allattendees[ ]; } return null; Remember that this only works for tables (or queries on tables) that easily fit in memory This is one of many design patterns for working with tables 60

61 61 Pseudo Code TableHelper.cs public void UpdateAttendees(List<AttendeeEntity> updatedattendees) { foreach (AttendeeEntity attendee in updatedattendees) { UpdateAttendee(attendee, false); } Context.SaveChanges(SaveChangesOptions.Batch); } public void UpdateAttendee(AttendeeEntity attendee) { UpdateAttendee(attendee, true); } private void UpdateAttendee(AttendeeEntity attendee, bool savechanges) { if (allattendees.containskey(attendee. )) { AttendeeEntity existingattendee = allattendees[attendee. ]; existingattendee.updatefrom(attendee); Context.UpdateObject(existingAttendee); } else { Context.AddObject(tableName, attendee); } if (savechanges) Context.SaveChanges(); }

62 Application Code Cloud Tables private void SaveButton_Click(object sender, RoutedEventArgs e) { // Write to table tablehelper.updateattendees(attendees); } That s it! Now your tables are accessible using REST service calls or any cloud storage tool. 62

63 63 Tools Fiddler2

64 Best Practices

65 Picking the Right VM Size Having the correct VM size can make a big difference in costs Fundamental choice larger, fewer VMs vs. many smaller instances If you scale better than linear across cores, larger VMs could save you money Pretty rare to see linear scaling across 8 cores. More instances may provide better uptime and reliability (more failures needed to take your service down) Only real right answer experiment with multiple sizes and instance counts in order to measure and find what is ideal for you

66 Using Your VM to the Maximum Remember: 1 role instance == 1 VM running Windows. 1 role instance!= one specific task for your code You re paying for the entire VM so why not use it? Common mistake split up code into multiple roles, each not using up CPU. Balance between using up CPU vs. having free capacity in times of need. Multiple ways to use your CPU to the fullest

67 Exploiting Concurrency Spin up additional processes, each with a specific task or as a unit of concurrency. May not be ideal if number of active processes exceeds number of cores Use multithreading aggressively In networking code, correct usage of NT IO Completion Ports will let the kernel schedule the precise number of threads In.NET 4, use the Task Parallel Library Data parallelism Task parallelism

68 Finding Good Code Neighbors Typically code falls into one or more of these categories: Memory Intensive CPU Intensive Network IO Intensive Storage IO Intensive Find code that is intensive with different resources to live together Example: distributed network caches are typically network- and memoryintensive; they may be a good neighbor for storage IO-intensive code

69 Scaling Appropriately Monitor your application and make sure you re scaled appropriately (not over-scaled). Spinning VMs up and down automatically is good at large scale. Remember that VMs take a few minutes to come up and cost ~$3 a day (give or take) to keep running. Being too aggressive in spinning down VMs can result in poor user experience. Trade-off between risk of failure/poor user experience due to not having excess capacity and the costs of having idling VMs. Performance Cost

70 Storage Costs Understand an application s storage profile and how storage billing works Make service choices based on your app profile E.g. SQL Azure has a flat fee while Windows Azure Tables charges per transaction. Service choice can make a big cost difference based on your app profile Caching and compressing. They help a lot with storage costs.

71 Saving Bandwidth Costs Bandwidth costs are a huge part of any popular web app s billing profile Saving bandwidth costs often lead to savings in other places Sending fewer things over the wire often means getting fewer things from storage Sending fewer things means your VM has time to do other tasks All of these tips have the side benefit of improving your web app s performance and user experience

72 Compressing Content 1. Gzip all output content All modern browsers can decompress on the fly. Compared to Compress, Gzip has much better compression and freedom from patented algorithms 2.Tradeoff compute costs for storage size 3.Minimize image sizes Use Portable Network Graphics (PNGs) Crush your PNGs Strip needless metadata Make all PNGs palette PNGs Uncompressed Content Gzip Minify JavaScript Minify CCS Minify Images Compressed Content

73 Best Practices Summary Doing less is the key to saving costs Measure everything Know your application profile in and out

74 Research Examples in the Cloud on another set of slides

75 Map Reduce on Azure Elastic MapReduce on Amazon Web Services has traditionally been the only option for Map Reduce jobs in the web Hadoop implementation Hadoop has a long history and has been improved for stability Originally Designed for Cluster Systems Microsoft Research this week is announcing a project code named Daytona for Map Reduce jobs on Azure Designed from the start to use cloud primitives Built-in fault tolerance REST based interface for writing your own clients

76 Project Daytona - Map Reduce on Azure 76

77 Questions and Discussion Thank you for hosting me at the Summer School 77

78

79 LAST (Basic Local Alignment Search Tool) The most important software in bioinformatics Identify similarity between bio-sequences omputationally intensive Large number of pairwise alignment operations A BLAST running can take 700 ~ 1000 CPU hours Sequence databases growing exponentially GenBank doubled in size in about 15 months.

80 It is easy to parallelize BLAST Segment the input Segment processing (querying) is pleasingly parallel Segment the database (e.g., mpiblast) Needs special result reduction processing Large volume data A normal Blast database can be as large as 10GB 100 nodes means the peak storage bandwidth could reach to 1TB The output of BLAST is usually x larger than the input

81 Parallel BLAST engine on Azure Query-segmentation data-parallel pattern split the input sequences query partitions in parallel merge results together when done Follows the general suggested application model Web Role + Queue + Worker With three special considerations Batch job management Task parallelism on an elastic Cloud Wei Lu, Jared Jackson, and Roger Barga, AzureBlast: A Case Study of Developing Science Applications on the Cloud, in Proceedings of the 1st Workshop on Scientific Cloud Computing (Science Cloud 2010), Association for Computing Machinery, Inc., 21 June 2010

82 A simple Split/Join pattern Leverage multi-core of one instance argument a of NCBI-BLAST 1,2,4,8 for small, middle, large, and extra large instance size Task granularity Large partition load imbalance Small partition unnecessary overheads NCBI-BLAST overhead Data transferring overhead. Splitting task Best Practice: test runs to profiling and set size to mitigate the overhead BLAST task BLAST task BLAST task BLAST task Value of visibilitytimeout for each BLAST task, Essentially an estimate of the task run time. too small repeated computation; too large unnecessary long period of waiting time in case of the instance failure. Best Practice: Estimate the value based on the number of pair-bases in the partition and test-runs Watch out for the 2-hour maximum limitation Merging Task

83 Task size vs. Performance Benefit of the warm cache effect 100 sequences per partition is the best choice Instance size vs. Performance Super-linear speedup with larger size worker instances Primarily due to the memory capability. Task Size/Instance Size vs. Cost Extra-large instance generated the best and the most economical throughput Fully utilize the resource

84 BLAST task BLAST task Splitting task BLAST task Merging Task BLAST task Web Role Job Management Role Worker Web Portal Job registration Scaling Engine Worker Web Service Job Scheduler Global dispatch queue Worker NCBI databases Job Registry Azure Table Blast databases, temporary data, etc.) Database updating Role Azure Blob

85 ASP.NET program hosted by a web role instance Submit jobs Track job s status and logs Authentication/Authorization based on Live ID Web Portal Web Service Job Portal Job registration Job Registry Scaling Engine Job Scheduler The accepted job is stored into the job registry table Fault tolerance, avoid in-memory states

86 R. palustris as a platform for H2 production Eric Shadt, SAGE Sam Phattarasukol Harwood Lab, UW Blasted ~5,000 proteins (700K sequences) Against all NCBI non-redundant proteins: completed in 30 min Against ~5,000 proteins from another strain: completed in less than 30 sec AzureBLAST significantly saved computing time

87 Discovering Homologs Discover the interrelationships of known protein sequences All against All query The database is also the input query The protein database is large (4.2 GB size) Totally 9,865,668 sequences to be queried Theoretically, 100 billion sequence comparisons! Performance estimation Based on the sampling-running on one extra-large Azure instance Would require 3,216,731 minutes (6.1 years) on one desktop One of biggest BLAST jobs as far as we know This scale of experiments usually are infeasible to most scientists

88 Allocated a total of ~4000 instances 475 extra-large VMs (8 cores per VM), four datacenters, US (2), Western and North Europe 8 deployments of AzureBLAST Each deployment has its own co-located storage service Divide 10 million sequences into multiple segments Each will be submitted to one deployment as one job for execution Each segment consists of smaller partitions When load imbalances, redistribute the load manually

89 Total size of the output result is ~230GB The number of total hits is 1,764,579,487 Started at March 25 th, the last task completed on April 8 th (10 days compute) But based our estimates, real working instance time should be 6~8 day Look into log data to analyze what took place

90 A normal log record should be 3/31/2010 6:14 RD00155D3611B0 Executing the task /31/2010 6:25 RD00155D3611B0 Execution of task is done, it took 10.9mins 3/31/2010 6:25 RD00155D3611B0 Executing the task /31/2010 6:44 RD00155D3611B0 Execution of task is done, it took 19.3mins 3/31/2010 6:44 RD00155D3611B0 Executing the task /31/2010 7:02 RD00155D3611B0 Execution of task is done, it took mins Otherwise, something is wrong (e.g., task failed to complete) 3/31/2010 8:22 RD00155D3611B0 Executing the task /31/2010 9:50 RD00155D3611B0 Executing the task /31/ :12 RD00155D3611B0 Execution of task is done, it took 82 mins

91 North Europe Data Center, totally 34,256 tasks processed All 62 compute nodes lost tasks and then came back in a group. This is an Update domain ~ 6 nodes in one group ~30 mins

92 West Europe Datacenter; 30,976 tasks are completed, and job was killed 35 Nodes experience blob writing failure at same time A reasonable guess: the Fault Domain is working

93 MODISAzure : Computing Evapotranspiration (ET) in The Cloud You never miss the water till the well has run dry Irish Proverb

94 Evapotranspiration (ET) is the release of water to the atmosphere by evaporation from open water bodies and transpiration, or evaporation through plant membranes, by plants. ET = Rn + ρ a c p δq g a ( + γ 1 + g a g s )λ υ Penman-Monteith (1964) ET = Water volume evapotranspired (m 3 s -1 m -2 ) Δ = Rate of change of saturation specific humidity with air temperature.(pa K -1 ) λ v = Latent heat of vaporization (J/g) R n = Net radiation (W m -2 ) c p = Specific heat capacity of air (J kg -1 K -1 ) ρ a = dry air density (kg m -3 ) δq = vapor pressure deficit (Pa) g a = Conductivity of air (inverse of r a ) (m s -1 ) g s = Conductivity of plant stoma, air (inverse of r s ) (m s -1 ) γ = Psychrometric constant (γ 66 Pa K -1 ) Lots of inputs: big data reduction Some of the inputs are not so simple Estimating resistance/conductivity across a catchment can be tricky

95 Climate classification ~1MB (1file) FLUXNET curated sensor dataset (30GB, 960 files) Vegetative clumping ~5MB (1file) NCEP/NCAR ~100MB (4K files) NASA MODIS imagery source archives 5 TB (600K files) 20 US year = 1 global year FLUXNET curated field dataset 2 KB (1 file)

96 Data collection (map) stage Downloads requested input tiles from NASA ftp sites Includes geospatial lookup for non-sinusoidal tiles that will contribute to a reprojected sinusoidal tile Reprojection (map) stage Converts source tile(s) to intermediate result sinusoidal tiles Simple nearest neighbor or spline algorithms Derivation reduction stage First stage visible to scientist Computes ET in our initial use Analysis reduction stage Optional second stage visible to scientist Enables production of science analysis artifacts such as maps, tables, virtual sensors Source Imagery Download Sites Data Collection Stage Reprojection Queue... Download Queue Reprojection Stage Source Metadata Reduction #1 Queue AzureMODIS Service Web Role Portal Derivation Reduction Stage Request Queue Reduction #2 Queue Scientists Science results Analysis Reduction Stage Scientific Results Download

97 <PipelineStage> Request MODISAzure Service (Web Role) Service Monitor (Worker Role) <PipelineStage>Job Queue Persist Parse & Persist <PipelineStage>JobStatus <PipelineStage>TaskStatus <PipelineStage>Task Queue ModisAzure Service is the Web Role front door Receives all user requests Queues request to appropriate Download, Reprojection, or Reduction Job Queue Dispatch Service Monitor is a dedicated Worker Role Parses all job requests into tasks recoverable units of work Execution status of all jobs and tasks persisted in Tables

98 Service Monitor (Worker Role) Parse & Persist <PipelineStage>TaskStatus <PipelineStage>Task Queue Dispatch GenericWorker (Worker Role) <Input>Data Storage All work actually done by a Worker Role Dequeues tasks created by the Service Monitor Retries failed tasks 3 times Maintains all task status Sandboxes science or other executable Marshalls all storage from/to Azure blob storage to/from local Azure Worker instance files

99 Reprojection Request Job Queue Persist ReprojectionJobStatus Each entity specifies a single reprojection job request Reprojection Data Storage Service Monitor (Worker Role) Task Queue Dispatch GenericWorker (Worker Role) Parse & Persist Points to ReprojectionTaskStatus ScanTimeList SwathGranuleMeta Swath Source Data Storage Each entity specifies a single reprojection task (i.e. a single tile) Query this table to get the list of satellite scan times that cover a target tile Query this table to get geo-metadata (e.g. boundaries) for each swath tile

100 Total: $1420 Computational costs driven by data scale and need to run reduction multiple times Storage costs driven by data scale and 6 month project duration Small with respect to the people costs even at graduate student rates! Source Imagery Download Sites Data Collection Stage $50 upload $450 storage Reprojection Queue... Download Queue Reprojection Stage $420 cpu $60 download GB 60K files 10 MB/sec 11 hours <10 workers 400 GB 45K files 3500 hours workers Source Metadata Reduction #1 Queue AzureMODIS Service Web Role Portal Derivation Reduction Stage $216 cpu $1 download $6 storage 5-7 GB 5.5K files 1800 hours workers Request Queue Reduction #2 Queue Scientists Scientific Results Download Analysis Reduction Stage $216 cpu $2 download $9 storage <10 GB ~1K files 1800 hours workers

Introduction to Windows Azure Cloud Computing Futures Group, Microsoft Research Roger Barga, Jared Jackson,Nelson Araujo, Dennis Gannon, Wei Lu, and

Introduction to Windows Azure Cloud Computing Futures Group, Microsoft Research Roger Barga, Jared Jackson,Nelson Araujo, Dennis Gannon, Wei Lu, and Introduction to Windows Azure Cloud Computing Futures Group, Microsoft Research Roger Barga, Jared Jackson,Nelson Araujo, Dennis Gannon, Wei Lu, and Jaliya Ekanayake Range in size from edge facilities

More information

Enabling Science in the Cloud: A Remote Sensing Data Processing Service for Environmental Science Analysis

Enabling Science in the Cloud: A Remote Sensing Data Processing Service for Environmental Science Analysis Enabling Science in the Cloud: A Remote Sensing Data Processing Service for Environmental Science Analysis Catharine van Ingen 1, Jie Li 2, Youngryel Ryu 3, Marty Humphrey 2, Deb Agarwal 4, Keith Jackson

More information

Windows Azure = Managed for You Standalone Servers Applications Runtimes Database Operating System Virtualization Server Storage Networking Efficiency IaaS PaaS SaaS Control+Cost Developer 1) Choose image,

More information

Cloud Computing for Research Roger Barga Cloud Computing Futures, Microsoft Research

Cloud Computing for Research Roger Barga Cloud Computing Futures, Microsoft Research Cloud Computing for Research Roger Barga Cloud Computing Futures, Microsoft Research Trends: Data on an Exponential Scale Scientific data doubles every year Combination of inexpensive sensors + exponentially

More information

Windows Azure Storage Essential Cloud Storage Services http://www.azureusergroup.com

Windows Azure Storage Essential Cloud Storage Services http://www.azureusergroup.com Windows Azure Storage Essential Cloud Storage Services http://www.azureusergroup.com David Pallmann, Neudesic Windows Azure Windows Azure is the foundation of Microsoft s Cloud Platform It is an Operating

More information

Cloud Computing with Windows Azure. beat schwegler microsoft western europe beatsch@microsoft.com

Cloud Computing with Windows Azure. beat schwegler microsoft western europe beatsch@microsoft.com Cloud Computing with Windows Azure beat schwegler microsoft western europe beatsch@microsoft.com why? cheaper. risk mitigation. agility. what? elastic compute. scalable storage. network topology. how?

More information

Introduction to Azure: Microsoft s Cloud OS

Introduction to Azure: Microsoft s Cloud OS Introduction to Azure: Microsoft s Cloud OS DI Andreas Schabus Technology Advisor Microsoft Österreich GmbH aschabus@microsoft.com www.codefest.at Version 1.0 Agenda Cloud Computing Fundamentals Windows

More information

Windows Azure and private cloud

Windows Azure and private cloud Windows Azure and private cloud Joe Chou Senior Program Manager China Cloud Innovation Center Customer Advisory Team Microsoft Asia-Pacific Research and Development Group 1 Agenda Cloud Computing Fundamentals

More information

Assignment # 1 (Cloud Computing Security)

Assignment # 1 (Cloud Computing Security) Assignment # 1 (Cloud Computing Security) Group Members: Abdullah Abid Zeeshan Qaiser M. Umar Hayat Table of Contents Windows Azure Introduction... 4 Windows Azure Services... 4 1. Compute... 4 a) Virtual

More information

Cloud Computing Trends

Cloud Computing Trends UT DALLAS Erik Jonsson School of Engineering & Computer Science Cloud Computing Trends What is cloud computing? Cloud computing refers to the apps and services delivered over the internet. Software delivered

More information

THE WINDOWS AZURE PROGRAMMING MODEL

THE WINDOWS AZURE PROGRAMMING MODEL THE WINDOWS AZURE PROGRAMMING MODEL DAVID CHAPPELL OCTOBER 2010 SPONSORED BY MICROSOFT CORPORATION CONTENTS Why Create a New Programming Model?... 3 The Three Rules of the Windows Azure Programming Model...

More information

Google Cloud Platform The basics

Google Cloud Platform The basics Google Cloud Platform The basics Who I am Alfredo Morresi ROLE Developer Relations Program Manager COUNTRY Italy PASSIONS Community, Development, Snowboarding, Tiramisu' Reach me alfredomorresi@google.com

More information

WINDOWS AZURE EXECUTION MODELS

WINDOWS AZURE EXECUTION MODELS WINDOWS AZURE EXECUTION MODELS Windows Azure provides three different execution models for running applications: Virtual Machines, Web Sites, and Cloud Services. Each one provides a different set of services,

More information

CUMULUX WHICH CLOUD PLATFORM IS RIGHT FOR YOU? COMPARING CLOUD PLATFORMS. Review Business and Technology Series www.cumulux.com

CUMULUX WHICH CLOUD PLATFORM IS RIGHT FOR YOU? COMPARING CLOUD PLATFORMS. Review Business and Technology Series www.cumulux.com ` CUMULUX WHICH CLOUD PLATFORM IS RIGHT FOR YOU? COMPARING CLOUD PLATFORMS Review Business and Technology Series www.cumulux.com Table of Contents Cloud Computing Model...2 Impact on IT Management and

More information

WINDOWS AZURE DATA MANAGEMENT AND BUSINESS ANALYTICS

WINDOWS AZURE DATA MANAGEMENT AND BUSINESS ANALYTICS WINDOWS AZURE DATA MANAGEMENT AND BUSINESS ANALYTICS Managing and analyzing data in the cloud is just as important as it is anywhere else. To let you do this, Windows Azure provides a range of technologies

More information

Scaling Analysis Services in the Cloud

Scaling Analysis Services in the Cloud Our Sponsors Scaling Analysis Services in the Cloud by Gerhard Brückl gerhard@gbrueckl.at blog.gbrueckl.at About me Gerhard Brückl Working with Microsoft BI since 2006 Windows Azure / Cloud since 2013

More information

Microsoft Lab Of Things - Week6 Tuesday -

Microsoft Lab Of Things - Week6 Tuesday - Microsoft Lab Of Things - Week6 Tuesday - Kookmin University 1 Objectives and what to study Azure Storage concepts Azure Storage development Blob Table Queue 2 Objectives Understand Azure Storage Services

More information

Amazon EC2 Product Details Page 1 of 5

Amazon EC2 Product Details Page 1 of 5 Amazon EC2 Product Details Page 1 of 5 Amazon EC2 Functionality Amazon EC2 presents a true virtual computing environment, allowing you to use web service interfaces to launch instances with a variety of

More information

WINDOWS AZURE DATA MANAGEMENT

WINDOWS AZURE DATA MANAGEMENT David Chappell October 2012 WINDOWS AZURE DATA MANAGEMENT CHOOSING THE RIGHT TECHNOLOGY Sponsored by Microsoft Corporation Copyright 2012 Chappell & Associates Contents Windows Azure Data Management: A

More information

Amazon Cloud Storage Options

Amazon Cloud Storage Options Amazon Cloud Storage Options Table of Contents 1. Overview of AWS Storage Options 02 2. Why you should use the AWS Storage 02 3. How to get Data into the AWS.03 4. Types of AWS Storage Options.03 5. Object

More information

HIGH-SPEED BRIDGE TO CLOUD STORAGE

HIGH-SPEED BRIDGE TO CLOUD STORAGE HIGH-SPEED BRIDGE TO CLOUD STORAGE Addressing throughput bottlenecks with Signiant s SkyDrop 2 The heart of the Internet is a pulsing movement of data circulating among billions of devices worldwide between

More information

Developing Microsoft Azure Solutions

Developing Microsoft Azure Solutions Course 20532A: Developing Microsoft Azure Solutions Page 1 of 7 Developing Microsoft Azure Solutions Course 20532A: 4 days; Instructor-Led Introduction This course is intended for students who have experience

More information

1. Comments on reviews a. Need to avoid just summarizing web page asks you for:

1. Comments on reviews a. Need to avoid just summarizing web page asks you for: 1. Comments on reviews a. Need to avoid just summarizing web page asks you for: i. A one or two sentence summary of the paper ii. A description of the problem they were trying to solve iii. A summary of

More information

Developing Microsoft Azure Solutions 20532A; 5 days

Developing Microsoft Azure Solutions 20532A; 5 days Lincoln Land Community College Capital City Training Center 130 West Mason Springfield, IL 62702 217-782-7436 www.llcc.edu/cctc Developing Microsoft Azure Solutions 20532A; 5 days Course Description This

More information

Enterprise Architectures for Large Tiled Basemap Projects. Tommy Fauvell

Enterprise Architectures for Large Tiled Basemap Projects. Tommy Fauvell Enterprise Architectures for Large Tiled Basemap Projects Tommy Fauvell Tommy Fauvell Senior Technical Analyst Esri Professional Services Washington D.C Regional Office Project Technical Lead: - Responsible

More information

INTRODUCING WINDOWS AZURE

INTRODUCING WINDOWS AZURE INTRODUCING WINDOWS AZURE DAVID CHAPPELL OCTOBER 2010 SPONSORED BY MICROSOFT CORPORATION CONTENTS An Overview of Windows Azure... 2 Compute... 4 Storage... 5 Fabric Controller... 7 Content Delivery Network...

More information

Storing and Processing Sensor Networks Data in Public Clouds

Storing and Processing Sensor Networks Data in Public Clouds UWB CSS 600 Storing and Processing Sensor Networks Data in Public Clouds Aysun Simitci Table of Contents Introduction... 2 Cloud Databases... 2 Advantages and Disadvantages of Cloud Databases... 3 Amazon

More information

SharePoint 2013 on Windows Azure Infrastructure David Aiken & Dan Wesley Version 1.0

SharePoint 2013 on Windows Azure Infrastructure David Aiken & Dan Wesley Version 1.0 SharePoint 2013 on Windows Azure Infrastructure David Aiken & Dan Wesley Version 1.0 Overview With the Virtual Machine and Virtual Networking services of Windows Azure, it is now possible to deploy and

More information

Data Management in the Cloud

Data Management in the Cloud Data Management in the Cloud Ryan Stern stern@cs.colostate.edu : Advanced Topics in Distributed Systems Department of Computer Science Colorado State University Outline Today Microsoft Cloud SQL Server

More information

Dennis Gannon Cloud Computing Futures extreme Computing Group Microsoft Research

Dennis Gannon Cloud Computing Futures extreme Computing Group Microsoft Research Dennis Gannon Cloud Computing Futures extreme Computing Group Microsoft Research 2 Cloud Concepts Data Center Architecture The cloud flavors: IaaS, PaaS, SaaS Our world of client devices plus the cloud

More information

Service Level Agreement for Windows Azure operated by 21Vianet

Service Level Agreement for Windows Azure operated by 21Vianet Service Level Agreement for Windows Azure operated by 21Vianet Last updated: November 2015 1. Introduction This Service Level Agreement for Windows Azure (this SLA ) is made by 21Vianet in connection with,

More information

Web Application Deployment in the Cloud Using Amazon Web Services From Infancy to Maturity

Web Application Deployment in the Cloud Using Amazon Web Services From Infancy to Maturity P3 InfoTech Solutions Pvt. Ltd http://www.p3infotech.in July 2013 Created by P3 InfoTech Solutions Pvt. Ltd., http://p3infotech.in 1 Web Application Deployment in the Cloud Using Amazon Web Services From

More information

Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at

Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at distributing load b. QUESTION: What is the context? i. How

More information

ASP.NET Multi-Tier Windows Azure Application Using Storage Tables, Queues, and Blobs

ASP.NET Multi-Tier Windows Azure Application Using Storage Tables, Queues, and Blobs ASP.NET Multi-Tier Windows Azure Application Using Storage Tables, Queues, and Blobs Rick Anderson Tom Dykstra Summary: This tutorial series shows how to create a multi-tier ASP.NET MVC 4 web application

More information

Hypertable Architecture Overview

Hypertable Architecture Overview WHITE PAPER - MARCH 2012 Hypertable Architecture Overview Hypertable is an open source, scalable NoSQL database modeled after Bigtable, Google s proprietary scalable database. It is written in C++ for

More information

AppDev OnDemand Cloud Computing Learning Library

AppDev OnDemand Cloud Computing Learning Library AppDev OnDemand Cloud Computing Learning Library A full year of access to our cloud computing courses, plus future course releases included free! The AppDev OnDemand Cloud Computing Learning Library includes

More information

Cloud Computing with Windows Azure using your Preferred Technology

Cloud Computing with Windows Azure using your Preferred Technology Cloud Computing with Windows Azure using your Preferred Technology Sumit Chawla Program Manager Architect Interoperability Technical Strategy Microsoft Corporation Agenda Windows Azure Platform - Windows

More information

Kentico CMS 6.0 Performance Test Report. Kentico CMS 6.0. Performance Test Report February 2012 ANOTHER SUBTITLE

Kentico CMS 6.0 Performance Test Report. Kentico CMS 6.0. Performance Test Report February 2012 ANOTHER SUBTITLE Kentico CMS 6. Performance Test Report Kentico CMS 6. Performance Test Report February 212 ANOTHER SUBTITLE 1 Kentico CMS 6. Performance Test Report Table of Contents Disclaimer... 3 Executive Summary...

More information

Petabyte Scale Data at Facebook. Dhruba Borthakur, Engineer at Facebook, SIGMOD, New York, June 2013

Petabyte Scale Data at Facebook. Dhruba Borthakur, Engineer at Facebook, SIGMOD, New York, June 2013 Petabyte Scale Data at Facebook Dhruba Borthakur, Engineer at Facebook, SIGMOD, New York, June 2013 Agenda 1 Types of Data 2 Data Model and API for Facebook Graph Data 3 SLTP (Semi-OLTP) and Analytics

More information

Running a Workflow on a PowerCenter Grid

Running a Workflow on a PowerCenter Grid Running a Workflow on a PowerCenter Grid 2010-2014 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)

More information

Azure VM Performance Considerations Running SQL Server

Azure VM Performance Considerations Running SQL Server Azure VM Performance Considerations Running SQL Server Your company logo here Vinod Kumar M @vinodk_sql http://blogs.extremeexperts.com Session Objectives And Takeaways Session Objective(s): Learn the

More information

Workload Characterization and Analysis of Storage and Bandwidth Needs of LEAD Workspace

Workload Characterization and Analysis of Storage and Bandwidth Needs of LEAD Workspace Workload Characterization and Analysis of Storage and Bandwidth Needs of LEAD Workspace Beth Plale Indiana University plale@cs.indiana.edu LEAD TR 001, V3.0 V3.0 dated January 24, 2007 V2.0 dated August

More information

Cloud Computing at Google. Architecture

Cloud Computing at Google. Architecture Cloud Computing at Google Google File System Web Systems and Algorithms Google Chris Brooks Department of Computer Science University of San Francisco Google has developed a layered system to handle webscale

More information

AzureBlast: A Case Study of Developing Science Applications on the Cloud

AzureBlast: A Case Study of Developing Science Applications on the Cloud AzureBlast: A Case Study of Developing Science Applications on the Cloud Wei Lu Cloud Computing Futures Microsoft Research weilu@microsoft.com Jared Jackson Cloud Computing Futures Microsoft Research jaredj@microsoft.com

More information

Near Real Time Indexing Kafka Message to Apache Blur using Spark Streaming. by Dibyendu Bhattacharya

Near Real Time Indexing Kafka Message to Apache Blur using Spark Streaming. by Dibyendu Bhattacharya Near Real Time Indexing Kafka Message to Apache Blur using Spark Streaming by Dibyendu Bhattacharya Pearson : What We Do? We are building a scalable, reliable cloud-based learning platform providing services

More information

Introduction to Cloud Computing

Introduction to Cloud Computing Introduction to Cloud Computing Cloud Computing I (intro) 15 319, spring 2010 2 nd Lecture, Jan 14 th Majd F. Sakr Lecture Motivation General overview on cloud computing What is cloud computing Services

More information

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms Distributed File System 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributed File System Don t move data to workers move workers to the data! Store data on the local disks of nodes

More information

Feature Comparison. Windows Server 2008 R2 Hyper-V and Windows Server 2012 Hyper-V

Feature Comparison. Windows Server 2008 R2 Hyper-V and Windows Server 2012 Hyper-V Comparison and Contents Introduction... 4 More Secure Multitenancy... 5 Flexible Infrastructure... 9 Scale, Performance, and Density... 13 High Availability... 18 Processor and Memory Support... 24 Network...

More information

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle Agenda Introduction Database Architecture Direct NFS Client NFS Server

More information

Chapter 27 Aneka Cloud Application Platform and Its Integration with Windows Azure

Chapter 27 Aneka Cloud Application Platform and Its Integration with Windows Azure Chapter 27 Aneka Cloud Application Platform and Its Integration with Windows Azure Yi Wei 1, Karthik Sukumar 1, Christian Vecchiola 2, Dileban Karunamoorthy 2 and Rajkumar Buyya 1, 2 1 Manjrasoft Pty.

More information

Getting Started with Sitecore Azure

Getting Started with Sitecore Azure Sitecore Azure 3.1 Getting Started with Sitecore Azure Rev: 2015-09-09 Sitecore Azure 3.1 Getting Started with Sitecore Azure An Overview for Sitecore Administrators Table of Contents Chapter 1 Getting

More information

Windows Azure Data Services (basics) 55093A; 3 Days

Windows Azure Data Services (basics) 55093A; 3 Days Lincoln Land Community College Capital City Training Center 130 West Mason Springfield, IL 62702 217-782-7436 www.llcc.edu/cctc Windows Azure Data Services (basics) 55093A; 3 Days Course Description This

More information

Scalability of web applications. CSCI 470: Web Science Keith Vertanen

Scalability of web applications. CSCI 470: Web Science Keith Vertanen Scalability of web applications CSCI 470: Web Science Keith Vertanen Scalability questions Overview What's important in order to build scalable web sites? High availability vs. load balancing Approaches

More information

Traditional v/s CONVRGD

Traditional v/s CONVRGD Traditional v/s CONVRGD Traditional Virtualization Stack Converged Virtualization Infrastructure with HCE/HSE Data protection software applications PDU Backup Servers + Virtualization Storage Switch HA

More information

MySQL Cluster 7.0 - New Features. Johan Andersson MySQL Cluster Consulting johan.andersson@sun.com

MySQL Cluster 7.0 - New Features. Johan Andersson MySQL Cluster Consulting johan.andersson@sun.com MySQL Cluster 7.0 - New Features Johan Andersson MySQL Cluster Consulting johan.andersson@sun.com Mat Keep MySQL Cluster Product Management matthew.keep@sun.com Copyright 2009 MySQL Sun Microsystems. The

More information

Chapter 4 Cloud Computing Applications and Paradigms. Cloud Computing: Theory and Practice. 1

Chapter 4 Cloud Computing Applications and Paradigms. Cloud Computing: Theory and Practice. 1 Chapter 4 Cloud Computing Applications and Paradigms Chapter 4 1 Contents Challenges for cloud computing. Existing cloud applications and new opportunities. Architectural styles for cloud applications.

More information

Benchmarking Hadoop & HBase on Violin

Benchmarking Hadoop & HBase on Violin Technical White Paper Report Technical Report Benchmarking Hadoop & HBase on Violin Harnessing Big Data Analytics at the Speed of Memory Version 1.0 Abstract The purpose of benchmarking is to show advantages

More information

Bosch Video Management System High Availability with Hyper-V

Bosch Video Management System High Availability with Hyper-V Bosch Video Management System High Availability with Hyper-V en Technical Service Note Bosch Video Management System Table of contents en 3 Table of contents 1 Introduction 4 1.1 General Requirements

More information

In-Memory Databases MemSQL

In-Memory Databases MemSQL IT4BI - Université Libre de Bruxelles In-Memory Databases MemSQL Gabby Nikolova Thao Ha Contents I. In-memory Databases...4 1. Concept:...4 2. Indexing:...4 a. b. c. d. AVL Tree:...4 B-Tree and B+ Tree:...5

More information

Data Warehouse in the Cloud Marketing or Reality? Alexei Khalyako Sr. Program Manager Windows Azure Customer Advisory Team

Data Warehouse in the Cloud Marketing or Reality? Alexei Khalyako Sr. Program Manager Windows Azure Customer Advisory Team Data Warehouse in the Cloud Marketing or Reality? Alexei Khalyako Sr. Program Manager Windows Azure Customer Advisory Team Data Warehouse we used to know High-End workload High-End hardware Special know-how

More information

Cluster Computing. ! Fault tolerance. ! Stateless. ! Throughput. ! Stateful. ! Response time. Architectures. Stateless vs. Stateful.

Cluster Computing. ! Fault tolerance. ! Stateless. ! Throughput. ! Stateful. ! Response time. Architectures. Stateless vs. Stateful. Architectures Cluster Computing Job Parallelism Request Parallelism 2 2010 VMware Inc. All rights reserved Replication Stateless vs. Stateful! Fault tolerance High availability despite failures If one

More information

Enterprise Storage Solution for Hyper-V Private Cloud and VDI Deployments using Sanbolic s Melio Cloud Software Suite April 2011

Enterprise Storage Solution for Hyper-V Private Cloud and VDI Deployments using Sanbolic s Melio Cloud Software Suite April 2011 Enterprise Storage Solution for Hyper-V Private Cloud and VDI Deployments using Sanbolic s Melio Cloud Software Suite April 2011 Executive Summary Large enterprise Hyper-V deployments with a large number

More information

Experience with Server Self Service Center (S3C)

Experience with Server Self Service Center (S3C) Experience with Server Self Service Center (S3C) Juraj Sucik, Sebastian Bukowiec IT Department, CERN, CH-1211 Genève 23, Switzerland E-mail: juraj.sucik@cern.ch, sebastian.bukowiec@cern.ch Abstract. CERN

More information

Novel Systems. Extensible Networks

Novel Systems. Extensible Networks Novel Systems Active Networks Denali Extensible Networks Observations Creating/disseminating standards hard Prototyping/research Incremental deployment Computation may be cheap compared to communication

More information

System Protection for Hyper-V Whitepaper

System Protection for Hyper-V Whitepaper Whitepaper Contents 1. Introduction... 2 Documentation... 2 Licensing... 2 Hyper-V requirements... 2 Definitions... 3 Considerations... 3 2. About the BackupAssist Hyper-V solution... 4 Advantages... 4

More information

Hyper-V Protection. User guide

Hyper-V Protection. User guide Hyper-V Protection User guide Contents 1. Hyper-V overview... 2 Documentation... 2 Licensing... 2 Hyper-V requirements... 2 2. Hyper-V protection features... 3 Windows 2012 R1/R2 Hyper-V support... 3 Custom

More information

CAT: Azure SQL DB Premium Deep Dive and Mythbuster

CAT: Azure SQL DB Premium Deep Dive and Mythbuster CAT: Azure SQL DB Premium Deep Dive and Mythbuster Ewan Fairweather Senior Program Manager Azure Customer Advisory Team Tobias Ternstrom Principal Program Manager Data Platform Group Cloud & Enterprise

More information

Achta's IBAN Validation API Service Overview (achta.com)

Achta's IBAN Validation API Service Overview (achta.com) Tel: 00 353 (0) 14773295 e: info@achta.com Achta's IBAN Validation API Service Overview (achta.com) Summary At Achta we have built a secure, scalable and cloud based API for SEPA. One of our core offerings

More information

Cloud computing - Architecting in the cloud

Cloud computing - Architecting in the cloud Cloud computing - Architecting in the cloud anna.ruokonen@tut.fi 1 Outline Cloud computing What is? Levels of cloud computing: IaaS, PaaS, SaaS Moving to the cloud? Architecting in the cloud Best practices

More information

Application Migration Best Practices. Gregory Shepard Senior Consultant InCycle Software

Application Migration Best Practices. Gregory Shepard Senior Consultant InCycle Software Application Migration Best Practices Gregory Shepard Senior Consultant InCycle Software We Help Organizations Get to the Next Level ALM MVPs and ALM consultants in six locations Application Migration Best

More information

System Center 2012 Suite SYSTEM CENTER 2012 SUITE. BSD BİLGİSAYAR Adana

System Center 2012 Suite SYSTEM CENTER 2012 SUITE. BSD BİLGİSAYAR Adana 2013 System Center 2012 Suite SYSTEM CENTER 2012 SUITE BSD BİLGİSAYAR Adana Configure and manage apps, services, computers, and VMs... 1 Operations Manager... 3 Configuration Manager... 4 Endpoint Protection...

More information

A Comparison of Clouds: Amazon Web Services, Windows Azure, Google Cloud Platform, VMWare and Others (Fall 2012)

A Comparison of Clouds: Amazon Web Services, Windows Azure, Google Cloud Platform, VMWare and Others (Fall 2012) 1. Computation Amazon Web Services Amazon Elastic Compute Cloud (Amazon EC2) provides basic computation service in AWS. It presents a virtual computing environment and enables resizable compute capacity.

More information

Hadoop & Spark Using Amazon EMR

Hadoop & Spark Using Amazon EMR Hadoop & Spark Using Amazon EMR Michael Hanisch, AWS Solutions Architecture 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Why did we build Amazon EMR? What is Amazon EMR?

More information

CitusDB Architecture for Real-Time Big Data

CitusDB Architecture for Real-Time Big Data CitusDB Architecture for Real-Time Big Data CitusDB Highlights Empowers real-time Big Data using PostgreSQL Scales out PostgreSQL to support up to hundreds of terabytes of data Fast parallel processing

More information

Performance Test Report KENTICO CMS 5.5. Prepared by Kentico Software in July 2010

Performance Test Report KENTICO CMS 5.5. Prepared by Kentico Software in July 2010 KENTICO CMS 5.5 Prepared by Kentico Software in July 21 1 Table of Contents Disclaimer... 3 Executive Summary... 4 Basic Performance and the Impact of Caching... 4 Database Server Performance... 6 Web

More information

CloudCenter Full Lifecycle Management. An application-defined approach to deploying and managing applications in any datacenter or cloud environment

CloudCenter Full Lifecycle Management. An application-defined approach to deploying and managing applications in any datacenter or cloud environment CloudCenter Full Lifecycle Management An application-defined approach to deploying and managing applications in any datacenter or cloud environment CloudCenter Full Lifecycle Management Page 2 Table of

More information

Alfresco Enterprise on AWS: Reference Architecture

Alfresco Enterprise on AWS: Reference Architecture Alfresco Enterprise on AWS: Reference Architecture October 2013 (Please consult http://aws.amazon.com/whitepapers/ for the latest version of this paper) Page 1 of 13 Abstract Amazon Web Services (AWS)

More information

day 1 2 Windows Azure Platform Overview... 2 Windows Azure Compute... 3 Windows Azure Storage... 3 day 2 5

day 1 2 Windows Azure Platform Overview... 2 Windows Azure Compute... 3 Windows Azure Storage... 3 day 2 5 Developers Workshop presented by MVP & v-tsp Damir Dobrić Chief Architect and Managing Developer daenet GmbH, Frankfurt / Main day 1 2 Windows Azure Platform Overview... 2 Windows Azure Compute... 3 Windows

More information

HDFS Users Guide. Table of contents

HDFS Users Guide. Table of contents Table of contents 1 Purpose...2 2 Overview...2 3 Prerequisites...3 4 Web Interface...3 5 Shell Commands... 3 5.1 DFSAdmin Command...4 6 Secondary NameNode...4 7 Checkpoint Node...5 8 Backup Node...6 9

More information

SiteCelerate white paper

SiteCelerate white paper SiteCelerate white paper Arahe Solutions SITECELERATE OVERVIEW As enterprises increases their investment in Web applications, Portal and websites and as usage of these applications increase, performance

More information

ProTrack: A Simple Provenance-tracking Filesystem

ProTrack: A Simple Provenance-tracking Filesystem ProTrack: A Simple Provenance-tracking Filesystem Somak Das Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology das@mit.edu Abstract Provenance describes a file

More information

Windows Azure Security

Windows Azure Security Windows Azure Security A Peek Under the Hood Charlie Kaufman 06/03/2010 Agenda Introduction Azure Compute Security Azure Storage Security SQL Azure Security Questions Azure Combines Three Components Compute

More information

Building Scalable Applications Using Microsoft Technologies

Building Scalable Applications Using Microsoft Technologies Building Scalable Applications Using Microsoft Technologies Padma Krishnan Senior Manager Introduction CIOs lay great emphasis on application scalability and performance and rightly so. As business grows,

More information

INTRODUCING WINDOWS AZURE

INTRODUCING WINDOWS AZURE INTRODUCING WINDOWS AZURE DAVID CHAPPELL DECEMBER 2009 SPONSORED BY MICROSOFT CORPORATION CONTENTS An Overview of Windows Azure... 2 The Compute Service... 3 The Storage Service... 5 The Fabric... 7 Using

More information

INTRODUCING WINDOWS AZURE

INTRODUCING WINDOWS AZURE INTRODUCING WINDOWS AZURE DAVID CHAPPELL MARCH 2009 SPONSORED BY MICROSOFT CORPORATION CONTENTS An Overview of Windows Azure... 2 The Compute Service... 3 The Storage Service... 5 The Fabric... 7 Using

More information

Windows Server 2012 授 權 說 明

Windows Server 2012 授 權 說 明 Windows Server 2012 授 權 說 明 PROCESSOR + CAL HA 功 能 相 同 的 記 憶 體 及 處 理 器 容 量 虛 擬 化 Windows Server 2008 R2 Datacenter Price: NTD173,720 (2 CPU) Packaging All features Unlimited virtual instances Per processor

More information

Best Practices for Monitoring Databases on VMware. Dean Richards Senior DBA, Confio Software

Best Practices for Monitoring Databases on VMware. Dean Richards Senior DBA, Confio Software Best Practices for Monitoring Databases on VMware Dean Richards Senior DBA, Confio Software 1 Who Am I? 20+ Years in Oracle & SQL Server DBA and Developer Worked for Oracle Consulting Specialize in Performance

More information

Simplifying Storage Operations By David Strom (published 3.15 by VMware) Introduction

Simplifying Storage Operations By David Strom (published 3.15 by VMware) Introduction Simplifying Storage Operations By David Strom (published 3.15 by VMware) Introduction There are tectonic changes to storage technology that the IT industry hasn t seen for many years. Storage has been

More information

Architecting For Failure Why Cloud Architecture is Different! Michael Stiefel www.reliablesoftware.com development@reliablesoftware.

Architecting For Failure Why Cloud Architecture is Different! Michael Stiefel www.reliablesoftware.com development@reliablesoftware. Architecting For Failure Why Cloud Architecture is Different! Michael Stiefel www.reliablesoftware.com development@reliablesoftware.com Outsource Infrastructure? Traditional Web Application Web Site Virtual

More information

Can the Elephants Handle the NoSQL Onslaught?

Can the Elephants Handle the NoSQL Onslaught? Can the Elephants Handle the NoSQL Onslaught? Avrilia Floratou, Nikhil Teletia David J. DeWitt, Jignesh M. Patel, Donghui Zhang University of Wisconsin-Madison Microsoft Jim Gray Systems Lab Presented

More information

Data Semantics Aware Cloud for High Performance Analytics

Data Semantics Aware Cloud for High Performance Analytics Data Semantics Aware Cloud for High Performance Analytics Microsoft Future Cloud Workshop 2011 June 2nd 2011, Prof. Jun Wang, Computer Architecture and Storage System Laboratory (CASS) Acknowledgement

More information

SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011

SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011 SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications Jürgen Primsch, SAP AG July 2011 Why In-Memory? Information at the Speed of Thought Imagine access to business data,

More information

SAN Conceptual and Design Basics

SAN Conceptual and Design Basics TECHNICAL NOTE VMware Infrastructure 3 SAN Conceptual and Design Basics VMware ESX Server can be used in conjunction with a SAN (storage area network), a specialized high speed network that connects computer

More information

Distributed File Systems

Distributed File Systems Distributed File Systems Paul Krzyzanowski Rutgers University October 28, 2012 1 Introduction The classic network file systems we examined, NFS, CIFS, AFS, Coda, were designed as client-server applications.

More information

Enterprise GIS Architecture Deployment Options. Andrew Sakowicz

Enterprise GIS Architecture Deployment Options. Andrew Sakowicz Enterprise GIS Architecture Deployment Options Andrew Sakowicz Audience Audience - Architects - Developers - Administrators - Project Managers Level: - Beginner / Intermediate Introduction Andrew Sakowicz

More information

Maximizing SQL Server Virtualization Performance

Maximizing SQL Server Virtualization Performance Maximizing SQL Server Virtualization Performance Michael Otey Senior Technical Director Windows IT Pro SQL Server Pro 1 What this presentation covers Host configuration guidelines CPU, RAM, networking

More information

Hadoop in the Hybrid Cloud

Hadoop in the Hybrid Cloud Presented by Hortonworks and Microsoft Introduction An increasing number of enterprises are either currently using or are planning to use cloud deployment models to expand their IT infrastructure. Big

More information

Lecture 6 Cloud Application Development, using Google App Engine as an example

Lecture 6 Cloud Application Development, using Google App Engine as an example Lecture 6 Cloud Application Development, using Google App Engine as an example 922EU3870 Cloud Computing and Mobile Platforms, Autumn 2009 (2009/10/19) http://code.google.com/appengine/ Ping Yeh ( 葉 平

More information

Scala Storage Scale-Out Clustered Storage White Paper

Scala Storage Scale-Out Clustered Storage White Paper White Paper Scala Storage Scale-Out Clustered Storage White Paper Chapter 1 Introduction... 3 Capacity - Explosive Growth of Unstructured Data... 3 Performance - Cluster Computing... 3 Chapter 2 Current

More information

Alfresco Enterprise on Azure: Reference Architecture. September 2014

Alfresco Enterprise on Azure: Reference Architecture. September 2014 Alfresco Enterprise on Azure: Reference Architecture Page 1 of 14 Abstract Microsoft Azure provides a set of services for deploying critical enterprise workloads on its highly reliable cloud platform.

More information