Università degli studi di Roma Tor Vergata Facoltà di Ingegneria!"#$%&'()*'+),-./)0' 12/#'+.3-)04,.' 0.3-)04,.5.&672&.,%"#87.4' 9##+':,%-.;),0' 2
9##+':,%-.;),0'<)#42,)0' 3 9##+':,%-.;),0'<)#42,)0' E.Casalicchio, L.Silvestri "Architectures for autonomic service management in cloud-based systems," Computers and Communications (ISCC), 2011 IEEE Symposium on, pp.161-166, June 28 2011-July 1 2011, Kerkyra (Corfù), Greece 4
!"#$%&'()*'+),-./)0' Compute Amazon Elastic Compute Cloud (EC2) Amazon Elastic MapReduce Auto Scaling Elastic Load Balancing Storage Simple Storage Service (S3) Elastic Block Store (EBS) Database Amazon SimpleDB Amazon Relational Database Service (RDS) Amazon DynamoDB Messaging Simple Queue Service (SQS) Simple Notification Service (SNS) Networking & Content Delivery Amazon Route 53 Amazon Virtual Private Cloud (VPC) Amazon CloudFront Deployment & Management Amazon CloudWatch AWS Elastic Beanstalk AWS Identity and Access Management (IAM) 5!"#$%&'=>8' Provides resizable compute capacity in the cloud Allows to increase/decrease capacity (start/ stop instances) within minutes Pay-per-use on hourly basis From one to thousands of server instances can be launched simultaneously Guarantees complete control over instances root SSH access, GUI, command line tools, APIs offers advanced services Elastic Block Store Elastic Load Balancer CloudWatch + AutoScaling Elastic IP Amazon Elastic Beanstalk 6
=>8'1%/#?%&0' Regions geographically dispersed consist of one or more availability zones Current regions:us East (Northern Virginia), US West (Oregon),US West (Northern California), EU (Ireland), Asia Pacific (Singapore), Asia Pacific (Singapore), Asia Pacific (Tokyo), South America (Sao Paulo) Special Region AWS GovCloud Availability Zones distinct locations in the same region engineered to be insulated from failures in other availability zones used to protect applications from failure of a single location Load Balancing allowed only between different Availability Zones in the same Region not supported between different Regions 7 =>8'9&04#&/)'@AB)0' On-Demand Instances billing per-hour with no long-term commitments Reserved Instances one-time payment to reserve an instance for 1 or 3 years significant discount on hourly usage charge Spot Instances enable users to bid for unused EC2 capacity Spot Price fluctuates periodically depending on supply of/demand for Spot Instance capacity 8
=>8'9&04#&/)0' C'=>8'>%"B24)'D&.4'B,%-.;)0'4E)')F2.-#3)&4'>:D'/#B#/.4A'%G'#' C7HIC78'JK$'8HHL'MB4),%&'%,'8HHL'N)%&'B,%/)00%,' 9 =>8':,./)0'OD+'=#04P' 10
=>8':,./)0'O=DP' 11 =>8'!;;.?%'+),-./)0':,./)0' Load Balancing 12
=>8'9&4),G#/)' AWS Management Console Command Line Tools Java-based command-line client AWS SDKS (available for Java, PHP and.net) Third Party Libraries Query and SOAP APIs
!"#$%&'=3#0?/'`#BX);2/)' =&#*3)0'4%'B,%/)00'-#04'#"%2&40'%G';#4#'!BB3./#?%&0a'Q)*'.&;)b.&6R';#4#'".&.&6R'3%6Z3)'#A0.0R';#4#'QE#,)E%20.&6R' Z&#&/.#3'#A0.0R'0/.)&?Z/'0."23#?%&c' D?3.$)0'#'E%04);'K#;%%B'G,#")Q%,V',2&&.&6'%&'4E)'Q)*I0/#3)'.&G,#04,2/42,)'%G'!"#$%&'=>8'#&;'+W'!B#/E)'K#;%%B'.0'#&'%B)&'0%2,/)'d#-#'0%eQ#,)'G,#")Q%,V'4E#4'02BB%,40' ;#4#I.&4)&0.-)';.04,.*24);'#BB3./#?%&0',2&&.&6'%&'3#,6)'/3204),0'%G'/%""%;.4A' E#,;Q#,)' K#;%%B'."B3)")&4#?%&'%G'4E)'`#BX);2/)'G,#")Q%,V' ;#4#'.&'#'S%*'f%Q'02*;.-.;);'.&'0"#33),'/E2&V0'0%'4E#4'4E)A'/#&'*)'B,%/)00);'.&'B#,#33)3'O"#B'G2&/?%&P' B,%/)00);';#4#'#,)',)/%"*.&);'.&4%'4E)'Z'0%32?%&'O,);2/)'G2&/?%&P'!33%Q0'A%2'4%'."B3)")&4';#4#'B,%/)00.&6'#BB3./#?%&0'.&'"#&A'3#&62#6)0'.&/32;.&6'd#-#R':),3R'X2*AR':A4E%&R':K:R'XR'%,'>gg' 15 ^#4#*#0)'!"#$%&'X^+' X)3#?%'^U'O`A+h1'%,'M,#/3)'^U'=&6.&)P'!24%"#?/'`#)")&4'O+%eQ#,)':#4/E.&6R'U#/V2BP'#&;'`%&.4%,.&6' <%,'4E)'`A+h1'^U'=&6.&)R'A%2'/#&'#30%'#00%/.#4)'%&)'%,'"%,)'X)#;'X)B3./#0''!"#$%&'+."B3)^U' E.6E3A'#-#.3#*3)'#&;'f)b.*3)'&%&I,)3#?%';#4#'04%,)' #24%"#?/#33A'/,)#4)0'"23?B3)'6)%6,#BE./#33A';.04,.*24);'/%B.)0'%G')#/E';#4#'.4)"'A%2'04%,)'!"#$%&'^A&#"%^U' #'G233A'"#);'_%+h1';#4#*#0)'0),-./)'4E#4'B,%-.;)0'G#04'#&;'B,);./4#*3)' B),G%,"#&/)'Q.4E'0)#"3)00'0/#3#*.3.4A' #24%"#?/#33A'0B,)#;0'4E)';#4#'#&;'4,#i/'G%,'4E)'4#*3)'%-),'#'02i/.)&4' &2"*),'%G'0),-),0'4%'E#&;3)'4E)',)F2)04'/#B#/.4A'0B)/.Z);'*A'4E)'/204%"),' #&;'4E)'#"%2&4'%G';#4#'04%,);R'QE.3)'"#.&4#.&.&6'/%&0.04)&4R'G#04' B),G%,"#&/)'!33';#4#'.4)"0'#,)'04%,);'%&'+%3.;'+4#4)'^,.-)0'O++^0P'#&;'#,)'#24%"#?/#33A',)B3./#4);'#/,%00'"23?B3)'!-#.3#*.3.4A'j%&)0'.&'#'X)6.%&'.&4)6,#?%&'Q.4E'=3#0?/'`#BX);2/)' 16
!"#$%&'+."B3)'h2)2)'+),-./)' `)00#6)'F2)2.&6'0),-./)'4E#4')&#*3)0'#0A&/E,%&%20'")00#6)' *#0);'/%""2&./#?%&'*)4Q))&';.04,.*24);'/%"B%&)&40'%G'#&' #BB3./#?%&' (E)&'#'")00#6)'.0',)/).-);R'.4'*)/%")0'k3%/V);l'QE.3)'*).&6' B,%/)00);' 9G'4E)'")00#6)'B,%/)00.&6'G#.30R'4E)'3%/V'Q.33')bB.,)'#&;'4E)'")00#6)' Q.33'*)'#-#.3#*3)'#6#.&' 17!"#$%&'X%24)'TW' K.6E3A'#-#.3#*3)'#&;'0/#3#*3)'^_+'Q)*'0),-./)'!&0Q),0'^_+'F2),.)0'Q.4E'3%Q'3#4)&/A'*A'20.&6'#'63%*#3' &)4Q%,V'%G'^_+'0),-),0' h2),.)0'#,)'#24%"#?/#33a',%24);'4%'4e)'&)#,)04'^_+'0),-),' ^)0.6&);'4%'#24%"#?/#33A'0/#3)'4%'E#&;3)'-),A'3#,6)'F2),A' -%32")0'Q.4E%24'#&A'E2"#&'.&4),-)&?%&' :,./.&6' K%04);'j%&)0' \H7TH'B),'E%04);'$%&)'['"%&4E'G%,'4E)'Z,04'8T'E%04);'$%&)0' \H7CH'B),'E%04);'$%&)'['"%&4E'G%,'#;;.?%'E%04);'$%&)0' h2),.)0' \H7TH'B),'".33.%&'F2),.)0'm'Z,04'C'U.33.%&'F2),.)0'['"%&4E' \H78T'B),'".33.%&'F2),.)0'm'%-),'C'U.33.%&'F2),.)0'['"%&4E' 18
m'!"#$%&'+W'*2/V)4R'!"#$%&'=>8'.&04#&/)'m' %,'4E.0'/%23;'*)'#&')b4),'%,.6.&'0),-),7' :,./.&6' X)6.%'^#4#'@,#&0G),'O\H7H8HI\H78TH[JUP' X)F2)04'O\H7HCHI\H7H88'B),'CHHHH'K@@:R'K@@:+',)F2)040P' 19 =3#0?/'U)#&04#3V'!33%Q0'4%';)B3%A'#&;'"#)'#BB3./#?%&0'.&'4E)'!(+'/3%2;'3)-),#6.&6'!(+'0),-./)0'02/E'#0'=>8R'+WR'+_+R'=3#0?/'1%#;'U#3#&/.&6R'#&;'!24%I +/#3.&6' ^)B3%A")&4' D0),0'2B3%#;'#'(!X'Z3)'/%&4#.&.&6'#'d#-#'()*'!BB3./#?%&' =3#0?/'U)#&04#3V'E#&;3)0'4E)'B,%-.0.%&.&6'%G'#'3%#;'*#3#&/),'#&;'4E)' ;)B3%A")&4'%G'4E)'(!X'Z3)'4%'%&)'%,'"%,)'=>8'.&04#&/)0',2&&.&6'4E)'!B#/E)' @%"/#4'#BB3./#?%&'0),-),' <)#42,)0' ;)B3%A'&)Q'#BB3./#?%&'-),0.%&0'4%',2&&.&6')&-.,%&")&40' )I"#.3'&%?Z/#?%&0'4E,%26E'+_+'QE)&'#BB3./#?%&'E)#34E'/E#&6)0'%,' #BB3./#?%&'0),-),0'#,)'#;;);'%,',)"%-);'!24%'+/#3.&6'#&;'1%#;'U#3#&/.&6'B#,#")4),0'G233A'/204%".$#*3)'4E,%26E' 4E)'!(+'`#)")&4'>%&0%3)' :,./.&6a'&%'#;;.?%'/E#,6)'G%,'=3#0?/'U)#&04#3V'R'4E)'20),'B#A0'%&3A'G%,' 4E)'2&;),3A.&6'!(+',)0%2,/)0'4E#4'A%2,'#BB3./#?%&'/%&02")07'' 20
!"#$%&'=3#0?/'U3%/V'+4%,)' Offers persistent storage for EC2 instances Provides off-instance storage that persist independently from the life of an instance EBS volumes from 1GB to 1 TB EBS volumes can be used ad instance s boot partitions or attached to running instances as standard block devices A volume can only be attached to one instance at a time, but many volumes can be attached to a single instance EBS volumes can be attached only to instances in the same Availability Zone EBS volumes automatically replicated within the same Availability Zone to avoid data loss EBS provides the ability to create point-in-time snapshots of volumes that can be stored using S3 21 =3#0?/'9:'n'o.,42#3':,.-#4)'>3%2;' Elastic IP addresses are not associated with a particular instance but with a user account the user control an elastic IP address address until he explicitly release it allow to mask instance or Availability Zone failures by quickly remapping the Elastic IP address to another instance/load balancer Virtual Private Cloud enables enterprises to connect their existing infrastructure to a set of isolated AWS compute resources via a Virtual Private Network (VPN) connection 22
!"#$%&'>3%2;(#4/E' Provides monitoring for AWS cloud resources and applications CloudWatch is Metric repository AWS services put metrics in the repository users retrieve statistics based on those metrics 23 >3%2;(#4/E'/%&/)B40' Metric a time ordered set of data points PutMetricData API allows users to create custom metrics Statistics metric data aggregations over specified periods of time available statistics: Minimum, Maximum, Sum, Average, SampleCount can be retrieved by GetMetricStatistics API Period length of time associated with a specific CloudWatch statistic expressed in seconds, range from 60 (one minute) to 1209600 (two weeks) Alarm watches a single metric over a specified time period performs one or more actions based on the value of the metric relative to a given threshold over a number of time periods 24
EC2 metrics CPUUTilization! >3%2;(#4/E'")4,./0' DiskReadOps/DiskWriteOps! DiskReadBytes/DiskWriteBytes! NetworkIn/NetworkOut! Elastic Load Balancing Metrics Latency! RequestCount! HealthyHostCount/UnHealthyHostCount! Count of HTTP Response Codes (2xx, 3xx, 4xx, 5xx)generated by Load Balancer or back-end instances 25 >3%2;(#4/E'9&4),G#/)' Command Line Tools Libraries (JAVA, PHP, Python, Ruby, Android, ios, Wndows and.net) Query API HTTP/HTTPS GET or POST requests AWS Management Console 26
>3%2;(#4/E'!3#,"0'n'!24%'+/#3.&6' An alarm watches a single metric over a time period and performs one or more actions based on the value of the metric relative to a given threshold over a number of time periods Possible states: OK, ALARM, INSUFFICIENT_DATA! When an alarm changes its state an action is invoked notification through Amazon SNS Auto Scaling policy Example Threshold = 3 minimum breach = 3 periods 27!24%'+/#3.&6' Auto Scaling allows to scale EC2 capacity up or down automatically according to user defined conditions Enabled by Amazon CloudWatch uses CloudWatch alarms 28
!24%'+/#3.&6':%3./.)0' Auto Scaling policies defines action to take when an alarm state changes For every monitored event 2 policies should be defined a scale-up policy a scale-down policy A policy can be created using PutScalingPolicy API with the following parameters: AdjustmentType: possible values are ChangeInCapacity, ExactCapacity, PercentChangeInCapacity! Cooldown: amount of time after a scaling activity completes before any further trigger-related scaling activities can start PolicyName! ScalingAdjustment: the number of instances by which to scale (positive or negative) 29 =3#0?/'1%#;'U#3#&/.&6' Automatically distributes incoming traffic across multiple EC2 instances 30
=1U'G)#42,)0' Detects unhealthy instances within a poll and automatically reroute traffic to healthy instances Can be enabled across multiple Availability Zones within a Region NOT between Availability Zones in different Regions! Uses a Least Loaded balancing policy Supports sticky sessions load balancer generatedhttp cookies (browser based session lifetime) application-generated HTTP cookies (application-specific session lifetimes) Supports HTTPS Enables the client to define an application healthcheck for the instances through the following parameters Threshold, Interval, Target, Timeout, UnhealthyThreshold! Provides APIs to add/remove instances RegisterInstancesWithLoadBalancer! DeregisterInstancesWithLoadBalancer! 31 32
!24%&%"./'>3%2;'!,/E.4)/42,)' 33!24%&%"./'>3%2;'9"B3)")&4#?%&' 34
=b#"b3)a'`);.#q.v.'%&'=>8' To test the AWS Auto Scaling capabilities we deployed Mediawiki on Amazon EC2 MediaWiki is a free software open source wiki package written in PHP We populated the DB with a dump from Wikipedia We replicated traffic from a real Wikipedia workload trace properly reduced 35 =b#"b3)a'@)04*);'0)42b' 1-10 Amazon EC2 m1.small instances 32 bit Linux VMs with 1 EC2 Compute Unit and 1.7 GB memory Each VM replicates the front-end of the MediaWiki web application Apache 2.2.16 is used as application server 1 Amazon EC2 m1.large instance 64 bit Linux VM with 4 EC2 Compute Units (2 cores) and 7.5 GB memory MySQL 5.1.52 is used to implement the back-end tier The system is dimensioned to guarantee that the centralized DB never represents the system bottleneck 1 Amazon Elastic Load Balancer 1 EC2 m1.small instance as workload generator All components run in the same Availability Zone the effects of network latency are reduced at the minimum 36
=b#"b3)a'!24%'+/#3.&6':%3./.)0' Utilization-based, one alarm (UT-1AL) add 1 instance if average CPU utilization > 62% remove 1 instance if average CPU utilization < 50% Utilization-based, two alarms (UT-2AL) add 2 istances if utilization > 70%, 1 if utilization>62% remove 1 instance if utilization < 50%, 2 if utilization < 25% Latency-based, one alarm (LAT-1AL) add 1 instance if latency (average response time seen by the ELB) is > 0.2 seconds remove 1 instance if average CPU utilization < 50% Latency based, two alarms (LAT-2AL) add 2 istances if latency > 0.5 sec, 1 if latency > 0.2 sec remove 1 instances if utilization < 50%,2 if utilization < 25% 37 =>8'B,%*3)"0' ELB bugs problems with start/stop instances, better use launch/terminate if an instance crashes it remains forever in unhealthy status unhealthy instances are not automatically replaced CloudWatch problems metric variation over a time interval is missing request count considers only the requests processed by the load balancer (system throughput behind the ELB) a metric to know the number ofrequest arrived at the load balancer is missing General Problems no real-time billing performance level of a single VM is quite variable load balancing policy cannot be customized 38