Outline PHD Dissertation Proposal Defense Wes J. Lloyd Colorado State University, Fort Collins, Colorado USA Research Problem Challenges Approaches & Gaps Research Goals Research Questions & Experiments Research Contributions Preliminary Results 2 Traditional Application Deployment Single Server Data Business Logging Spatial DB rdbms DODB / NOSQL Services Services Services Tracking DB Object Store App Server Apache Tomcat Infrastructure as a Service (IaaS) Cloud Computing Server partitioning of multi core servers Hardware virtualization Service isolation Resource elasticity Research Problem 3 Research Problem 4 1
Problem Statement Autonomicdeployment of multi tier applications to IaaS clouds Component composition Collocation and interference of components Scaling infrastructure to meet demand Research Problem 5 Research Problem 6 Application Component Composition Application Components App Server rdbms r/o File Server Log Server Load Balancer rdbms write Dist. cache Application Stack Component Deployment PERFORMANCE Virtual Machine () Images App Server rdbms write File Server Log Server Image 1 Image 2 rdbms r/o Load Balancer... Image n 7 Problems & Challenges 8 2
Bell s Number Provisioning Variation n = number of application components Model Database File Server Log Server Application Stack 1 : 1..n components Component Deployment # of Configurations config 1 M F D L... deployments config 2 M D F L 5 6 7 52 203 877 config n 8 4,140 M L D F 9 21,147 k= # of possible configs n k 4 15 n... Request(s) to launch s Ambiguous Mapping PERFORMANCE Physical Host s Share PM CPU / Disk / Network Physical Host Physical Host Physical Host Physical Host Physical Host s Reserve PM Memory Blocks Problems & Challenges 9 Problems & Challenges 10 Infrastructure Management Scale Services Tune Application Parameters Tune Virtualization Parameters Application Servers Service Requests Load Balancer Load Balancer distributed cache rdbms nosql data stores Problems & Challenges 11 12 3
Virtual Machine () Placement as Bin Packing Problem Related Work Bins= physical machines (PMs) Items= virtual machines (s) Dimensions CPU time RAM, hard disk size, # cores Disk read/write throughput Network read/write throughput PM capacities vary dynamically resource utilization varies NP-Hard Multivariateperformance models Regression models Machine learning Feedback loop control Hybrid approaches Formal approaches Integer linear programming Case based reasoning Approaches & Gaps 13 Approaches & Gaps 14 Gaps in Related Work Existing approaches do not consider imagecomposition Complementary component placements Interference among components Minimization of resources (# s) Load balancing of physical resources Performance models ignore Disk I/O Network I/O and component location Why Gaps Exist Public clouds Research is cost prohibitive Users concerned with performance not in control Private clouds: systems still evolving Performance models (large problem space) Virtualization misunderstood or overlooked Approaches & Gaps 15 Approaches & Gaps 16 4
Research Goals RG1: Support component composition RG2: Support virtual infrastructure management Determine and execute placement Scale infrastructure for application demand 17 Research Goals 18 Performance Objectives Primary: Maximize applicationthroughputthroughput Secondary: Minimize resource cost (# of s) Minimize modeling time Support high responsiveness to change in application demand application demand Research Goals 19 20 5
Methodology Evaluation CSIP: USDA NRCS platform for model services Models as multi tier application surrogates RUSLE2 Soil erosion model WEPS Wind Erosion Prediction System Hydrology models: SWAT, AgES Other models: STIR, SCI Eucalyptus IaaS cloud(s) Amazon EC2 compatible XEN & K hypervisors Component Composition RQ1 RQ2 Infrastructure Management RQ3 RQ4 RQ5 Research Questions & Methodology 21 Research Questions & Methodology 22 RQ1: Which independent variables best help model application performance (throughput) to guide autonomic component composition? Total (all s) resource utilization CPU time, disk I/O, network I/O, Individual and PM resource utilization Component and location Configuration: number of cores, RAM, hypervisor type (K, XEN...) RQ1: Which independent variables best help model application performance (throughput) to guide autonomic component composition? Methodology Total Exploratory (all s) resource performance utilization modeling CPU time, disk I/O, network I/O, Individual Investigate /PM independent resource utilization variables Investigate modeling techniques Component Multiple / linear location regression (MLR) Configuration: Artificial neural number networks of cores, (ANNs) RAM, hypervisor Others type (K, XEN...) Research Questions & Methodology 23 Research Questions & Methodology 24 6
RQ2: Can component resource classifications and behavioral rules predict performance of component compositions? Support simplification of the search space Support applications with large # of n k components 4 15 Bell s number 5 52 6 203 7 877 8 4,140 9 21,147 n... RQ2: Can component resource classifications and behavioral rules predict performance of component compositions? Methodology Investigate Support simplification autonomic component of the search composition space approach(es) Support applications with large # of n k components 4 15 Performance modeling 5 52 e.g. Bell s number: Heuristics to classify 6 203 Componentresource utilization 7 877 Component dependencies 8 4,140 9 21,147 n... Research Questions & Methodology 25 Research Questions & Methodology 26 Evaluation Metrics: Component Composition Composition i performance Average throughput of configurations Resource packing density # components/# s for compositions Derivation speed Average wall clock time to produce compositions RQ3: Does performance of component compositions change when scaled up? Single provisioned application s Multiply provisioned application s Investigate collocation of new s Intelligent vs. ad hoc placement Load balance physical resources Research Questions & Methodology 27 Research Questions & Experiments 28 7
RQ3: Do performance rankings of component compositions change when scaled up? Single Methodology provisioned application s Scale up compositions and benchmark Multiply performance provisioned application s Investigate collocation of new s Intelligent Investigate vs. ad hoc impact placement of placement for infrastructure scaling Load balance physical resources RQ4: How rapidly can s be launched in response to application demand? Determine upper bound of launch speed Devise workarounds to improve performance prelaunch and suspension Reserve RAM, other resources multiplexed Enforced caching of data on PMs Reassign duties of existing s Others? Research Questions & Methodology 29 Research Questions & Methodology 30 RQ4: How rapidly can s be launched in response to application demand? Determine upper bound of launch speed Devise Methodology workarounds to improve performance Benchmark prelaunch and launch suspension performance and investigate Reserve RAM, other potential resources improvements multiplexed Enforced caching of data on PMs Reassign duties of existing s Others? RQ5: Which independent variables best support application performance modeling for autonomic infrastructure management? Virtual infrastructure Number of s (1 to n) per application RAM, # cores location data Application specific parameters Number of worker threads Number of database connections Number of app server concurrent connections Research Questions & Methodology 31 Research Questions & Methodology 32 8
RQ5: Which independent variables best support application performance modeling for autonomic infrastructure management? Methodology When resources are scaled Explore Virtual infrastructure autonomic infrastructure management approach(es) Number of s (1 to n) per application RAM allocation Performance core allocation model based Feedback Location of control s Hybrid Application approach specific parameters Number of worker threads Number of database connections Number of app server concurrent connections Evaluation Metrics: Infrastructure Management Percentage of service requests completed Responsiveness Max supported load acceleration without dropping requests Adaptation time Time window with dropped requests Failure recovery time Research Questions & Methodology 33 Research Questions & Methodology 34 Expected Contributions (1/2) Novel, intelligent approaches for IaaS cloud Application deployment Infrastructure management Move IaaS infrastructure management beyond simple management of pools 35 Contributions 36 9
Expected Contributions (2/2) Autonomic component composition (RQ1, RQ2) Autonomic infrastructure management (RQ3, RQ4, RQ5) Improve application performance modeling For component composition (RQ1) New independent variables (RQ1) Heuristics (RQ2) For infrastructure management (RQ5) Support load balancing of physical resources Non goals Support for stochastic applications Only applications with stable resource utilization characteristics supported External interference From non application s Hot spot detection Contributions 37 Contributions 38 Component Compositions Rusle 2 Model Fastest Compositions SC2 SC11 SC 6 M D L M D F L M F D L F Service Isolation Overhead 39 XEN K Preliminary Results 40 10
Component Compositions SC2 M D F L Rusle 2 Model Fastest Compositions SC 6 M D F L SC11 Determining fastest compositions Not intuitive Testing / Service prediction Isolation required Overhead Service Isolation Was not fastest Adds overhead M F D L Results in maximum hosting costs (# of s) Component Composition Resource Utilization Diversity XEN K Preliminary Results 41 Preliminary Results 42 Component Composition Resource Utilization Diversity Resource utilization varies Component/ placements memory size allocations Hypervisor type K/XEN Testing required to identify resource utilization Intuition is insufficient Preliminary Results 43 Preliminary Results 44 11
Preliminary Results 45 Preliminary Results 46 Scaling Infrastructure requires tuning application parameters in response to available resources. Preliminary Results 47 48 12