Performance Management for Cloud-based Applications STC 2012 1
Agenda Context Problem Statement Cloud Architecture Key Performance Challenges in Cloud Challenges & Recommendations 2
Context Cloud Computing gained significance mainly due to its impact on reduced CapExand OpExthat is possible due to characteristics such as Elasticity, On-demand resource provisioning and Pay-per- Use that drive organizations to migrate some of their applications, data and infrastructure to Cloud Architectures. However, organizations are speculative of potential challenges such as application performance management in Cloud: How is performance management different for applications in Cloud compared to that in current architectures? What are the typical performance management challenges for applications in different Cloud Service Models (IaaS, PaaS, and SaaS) and Deployment Models (i.e., Public and Private Clouds)? What are the ways and means to overcome performance management in Cloud? Objective of the paper is to highlight the performance management challenges for Cloud-based applications and suggest recommendations/best practices to address them. 3
Generic Cloud Architecture Following diagram represents typical cloud architecture and its components. Virtual Machines (VMs) Virtualization Layer Host Operating System Physical Servers CPU Memory Storage Network Performance of any given IT System (Client/Server, Multi-tiered, Mainframe, SOA et.al) depends on 3 key aspects: Performance of Application ( includes Application Code, Application Design, Software/Middleware/Database and External Systems) Hardware(H/W) Infrastructure Optimal Software (S/W) Configuration Settings for given H/W When compared with traditional architectures, additional layers in Cloud architecture are - Virtualization Layer and Virtual Machine layer 4
Typical Performance Management Challenges in Cloud Hypervisor Layer has certain overhead due to resource virtualization Timekeeping issue impacts on time based perf metrics Virtualization Layer Shared Physical Environment Bursty load of an Application robs resources from other Applications sharing the hardware infrastructure Stateful Workloads Elasticity & Scalability n-way Session Replication in VMs impacts performance & scalability Elasticity -not a substitute for Application Scalability App should be scalable first 5
Performance Management Challenges & Recommendations Category Challenge Recommendation / Best Practice Virtualization / Hypervisor Layer Time measurement is a challenge in Virtualized environment due to the fact that timing of a VM is not synchronized with other VMs or even with the physical host as VMs get scheduled and de-scheduled based onworkload demand virtualizing physical NIC into multiple Virtual NICs will have more concurrent network traffic there by impacts bandwidth available for application Currently being addressed by different Hypervisor vendors (such as VMware, Microsoft, IBM) Architects/developers should be awarewhen designing routines to capture latency at application code level FewVMs should be assigned dedicated physical NICs depending on the criticality of workload & performance SLA. Appropriate sizing of number of physical NICs that share the same hardware should be considered.
Performance Management Challenges & Recommendations Category Challenge Recommendation / Best Practice Shared Physical Environment Sudden and unpredictable load of any application/workload might need more than the required computing resources, due to Elasticity, thereby taking away the resources of other workloads impacting their performance SLAs Cloud Vendors (Public/Private) who manage underlying hardware infrastructure should have complete understanding about tenants/applications/workload s sharing the hardware infrastructure, their load patterns, respective capacity requirements (both Min and Max) and performance SLAs Cloud Consumer/Cloud Integrator should insist to get VM configurations -virtual and physical resources, Resource Sharing model of VMs (Shared/Dedicated/Shared- Cap) for early verification and validation
Performance Management Challenges & Recommendations Category Challenge Recommendation / Best Practice Stateful Workloads Elasticity Vs Application Scalability For statefulworkloads,session management and session replication across multiple VMs is costlydue to n- way replication (store and retrieval operations) Elasticity benefits are realized if and only if a given Application is Scalable first. Employuse of Distributed Caching solutions such as Oracle Coherence, MemCache, WebSphereeXtremeScale that does intelligent replication and avoids unnecessary n-way replication with faster session archival and retrieval. Ensurethat the amount of data stored in Sessions is as minimum as possible. Standard performance engineering activitiessuch as monitoring, performance tuning and application scalability assessment should be carried out even for applicationsto be migrated to Cloud
Performance Management Challenges & Recommendations Category Challenge Recommendation / Best Practice IaaS Cloud Consumer is provided only the required computing resources and hence has control over OS and applications deployed on top of it -but noton the underlying hardware infrastructure Architects of cloud consumer group need to understand and review Mapping between Virtual Resources and Physical Resources of VMs VM Profile in terms of resource sharing (Shared or Dedicated or Shared with Cap) Rules defined for the Resource Management of VMs (ideally defined by Cloud Provider Administrator)
Performance Management Challenges & Recommendations Category Challenge Recommendation / Best Practice PaaS Consumer does not have any clue on what happens below the Platform Consumer does not have access to modify or tune platform specific configuration suitable for the application Performancebottleneck identification needs profiling tools such as Jprobe/JProfilier/.NET Profiler etc. which are agent-based tools that require the agent to be attached with platform s runtime Platform specific performance monitoring can be done using the pre-packaged monitoring capabilities of the platform, if and only if the capability is provided to Consumer Usage of any enterprise monitoring tools such as DynaTrace, HP Diagnostics, HP SiteScopeCA Introscopeisrestricted by Platform s support and compatibility Clearly define contractual agreements with the PaaSVendor w.r.t providing OSand Hardware level performance metrics and performance of underlying infrastructure. Design application to have performance metrics logging feature for critical routines within application code Review support provided by variour Platform vendors (Google, Force.com) for monitoring tools in advance
Performance Management Challenges & Recommendations Category Challenge Recommendation / Best Practice SaaS Consumers have nocontrol over application code, platform and hardware infrastructure,hence application performance management is completely dependant on Cloud Provider Clearly define contractual agreement and penalty clauses with the Cloud Provider for end-toend application performance SLAs
Thank You The contents of this document are proprietary and confidential to Infosys Technologies Ltd. and may not be disclosed in whole or in part at any time, to any third party without the prior written consent of Infosys Ltd. 2012 Infosys Ltd. All rights reserved. Copyright in the whole and any part of this document belongs to Infosys Technologies Ltd. This work may not be used, sold, transferred, adapted, abridged, copied or reproduced in whole or in part, in any manner or form, or in any media, without the prior written consent of Infosys Technologies Ltd. 1 2