Intelligent Tracking of Performance Storms in Complex Cloud Infrastructures

Similar documents
Xangati Storage Solution Brief. Optimizing Virtual Infrastructure Storage Systems with Xangati

Optimizing service assurance for XenServer virtual infrastructures with Xangati

The Top 20 VMware Performance Metrics You Should Care About

solution brief September 2011 Can You Effectively Plan For The Migration And Management of Systems And Applications on Vblock Platforms?

Capacity planning with Microsoft System Center

Capacity Planning Fundamentals. Support Business Growth with a Better Approach to Scaling Your Data Center

can you effectively plan for the migration and management of systems and applications on Vblock Platforms?

VMware Virtualization and Cloud Management Overview VMware Inc. All rights reserved

Solution Brief Virtual Desktop Management

Virtual Desktop Infrastructure Optimization with SysTrack Monitoring Tools and Login VSI Testing Tools

Top Purchase Considerations for Virtualization Management

Integrated Performance Management for Physical, Virtual and Cloud Infrastructure

Performance Management for Enterprise Applications

Elevating Data Center Performance Management

PLA 7 WAYS TO USE LOG DATA FOR PROACTIVE PERFORMANCE MONITORING. [ WhitePaper ]

A TECHNICAL WHITE PAPER ATTUNITY VISIBILITY

Benefits of Deploying VirtualWisdom with HP Converged Infrastructure March, 2015

White Paper. How to Achieve Best-in-Class Performance Monitoring for Distributed Java Applications

The Trellis Dynamic Infrastructure Optimization Platform for Data Center Infrastructure Management (DCIM)

VMware and Primary Data: Making the Software-Defined Datacenter a Reality

How To Manage Cloud Computing

HP Virtualization Performance Viewer

Riverbed SteelCentral. Product Family Brochure

CA Virtual Assurance/ Systems Performance for IM r12 DACHSUG 2011

Application Performance Management

BridgeWays Management Pack for VMware ESX

S o l u t i o n O v e r v i e w. Optimising Service Assurance with Vitria Operational Intelligence

Proactive and Predictive Virtualization Management Optimizes Datacenter Availability and Utilization

The Business Case for Virtualization Management: A New Approach to Meeting IT Goals By Rich Corley Akorri

IT Service Management Real-time Enduser Context Has A Dramatic Affect On Incident and Problem Resolution Times

Oracle Enterprise Manager 13c Cloud Control

Optimize Application Performance and Enhance the Customer Experience

INCREASING EFFICIENCY WITH EASY AND COMPREHENSIVE STORAGE MANAGEMENT

7/15/2011. Monitoring and Managing VDI. Monitoring a VDI Deployment. Veeam Monitor. Veeam Monitor

Minder. simplifying IT. All-in-one solution to monitor Network, Server, Application & Log Data

Operations Management for Virtual and Cloud Infrastructures: A Best Practices Guide

End-User Experience. Critical for Your Business: Managing Quality of Experience.

Aternity Virtual Desktop Monitoring. Complete Visibility Ensures Successful VDI Outcomes

Machine Data Analytics with Sumo Logic

BIG DATA THE NEW OPPORTUNITY

SAP Performance Management. A Trend Study by Compuware and PAC

A new Breed of Managed Hosting for the Cloud Computing Age. A Neovise Vendor White Paper, Prepared for SoftLayer

Violin Symphony Abstract

VMware vcenter Log Insight Delivers Immediate Value to IT Operations. The Value of VMware vcenter Log Insight : The Customer Perspective

Riverbed SteelCentral. Product Family Brochure

SolarWinds Virtualization Manager

Virtualization Essentials

Optimize VDI with Server-Side Storage Acceleration

Process Intelligence: An Exciting New Frontier for Business Intelligence

Move beyond monitoring to holistic management of application performance

Proactive VDI Performance Monitoring

Support the Era of the App with End-to-End Network and Application Performance Visibility

Achieving Business Performance Goals through Virtualization Management Best Practices

Aternity Desktop and Application Virtualization Monitoring. Complete Visibility Ensures Successful Outcomes

CA Cloud Overview Benefits of the Hyper-V Cloud

Monitoring Cloud Applications. Amit Pathak

Intel Service Assurance Administrator. Product Overview

PEPPERDATA IN MULTI-TENANT ENVIRONMENTS

Guaranteeing Performance and Availability for Electronic Health Records (EHR) EHR Solution Brief JPS Healthcare Case Study

CA Virtual Assurance for Infrastructure Managers

Accelerating Time to Market:

5 Critical Strategies for Application Performance Management

Business Usage Monitoring for Teradata

Cisco and Visual Network Systems: Implement an End-to-End Application Performance Management Solution for Managed Services

BMC Service Assurance. Proactive Availability and Performance Management Capacity Optimization

The Evolution of Load Testing. Why Gomez 360 o Web Load Testing Is a

Applying Data Center Infrastructure Management in Collocation Data Centers

Identifying Problematic SQL in Sybase ASE. Abstract. Introduction

Performance Prediction, Sizing and Capacity Planning for Distributed E-Commerce Applications

The Benefits of POWER7+ and PowerVM over Intel and an x86 Hypervisor

IBM Tivoli Netcool network management solutions for enterprise

Agio Remote Monitoring and Management

Work Smarter, Not Harder: Leveraging IT Analytics to Simplify Operations and Improve the Customer Experience

Is backhaul the weak link in your LTE network? Network assurance strategies for LTE backhaul infrastructure

HRG Assessment: Stratus everrun Enterprise

White Paper. The Assurance Checklist for Branch Networks A pragmatic guide for building high performance branch office networks.

The Advantages of Converged Infrastructure Management

can you improve service quality and availability while optimizing operations on VCE Vblock Systems?

Citrix XenDesktop & XenApp

managing Cost in the cloud

Network Performance + Security Monitoring

OPTIMIZING SERVER VIRTUALIZATION

Transcription:

WHITE PAPER Intelligent Tracking of Performance Storms in Complex Cloud Infrastructures by Jagan Jagannathan, Founder and CTO, Xangati 2014 Xangati, Inc. All rights reserved. Page 1 of 5

Intelligent Tracking of Performance Storms in Complex Cloud Infrastructures As enterprises, service providers, healthcare organizations, government agencies and educational institutions adopt and migrate their data center to virtual and cloud infrastructures, management solutions have not kept up to provide fine-grain relevant information for this dynamic, complex and volatile environment. Critical resources and applications, in such environments, are shared and are therefore, subject to spontaneous storms impacting the performance of applications and end users. This white paper explores common storms affecting virtualization and cloud environments and the key Infrastructure Performance Management (IPM) requirements to intelligently identify, capture and manage performance storms. Identifying Performance Storms in Cloud Infrastructures Performance storms are created by the unintended toxic interactions among cross-silo shared resources in the converged data center. A storm entangles multiple objects VMs, hosts, end-users, applications, etc. even if they are unrelated. The entanglement often has a dramatically adverse effect on the infrastructure performance. Some of the most common performance storms include: Storage storms typically occur when applications unknowingly and excessively share a datastore, which causes storage performance to deteriorate, often dramatically and spontaneously. Memory storms usually occur when you have multiple VMs trying to share insufficient amount of memory or, in other cases, you might have a VM that is hogging memory and not leaving enough for the others even with ballooning in place. CPU storms typically occur when there aren t enough CPU cycles or virtual CPUs to go around in the sharing of processing resources, leaving some with more and some with less Network storms usually occur when too many VMs are attempting to communicate at the same time on a specific interface or when a few VMs are hogging a specific interface with traffic limiting the ability of other VMs to send or receive data. 2014 Xangati, Inc. All rights reserved. Page 2 of 5

Legacy Infrastructure (Pre-Cloud Era) Can t Deal With Performance Storms With current performance management solutions, cloud performance storms can take several hours to even days or weeks to isolate, analyze and resolve, according to a recent IT survey we conducted with ZK Research. Why does it take that long? Two important reasons First, existing solutions, at best, have a real-time fidelity of multiple minutes which is fundamentally incompatible with performance storms that may start and finish within time intervals of seconds; Second, current solutions only focus on silo-specific metrics that only help generate alerts. Unfortunately, alerts only identify effects of storms they leave the all-important and often daunting root cause analysis to administrators to figure out on their own. Identifying Causes of Performance Storms Even in the best-run cloud infrastructures, performance storms are part of the new reality, and one must be able to accurately identify, track and resolve these disruptive and spontaneous occurrences in a timely and effective manner. To get to the root cause of the problem, you need: 1. Insight into real-time (second-by-second) interactions; 2. Visibility into both consumptive and interactive object behaviors; and 3. Integration with capacity management. #1 Real-time (Second-by-Second) Insight Into Interactions Because the cloud is constantly in-flux, it is critical to be able to visualize interactions on a real-time (second-by-second) basis in order to capture everything that is occurring within the environment. Equally important is tracking these real-time (second-by-second) interactions at scale. Given the scale, complexity & behavior of cloud infrastructures, this live, continuous and scalable monitoring and management is essential to accurately identify performance storms and can only be achieved through an in-memory based analytics architecture. An in-memory analytics architecture allows the system to track and analyze what is happening at a precise moment on a second-by-second basis rather than just averaging data out over a five or ten minute time period. The analytics 2014 Xangati, Inc. All rights reserved. Page 3 of 5

architecture enables visualization of the multitude of simultaneous and finegrain interactions that are responsible for surges or spikes in performance. In effect, it provides the critical context and understanding needed to identify trends & patterns that characterize storms. How else would you find the source of a datastore latency storm unless you know which VMs are actually using that datastore at that exact moment in time? #2 Visibility into Both Consumptive and Interactive Object Behaviors To see what is causing a performance storm, you need visibility not only into how objects are consuming cloud resources but also how objects are interacting with others within the infrastructure this being much more critical to determine the problem cause. Consumptive silo-specific alerts (using a combination of system-learned and best- practice thresholds) point to the effects of performance storms an impacted application or VM, for example while interactional cross-silo alerts give details that enable one to accurately identify and resolve the source of the problem. In order to deliver these interactional alerts and reveal the toxic (heavy resource usage) interactions that may be occurring between different objects you must have a cross-silo analysis of the infrastructure cutting across network, server and storage tiers, as well as applications and end users to provide a context. Furthermore, this analysis needs to scale so that you can easily view the distant and proximate areas of impact for a given storm, as well as the source of contention and the resources affected. Only by visualizing and analyzing the cross-silo interactions can you accurately identify the trends & patterns of interactions that are causing the storm. #3 Integration with Capacity Management The most common culprit for performance storms is conservative or underprovisioning of the cloud for either cost reasons or for little or unknown capacity requirements. Considering this, it seems logical that one would integrate performance and capacity management. Yet today s virtualization management solutions do not, perhaps due to the inability to connect the two thereby ignoring the intrinsic connection that exists and dealing with capacity management as a completely separate and distinct entity. Xangati uniquely believes that infrastructure performance management must expressly inform capacity analytics; otherwise, you can t identify the impact of 2014 Xangati, Inc. All rights reserved. Page 4 of 5

performance storms and their intensity on capacity utilization and saturation. This linkage leads to recommendations on how to resolve problems that cause storms, typically by either increasing resource capacity or by targeted resource load balancing. To operate your cloud in an efficient and effective manner, you need the right infrastructure performance management solution to tackle the highly disruptive and hard-to-detect performance storms that are intrinsic to your cloud. The three capabilities, discussed in this paper, allow you to effectively monitor and manage your infrastructure. To summarize, they are (1) Real-time, live and continuous, insights that you need to instantly recognize spontaneous and transient storms; (2) Cross-silo visibility into interactional metrics that you need to help identify root causes of storms instead of just chasing the effects a.k.a consumptive metric alerts; and (3) Linkage between performance and capacity management to appropriately add or reallocate infrastructure resources to mitigate or avoid future storms. About Xangati Xangati is the recognized leader for cloud and workload performance management solutions. Over 300 customers among enterprises, government agencies, healthcare organizations, educational systems and cloud providers use Xangati s solutions to gain unprecedented performance management of their massive, heterogeneous and consumer-scale, cloud and VDI environments. Xangati s solutions built on patented technology proactively track the health of key IT metrics that impact the performance of applications and users, accurately diagnose the cause of any performance bottleneck and recommend remedial action when a bottleneck is discovered. Organizations like EBay, Comcast, British Gas, Guess, Colliers International, Univita Health, DTCC, Harvard University and the US Army use the Xangati Management Dashboard suite of solutions with its massively scalable live and continuous recording ability to ensure their business-critical applications perform at optimal levels. Xangati is headquartered in Silicon Valley and can be found online at www.xangati.com. 2014 Xangati, Inc. All rights reserved. Page 5 of 5