E-Guide MANAGING AND MONITORING HYBRID CLOUD RESOURCE POOLS: 3 STEPS TO ENSURE OPTIMUM APPLICATION PERFORMANCE

W orking with individual in hybrid cloud can be complex, but Quality of Experience can be ensured. This expert E-Guide walks through a three-step process on how to make sure that worker productivity isn t impacted by performance issues. PAGE 2 OF 10

WORKING WITH INDIVIDUAL APPLICATIONS FOR HYBRID CLOUD MANAGEMENT Tom Nolle As enterprises get their arms around working with hybrid cloud architectures, the issue of managing individual cloud in a hybrid resource pool quickly comes to the forefront. Cloud decouple application components from specific server and storage resources to create a "pool' that can be employed to maximize utilization and reduce costs. A company can contribute its own servers to the pool to create a private cloud, and it can also use one or more public cloud providers to host non-mission-critical or act as an overflow or failover resource in case of problems with its own data centers. Managing and monitoring hybrid cloud resource pools to make sure that worker productivity isn t impacted by performance issues is a three-step process: PAGE 3 OF 10

1. Set a measurable Quality of Experience (QoE) goal. 2. Organize your monitoring/ resources to fully isolate problems. 3. Target remedial action at the real problem source, even in a virtual/cloud world. The most important step hybrid cloud and monitoring is measuring QoE response time at the user's point of connection. This can be obtained from the local application component, client device or local network connection. What is important is the ability to measure total application response time and packet-loss rate. These pieces of information will be used in conjunction with data from other monitoring sources to fix problems. Where the user accesses on smart devices, tools on the client device will probably be able to provide response-time data. The IT shop has access to the data in the device, either because at best, the device has mobile device (MDM) features they can draw on, or at worst, the application on the device could include the response-time data in the message flow it generates to the application. If that doesn't work, it s always possible to measure response time with PAGE 4 OF 10

monitors or probes. No matter how you get the data, the key point is that if you know the acceptable response times and the point at which operations become unacceptable, you have a baseline for resource and monitoring. TAKING INVENTORY OF AVAILABLE MANAGEMENT DATA The second step in QoE /monitoring is to take inventory of the data available from network and cloud providers. As before, the objective is to see what delay and packet loss information is available and what resources or network connections the data represents. Expect considerable variation here among providers, even in terms of how the data is reported and interpreted, so be prepared to do some work reducing all the information to common metrics. There is no general solution because every network and cloud provider will have somewhat different information and formats, so there s no option but to do a bit of customization to create the needed data elements. On the network side, it's good to assess whether standard network tools like ping and traceroute will work. Both of these protocols provide basic reachability, response time and hop data (using traceroute) on the path between two points. Looking into the network from the user connection point to the application, they can help find network delays or unusual packet routings that can PAGE 5 OF 10

indicate an internal network problem. In order for these to work, however, they have to be supported both at the application side of the connection and in the network connection itself. Things like Network Address Translation (NAT) and load balancing can impact ping/traceroute value so testing this during a project pilot is important. AVAILABLE MANAGEMENT DATA FROM CLOUD PROVIDERS On the cloud side, cloud providers data varies depending on the type of X as a service offered. Many Platform as a Service (PaaS) and Software as a Service (SaaS) providers will be able to offer some data on the cloud-to-network interface and on internal application resources because the are using cloud operating system and middleware components that often have interfaces. With Infrastructure as a Service (IaaS), the application s software platform is provided as part of the user s machine image, which means tools must be built into the machine image to be available. Application monitoring and, and even some network tools, can often be incorporated into application middleware and deployed on IaaS services to gain better visibility and control. But it s important to check with the cloud provider to ensure that the tools will work on virtual PAGE 6 OF 10

resources. In many cases, simple to echo packets to measure response time can also be added, if ping/traceroute isn t satisfactory or supported. Unfortunately, there are no real standards for cloud, even to the extent of defining what information is available. IT has to pick tools based on their familiarity and needs, which generally means any tool that can be integrated with the application image and uses available APIs (on PaaS) will work. When all of the data is assembled and converted into a common format, fault isolation and remediation normally begins when user response times rise more than a predetermined amount. The first step is to determine if there is unusual packet loss since lost packets will not only have to be retransmitted, they often reset the flow control protocols and thus may reduce connection performance. Packet loss is most easily detected by looking for retransmissions or flow-control changes, either in the application s network middleware or in the client device. If losses can be eliminated as a cause, the next step is to look for network delay, followed by processing delay. Packet loss is a result of congestion en route, so fixing packet loss will often involve either rerouting packets or increasing network performance. In either case, it will be necessary to work with the network provider(s) to resolve the problem. Packet delays in the network are usually associated with an excess PAGE 7 OF 10

of hops between routers along the path, a measure of an inefficient route. If packet routing changes because of a network problem, it will likely be restored to normal in time, but persistent problems with excessive route hops may indicate the provider isn t able to provide an efficient connection to cloud resources. With VPN services, it s often possible to reroute VPN connections to reduce hops, but with Internet services, the only option may be to change ISPs. When neither packet loss nor packet routing delay is at fault, the only remaining variable is processing time, which can be impacted by the loading on the server used to run the application, the storage used, and the application design. Issues with cloud application performance can be traced to colliding resource requirements from other cloud users, inadequate resources allocated to the application in the cloud contract, failure of the provider to meet the service level agreement (SLA), or simply an excess of demand on the application. It may be possible to tune cloud resource allocation through the cloud interface or to launch multiple instances of the application to improve response time. This will mean adding a form of load balancing to the application, something best handled by the cloud provider or cloud-hosted software. Professionals in the IT and network operations areas understand problem isolation and resolution processes where resources are dedicated to PAGE 8 OF 10

. With some care, those same principles can support hybrid resource pools and ensure application QoE in the cloud. TOM NOLLE is a strategic egghead -- someone who first wants to know the truth, no matter what it is, and then wants to explain it in a way that reaches everyone who cares to know it. He's an analyst in telecommunications, media and technology, and a former software architect who now works to blend technology detail and business reality. PAGE 9 OF 10

FREE RESOURCES FOR TECHNOLOGY PROFESSIONALS TechTarget publishes targeted technology media that address your need for information and resources for researching products, developing strategy and making cost-effective purchase decisions. Our network of technology-specific Web sites gives you access to industry experts, independent content and analysis and the Web s largest library of vendor-provided white papers, webcasts, podcasts, videos, virtual trade shows, research reports and more drawing on the rich R&D resources of technology providers to address market trends, challenges and solutions. Our live events and virtual seminars give you access to vendor neutral, expert commentary and advice on the issues and challenges you face daily. Our social community IT Knowledge Exchange allows you to share real world information in real time with peers and experts. WHAT MAKES TECHTARGET UNIQUE? TechTarget is squarely focused on the enterprise IT space. Our team of editors and network of industry experts provide the richest, most relevant content to IT professionals and. We leverage the immediacy of the Web, the networking and face-to-face opportunities of events and virtual events, and the ability to interact with peers all to create compelling and actionable information for enterprise IT professionals across all industries and markets. PAGE 10 OF 10