PLA 7 WAYS TO USE LOG DATA FOR PROACTIVE PERFORMANCE MONITORING. [ WhitePaper ]

Similar documents
10 METRICS TO MONITOR IN THE LTE NETWORK. [ WhitePaper ]

MSP. HOW MSPs Can Use Performance Monitoring to Create New Revenue Streams. [ WhitePaper ] Introduction

DDoS DETECTING. DDoS ATTACKS WITH INFRASTRUCTURE MONITORING. [ Executive Brief ] Your data isn t safe. And neither is your website or your business.

Application Performance Testing Basics

Kaseya Traverse. Kaseya Product Brief. Predictive SLA Management and Monitoring. Kaseya Traverse. Service Containers and Views

Riverbed SteelCentral. Product Family Brochure

Whitepaper. A Guide to Ensuring Perfect VoIP Calls. blog.sevone.com info@sevone.com

Riverbed SteelCentral. Product Family Brochure

whitepaper Network Traffic Analysis Using Cisco NetFlow Taking the Guesswork Out of Network Performance Management

Network Management and Monitoring Software

Elevating Data Center Performance Management

Best Practices for NetFlow/IPFIX Analysis and Reporting

Meeting the Challenge of Big Data Log Management: Sumo Logic s Real-Time Forensics and Push Analytics

Whitepaper. 10 Metrics to Monitor in the LTE Network. blog.sevone.com

Harnessing the Power of Big Data for Real-Time IT: Sumo Logic Log Management and Analytics Service

pc resource monitoring and performance advisor

Traffic Analysis with Netflow The Key to Network Visibility

OneSight Voice Quality Assurance

Gaining Operational Efficiencies with the Enterasys S-Series

Getting Started with VoIP Reports

End Your Data Center Logging Chaos with VMware vcenter Log Insight

APPLICATION PERFORMANCE MONITORING

Web Analytics Understand your web visitors without web logs or page tags and keep all your data inside your firewall.

CA NSM System Monitoring Option for OpenVMS r3.2

Network Performance Management Solutions Architecture

Proactive Performance Management for Enterprise Databases

Business case for VoIP Readiness Network Assessment

The Evolution of Load Testing. Why Gomez 360 o Web Load Testing Is a

Network Performance + Security Monitoring

7 Key Requirements for Distributed Network Monitoring

NETWORK AND SERVER MANAGEMENT

Performance Management for Enterprise Applications

Application Performance Management

Modern IT Operations Management. Why a New Approach is Required, and How Boundary Delivers

How To Manage A Network With Ccomtechnique

Traffic Analysis With Netflow. The Key to Network Visibility

HP Service Health Analyzer: Decoding the DNA of IT performance problems

CA NSM System Monitoring. Option for OpenVMS r3.2. Benefits. The CA Advantage. Overview

White Paper. The Ten Features Your Web Application Monitoring Software Must Have. Executive Summary

NetQoS Delivers Distributed Network

VMware vcenter Log Insight Delivers Immediate Value to IT Operations. The Value of VMware vcenter Log Insight : The Customer Perspective

Service Performance Management: Pragmatic Approach by Jim Lochran

WHITE PAPER. Five Steps to Better Application Monitoring and Troubleshooting

CiscoWorks Internetwork Performance Monitor 4.0

Delivering actionable service knowledge

Creating Business-Class VoIP: Ensuring End-to-End Service Quality and Performance in a Multi-Vendor Environment. A Stratecast Whitepaper

SolarWinds Network Performance Monitor powerful network fault & availabilty management

SANS Top 20 Critical Controls for Effective Cyber Defense

E-Guide NETWORKING MONITORING BEST PRACTICES: SETTING A NETWORK PERFORMANCE BASELINE

Network Monitoring Comparison

Remote Network Monitoring Software for Managed Services Providers

Advanced File Integrity Monitoring for IT Security, Integrity and Compliance: What you need to know

Orion Network Performance Monitor

Intelligent Tracking of Performance Storms in Complex Cloud Infrastructures

Cisco Unified Communications and Collaboration technology is changing the way we go about the business of the University.

Ubuntu and Hadoop: the perfect match

ROCANA WHITEPAPER How to Investigate an Infrastructure Performance Problem

orrelog Ping Monitor Adapter Software Users Manual

Security Event Management. February 7, 2007 (Revision 5)

INFRASTRUCTURE MONITORING:

Enterprise Energy Management with JouleX and Cisco EnergyWise

Redefining Infrastructure Management for Today s Application Economy

Web Traffic Capture Butler Street, Suite 200 Pittsburgh, PA (412)

EMC Data Protection Advisor 6.0

VDI FIT and VDI UX: Composite Metrics Track Good, Fair, Poor Desktop Performance

PacketTrap One Resource for Managed Services

Solution Brief Virtual Desktop Management

Network Management Deployment Guide

Quality of Service (QoS) and Quality of Experience (QoE) VoiceCon Fall 2008

IT Service Management Real-time Enduser Context Has A Dramatic Affect On Incident and Problem Resolution Times

Consequences of Poorly Performing Software Systems

Monitoring Best Practices for

Data Driven Success. Comparing Log Analytics Tools: Flowerfire s Sawmill vs. Google Analytics (GA)

Routing & Traffic Analysis for Converged Networks. Filling the Layer 3 Gap in VoIP Management

MPLS WAN Explorer. Enterprise Network Management Visibility through the MPLS VPN Cloud

SolarWinds Network Performance Monitor

Best Practices from Deployments of Oracle Enterprise Operations Monitor

Network-Wide Class of Service (CoS) Management with Route Analytics. Integrated Traffic and Routing Visibility for Effective CoS Delivery

Cisco Integrated Video Surveillance Solution: Expand the Capabilities and Value of Physical Security Investments

Parallels Virtuozzo Containers

SOLARWINDS NETWORK PERFORMANCE MONITOR

TECH TIPS 4 STEPS TO FORECAST AND PLAN YOUR NETWORK CAPACITY NEEDS

How Cisco Actively Manages Voice Availability and Quality

Using Application Response to Monitor Microsoft Outlook

TRIPWIRE REMOTE OPERATIONS: STOP OPERATING, START ANALYZING

MONyog White Paper. Webyog

effective performance monitoring in SAP environments

TIME TO RETHINK REAL-TIME BIG DATA ANALYTICS

A Guide to Understanding SNMP

8/26/2007. Network Monitor Analysis Preformed for Home National Bank. Paul F Bergetz

SP Monitor. nfx One gives MSPs the agility and power they need to confidently grow their security services business. NFX FOR MSP SOLUTION BRIEF

locuz.com Big Data Services

solution brief NEC Remote Managed Services Prevent Costly Communications Downtime with Proactive Network Monitoring and Management from NEC

Transcription:

[ WhitePaper ] PLA 7 WAYS TO USE LOG DATA FOR PROACTIVE PERFORMANCE MONITORING. Over the past decade, the value of log data for monitoring and diagnosing complex networks has become increasingly obvious. As a result, many operations teams have changed their IT practices. However, these advancements were hampered by the limitations of existing log search and analysis tools. Fortunately, new technologies are now driving a more effective approach to using log data. Instead of gathering and examining information about the past, log analytics can now focus on the present and even the future. This white paper outlines seven examples of how this can be accomplished using the SevOne Performance Log Appliance (PLA).

SEVONE PLA. SUCCESSFULLY ADDRESSING TODAY S ATTITUDES AND PRACTICES. Created for high-volume processing, storing and indexing of log data; the SevOne PLA does more than make log details accessible to complex, after-the-fact search queries. Algorithms identify patterns of log activity and create a picture of what s normal behavior. When log entries vary from that baseline, the PLA sends an alert. Operations staff can then use the PLA web interface to drill down to the relevant logs to see what changed and why. SevOne PLA also interfaces with the SevOne performance monitoring platform. For the first time, operations staff can automatically correlate polled performance metrics on networks, servers, applications, storage and more with the corresponding log details on device or application actions and changes in state. The need for a solution like SevOne PLA is clear. The attitudes and practices of network operations groups have been evolving for years. In fact, 59 percent of IT respondents in a December 2014 report by Enterprise Management Associates said they consider log analytics a strategic, not merely tactical, effort. These strategic users were two times more likely than tactical users to say that log data is the most important of all network management data sources. Large enterprise organizations, in particular, cited log data as the first place we turn to when dealing with infrastructure monitoring issues. The evolution of IT attitudes and practices is reflected in improved log analysis tools, especially search tools with sophisticated proprietary query languages, and more muscular analytics applications. Yet, two major problems remain with even the best of these tools: they are after-thefact responses to network issues; and users have to know what they re looking for in order to frame the search queries. These limitations, coupled with the sheer volume of log data and log-based devices, mean it typically takes hours to sift through log data to identify the causes of infrastructure problems.

The SevOne Performance Log Appliance (PLA) provides exactly these capabilities. It also links tightly with the SevOne infrastructure monitoring platform. This allows users to highlight performance anomalies and, with one mouse click, bring up the corresponding PLA log data in the same window. Fortunately, a combination of new technologies can overcome these limitations:. A scalable architecture using distributed, parallel processing to handle the data volumes. Algorithms to analyze the normal behavior (baseline) of devices and applications. An alerting feature that detects variations in the baseline, or first occurrences of unique logs and sends warnings to operations staff. A user interface that uses identified key performance metrics and indexing to simplify navigating log messages, without the need to learn complex query languages. Automatic correlation of performance metric changes with the related log data, for faster discovery of root causes. Here are seven ways to use this powerful new approach to log analytics: 1 LEVERAGING NEW LOG FIELDS. One online shopping site wanted to track how long it took for shoppers search queries to return results. Minimizing that time boosts the number of transactions and customer satisfaction, and improves the site s scoring by Google Analytics. The first step was to add a field to the Apache custom log format, for duration. This allowed the log to track how long it took for the application server to respond to a web server query and enabled the operations team to see, at any given moment, the duration of specific queries. Next, they took this new log entry further with the SevOne PLA, which analyzes log data to identify patterns and creates a picture of what constitutes normal behavior (in this case, search queries). This allowed them to categorize variations from this baseline into several groups, including response times under 5 seconds, under 30 seconds, more than 60 second, etc. From here, they used the PLA to create alerts triggered by these variations. The results were immediate. The operations team quickly tracked down a group of long-running search queries and re-coded them for faster responses. Moving forward they plan to correlate the log data with performance metrics from server CPUs, memory and disk storage with the goal of re-configuring these resources to optimize search times. [White Paper] 7 Ways to Use Log Data for Proactive Performance Monitoring PG 3

2 RECEIVING AUTOMATIC ALERTS FOR FIRST- TIME LOG EVENTS. Being able to get an automatic alert each time a never-before-seen log message code appears can be a huge advantage. Often, these messages act like a canary in the coalmine, warning of a change that could be the forerunner of a big problem. This capability is called first value occurrence in the SevOne PLA. One example of the importance of first value occurrence is its ability to illuminate obscure or rare log messages in a population of hundreds or thousands of Cisco routers and switches. Each of these devices are capable of sending hundreds of unique message types. It s impractical to create alerts for each of these because the resulting flood of logbased alerts would be more confusing than illuminating. Furthermore, because many of them are rare, they aren t the kind of data that are routinely and regularly searched for in log analytics tools. Because the PLA builds an accurate picture of normal behavior, it knows when a previously unknown router message appears, and sends an alert. By identifying a first-time message code like low memory resource, the PLA proactively tells operations teams there is an anomaly. This kind of information is actionable because it tells the team, right away, what changed and where. That change can then be correlated with other log messages and with performance metrics. 3 MONITORING SPIKES AND DROPS IN APPLICATION MESSAGE VOLUMES. Most applications and devices have a regular pattern in the number of log messages they send during a given amount of time. A firewall might generate a thousand messages per second; a router might generate a thousand per day. Spikes or drops in these patterns can reveal underlying problems. For example, one operations group configured a backup that ended up going through a firewall instead of remaining, as intended, on the LAN. The number of firewall connections soared and the firewall crashed because it wasn t configured for that many connections. The SevOne PLA can baseline these message flows, and instantly alert when behavior changes from the norm. For example, after a software upgrade, a server cluster s logs showed a jump in API error messages, in one instance jumping from 100 messages per second to over 2,000. The PLA detected the spike and sent an alert. Operations staff called up the relevant log data in the PLA user interface, and saw that certain software processes had been thrown out of alignment due to a simple misconfiguration in the upgrade. [White Paper] 7 Ways to Use Log Data for Proactive Performance Monitoring PG 4

In addition, with the SevOne PLA, it s possible to be even more finegrained by identifying and tracking individual fields within the log messages. So within a surge of messages, users can discover, for example, that a given SNMP daemon is spiking because of too-frequent polling. This level of detail could be buried in the volume of other messages and difficult to find without the PLA alerting capability. 4 KNOWING WHEN LOG DATA DRIES UP. One of the simplest and most basic questions in log analytics is the one that, until now, was very difficult to answer: when does a device suddenly stop sending log data, and therefore, stop providing current information about its state? Most log tools today, including sophisticated search products, can t answer this question because they don t provide a baseline of what is normal. And IT staff can only find out by manually, continuously and laboriously checking each device. With baselining and alerting features, the SevOne PLA treats this problem as just another message volume alert. It identifies the baseline message volume it receives over a given time from a specific device. If the volume drops by some number or drops to zero, the PLA sends an alert. Critical devices and servers can no longer fall silent about their state without being noticed. 5 MONITORING VOIP CALL QUALITY. Understanding VOIP telephony environment and call quality scores is not typically seen as a use case for log data. However, some operations groups are already exploring such applications. One organization was using a popular log search tool to monitor the call logs of their Cisco call managers. The search tool let them create a report that used various log statistics and data -- such as mean opinion scores, R Values (a score designed to express the subjective quality of speech) and jitter (the variation in the delay of received packets) -- to understand the quality of VOIP calls at a given moment or span of time. But that isolated result didn t show the trend of quality issues over time, because the tool lacked a way to baseline the log data from the call managers. The only way to do that was to manually and tediously compare separate call quality log reports of comparable time periods from day to day, or week to week. [White Paper] 7 Ways to Use Log Data for Proactive Performance Monitoring PG 5

The SevOne PLA is able to analyze the influx of data to automatically identify the relevant baseline behaviors. Using these baselines, the PLA can be used to alert operations teams to things like a gradual increase in jitter, or recurring jitter spikes at certain times or on certain days. Log data that previously could only be used as a forensic tool to analyze quality data at an isolated moment is becoming an operational tool for monitoring that data in real-time. Coupling this event-based data with performance metrics - such as CPU and memory utilization, and drive space availability - creates a picture of resource dependencies, and guides actions to optimize these resources and preserve quality objectives. 6 MANAGING CONFIGURATION AND POLICY CHANGES. On a platform like the SevOne PLA, log data can be used to manage not only unexpected changes, but also the wealth of deliberately planned changes to infrastructure: a bug fix, an OS update, a server upgrade, a new forwarding rule or policy change, a new application or a cluster reconfiguration, for example. In this way, log data becomes a way to measure and manage these changes; to confirm that a change has achieved its performance objectives; or to identify how a change may have triggered a cascade of unexpected issues. The SevOne PLA also enables operations staff to create a fine-grained before-and-after comparison of log events when launching a network change. They can capture when the configuration change occurs in different network devices, and even see what specific changes have occurred. For example, adding a new QOS scheme and changing out a given access list can each be separately captured, seen and analyzed. The impact of this data is even greater when it s coupled with the SevOne performance monitoring platform, which polls devices and applications via an array of protocols, to collect and baseline performance statistics such as network or CPU utilization. These metrics create a view of the overall health of the network or a service, before and after a planned change. Correlating these metrics with the associated PLA log data shows how, and how well, the change is improving or degrading overall performance. [White Paper] 7 Ways to Use Log Data for Proactive Performance Monitoring PG 6

7 USING LOG DATA AS A BASIS FOR CAPACITY PLANNING. By combining log data with performance metrics, users can accurately forecast future growth in network activity and usage. Those projections can then become the basis for network changes and upgrades to handle that growth. Log data makes it possible to capture user activity at a granular level. Using this information, the SevOne PLA can baseline behaviors for the average number of users, or for peak number of users. Performance metrics then reveal how much of the network resources and processes are associated with each. This combination of capabilities makes it possible to do things like deconstruct online shopping activities once users press the checkout button. Operations teams can see how long the transaction takes and measure that against the backend CPU load and other metrics. By bringing together detailed, event-based log data with overall performance metrics, they can also see the weight a single customer puts on the infrastructure, and on the overall health of an application. Operations staff can then project future demand and load based on adoption rate and the increase in user numbers. These projections are no longer based on best guesses, but on the measureable historical trend of real users who are creating the actual load on the network infrastructure. The result is greater accuracy, cost-effectiveness and confidence in capacity planning decisions.

A NEW ERA IN LEVERAGING LOG ANALYTICS. The above examples embody a new approach that recognizes log data as a valuable real-time resource for network operations. Event-based logs have immediacy and detail about activities; they disclose state changes; and they capture the history of these records. All of these qualities are needed for proactive management of today s complex IT infrastructures. Until now, these qualities could only be partially exploited. The SevOne PLA brings a scalable architecture to store, index, analyze and baseline vast amounts of log data. As a result, operations staff no longer have to search for problems: an alerting engine detects variances and anomalies and sends a warning. In addition, it links with the SevOne infrastructure monitoring platform to illuminate changes in performance metrics with the corresponding eventlevel log data. About SevOne. SevOne provides the world s most scalable infrastructure performance monitoring platform to the world s most connected companies. The patented SevOne Cluster TM architecture leverages distributed computing to scale infinitely and collect millions of objects. It provides real-time reporting down to the second and provides the insight needed to prevent outages. SevOne customers include seven of today s 13 largest banks, enterprises, CSPs, MSPs and MSOs. SevOne is backed by Bain Capital Ventures. More information can be found at www.sevone.com. Follow SevOne on Twitter at @SevOneInc. [ www.sevone.com blog.sevone.com info@sevone.com ] SEV_WP_05_2015