Wait, How Many Metrics? Monitoring at Quantcast

Size: px
Start display at page:

Download "Wait, How Many Metrics? Monitoring at Quantcast"

Transcription

1 Wait, How Many Metrics? Monitoring at Quantcast Count what is countable, measure what is measurable, and what is not measurable, make measurable. Galileo Galilei Quantcast offers free direct audience measurement for hundreds of millions of web destinations and powers the ability to target any online audience across tens of billions of events per day. Operating at this scale requires Quantcast to have an expansive and reliable monitoring platform. In order to stay on top of its operations, Quantcast collects billions of monitoring metrics per day. The Ganglia infrastructure we've developed lets us collect these metrics from a wide variety of sources and deliver them to several different kinds of consumers; analyze, report and alert on them in real time; and provide our product teams with a platform for performing their own analysis and reporting. The monitoring infrastructure at Quantcast collects and stores about 2 million unique metrics from a number of data centers around the world, for a total of almost 12 billion metrics per day, all of which are made available to our monitoring and visualization tools within seconds of their collection. This infrastructure rests on Ganglia's Atlas-like shoulders. Some of the sources that generate these metrics are: operating system metrics (CPU/memory/disk utilization) application metrics (queries per second) infrastructure metrics (network hardware throughput, UPS power utilization) derived metrics (SLA compliance) business metrics (spend and revenue) We have a similarly broad spectrum of consumers of monitoring data: alerting tools (Nagios) business analysis of historical trends performance analysis by product teams Ganglia gives us a system that can listen to all of these sources, serve all of these consumers, and be performant, reliable, and quick to update.

2 Reporting, Analysis, and Alerting Alerting Integration Like many companies, we use Nagios to monitor our systems and alert us when something is not working correctly. Instead of running a Nagios agent on each of our hosts, we use Ganglia to report system and application metrics back to our central monitoring servers. Not only does this reduce system overhead, but it also reduces the proliferation of protocols and applications running over our WAN network. Nagios runs a Perl script called check_qcganglia, which has access to every metric in our RRD files stored on ramdisk. This allows us to have individual host checks as well as health checks for entire clusters and grids using the Ganglia summary data. Another great benefit of this is that we can configure checks and alerts on the application level data that our developers put into Ganglia. This also allows us to alert on aggregated business metrics like spend and revenue. We have also implemented a custom Nagios alerting script which, through configuration of the alert type, determines which Ganglia graphs would be useful for an operator to see right away in the alert and attaches those graphs as images to the outbound alert. Typically these include CPU and memory utilization graphs for the individual host as well as application graphs for the host, cluster, and grid. These help the on-call operator immediately assess the impact of any given alert on the overall performance of the system. Holt-Winters Aberrance Detection Because Ganglia is built on RRD Tool, we're able to leverage some of its most powerful (if intimidating) features as well. One of these is its ability to do Holt- Winters forecasting and aberrance detection. The Holt-Winters algorithm works on data that is periodic and generally consistent over time. It derives an upper

3 and lower bound from historical data and uses that to make predictions about current and future data. We have several metrics that are well-suited for Holt-Winters aberrance detection, such as http requests per second. This varies significantly over the course of a single day and each day of the week, but from one Monday to the next, our traffic is pretty consistent. If we see a large swing of traffic from one minute to the next, it typically indicates a problem. We use Nagios and RRD Tool to monitor the Holt-Winters forecast for our traffic and alert if the traffic varies outside of the expected range. This allows us to see and respond to network and application problems very quickly, using dynamically-derived thresholds that are always up to date. Below is an example of an aberrance. We took a datacenter offline for maintenance, and this triggered aberrance detection. Here's an example of an unexpected aberrance alert.

4 Ganglia as an Application Platform Since a gmond daemon is running on every machine in the company, we encourage our application teams to use Ganglia as the pathway for reporting performance data about their applications. This has several benefits: easy integration for performance monitoring. The Ganglia infrastructure is already configured, so the application developer doesn't have to do any special work to make use of it. Especially in concert with the json2gmetrics tool described below, it's easy for an application to generate and report any metrics the developers think would be useful to see. Also, the developer doesn't have to worry about opening up firewall ports, parsing or rotating log files, or running other services devoted to monitoring, such as JMX. powerful analysis tools. By submitting metrics to Ganglia in this way, application developers get easy access to very powerful analysis tools (namely the Ganglia web interface). This is particularly useful when troubleshooting problems with an application that correlate with operating system metrics, e.g. attributing a drop in requests per second to a machine swapping the application out of memory. simple and flexible alerting. Our Ganglia infrastructure is tied in with our alerting system. Once an alert is configured, an application can trigger an alert state by generating a specific metric with a sentinel value. This lets us centralize alert generation and event correlation; for instance, we can prevent an application from generating alerts when the system it runs on will be in a known maintenance window. This is much better than the alternative of each application having to re-invent the wheel of monitoring for an invalid state and sending its own s. Best Practices Hard-won experience has given us some best practices for using Ganglia. Following these has helped us scale Ganglia up to a truly impressive level while retaining high reliability and real-time performance. Using tmpfs to handle high IOPS Collecting the sheer number of metrics we do and writing them to a spinning disk would be impossible. To solve the IOPS problem, we write all our metrics to a ramdisk (Linux tmpfs) and then back them up to a spinning disk with a cron job. Since we have multiple gmetad hosts in separate locations, we're able to protect against an outage at either location. Also our cron job runs frequently enough that if the tmpfs is cleared (for instance, if the server reboots) our window of lost data is small.

5 Sharding/instancing Another way we deal with our large number of metrics is to split up our gmetad collectors into several instances. We logically divided our infrastructure into several different groups like webservers, internal systems, map-reduce cluster and realtime bidding. Each system has its own gmetad instance, and the only place all the graphs come together are on web dashboards that embed the graphs from each instance. This gives us the best of both worlds: a seamless interface to a large volume of data, with the added reliability and performance of multiple instances. Tools We've developed several tools to let us get more use out of Ganglia. snmp2ganglia snmp2ganglia is a daemon we wrote that translates SNMP OIDs into Ganglia metrics. It takes a configuration file that maps a given OID (and whatever adjustments it might need) into a Ganglia metric, with a name, value, type, and units label. The configuration file also associates each OID with one or more hosts, such as network routers or UPS devices. On a configurable schedule, snmp2ganglia polls each host for the OIDs associated with it and delivers those values to Ganglia as metrics. We make use of the capability to "spoof" a metric's source IP and DNS name to create a pseudo-host for each device. json2gmetrics json2gmetrics is a Perl script that wraps the Ganglia::Gmetric library, to make it easy to import a lot of metrics from almost any source. The script takes a JSON string with a list of mappings of the form "name: <metric>, value: <value>", etc. With this tool, it's trivial for any program or system to generate Ganglia metrics, and each import requires only one fork (as opposed to one fork per metric with the built-in gmetric tool). gmond plugins We make extensive use of gmond-style plugins, especially for operating system metrics. For instance, some of the plugins we've written collect: SMART counters about disk health, temperature, etc. network interface traffic (packets and bytes) and errors CPU and memory consumed by each user account mail queue length and statistics

6 service uptimes These metrics are a powerful tool for troubleshooting and analyzing system performance, especially when correlating systems that might seem unrelated through the Ganglia web UI. RRD management scripts We have a small army of scripts that work with RRD files: Adding new RRAs to existing files, such as a new MAX or MIN RRA Adjusting the parameters of existing RRAs Merging data from multiple files into one coherent RRD Smoothing transient spikes in RRD data that throw off average calculations (particularly common with SNMP OID counters) Drawbacks Although Ganglia is both powerful and flexible, it does have its weak points. Each of these drawbacks describes a challenge we had to overcome to run Ganglia at our scale. Necessity of sharding As our infrastructure grew, we discovered that a single gmetad process just couldn't keep up with the number of metrics we were trying to collect. Updates would be skipped because the gmetad process couldn't write them to the RRD files in time, and the CPU and memory pressure on a single machine was overwhelming. We solved this problem by scaling horizontally: setting up a series of monitoring servers, each of which runs a gmetad process that collects some specific part of the whole set of metrics. This way, we can run as many monitoring servers as we need to reach the scale of our operations. The downside of this solution is significantly increased management cost; we have to manage and coordinate multiple monitoring servers and track which metrics are being collected where. We've mitigated this problem somewhat by setting up the Ganglia web UI on each server to redirect requests to the appropriate server for each category of metrics, so end-users see a unified system that hides the sharding. RRD data consolidation RRD files offer the following quality: keeping an unbounded amount of timeseries data in a constant amount of space. They achieve this by consolidating older data with a consolidation function (such as MIN, MAX, or AVERAGE) to make room for newer data at full resolution. For example, an RRD file might be configured to keep a datapoint at 15-second intervals for an hour. This means keeping 240 unique values (60 minutes/hour * 4 datapoints/minute). A whole day of data at this resolution would require 5760

7 datapoints! However, instead of keeping all of those datapoints, this RRD file might instead be configured to keep just one averaged datapoint per hour for the whole day. As new updates are processed, a whole hour's worth of 15-second datapoints would be averaged to get a single datapoint that represents the entire hour. The file would contain high-resolution data for the last hour and a consolidated representation of that data for the rest of the day. For Quantcast, we often need to analyze events in a very precise time window. If we saw a CPU load spike for just one minute, for instance, RRD Tool's data consolidation would average that spike with the rest of the data for that hour into a single value that did not reflect the spike; the precise data about the event would be lost. Coordination over a WAN Quantcast is a global operation, so we needed to make some improvements to Ganglia to get it to a global scale. One of these improvements involved the process of collecting data from a gmond daemon. When a gmetad process collects metrics from a gmond, the data is emitted as an XML tree with a node for each metric. This winds up being a whole lot of highly-compressible plaintext data. In order to save bandwidth over our WAN links, we patched gmond to emit a compressed version of that tree, and we also patched gmetad to work with compressed streams. This substantially reduced the amount of monitoring data we were sending over our WAN links in each site, which left us with more room for business-critical data. Excessive IOPS for RRD updates Whenever a gmetad process writes a metric to an RRD file, the update process requires several accesses to the same file: one read in the header area, one write in the header area, and at least one update in the data area. RRD Tool attempts to improve the speed of this process by issuing fadvise/madvise system calls to prevent the filesystem from doing normally clever things that interfere with this specific access pattern. Since we write our RRD files into a tmpfs, the fadvise/madvise calls were just slowing things down. We wrote a LD_PRELOAD library to hook those calls and negate them (return 0). This allowed us to collect a much larger number of metrics with each gmetad shard, and also significantly reduced CPU usage by the gmetad process.

8 Conclusions Ganglia is an incredibly powerful and flexible tool, but at its heart it's designed to collect performance monitoring data about the machines in a single highperformance computing cluster and display them as a series of graphs in a web interface. Quantcast has taken that basic tool and stretched it to its limit: growing it to an exceptionally large scale, extending it across a global infrastructure, and scouring the hidden corners of Ganglia's potential to find new tools and functionality. Most importantly, though, we've made Ganglia into a platform upon which other services rely. From Nagios alerts and rich s, to integration with application development, it's become a cornerstone of our production operations.

Maintaining Non-Stop Services with Multi Layer Monitoring

Maintaining Non-Stop Services with Multi Layer Monitoring Maintaining Non-Stop Services with Multi Layer Monitoring Lahav Savir System Architect and CEO of Emind Systems lahavs@emindsys.com www.emindsys.com The approach Non-stop applications can t leave on their

More information

PANDORA FMS NETWORK DEVICES MONITORING

PANDORA FMS NETWORK DEVICES MONITORING NETWORK DEVICES MONITORING pag. 2 INTRODUCTION This document aims to explain how Pandora FMS can monitor all the network devices available in the market, like Routers, Switches, Modems, Access points,

More information

Scaling Graphite Installations

Scaling Graphite Installations Scaling Graphite Installations Graphite basics Graphite is a web based Graphing program for time series data series plots. Written in Python Consists of multiple separate daemons Has it's own storage backend

More information

A SURVEY ON AUTOMATED SERVER MONITORING

A SURVEY ON AUTOMATED SERVER MONITORING A SURVEY ON AUTOMATED SERVER MONITORING S.Priscilla Florence Persis B.Tech IT III year SNS College of Engineering,Coimbatore. priscillapersis@gmail.com Abstract This paper covers the automatic way of server

More information

PANDORA FMS NETWORK DEVICE MONITORING

PANDORA FMS NETWORK DEVICE MONITORING NETWORK DEVICE MONITORING pag. 2 INTRODUCTION This document aims to explain how Pandora FMS is able to monitor all network devices available on the marke such as Routers, Switches, Modems, Access points,

More information

Deploying the BIG-IP LTM with the Cacti Open Source Network Monitoring System

Deploying the BIG-IP LTM with the Cacti Open Source Network Monitoring System DEPLOYMENT GUIDE Deploying the BIG-IP LTM with the Cacti Open Source Network Monitoring System Version 1.0 Deploying F5 with Cacti Open Source Network Monitoring System Welcome to the F5 and Cacti deployment

More information

SiteCelerate white paper

SiteCelerate white paper SiteCelerate white paper Arahe Solutions SITECELERATE OVERVIEW As enterprises increases their investment in Web applications, Portal and websites and as usage of these applications increase, performance

More information

STUDY AND SIMULATION OF A DISTRIBUTED REAL-TIME FAULT-TOLERANCE WEB MONITORING SYSTEM

STUDY AND SIMULATION OF A DISTRIBUTED REAL-TIME FAULT-TOLERANCE WEB MONITORING SYSTEM STUDY AND SIMULATION OF A DISTRIBUTED REAL-TIME FAULT-TOLERANCE WEB MONITORING SYSTEM Albert M. K. Cheng, Shaohong Fang Department of Computer Science University of Houston Houston, TX, 77204, USA http://www.cs.uh.edu

More information

How To Set Up Foglight Nms For A Proof Of Concept

How To Set Up Foglight Nms For A Proof Of Concept Page 1 of 5 Foglight NMS Overview Foglight Network Management System (NMS) is a robust and complete network monitoring solution that allows you to thoroughly and efficiently manage your network. It is

More information

Edge Configuration Series Reporting Overview

Edge Configuration Series Reporting Overview Reporting Edge Configuration Series Reporting Overview The Reporting portion of the Edge appliance provides a number of enhanced network monitoring and reporting capabilities. WAN Reporting Provides detailed

More information

How To Use Mindarray For Business

How To Use Mindarray For Business Minder Network Performance Monitoring Monitor everything about your Network performance Discover, visualize and monitor your complete IT Infrastructure in less than an hour. Mindarray s Minder is a powerful

More information

HPCC Monitoring and Reporting (Technical Preview) Boca Raton Documentation Team

HPCC Monitoring and Reporting (Technical Preview) Boca Raton Documentation Team HPCC Monitoring and Reporting (Technical Preview) Boca Raton Documentation Team HPCC Monitoring and Reporting (Technical Preview) Boca Raton Documentation Team Copyright 2015 HPCC Systems. All rights reserved

More information

pt360 FREE Tool Suite Networks are complicated. Network management doesn t have to be.

pt360 FREE Tool Suite Networks are complicated. Network management doesn t have to be. pt360 FREE Tool Suite Networks are complicated. Network management doesn t have to be. pt360 FREE Tool Suite - At a Glance PacketTrap Networks November, 2009 PacketTrap's pt360 FREE Tool Suite consolidates

More information

Enterprise IT is complex. Today, IT infrastructure spans the physical, the virtual and applications, and crosses public, private and hybrid clouds.

Enterprise IT is complex. Today, IT infrastructure spans the physical, the virtual and applications, and crosses public, private and hybrid clouds. ENTERPRISE MONITORING & LIFECYCLE MANAGEMENT Unify IT Operations Enterprise IT is complex. Today, IT infrastructure spans the physical, the virtual and applications, and crosses public, private and hybrid

More information

SapphireIMS 4.0 BSM Feature Specification

SapphireIMS 4.0 BSM Feature Specification SapphireIMS 4.0 BSM Feature Specification v1.4 All rights reserved. COPYRIGHT NOTICE AND DISCLAIMER No parts of this document may be reproduced in any form without the express written permission of Tecknodreams

More information

Violin Symphony Abstract

Violin Symphony Abstract Violin Symphony Abstract This white paper illustrates how Violin Symphony provides a simple, unified experience for managing multiple Violin Memory Arrays. Symphony facilitates scale-out deployment of

More information

White Paper. The Ten Features Your Web Application Monitoring Software Must Have. Executive Summary

White Paper. The Ten Features Your Web Application Monitoring Software Must Have. Executive Summary White Paper The Ten Features Your Web Application Monitoring Software Must Have Executive Summary It s hard to find an important business application that doesn t have a web-based version available and

More information

Remote Network Monitoring Software for Managed Services Providers

Remote Network Monitoring Software for Managed Services Providers http://www.packettrap.com Remote Network Monitoring Software for Managed Services Providers PacketTrap MSP provides a cost-effective way for you to offer enterprise-class server, application, and network

More information

CAREN NOC MONITORING AND SECURITY

CAREN NOC MONITORING AND SECURITY CAREN CAREN Manager: Zarlyk Jumabek uulu 1-2 OCTOBER 2014 ALMATY, KAZAKHSTAN Copyright 2010 CAREN / Doc ID : PS01102014 / Address : Chui ave, 265a, Bishkek, The Kyrgyz Republic Tel: +996 312 900275 website:

More information

Application Discovery Manager User s Guide vcenter Application Discovery Manager 6.2.1

Application Discovery Manager User s Guide vcenter Application Discovery Manager 6.2.1 Application Discovery Manager User s Guide vcenter Application Discovery Manager 6.2.1 This document supports the version of each product listed and supports all subsequent versions until the document

More information

Livrable L13.3. Nature Interne Date livraison 12/07/2012. Titre du Document Energy management system and energy consumption efficiency - COEES Code v1

Livrable L13.3. Nature Interne Date livraison 12/07/2012. Titre du Document Energy management system and energy consumption efficiency - COEES Code v1 Propriétés du Document Source du Document FUI-10-COMPATIBLE ONE Titre du Document Energy management system and energy consumption efficiency - COEES Code v1 Module(s) Responsable Auteur(s) / contributeur(s)

More information

Ganglia & Nagios. Maciej Lasyk 11. Sesja Linuksowa Wrocław, 2014-04-06 1/25. Maciej Lasyk, Ganglia & Nagios

Ganglia & Nagios. Maciej Lasyk 11. Sesja Linuksowa Wrocław, 2014-04-06 1/25. Maciej Lasyk, Ganglia & Nagios Ganglia & Nagios Maciej Lasyk 11. Sesja Linuksowa Wrocław, 2014-04-06 1/25 Ganglia.. what? Ganglia cluster / group of neurons found outside the central nervous system 2/25 Just a little about monitoring

More information

WhatsUp Gold v11 Features Overview

WhatsUp Gold v11 Features Overview WhatsUp Gold v11 Features Overview This guide provides an overview of the core functionality of WhatsUp Gold v11, and introduces interesting features and processes that help users maximize productivity

More information

PEPPERDATA IN MULTI-TENANT ENVIRONMENTS

PEPPERDATA IN MULTI-TENANT ENVIRONMENTS ..................................... PEPPERDATA IN MULTI-TENANT ENVIRONMENTS technical whitepaper June 2015 SUMMARY OF WHAT S WRITTEN IN THIS DOCUMENT If you are short on time and don t want to read the

More information

mbits Network Operations Centrec

mbits Network Operations Centrec mbits Network Operations Centrec The mbits Network Operations Centre (NOC) is co-located and fully operationally integrated with the mbits Service Desk. The NOC is staffed by fulltime mbits employees,

More information

Monitor Solution Best Practice v3.2 part of Symantec Server Management Suite

Monitor Solution Best Practice v3.2 part of Symantec Server Management Suite Summary When implementing Monitoring and Alerting part of Server Management suite. The following items are areas that should be reviewed. It is important to start to monitor what you need and add/build

More information

Network Management and Monitoring Software

Network Management and Monitoring Software Page 1 of 7 Network Management and Monitoring Software Many products on the market today provide analytical information to those who are responsible for the management of networked systems or what the

More information

Vistara Lifecycle Management

Vistara Lifecycle Management Vistara Lifecycle Management Solution Brief Unify IT Operations Enterprise IT is complex. Today, IT infrastructure spans the physical, the virtual and applications, and crosses public, private and hybrid

More information

Network Monitoring with SNMP

Network Monitoring with SNMP Network Monitoring with SNMP This paper describes how SNMP is used in WhatsUp- Professional and provides specific examples on how to configure performance, active, and passive monitors. Introduction SNMP

More information

Monitoring Microsoft Exchange to Improve Performance and Availability

Monitoring Microsoft Exchange to Improve Performance and Availability Focus on Value Monitoring Microsoft Exchange to Improve Performance and Availability With increasing growth in email traffic, the number and size of attachments, spam, and other factors, organizations

More information

One software solution to monitor your entire network, including devices, applications traffic and availability.

One software solution to monitor your entire network, including devices, applications traffic and availability. One software solution to monitor your entire network, including devices, applications traffic and availability. About Britannic Expert Integrators We are award winning specialists in IP communications,

More information

SEE-GRID-SCI. www.see-grid-sci.eu. SEE-GRID-SCI USER FORUM 2009 Turkey, Istanbul 09-10 December, 2009

SEE-GRID-SCI. www.see-grid-sci.eu. SEE-GRID-SCI USER FORUM 2009 Turkey, Istanbul 09-10 December, 2009 SEE-GRID-SCI Grid Site Monitoring tools developed and used at SCL www.see-grid-sci.eu SEE-GRID-SCI USER FORUM 2009 Turkey, Istanbul 09-10 December, 2009 V. Slavnić, B. Acković, D. Vudragović, A. Balaž,

More information

DEPLOYMENT GUIDE Version 1.0. Deploying the BIG-IP LTM with the Nagios Open Source Network Monitoring System

DEPLOYMENT GUIDE Version 1.0. Deploying the BIG-IP LTM with the Nagios Open Source Network Monitoring System DEPLOYMENT GUIDE Version 1.0 Deploying the BIG-IP LTM with the Nagios Open Source Network Monitoring System Deploying F5 with Nagios Open Source Network Monitoring System Welcome to the F5 and Nagios deployment

More information

The new services in nagios: network bandwidth utility, email notification and sms alert in improving the network performance

The new services in nagios: network bandwidth utility, email notification and sms alert in improving the network performance The new services in nagios: network bandwidth utility, email notification and sms alert in improving the network performance Mohammad Ali Arsyad bin Mohd Shuhaimi Hang Tuah Jaya, 76100 Durian Tunggal,

More information

Network management tools: Torrus, Gerty, Mooxu

Network management tools: Torrus, Gerty, Mooxu Network management tools: Torrus, Gerty, Mooxu DENOG3 Frankfurt am Main, October 20 th 2011 Stanislav Sinyagin ssinyagin@k-open.com Copyright notice This work is licensed under the Creative Commons Attribution-Share

More information

SystemWatch SM. Remote Network Monitoring

SystemWatch SM. Remote Network Monitoring SystemWatch SM Remote Network Monitoring SystemWatch Network Monitoring Service: Real World Experience Sony s SystemWatch network monitoring service uses device and networking monitoring software in conjunction

More information

Best of Breed of an ITIL based IT Monitoring. The System Management strategy of NetEye

Best of Breed of an ITIL based IT Monitoring. The System Management strategy of NetEye Best of Breed of an ITIL based IT Monitoring The System Management strategy of NetEye by Georg Kostner 5/11/2012 1 IT Services and IT Service Management IT Services means provisioning of added value for

More information

Network Management Deployment Guide

Network Management Deployment Guide Smart Business Architecture Borderless Networks for Midsized organizations Network Management Deployment Guide Revision: H1CY10 Cisco Smart Business Architecture Borderless Networks for Midsized organizations

More information

IT Services Monitoring with

IT Services Monitoring with IT Services Monitoring with OmniAnalyser Maintaining IT Services with Monitoring and Analysis Modern business environments, being significantly dependant on IT services, face the growing challenge of understanding

More information

SCF/FEF Evaluation of Nagios and Zabbix Monitoring Systems. Ed Simmonds and Jason Harrington 7/20/2009

SCF/FEF Evaluation of Nagios and Zabbix Monitoring Systems. Ed Simmonds and Jason Harrington 7/20/2009 SCF/FEF Evaluation of Nagios and Zabbix Monitoring Systems Ed Simmonds and Jason Harrington 7/20/2009 Introduction For FEF, a monitoring system must be capable of monitoring thousands of servers and tens

More information

Introduction to Network Monitoring and Management

Introduction to Network Monitoring and Management Introduction to Network Monitoring and Management Jonathan Brewer Network Startup Resource Center jon@nsrc.org These materials are licensed under the Creative Commons Attribution-NonCommercial 4.0 International

More information

Network Monitoring Comparison

Network Monitoring Comparison Network Monitoring Comparison vs Network Monitoring is essential for every network administrator. It determines how effective your IT team is at solving problems or even completely eliminating them. Even

More information

Monitoring HTCondor with Ganglia

Monitoring HTCondor with Ganglia Monitoring HTCondor with Ganglia Ganglia Overview Scalable distributed monitoring for HPC clusters Two daemons gmond every host; collects and send metrics gmetad single host; persists metrics from local

More information

Elevating Data Center Performance Management

Elevating Data Center Performance Management Elevating Data Center Performance Management Data Center innovation reduces operating expense, maximizes employee productivity, and generates new sources of revenue. However, many I&O teams lack proper

More information

Server & Application Monitor

Server & Application Monitor Server & Application Monitor agentless application & server monitoring SolarWinds Server & Application Monitor provides predictive insight to pinpoint app performance issues. This product contains a rich

More information

Network Monitoring. Easy, failsafe, and complete visibility of your network. Our customers have the same view as our NOC technicians.

Network Monitoring. Easy, failsafe, and complete visibility of your network. Our customers have the same view as our NOC technicians. One software solution to monitor your entire network, including devices, applications, traffic, and availability. Network monitoring is the constant evaluation fo your systems performance within the IT

More information

Web Analytics Understand your web visitors without web logs or page tags and keep all your data inside your firewall.

Web Analytics Understand your web visitors without web logs or page tags and keep all your data inside your firewall. Web Analytics Understand your web visitors without web logs or page tags and keep all your data inside your firewall. 5401 Butler Street, Suite 200 Pittsburgh, PA 15201 +1 (412) 408 3167 www.metronomelabs.com

More information

Understanding Slow Start

Understanding Slow Start Chapter 1 Load Balancing 57 Understanding Slow Start When you configure a NetScaler to use a metric-based LB method such as Least Connections, Least Response Time, Least Bandwidth, Least Packets, or Custom

More information

Delivering actionable service knowledge

Delivering actionable service knowledge Delivering actionable service knowledge Converged Infrastructure Monitoring and Management (CIM 2 ) Delivering actionable service knowledge Converged Infrastructure Monitoring & Management (CIM 2 ) from

More information

WINDOWS SERVER MONITORING

WINDOWS SERVER MONITORING WINDOWS SERVER Server uptime, all of the time CNS Windows Server Monitoring provides organizations with the ability to monitor the health and availability of their Windows server infrastructure. Through

More information

Astaro Deployment Guide High Availability Options Clustering and Hot Standby

Astaro Deployment Guide High Availability Options Clustering and Hot Standby Connect With Confidence Astaro Deployment Guide Clustering and Hot Standby Table of Contents Introduction... 2 Active/Passive HA (Hot Standby)... 2 Active/Active HA (Cluster)... 2 Astaro s HA Act as One...

More information

Whitepaper. Business Service monitoring approach

Whitepaper. Business Service monitoring approach Whitepaper on Business Service monitoring approach - Harish Jadhav Page 1 of 15 Copyright Copyright 2013 Tecknodreams Software Consulting Pvt. Ltd. All Rights Reserved. Restricted Rights Legend This document

More information

Minimal network traffic is the result of SiteAudit s design. The information below explains why network traffic is minimized.

Minimal network traffic is the result of SiteAudit s design. The information below explains why network traffic is minimized. SiteAudit Knowledge Base Network Traffic March 2012 In This Article: SiteAudit s Traffic Impact How SiteAudit Discovery Works Why Traffic is Minimal How to Measure Traffic Minimal network traffic is the

More information

NetCrunch 6. AdRem. Network Monitoring Server. Document. Monitor. Manage

NetCrunch 6. AdRem. Network Monitoring Server. Document. Monitor. Manage AdRem NetCrunch 6 Network Monitoring Server With NetCrunch, you always know exactly what is happening with your critical applications, servers, and devices. Document Explore physical and logical network

More information

SENTINEL MANAGEMENT & MONITORING

SENTINEL MANAGEMENT & MONITORING MANAGEMENT & MONITORING Network Monitoring Server Monitoring Database Monitoring Application Monitoring End User Response Time Monitoring Virtualisation Monitoring VOIP Monitoring SLA Monitoring Knowing

More information

Volume SYSLOG JUNCTION. User s Guide. User s Guide

Volume SYSLOG JUNCTION. User s Guide. User s Guide Volume 1 SYSLOG JUNCTION User s Guide User s Guide SYSLOG JUNCTION USER S GUIDE Introduction I n simple terms, Syslog junction is a log viewer with graphing capabilities. It can receive syslog messages

More information

Monitoring Infrastructure for Superclusters: Experiences at MareNostrum

Monitoring Infrastructure for Superclusters: Experiences at MareNostrum ScicomP13 2007 SP-XXL Monitoring Infrastructure for Superclusters: Experiences at MareNostrum Garching, Munich Ernest Artiaga Performance Group BSC-CNS, Operations Outline BSC-CNS and MareNostrum Overview

More information

CARL : Cyberoam Aggregated Reporting and Logging :: User Guide. Table Of Contents INTRODUCTION... 4

CARL : Cyberoam Aggregated Reporting and Logging :: User Guide. Table Of Contents INTRODUCTION... 4 Table Of Contents INTRODUCTION... 4 About Cyberoam Aggregated Reporting and Logging... 5 INSTALLATION AND SETUP... 6 System Requirements... 6 Prerequisites... 8 Installing and Uninstalling... 10 Starting

More information

NMS300 Network Management System

NMS300 Network Management System NMS300 Network Management System User Manual June 2013 202-11289-01 350 East Plumeria Drive San Jose, CA 95134 USA Support Thank you for purchasing this NETGEAR product. After installing your device, locate

More information

A Guide to Understanding SNMP

A Guide to Understanding SNMP A Guide to Understanding SNMP Read about SNMP v1, v2c & v3 and Learn How to Configure SNMP on Cisco Routers 2013, SolarWinds Worldwide, LLC. All rights reserved. Share: In small networks with only a few

More information

Traffic Analysis With Netflow. The Key to Network Visibility

Traffic Analysis With Netflow. The Key to Network Visibility Summary Today, Enterprises know that the WAN is one of their most important assets. It needs to be up and running 24x7 for the enterprise to function smoothly. To make this possible, IT administrators

More information

APV9650. Application Delivery Controller

APV9650. Application Delivery Controller APV9650 D a t a S h e e t Application Delivery Controller Array Networks APV Series of Application Delivery Controllers optimizes the availability, user experience, performance, security and scalability

More information

Network Management & Monitoring Overview

Network Management & Monitoring Overview Network Management & Monitoring Overview Advanced cctld Workshop September, 2008, Holland What is network management? System & Service monitoring Reachability, availability Resource measurement/monitoring

More information

Application Performance Monitoring (APM) Technical Whitepaper

Application Performance Monitoring (APM) Technical Whitepaper Application Performance Monitoring (APM) Technical Whitepaper Table of Contents Introduction... 3 Detect Application Performance Issues Before Your Customer Does... 3 Challenge of IT Manager... 3 Best

More information

echomountain Enterprise Monitoring, Notification & Reporting Services Protect your business

echomountain Enterprise Monitoring, Notification & Reporting Services Protect your business Protect your business Enterprise Monitoring, Notification & Reporting Services echomountain 1483 Patriot Blvd Glenview, IL 60026 877.311.1980 sales@echomountain.com echomountain Enterprise Monitoring,

More information

AXIGEN Mail Server Reporting Service

AXIGEN Mail Server Reporting Service AXIGEN Mail Server Reporting Service Usage and Configuration The article describes in full details how to properly configure and use the AXIGEN reporting service, as well as the steps for integrating it

More information

Using SNMP to Obtain Port Counter Statistics During Live Migration of a Virtual Machine. Ronny L. Bull Project Writeup For: CS644 Clarkson University

Using SNMP to Obtain Port Counter Statistics During Live Migration of a Virtual Machine. Ronny L. Bull Project Writeup For: CS644 Clarkson University Using SNMP to Obtain Port Counter Statistics During Live Migration of a Virtual Machine Ronny L. Bull Project Writeup For: CS644 Clarkson University Fall 2012 Abstract If a managed switch is used during

More information

SolarWinds Certified Professional. Exam Preparation Guide

SolarWinds Certified Professional. Exam Preparation Guide SolarWinds Certified Professional Exam Preparation Guide Introduction The SolarWinds Certified Professional (SCP) exam is designed to test your knowledge of general networking management topics and how

More information

Monitoring Windows Servers and Applications with GroundWork Monitor Enterprise 6.7. Product Application Guide October 8, 2012

Monitoring Windows Servers and Applications with GroundWork Monitor Enterprise 6.7. Product Application Guide October 8, 2012 Monitoring Windows Servers and Applications with GroundWork Monitor Enterprise 6.7 Product Application Guide October 8, 2012 Table of Contents Introduction...3 Definitions and Abbreviations...3 GroundWork

More information

NETWORK AND SERVER MANAGEMENT

NETWORK AND SERVER MANAGEMENT E-SPIN PROFESSIONAL BOOK NETWORK MANAGEMENT NETWORK AND SERVER MANAGEMENT ALL THE PRACTICAL KNOW HOW AND HOW TO RELATED TO THE SUBJECT MATTERS. COMPREHENSIVE MONITORING FOR NETWORKS, SYSTEMS APPLICATIONS,

More information

Network Management & Monitoring Overview

Network Management & Monitoring Overview Network Management & Monitoring Overview Unix & Network Management June 16-18 Papeete, French Polynesia Hervey Allen, Phil Regnauld Introduction This is a big topic... We'll try to respond to what you

More information

There are numerous ways to access monitors:

There are numerous ways to access monitors: Remote Monitors REMOTE MONITORS... 1 Overview... 1 Accessing Monitors... 1 Creating Monitors... 2 Monitor Wizard Options... 11 Editing the Monitor Configuration... 14 Status... 15 Location... 17 Alerting...

More information

Resource Utilization of Middleware Components in Embedded Systems

Resource Utilization of Middleware Components in Embedded Systems Resource Utilization of Middleware Components in Embedded Systems 3 Introduction System memory, CPU, and network resources are critical to the operation and performance of any software system. These system

More information

SolarWinds Network Performance Monitor

SolarWinds Network Performance Monitor SolarWinds Network Performance Monitor powerful network fault & availabilty management Fully Functional for 30 Days SolarWinds Network Performance Monitor (NPM) makes it easy to quickly detect, diagnose,

More information

America s Most Wanted a metric to detect persistently faulty machines in Hadoop

America s Most Wanted a metric to detect persistently faulty machines in Hadoop America s Most Wanted a metric to detect persistently faulty machines in Hadoop Dhruba Borthakur and Andrew Ryan dhruba,andrewr1@facebook.com Presented at IFIP Workshop on Failure Diagnosis, Chicago June

More information

The Purview Solution Integration With Splunk

The Purview Solution Integration With Splunk The Purview Solution Integration With Splunk Integrating Application Management and Business Analytics With Other IT Management Systems A SOLUTION WHITE PAPER WHITE PAPER Introduction Purview Integration

More information

Assignment One. ITN534 Network Management. Title: Report on an Integrated Network Management Product (Solar winds 2001 Engineer s Edition)

Assignment One. ITN534 Network Management. Title: Report on an Integrated Network Management Product (Solar winds 2001 Engineer s Edition) Assignment One ITN534 Network Management Title: Report on an Integrated Network Management Product (Solar winds 2001 Engineer s Edition) Unit Co-coordinator, Mr. Neville Richter By, Vijayakrishnan Pasupathinathan

More information

WhatsUpGold. v3.0. WhatsConnected User Guide

WhatsUpGold. v3.0. WhatsConnected User Guide WhatsUpGold v3.0 WhatsConnected User Guide Contents CHAPTER 1 Welcome to WhatsConnected Finding more information and updates... 2 Sending feedback... 3 CHAPTER 2 Installing and Configuring WhatsConnected

More information

PacketTrap One Resource for Managed Services

PacketTrap One Resource for Managed Services Remote Monitoring Software for Managed Services Providers PacketTrap RMM provides a cost-effective way for you to offer enterprise-class server, application, and network management to your customers. It

More information

FileNet System Manager Dashboard Help

FileNet System Manager Dashboard Help FileNet System Manager Dashboard Help Release 3.5.0 June 2005 FileNet is a registered trademark of FileNet Corporation. All other products and brand names are trademarks or registered trademarks of their

More information

TPAf KTl Pen source. System Monitoring. Zenoss Core 3.x Network and

TPAf KTl Pen source. System Monitoring. Zenoss Core 3.x Network and Zenoss Core 3.x Network and System Monitoring A step-by-step guide to configuring, using, and adapting this free Open Source network monitoring system Michael Badger TPAf KTl Pen source I I flli\ I I community

More information

CSS ONEVIEW G-Cloud CA Nimsoft Monitoring

CSS ONEVIEW G-Cloud CA Nimsoft Monitoring CSS ONEVIEW G-Cloud CA Nimsoft Monitoring Service Definition 01/04/2014 CSS Delivers Contents Contents... 2 Executive Summary... 3 Document Audience... 3 Document Scope... 3 Information Assurance:... 3

More information

Running custom scripts which allow you to remotely and securely run a script you wrote on Windows, Mac, Linux, and Unix devices.

Running custom scripts which allow you to remotely and securely run a script you wrote on Windows, Mac, Linux, and Unix devices. About Foglight NMS Foglight NMS is a comprehensive device, application, and traffic monitoring and troubleshooting solution. It is capable of securely monitoring single and multi-site networks of all sizes,

More information

Top 3 Issues and Questions (in Network Monitoring!) Developing a Network Monitoring Architecture! infotex. Dan Hadaway CRISC Managing Partner, infotex

Top 3 Issues and Questions (in Network Monitoring!) Developing a Network Monitoring Architecture! infotex. Dan Hadaway CRISC Managing Partner, infotex Top Three Issues and Questions in Network Monitoring Dan Hadaway and Sean Waugh of Auditors now know why we can t monitor event logs, but guess what, they don t care!! So let s open the hood of the managed

More information

MSP End User. Version 3.0. Technical Solution Guide

MSP End User. Version 3.0. Technical Solution Guide MSP End User Version 3.0 Technical Solution Guide N-Compass Remote Networking Monitoring Architecture How Does N-Compass Help Small & Medium Businesses? Proactive IT management The ability to do predictive

More information

Web Traffic Capture. 5401 Butler Street, Suite 200 Pittsburgh, PA 15201 +1 (412) 408 3167 www.metronomelabs.com

Web Traffic Capture. 5401 Butler Street, Suite 200 Pittsburgh, PA 15201 +1 (412) 408 3167 www.metronomelabs.com Web Traffic Capture Capture your web traffic, filtered and transformed, ready for your applications without web logs or page tags and keep all your data inside your firewall. 5401 Butler Street, Suite

More information

Health monitoring & predictive analytics To lower the TCO in a datacenter

Health monitoring & predictive analytics To lower the TCO in a datacenter Health monitoring & predictive analytics To lower the TCO in a datacenter PRESENTATION TITLE GOES HERE Christian B Madsen & Andrei Khurshudov Seagate Technology christian.b.madsen@seagate.com Outline 1.

More information

Tools and strategies to monitor the ATLAS online computing farm

Tools and strategies to monitor the ATLAS online computing farm 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Tools and strategies to monitor the ATLAS online computing farm S. Ballestrero 1,2, F. Brasolin 3, G. L. Dârlea 1,4, I. Dumitru 4, D. A. Scannicchio 5, M. S. Twomey

More information

A recipe using an Open Source monitoring tool for performance monitoring of a SaaS application.

A recipe using an Open Source monitoring tool for performance monitoring of a SaaS application. A recipe using an Open Source monitoring tool for performance monitoring of a SaaS application. Sergiy Fakas, TOA Technologies Nagios is a popular open-source tool for fault-monitoring. Because it does

More information

Networking in the Hadoop Cluster

Networking in the Hadoop Cluster Hadoop and other distributed systems are increasingly the solution of choice for next generation data volumes. A high capacity, any to any, easily manageable networking layer is critical for peak Hadoop

More information

G DATA TechPaper #0275. G DATA Network Monitoring

G DATA TechPaper #0275. G DATA Network Monitoring G DATA TechPaper #0275 G DATA Network Monitoring G DATA Software AG Application Development May 2016 Contents Introduction... 3 1. The benefits of network monitoring... 3 1.1. Availability... 3 1.2. Migration

More information

SolarWinds Network Performance Monitor

SolarWinds Network Performance Monitor SolarWinds Network Performance Monitor powerful network fault & availabilty management Fully Functional for 30 Days SolarWinds Network Performance Monitor (NPM) makes it easy to quickly detect, diagnose,

More information

Kaseya 2. User Guide. Version 7.0. English

Kaseya 2. User Guide. Version 7.0. English Kaseya 2 Monitoring Configuration User Guide Version 7.0 English September 3, 2014 Agreement The purchase and use of all Software and Services is subject to the Agreement as defined in Kaseya s Click-Accept

More information

CA Virtual Assurance/ Systems Performance for IM r12 DACHSUG 2011

CA Virtual Assurance/ Systems Performance for IM r12 DACHSUG 2011 CA Virtual Assurance/ Systems Performance for IM r12 DACHSUG 2011 Happy Birthday Spectrum! On this day, exactly 20 years ago (4/15/1991) Spectrum was officially considered meant - 2 CA Virtual Assurance

More information

Network Monitoring and Management Introduction to Networking Monitoring and Management

Network Monitoring and Management Introduction to Networking Monitoring and Management Network Monitoring and Management Introduction to Networking Monitoring and Management These materials are licensed under the Creative Commons Attribution-Noncommercial 3.0 Unported license (http://creativecommons.org/licenses/by-nc/3.0/)

More information

GlobalSCAPE DMZ Gateway, v1. User Guide

GlobalSCAPE DMZ Gateway, v1. User Guide GlobalSCAPE DMZ Gateway, v1 User Guide GlobalSCAPE, Inc. (GSB) Address: 4500 Lockhill-Selma Road, Suite 150 San Antonio, TX (USA) 78249 Sales: (210) 308-8267 Sales (Toll Free): (800) 290-5054 Technical

More information

Juniper Networks Management Pack Documentation

Juniper Networks Management Pack Documentation Juniper Networks Management Pack Documentation Juniper Networks Data Center Switching Management Pack for VMware vrealize Operations (vrops) Release 2.5 Modified: 2015-10-12 Juniper Networks, Inc. 1133

More information

Grids & networks monitoring - practical approach

Grids & networks monitoring - practical approach Session 2 Networking for the Grid Grids & networks monitoring - practical approach Jedrzej Jajor Cezary Mazurek Wiktor Procyk INGRID 2007 Instrumenting the Grid Outline Introduction to monitoring Tools:

More information

SonicOS Enhanced 4.0: NAT Load Balancing

SonicOS Enhanced 4.0: NAT Load Balancing SonicOS Enhanced 4.0: NAT Load Balancing This document describes how to configure the Network Address Translation (NAT) & Load Balancing (LB) features in SonicOS Enhanced 4.0. Feature Overview, page 1

More information

Cisco Performance Visibility Manager 1.0.1

Cisco Performance Visibility Manager 1.0.1 Cisco Performance Visibility Manager 1.0.1 Cisco Performance Visibility Manager (PVM) is a proactive network- and applicationperformance monitoring, reporting, and troubleshooting system for maximizing

More information