opennms reporting generation tool Juan Pedro Escalona DevOps Southampton, UK - 2014
Juan Pedro Escalona DevOps / Systems Administrator with over 6 years experience administering different OS, network systems and anything that is thrown to my desk. At the present, I work in QA Automation & Performance group, providing tools to automate processes. Monitoring lover, background in several Open Source monitoring solutions. Six months ago, these words changed me: - Hi Juan Pedro. Have you met opennms? Let me introduce you... Manuel Villarejo
Agenda Introduction Load Test Processes Automating processes progress Reporting Tool in Load test processes Python opennms Library Demo Liveview Results Q&A
Agenda Introduction Load Test Processes Automating processes progress Reporting Tool in Load test processes Python opennms Library Demo Liveview Results Q&A
Introduction Over 1000 servers in production Exact copy of production environment for test Releases every 2 weeks Load Test is mandatory An average of 10 Load Test every week
Agenda Introduction Load Test Processes Automating processes progress Reporting Tool in Load test processes Python opennms Library Demo Liveview Results Q&A
Load Test Processes Launch Load Test Prepare Load Test Report Load Test
Load Test Processes Prepare Load Test: Set initial configuration in order to replicate production status (ie. Refresh DBs)
Load Test Processes Prepare Load Test: Set initial configuration in order to replicate production status (ie. Refresh DBs) Launch Load Test: Most of load generators and monitors are compatible only with Windows OS Webload & Neoload + 25 Load Generators (Windows / Linux)
Load Test Processes Prepare Load Test: Set initial configuration in order to replicate production status (ie. Refresh DBs) Launch Load Test: Most of load generators and monitors are compatible only with Windows OS Webload & Neoload + 25 Load Generators (Windows / Linux) Report Load Test: opennms: Main key metrics of servers and graphs Webload / Neoload Report
Agenda Introduction Load Test Processes Automating processes progress Reporting Tool in Load test processes Python opennms Library Demo Liveview Results Q&A
Automating Processes Progress Prepare Load Test: It's complex but for the time being we use scripts to do it Orchestration: check dependencies + call scripts
Automating Processes Progress Prepare Load Test: It's complex but for the time being we use scripts to do it Orchestration: check dependencies + call scripts Launch Load Test: Generate load templates based on load profiles Run/stop and control status of Load Generator Consoles
Automating Processes Progress Prepare Load Test: It's complex but for the time being we use scripts to do it Orchestration: check dependencies + call scripts Launch Load Test: Generate load templates based on load profiles Run/stop and control status of Load Generator Consoles Report Load Test: Schedule report generation or trigger when load test has finished Aggregate different servers with different type of graphs in same report Provide a single view to check status of all nodes we're testing
Automating Processes Progress > Report At the beginning: Time: 10-20 minutes No health key metrics of server Site Catalyst Webload Report
Automating Processes Progress > Report Spreadsheet reports: Time: From 1 hour to days OpenNMS: health statistics Screenshot + Cut + Paste (1 node) Manually
Automating Processes Progress > Report
Automating Processes Progress > Report KSC Reports: Time: 30 minutes to several hours Join different servers + Graphs Save as HTML Manually
Automating Processes Progress > Report
Automating Processes Progress > Report Report generation is not automatic Not possible join several KSC Reports in once KSC Report configuration is not easy Metrics resolution over the time (RRD)
Agenda Introduction Load Test Processes Automating processes progress Reporting Tool in Load test processes Python opennms Library Demo Liveview Improvements & Results Q&A
Reporting Tool in Load Test Processes Key features Support multiple environments (a.k.a. multiple OpenNMS instances) Easy configuration building new Server Reports ( hosts + graphs) Easy report generation: few clicks + few minutes Liveview: lets you aggregate different Server Reports in a same view just on click
Reporting Tool in Load Test Processes Architecture and components: Python OpenNMS Client module: Retrieve info from opennms Contributed by Jason Viloria Memcache: Cache all info obtained from opennms Django Application: Provide UI interface Automate report generation (Celery)
Reporting Tool in Load Test Processes
Agenda Introduction Load Test Processes Automating processes progress Reporting Tool in Load test processes Python opennms Library Demo Liveview Improvements & Results Q&A
Python opennms library Retrieve the list of available nodes in OpenNMS instances: Assets: Node label, node id, categories Node Resources: JMX, SNMP Graphs Node Level: CPU, Memory Interface Level: Net Interface Bandwidth, Disk IO Caching
Python opennms library Get a node by label
Python opennms library Get a node by id
Python opennms library Get node IP Interfaces
Python opennms library Get node graphs
Agenda Introduction Load Test Processes Automating processes progress Reporting Tool in Load test processes Python opennms Library Demo Liveview Improvements & Results Q&A
Demo
Demo > Report Archives
Demo > Add new node
Demo > Add new graph
Demo > Add new Server Report
Demo > New Report
Demo > New Recurrent Report
DEMO > Review report
Agenda Introduction Load Test Processes Automating processes progress Reporting Tool in Load test processes Python opennms Library Demo Liveview Improvements & Results Q&A
Liveview
Agenda Introduction Load Test Processes Automating processes progress Reporting Tool in Load test processes Python opennms Library Demo Liveview Improvements & Results Q&A
Improvements & Results
Improvements & Results Time Spent per Stage 7 Overall Reporting Time 6 6 5 4 3 2 1 0 2 2 1 1 0.1 Report Generation OpenNMS graphs Analysis 10 8 6 4 2 0 9 3 2012 2013
Improvements & Results Load Test Executions 105 120 100 80 60 60 40 20 0 30 24 17 32 24 Load Tests Capacity 94 65 39 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Executions x 3.5 since January: st January 2013 (1 Big Event): 30 executions August 2013 (2 nd Big Event): 105 executions Results Analysis: 2 X analysis time 1/3 overall report time
Improvements & Results At the beginning No server key metrics Spreadsheets Key metrics (1 node) Manual process Much time KSC Reports Key metrics (all) Save time Still manually Reporting Tool Key metrics (all) Save huge time Automatic
We did it
Thanks! Questions?