Analyzing large flow data sets using. visualization tools. modern open-source data search and. FloCon 2014. Max Putas



Similar documents
Log management with Logstash and Elasticsearch. Matteo Dessalvi

Real-time Data Analytics mit Elasticsearch. Bernhard Pflugfelder inovex GmbH

How To Use Elasticsearch

Log Management with Open-Source Tools. Risto Vaarandi SEB Estonia

Log Analysis with the ELK Stack (Elasticsearch, Logstash and Kibana) Gary Smith, Pacific Northwest National Laboratory

Information Retrieval Elasticsearch

Powering Monitoring Analytics with ELK stack

Processing millions of logs with Logstash

Andrew Moore Amsterdam 2015

April 8th - 10th, 2014 LUG14 LUG14. Lustre Log Analyzer. Kalpak Shah. DataDirect Networks. ddn.com DataDirect Networks. All Rights Reserved.

A New Approach to Network Visibility at UBC. Presented by the Network Management Centre and Wireless Infrastructure Teams

Using Logstash and Elasticsearch analytics capabilities as a BI tool

Log Management with Open-Source Tools. Risto Vaarandi rvaarandi 4T Y4H00 D0T C0M

Log managing at PIC. A. Bruno Rodríguez Rodríguez. Port d informació científica Campus UAB, Bellaterra Barcelona. December 3, 2013

Logging on a Shoestring Budget

Efficient Management of System Logs using a Cloud Radoslav Bodó, Daniel Kouřil CESNET. ISGC 2013, March 2013

Log management with Graylog2 Lennart Koopmann, FrOSCon Mittwoch, 29. August 12

Data Discovery and Systems Diagnostics with the ELK stack. Rittman Mead - BI Forum 2015, Brighton. Robin Moffatt, Principal Consultant Rittman Mead

How Comcast Built An Open Source Content Delivery Network National Engineering & Technical Operations

Log infrastructure & Zabbix. logging tools integration

A Performance Analysis of Distributed Indexing using Terrier

Bernd Ahlers Michael Friedrich. Log Monitoring Simplified Get the best out of Graylog2 & Icinga 2

XpoLog Center Suite Data Sheet

Yahoo! Communities Architectures Ian Flint

Development of Monitoring and Analysis Tools for the Huawei Cloud Storage

Scaling Graphite Installations

Deploying and Managing SolrCloud in the Cloud ApacheCon, April 8, 2014 Timothy Potter. Search Discover Analyze

FileNet System Manager Dashboard Help

Introduction. Background

MySQL Enterprise Monitor

Streamlining Infrastructure Monitoring and Metrics in IT- DB-IMS

Web Load Stress Testing

Blackboard Open Source Monitoring

Mobile Analytics. mit Elasticsearch und Kibana. Dominik Helleberg

Towards Smart and Intelligent SDN Controller

A Brief. Introduction. of MG-SOFT s SNMP Network Management Products. Document Version 1.3, published in June, 2008

TEST AUTOMATION FRAMEWORK

Building a logging pipeline with Open Source tools. Iñigo Ortiz de Urbina Cazenave

Log management with Graylog2 Lennart Koopmann, Kieker Days Mittwoch, 5. Dezember 12

Client Overview. Engagement Situation. Key Requirements

Comparative Analysis of Open-Source Log Management Solutions for Security Monitoring and Network Forensics

Testing Automation for Distributed Applications By Isabel Drost-Fromm, Software Engineer, Elastic

DiskPulse DISK CHANGE MONITOR

TPAf KTl Pen source. System Monitoring. Zenoss Core 3.x Network and

Edge Configuration Series Reporting Overview

Grids & networks monitoring - practical approach

Graylog2 Lennart Koopmann, OSDC /

PANDORA FMS NETWORK DEVICE MONITORING

Maintaining Non-Stop Services with Multi Layer Monitoring

Statement of Work Security Information & Event Management (SIEM) December 20, 2012 Request for Proposal No

WHITE PAPER Redefining Monitoring for Today s Modern IT Infrastructures

Improve performance and availability of Banking Portal with HADOOP

TORNADO Solution for Telecom Vertical

Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack

Open Source and Commercial Performance Testing Tools

Research Report. IBM Operations Analytics - Log Analysis: Getting the Most out of Your Operational Big Data

How to create a load testing environment for your web apps using open source tools by Sukrit Dhandhania

PANDORA FMS NETWORK DEVICES MONITORING

Leveraging the Power of SOLR with SPARK. Johannes Weigend QAware GmbH Germany pache Big Data Europe September 2015

Modern Web development and operations practices. Grig Gheorghiu VP Tech Operations Nasty Gal

Improved metrics collection and correlation for the CERN cloud storage test framework

Cymon.io. Open Threat Intelligence. 29 October 2015 Copyright 2015 esentire, Inc. 1

the missing log collector Treasure Data, Inc. Muga Nishizawa

SIEM SPEEDS TIME TO RESOLUTION (NOT JUST FOR SECURITY ISSUES)

Applied Detection and Analysis Using Network Flow Data

Sisense. Product Highlights.

Jason Hill HPC Operations Group ORNL Cray User s Group 2011, Fairbanks, AK

XpoLog Competitive Comparison Sheet

Vistara Lifecycle Management

TSM Studio Server User Guide

Collaborative Open Market to Place Objects at your Service

W3Perl A free logfile analyzer

The syslog-ng Store Box 3 F2

NNMi120 Network Node Manager i Software 9.x Essentials

WEBAPP PATTERN FOR APACHE TOMCAT - USER GUIDE

mbits Network Operations Centrec

A Basic Introduction to DevOps Tools

Testing Tools using Visual Studio. Randy Pagels Sr. Developer Technology Specialist Microsoft Corporation

Top 3 Issues and Questions (in Network Monitoring!) Developing a Network Monitoring Architecture! infotex. Dan Hadaway CRISC Managing Partner, infotex

Real Time Performance Dashboard for SOA Web Services ORION SOA

Volume SYSLOG JUNCTION. User s Guide. User s Guide

XpoLog Center Suite Log Management & Analysis platform

Network Monitoring Comparison

RTI v3.3 Lightweight Deep Diagnostics for LoadRunner

Business white paper. HP Process Automation. Version 7.0. Server performance

Traffic visualization with Arista sflow and Splunk

ntopng: Realtime Network Traffic View

ZingMe Practice For Building Scalable PHP Website. By Chau Nguyen Nhat Thanh ZingMe Technical Manager Web Technical - VNG

ExtraHop and AppDynamics Deployment Guide

A Year of HTCondor Monitoring. Lincoln Bryant Suchandra Thapa

GeoCloud Project Report USGS/EROS Spatial Data Warehouse Project

Developing an Application Tracing Utility for Mule ESB Application on EL (Elastic Search, Log stash) Stack Using AOP

Enterprise IT is complex. Today, IT infrastructure spans the physical, the virtual and applications, and crosses public, private and hybrid clouds.

NetCrunch 6. AdRem. Network Monitoring Server. Document. Monitor. Manage

Monitoring Windows Servers and Applications with GroundWork Monitor Enterprise 6.7. Product Application Guide October 8, 2012

Integration of IT-DB Monitoring tools into IT General Notification Infrastructure

Transcription:

Analyzing large flow data sets using modern open-source data search and visualization tools FloCon 2014 Max Putas

About me Operations Engineer - DevOps BS, MS, and CAS in Telecommunications Work/research interests System automation Efficiency improvement System and network monitoring Traffic/service analysis Open-Source software

Common tools for analysis Scripts: Bash, Perl, Python Learning curve, time-intensive GnuPlot for graphing/visualization Application-specific tools SiLK, Apache Chainsaw, Wireshark Splunk - EXPEN$IVE Excel :-(

General model Raw Data Transform Store Visualize - Search - Analyse

Components Binary SiLK data rwfilter rwcut CSV file Logstash Elasticsearch Kibana

Components Logstash =

Components Logstash : About Can act as an agent, server, or both Single jar file only depends on Java Very young project Started in late 2010 First official book released last year (2013)

Components Logstash : Plugins 36 40 46 Input File Filter CSV Date GeoIP Output Elasticsearch...

Components Logstash : Configuration input { file { path => "/tmp/silk-data.csv" start_position => "beginning" type => "silkcsv" } } filter { date { type => "silkcsv" match => [ "stime", "yyyy/mm/dd't'hh:mm:ss.sss" ] add_tag => [ "dated" ] } } output { elasticsearch { host => "localhost" } }

Components Elasticsearch : About Built on Apache Lucene (indexing/search library) Java RESTful API Distributed, scalable architecture. Nodes can find eachother through discovery JSON-based "Big data" focus

Components Elasticsearch : Data storage Index - document database Document types fields type mappings Shards - pieces of the index More shards, better indexing performance across the cluster Replicas - how many copies of each shard More replicas, better search performance and redundancy Node 1 Node 2 1 Index 3 Shards 2 Replicas

Components Elasticsearch : Performance Lab setup: 6-core CPU : 16GB RAM : SATA HD Indexing performance: 4000/s Double the number of shards and machines ~2x index performance increase Double the number of replicas ~2x search performance increase Can take full advantage of SSDs

Components Elasticsearch : Type Mapping... "dip" : { "type" : "ip" }, "dport" : { "type" : "integer" },... "duration" : { "type" : "float" },... "etime" : { "type" : "date", "format" : "yyyy/mm/dd't'hh:mm:ss.sss" },...

Kibana

Components Kibana : Features Pure Javascript: connects directly to Elasticsearch A reverse proxy will be necessary to limit access Graphing/visualization: histograms, scatter plots, pie charts, ranked lists, maps, and line graphs Statistics: trends, min, mean, and max Real-time search: Simultaneous queries, sortable results, filters, field drill-down, and derived (faceted) queries

Components Development The developers of Kibana and Logstash were recently hired by Elasticsearch There is a possibility of even tighter integration in the future

More possibilities Logs Web, database, e-mail, and DNS servers Firewalls, IDS/IPS, switches, and routers Syslog and Windows events Monitoring alerts: SNMP Performance metrics Others? If it s textual and log-like it ll probably work Custom plugins are possible Gather related data to correlate events in Kibana or through the Elasticsearch API

More possibilities Parsing Problem? Regex complexity [0-9]+-(?:0?[1-9] 1[0-2])-(?:(?:0[1-9]) (?:[12] [0-9]) (?:3[01]) [1-9]) (?:2[0123] [01][0-9]):(?: [0-5][0-9]):(?:(?:[0-5][0-9] 60)(?:[.,][0-9]+)?), (?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9] +)?) (?:\.[0-9]+))))

More possibilities Parsing Logstash provides built-in parsing ( grok ) rules: HTTPDATE %{MONTHDAY}/%{MONTH}/%{YEAR}:%{TIME} %{INT} Common Apache log format: 127.0.0.1 - frank [10/Oct/2000:13:55:36-0700] "GET /apache_pb.gif HTTP/1.0" 200 2326 Complete rule: APACHELOG %{IPORHOST:clientip} %{USER:ident} %{USER:auth} \[%{HTTPDATE:timestamp}\] "(?:%{WORD:verb} %{NOTSPACE: request}(?:http/%{number:httpversion})? %{DATA: rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes} -)

DEMO

References and resources http://logstash.net http://www.elasticsearch.org http://www.elasticsearch.org/overview/kibana/ Try Kibana yourself: http://demo.kibana.org Debug grok parsing rules: http://grokdebug.herokuapp.com SiLK Kibana 3 demo video: https://vimeo.com/71393353 Contact: max.putas@gmail.com

?