Log Analysis with the ELK Stack (Elasticsearch, Logstash and Kibana) Gary Smith, Pacific Northwest National Laboratory



Similar documents
Logging on a Shoestring Budget

April 8th - 10th, 2014 LUG14 LUG14. Lustre Log Analyzer. Kalpak Shah. DataDirect Networks. ddn.com DataDirect Networks. All Rights Reserved.

Log Management with Open-Source Tools. Risto Vaarandi SEB Estonia

Log managing at PIC. A. Bruno Rodríguez Rodríguez. Port d informació científica Campus UAB, Bellaterra Barcelona. December 3, 2013

Log management with Logstash and Elasticsearch. Matteo Dessalvi

Real-time Data Analytics mit Elasticsearch. Bernhard Pflugfelder inovex GmbH

Log Management with Open-Source Tools. Risto Vaarandi rvaarandi 4T Y4H00 D0T C0M

Processing millions of logs with Logstash

Analyzing large flow data sets using. visualization tools. modern open-source data search and. FloCon Max Putas

Efficient Management of System Logs using a Cloud Radoslav Bodó, Daniel Kouřil CESNET. ISGC 2013, March 2013

Andrew Moore Amsterdam 2015

Everything you wanted to know about mainframe security, pen testing and vulnerability scanning.. But were too afraid to ask!

Information Retrieval Elasticsearch

Reliable log data transfer

Using NXLog with Elasticsearch and Kibana. Using NXLog with Elasticsearch and Kibana

Bernd Ahlers Michael Friedrich. Log Monitoring Simplified Get the best out of Graylog2 & Icinga 2

Log infrastructure & Zabbix. logging tools integration

Comparative Analysis of Open-Source Log Management Solutions for Security Monitoring and Network Forensics

Introduction. Background

Blackboard Open Source Monitoring

Using Logstash and Elasticsearch analytics capabilities as a BI tool

How To Use Elasticsearch

Mobile Analytics. mit Elasticsearch und Kibana. Dominik Helleberg

Developing an Application Tracing Utility for Mule ESB Application on EL (Elastic Search, Log stash) Stack Using AOP

Powering Monitoring Analytics with ELK stack

Building a logging pipeline with Open Source tools. Iñigo Ortiz de Urbina Cazenave

A New Approach to Network Visibility at UBC. Presented by the Network Management Centre and Wireless Infrastructure Teams

Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack

FireEye App for Splunk Enterprise

Peeling Back the Layers of the Network Security with Security Onion Gary Smith, Pacific Northwest National Laboratory

XpoLog Center Suite Log Management & Analysis platform

To use MySQL effectively, you need to learn the syntax of a new language and grow

FileMaker 11. ODBC and JDBC Guide

There are numerous ways to access monitors:

Monitoring System Status

DiskPulse DISK CHANGE MONITOR

Finding the needle in the haystack with ELK

Who did what, when, where and how MySQL Audit Logging. Jeremy Glick & Andrew Moore 20/10/14

GroundWork Monitor Open Source Readme

logstash The Book Log management made easy James Turnbull

Data Discovery and Systems Diagnostics with the ELK stack. Rittman Mead - BI Forum 2015, Brighton. Robin Moffatt, Principal Consultant Rittman Mead

Monitoring MySQL database with Verax NMS

Knowledge Base Articles

syslog-ng: nyers adatból Big Data

EventSentry Overview. Part I About This Guide 1. Part II Overview 2. Part III Installation & Deployment 4. Part IV Monitoring Architecture 13

The syslog-ng Premium Edition 5F2

WHITE PAPER Redefining Monitoring for Today s Modern IT Infrastructures

Hitachi Content Platform as a Continuous Integration Build Artifact Storage System

Security Correlation Server Quick Installation Guide

Why should you look at your logs? Why ELK (Elasticsearch, Logstash, and Kibana)?

A Comparative Analysis of Open-Source Log Management Solutions for Security Monitoring and Network Forensics. Risto Vaarandi Paweł Niziński

Maintaining Non-Stop Services with Multi Layer Monitoring

Introduction Installation firewall analyzer step by step installation Startup Syslog and SNMP setup on firewall side firewall analyzer startup

the missing log collector Treasure Data, Inc. Muga Nishizawa

Modern Web development and operations practices. Grig Gheorghiu VP Tech Operations Nasty Gal

FileMaker 12. ODBC and JDBC Guide

Monitoring PostgreSQL database with Verax NMS

Analyze Traffic with Monitoring Interfaces and Packet Forwarding

Monitoring Linux and Windows Logs with Graylog Collector. Bernd Ahlers Graylog, Inc.

Log Analyzer Reference

The syslog-ng Premium Edition 5LTS

Network Monitoring & Management Log Management

SourceAnywhere Service Configurator can be launched from Start -> All Programs -> Dynamsoft SourceAnywhere Server.

FUJITSU Software ServerView Cloud Monitoring Manager V1 Introduction

IBM SPSS Statistics Version 22. Windows Installation Instructions (Concurrent License)

McAfee epolicy Orchestrator Software

VMware vcenter Log Insight User's Guide

Volume SYSLOG JUNCTION. User s Guide. User s Guide

HTCondor at the RAL Tier-1

Integration of IT-DB Monitoring tools into IT General Notification Infrastructure

Streamlining Infrastructure Monitoring and Metrics in IT- DB-IMS

Log management with Graylog2 Lennart Koopmann, FrOSCon Mittwoch, 29. August 12

UltraLog HSPI User s Guide

Snare Agent Management Console User Guide to the Snare Agent Management Console in Snare Server v6

Sitemap. Component for Joomla! This manual documents version 3.15.x of the Joomla! extension.

Application Discovery Manager User s Guide vcenter Application Discovery Manager 6.2.1

Installation and configuration of Real-Time Monitoring Tool (RTMT)

SonicWALL Global Management System Reporting Guide Standard Edition

syslog-ng 3.0 Monitoring logs with Nagios

VMware vcenter Log Insight Security Guide

Enterprise Manager. Version 6.2. Administrator s Guide

IceWarp Server. Log Analyzer. Version 10

UQC103S1 UFCE Systems Development. uqc103s/ufce PHP-mySQL 1

Using elasticsearch, logstash and kibana to create realtime dashboards

Spector 360 Deployment Guide. Version 7

User Guide to the Snare Agent Management Console in Snare Server v7.0

Auditing Drupal sites for performance, content and optimal configuration

A Year of HTCondor Monitoring. Lincoln Bryant Suchandra Thapa

Configuring the BIG-IP LTM v11 for Oracle Database and RAC

Check list for web developers

TPAf KTl Pen source. System Monitoring. Zenoss Core 3.x Network and

XCloner Official User Manual

HP ProCurve Manager Plus

Using Debug Commands

Using LDAP Authentication in a PowerCenter Domain

ENC Enterprise Network Center. Intuitive, Real-time Monitoring and Management of Distributed Devices. Benefits. Access anytime, anywhere

How To Analyze Logs On Aloha On A Pcode On A Linux Server On A Microsoft Powerbook (For Acedo) On A Macbook Or Ipad (For An Ubuntu) On An Ubode (For Macrocess

A Plan for the Continued Development of the DNS Statistics Collector

Transcription:

Log Analysis with the ELK Stack (Elasticsearch, Logstash and Kibana) Gary Smith, Pacific Northwest National Laboratory

A Little Context! The Five Golden Principles of Security! Know your system! Principle of Least Privilege! Defense in Depth! Protection is key but detection is a must.! Know your enemy. 2

In the Days of Chinook! For most of Chinook's lifetime, the MSC used the "free" version of Splunk to review the syslogs.! Splunk Inc. has an interesting licensing model that's sort of like an all-you-can-eat buffet where you pay by the pound:! The more you ingest, the more you pay.! If you ingest < 500MB of logs a day, Splunk is "free".! If you go over that limit too many times, Splunk will continue to index your logs but you can't view them until you pay them $$$ or you reinstall Splunk.! Consequently, I was always fiddling with Syslog-NG's rules to keep the cruft out and keep the daily log data < 500MB. 3

The Dawning of Cascade! When the talk about what would later be known as Cascade started ramping up, I started looking at a replacement for Splunk because I knew that I would not be able to keep under the 500MB limit with two supercomputers in operation.! The price for a commercial license for Splunk for the amount of log data the MSC's systems produced would be prohibitive. 4

Alternatives to Splunk! I looked at a lot of alternatives to Splunk. These are just some of them:! Graylog2! Nxlog! Octopussy! Logscape,! ELSA! LOGanalyzer! Logalyzer! Logwatcher! loghound! logreport! Logsurfer! PHP-Syslog-NG 5

Alternative to Splunk! Some wouldn't build. Some wouldn t work.! Some were slow.! Some had an abysmal user interface.! Most all of them had a MySQL, PostgreSQL, or similar relational database backend for storage and retrieval. 6

The Solution: ELK Stack [Elasticsearch, Logstash, Kibana]! Elasticsearch: Indexing, storage and retrieval engine! Logstash: Log input slicer and dicer and output writer! Kibana: Data displayer 7

Early Experiments With The Stack! In the early stages of the ELK stack, the pieces didn't play well together.! Early versions of Logstash needed specific versions of Elasticsearch and those weren't the latest ones.! This caused some problems because Kibana wanted the latest version of Elasticsearch.! So I tried a couple of alternatives to ELK. 8

EFK! EFK => Elasticsearch FluentD Kibana! This worked pretty well.! Good points:! With FluentD, you install it, point it at Elasticsearch, point your syslogs at Fluentd and you're good to go.! Bad Points:! There's not much you can do to extend FluentD to do things with the syslog events coming in. 9

ERK! ERK => Elasticsearch Rsyslogd Kibana! There's an Rsyslogd plugin that takes syslog events and sends them to Elasticsearch.! Good points:! Much like FluentD, you install it, point it at Elasticsearch and point your syslogs at Rsyslogd and you're good to go.! Bad Points:! The plugin requires the very latest version of Rsyslogd, so you have to build the latest version of Rsyslogd and the plugin.! Then, you have to maintain the version of Rsyslogd and the plugin since it's two major revisions above what's available in RHEL. 10

Finally, One Big Happy Family! The dysfunctional aspects of the ELK stack got worked out.! Now the members of the ELK stack play well together after being unified with help from the Elasticsearch people. 11

12 Components of The ELK Stack [Elasticsearch Logstash Kibana]

Logstash! Logstash was developed by Jordan Sissel when he was a system administrator at Dreamhost.! Jordan needed something that could handle a peak of 20,000 messages per second.! Logstash is easy to set up, scalable, and easy to extend. 13

Logstash Hosts! In most cases there are two broad classes of Logstash hosts:! Hosts running the Logstash agent as an event forwarder that sends you application, service, and host logs to a central Logstash server.! Central Logstash hosts running some combination of archiver, indexer, search, storage, and web interface software which receive, process, and store your logs. 14

Logstash Basic Configuration File! A basic configuration file for Logstash has 3 sections:! input! filter! output 15

The Input Section! Inputs are the mechanism for passing log data to Logstash. Some of the more useful, commonly-used ones are:! file: reads from a file on the filesystem, much like the UNIX command "tail -f"! syslog: listens on the well-known port 514 for syslog messages and parses according to RFC3164 format! lumberjack: processes events sent in the lumberjack protocol. Now called logstash-forwarder. 16

The Filter Section! Filters are workhorses for processing inputs in the Logstash chain.! They are often combined with conditionals in order to perform a certain action on an event, if it matches particular criteria.! Some useful filters:! grok: parses arbitrary text and structures it.! Grok is currently the best way in Logstash to parse unstructured log data into something structured and queryable.! With 120 patterns shipped built-in to Logstash, it s more than likely you ll find one that meets your needs!! mutate: The mutate filter allows you to do general mutations to fields. You can rename, remove, replace, and modify fields in your events.! drop: Drop an event completely, for example, debug events.! geoip: Adds information about geographical location of IP addresses (and displays amazing charts in Kibana) 17

The Output Section! Outputs are the final phase of the Logstash pipeline.! An event may pass through multiple outputs during processing, but once all outputs are complete, the event has finished its execution.! Some commonly used outputs include:! elasticsearch: If you re planning to save your data in an efficient, convenient and easily queryable format Elasticsearch is the way to go. Period. Yes, we re biased :)! file: writes event data to a file on disk.! statsd: a service which "listens for statistics, like counters and timers, sent over UDP and sends aggregates to one or more pluggable backend services".! If you re already using statsd, this could be useful for you! 18

Elasticsearch: Hard made Easy! Elasticsearch is a powerful indexing and search tool.! The Elasticsearch team says, "Elasticsearch is a response to the claim, 'Search is hard'".! Elasticsearch is easy to set up, has search and index data available RESTfully as JSON over HTTP and is easy to scale and extend.! It's released under the Apache 2.0 license and is built on top of Apache's Lucene project. 19

Elasticsearch: How it works!! Elasticsearch is a text indexing search engine.! The best metaphor to describe Elasticsearch is the index of a book.! You flip to the back of a book, look up a word and then find the reference page.! This means that rather than searching text strings directly, Elasticsearch creates an index from incoming text and performs searches on the index rather than the content.! As a result, it is very fast. 20

Elasticsearch Configuration! Elasticsearch is started with a default cluster name of "elasticsearch" and a random node name based on characters from the X-Men.! A new random node name is selected each time Elasticsearch is restarted if one has not been chosen.! If you want to customize your cluster and node names to ensure unique names, edit /etc/elasticsearch/ elasticsearch.yml.!! This is Elasticsearch's YAML-based configuration file. 21

The Kibana Web Interface! The Kibana web interface is a customizable dashboard that you can extend and modify to suit your environment.! It allows the querying of events, creation of tables and graphs as well as sophisticated visualizations.! The Kibana web interface uses the Apache Lucene query syntax to allow you to make queries.! You can search any of the fields contained in a Logstash event, for example, message, syslog_program, etc.! You can use Boolean logic with AND, OR, NOT as well as fuzzy searches and wildcard searches.! You can:! Build complex queries (including saving them and displaying the results as a new panel)! Graph and visualize data! Produce tables and display data on maps and charts. 22

Troubleshooting: Is It Running?! How do you tell if Elasticsearch is running?!! Do this: curl http://10.0.0.1:9200/_status?pretty=true!! This will return a page that contains a variety of information about the state and status of your Elasticsearch server.! 23

Troubleshooting: Are Logstash And Elasticsearch Working Together?! How can you check to see if Logstash is getting events to Elasticsearch and they are getting indexed?! Do this: curl "http://localhost:9200/_search q=type:syslog&pretty=true"! 24

Troubleshooting: Syntax Checking Your Logstash Configuration File! After you've written/modified your Logstash configuration file, how do you know it's syntactically correct before you put it into production! Do this: /opt/logstash/bin/logstash agent -f logstash.conf --configtest! 25

Getting Rid Of Old Data! One of the things I could never figure out with Splunk is "How do I get expire old data out of Splunk?"! What about Elasticsearch? Can I expire old data out of Elasticsearch and keep only what's recent and relevant?! As it turns out, like most things in the Linux-Sphere, there's more than one way to do it.! I have a daily cron job that runs a Perl script that deletes data out of Elasticsearch older that 31 days.! There is also a python program called Curator by Aaron Mildenstein that helps you curate, or manage your Elasticsearch indices like a museum curator manages the exhibits and collections on display. 26

Debugging Grok Patterns! Remember those patterns I was using in the grok filter to parse out the fields in a syslog event? How did I come up with those?! I used the Grok Debugger at http://grokdebug.herokuapp.com/ to work out the construction of the pattern.! The Grok Debugger is an interactive web page that facilitates rapid creation of patterns based on sample input. 27

28 The Grok Debugger In Action

References! http://www.elastic.co/! http://logstash.net/docs/latest! https://www.elastic.co/products/kibana! https://github.com/elastic/curator/wiki! http://www.fluentd.org/! http://www.rsyslog.com/! http://grokdebug.herokuapp.com/ 29

Questions? Gary Smith Information System Security Officer, Molecular Science Computing, Pacific Northwest National Laboratory Richland, WA gary.smith@pnnl.gov 30