Finding the needle in the haystack with ELK

Similar documents
Using elasticsearch, logstash and kibana to create realtime dashboards

Wie man aus langweiligen Logdateien Gold gewinnen kann

Logging on a Shoestring Budget

Mobile Analytics. mit Elasticsearch und Kibana. Dominik Helleberg

LOG- UND EVENTMANAGEMENT

Powering Monitoring Analytics with ELK stack

Log management with Logstash and Elasticsearch. Matteo Dessalvi

LOG- UND EVENTMANAGEMENT MIT LOGSTASH UND GRAPHITE

Log management with Graylog2 Lennart Koopmann, FrOSCon Mittwoch, 29. August 12

Log Analysis with the ELK Stack (Elasticsearch, Logstash and Kibana) Gary Smith, Pacific Northwest National Laboratory

Real-time Data Analytics mit Elasticsearch. Bernhard Pflugfelder inovex GmbH

Andrew Moore Amsterdam 2015

Analyzing large flow data sets using. visualization tools. modern open-source data search and. FloCon Max Putas

Processing millions of logs with Logstash

Using Logstash and Elasticsearch analytics capabilities as a BI tool

Log managing at PIC. A. Bruno Rodríguez Rodríguez. Port d informació científica Campus UAB, Bellaterra Barcelona. December 3, 2013

Log Management with Open-Source Tools. Risto Vaarandi SEB Estonia

Introduction. Background

Information Retrieval Elasticsearch

Efficient Management of System Logs using a Cloud Radoslav Bodó, Daniel Kouřil CESNET. ISGC 2013, March 2013

April 8th - 10th, 2014 LUG14 LUG14. Lustre Log Analyzer. Kalpak Shah. DataDirect Networks. ddn.com DataDirect Networks. All Rights Reserved.

Log Management with Open-Source Tools. Risto Vaarandi rvaarandi 4T Y4H00 D0T C0M

Using NXLog with Elasticsearch and Kibana. Using NXLog with Elasticsearch and Kibana

Log infrastructure & Zabbix. logging tools integration

Data Discovery and Systems Diagnostics with the ELK stack. Rittman Mead - BI Forum 2015, Brighton. Robin Moffatt, Principal Consultant Rittman Mead

Bernd Ahlers Michael Friedrich. Log Monitoring Simplified Get the best out of Graylog2 & Icinga 2

WHITE PAPER Redefining Monitoring for Today s Modern IT Infrastructures

Comparative Analysis of Open-Source Log Management Solutions for Security Monitoring and Network Forensics

Reliable log data transfer

A Basic Introduction to DevOps Tools

Developing an Application Tracing Utility for Mule ESB Application on EL (Elastic Search, Log stash) Stack Using AOP

Modern Web development and operations practices. Grig Gheorghiu VP Tech Operations Nasty Gal

Building a logging pipeline with Open Source tools. Iñigo Ortiz de Urbina Cazenave

How To Use Elasticsearch

A New Approach to Network Visibility at UBC. Presented by the Network Management Centre and Wireless Infrastructure Teams

FUJITSU Software ServerView Cloud Monitoring Manager V1 Introduction

A Year of HTCondor Monitoring. Lincoln Bryant Suchandra Thapa

Blackboard Open Source Monitoring

NetFlow Analytics for Splunk

An Open Source NoSQL solution for Internet Access Logs Analysis

Log management with Graylog2 Lennart Koopmann, Kieker Days Mittwoch, 5. Dezember 12

CloudStack Metering Working with the Usage Data. Tariq Iqbal Senior

MALAYSIAN PUBLIC SECTOR OPEN SOURCE SOFTWARE (OSS) PROGRAMME. COMPARISON REPORT ON NETWORK MONITORING SYSTEMS (Nagios and Zabbix)

Performance Analysis and Capacity Planing

XpoLog Center Suite Log Management & Analysis platform

SCF/FEF Evaluation of Nagios and Zabbix Monitoring Systems. Ed Simmonds and Jason Harrington 7/20/2009

Scaling Graphite Installations

Cymon.io. Open Threat Intelligence. 29 October 2015 Copyright 2015 esentire, Inc. 1

IceWarp to IceWarp Server Migration

Sisense. Product Highlights.

Scalable Architecture on Amazon AWS Cloud

A Comparative Analysis of Open-Source Log Management Solutions for Security Monitoring and Network Forensics. Risto Vaarandi Paweł Niziński

There are numerous ways to access monitors:

The syslog-ng Premium Edition 5F2

Graylog2 Lennart Koopmann, OSDC /

logstash The Book Log management made easy James Turnbull

Topics. CIT 470: Advanced Network and System Administration. Why Monitoring? Why Monitoring? Historical Monitoring Processes. Historical Monitoring

Application Performance Monitoring of a scalable Java web-application in a cloud infrastructure

DiskPulse DISK CHANGE MONITOR

logstash The Book Log management made easy James Turnbull

@tobiastrelle. codecentric AG 1

SEO - Access Logs After Excel Fails...

syslog-ng: from log collection to processing and information extraction

Monitoring Linux and Windows Logs with Graylog Collector. Bernd Ahlers Graylog, Inc.

Creating Big Data Applications with Spring XD

the missing log collector Treasure Data, Inc. Muga Nishizawa

W3Perl A free logfile analyzer

Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack

Centralized logging system based on WebSockets protocol

CI Pipeline with Docker

ZABBIX. An Enterprise-Class Open Source Distributed Monitoring Solution. Takanori Suzuki MIRACLE LINUX CORPORATION October 22, 2009

Figure 29. Filtering options for connection type and the amount of last items to show Figure 30. User Activity View Figure 31.

1. INTERFACE ENHANCEMENTS 2. REPORTING ENHANCEMENTS

syslog-ng: nyers adatból Big Data

APPLICATION PROGRAMMING INTERFACE

FileNet System Manager Dashboard Help

Streamlining Infrastructure Monitoring and Metrics in IT- DB-IMS

CIT 470: Advanced Network and System Administration. Topics. Why Monitoring? System Monitoring

Spoilt for Choice Which Integration Framework to choose? Mule ESB. Integration. Kai Wähner

MySQL Enterprise Monitor

Watch your Flows with NfSen and NFDUMP 50th RIPE Meeting May 3, 2005 Stockholm Peter Haag

Configuring Security for FTP Traffic

Suricata 2.0, Netfilter and the PRC

Configuration Information

FireEye App for Splunk Enterprise

Clusters in the Cloud

Introduction Installation firewall analyzer step by step installation Startup Syslog and SNMP setup on firewall side firewall analyzer startup

Zabbix 3.0. The Simple, the Powerful and the Shiny by Zabbix SIA

DevOoops Increase awareness around DevOps infra security. Gianluca

How Comcast Built An Open Source Content Delivery Network National Engineering & Technical Operations

Search Big Data with MySQL and Sphinx. Mindaugas Žukas

pt360 FREE Tool Suite Networks are complicated. Network management doesn t have to be.

Logentries Insights: The State of Log Management & Analytics for AWS

This presentation explains how to monitor memory consumption of DataStage processes during run time.

Federated SQL on Hadoop and Beyond: Leveraging Apache Geode to Build a Poor Man's SAP HANA. by Christian

Load and Performance Load Testing. RadView Software October

How To Set Up Foglight Nms For A Proof Of Concept

Performance and Health Monitoring and Analysis of Hive Scales Portal Web Application

Transcription:

Finding the needle in the haystack with ELK Elasticsearch for Incident Handlers and Forensic Analysts S by Christophe@Vandeplas.com

Whoami S Working for the Belgian Government my own company S Incident Handling S Malware analysis S Forensics (network + system) S Open Source minded S Creator of MISP Malware Information Sharing Platform S Creator of pystemon pastebin monitoring tool S Core organizer of the FOSDEM conference for many years S Contact me: christophe@vandeplas.com

Finding the needle in the haystack with ELK Elasticsearch for Incident Handlers and Forensic Analysts S by Christophe@Vandeplas.com

image by James Lumb

What tools do you use? S Text logs S notepad S Grep S awk / sed / cut S MS Excel / OOo Calc

image by velorichard.wordpress.com

Optimizing S grep -F log.txt S zgrep -F log.txt S zgrep -f patterns.txt -F log.txt S find "$LOGS_DIR" -iname "*.gz" -print0 parallel --gnu -0 -n1 -P8 zgrep -f patterns.txt F > result-all.txt S Fast for single search, however no column lookup!

Optimizing S MySQL / MS Access S Splunk S free = 500MB/day S ELSA Enterprise Log Search and Archive S Limitation of the # of columns S ${COMMERCIAL_TOOL}

Trick for Splunk Addicts S Limit is 500 MB /day S 3 license violations allowed per month S Set the date to 00:01 AM S Index as much as possible 24h/day for 3 days (while loops are your friend) S Enjoy searching

Trick for all = ELK S Elasticsearch Logstash Kibana S Index as much as you want S No limit on volume, speed or position of the moon S Open Source, Free to use, commercial support logstash kibana

Configurations S https://github.com/cvandeplas/elk-forensics S Repository with Logstash and Kibana configurations S Mactime, BlueCoat, Mail IMSS, IWSVA, IIS, SuperTimeline, Plaso, S http://christophe.vandeplas.com/2014/06/setting-upsingle-node-elk-in-20-minutes.html S Our focus today: S Forensics and Incident Handling S Batch-Import

How does it work? S

Trick for all = ELK S Elasticsearch Logstash Kibana S Index as much as you want S No limit on volume, speed or position-of-the-moon-licensing S Open Source, Free to use, commercial support logstash kibana

Inputs S Inputs & codecs S collectd, drupal_dblog, elasticsearch, eventlog, exec, file, ganglia, gelf, gemfire, generator, graphite, heroku, imap, invalid_input, irc, jmx, log4j, lumberjack, pipe, puppet_facter, rabbitmq, rackspace, redis, relp, s3, snmptrap, sqlite, sqs, stdin, stomp, syslog, tcp, twitter, udp, unix, varnishlog, websocket, wmi, xmpp, zenoss, zeromq S cloudtrail, collectd, compress_spooler, dots, edn, edn_lines, fluent, graphite, json, json_lines, json_spooler, line, msgpack, multiline, netflow, noop, oldlogstashjson, plain, rubydebug, spool S Outputs S Filters

Input Example S I usually don t use file as input S Keeps a reference to the position in the file S TCP socket is the easiest for me S ncat log01.lab.internal 18001 < logfile.log!

Outputs S Inputs & codecs S Outputs S boundary, circonus, cloudwatch, csv, datadog, datadog_metrics, elasticsearch, elasticsearch_http, elasticsearch_river, email, exec, file, ganglia, gelf, gemfire, google_bigquery, google_cloud_storage, graphite, graphtastic, hipchat, http, irc, jira, juggernaut, librato, loggly, lumberjack, metriccatcher, mongodb, nagios, nagios_nsca, null, opentsdb, pagerduty, pipe, rabbitmq, rackspace, redis, redmine, riak, riemann, s3, sns, solr_http, sqs, statsd, stdout, stomp, syslog, tcp, udp, websocket, xmpp, zabbix, zeromq S Filters

Output Example

Filters S Inputs & codecs S Outputs S Filters S advisor, alter, anonymize, checksum, cidr, cipher, clone, collate, csv, date, dns, drop, elapsed, elasticsearch, environment, extractnumbers, fingerprint, gelfify, geoip, grep, grok, grokdiscovery, i18n, json, json_encode, kv, metaevent, metrics, multiline, mutate, noop, prune, punct, railsparallelrequest, range, ruby, sleep, split, sumnumbers, syslog_pri, throttle, translate, unique, urldecode, useragent, uuid, wms, wmts, xml, zeromq

Filter Example

Filter Example

Grok S Named regular expressions to match patterns/extract data. S Logstash ships with lots of patterns! https://github.com/elasticsearch/logstash/tree/master/patterns S Test app: http://grokdebug.herokuapp.com

Testing complex Groks

Data Enrichment with Filters S Extract fields: csv, grok, kv! S Extract date! S Modify using mutate! S Enrich with S Geoip S User-agent S Urldecode S Translate S

Geoip

Geoip

User-Agent

User-Agent

Translate

Translate

Ruby as last resort * There might be a better way to do this, but ruby and I are not really friends yet

Data Enrichment with Filters S Extract fields: csv, grok, kv! S Extract date! S Modify using mutate! S Enrich with S Geoip S User-agent S Urldecode S Translate S

Trick for all = ELK S Elasticsearch Logstash Kibana S Index as much as you want S No limit on volume, speed or season-licensing S Open Source, Free to use, commercial support logstash kibana

Elasticsearch S Wikipedia: Elasticsearch is a search server based on Lucene. It provides a distributed, multitenant-capable fulltext search engine with a RESTful web interface and schema-free JSON documents. Elasticsearch is developed in Java and is released as open source under the terms of the Apache License. S Very very fast S Adding an node = easier than extremely easy

Elasticsearch S Be cautious S No security by default S Auto-discovery, auto-distribution if other node is present S Elastic HQ plugin S cd /usr/share/elasticsearch/bin! S./plugin -install royrusso/elasticsearch-hq!

Trick for all = ELK S Elasticsearch Logstash Kibana S Index as much as you want S No limit on volume, speed or horoscope-licensing S Open Source, Free to use, commercial support logstash kibana

Kibana S Fancy GUI S Extremely easy to build up a dashboard S Gives good overview over data S Powerful, but limited in capability S For more: write a python script or use REST API

DO NOT PRESS THIS BUTTON

Search syntax S Apache Lucene Search syntax S title:foo title:"foo bar S title:"foo bar AND body:"quick fox S (title:"foo bar" AND body:"quick fox") OR title:fox S title:foo -title:bar S title:foo*bar S time_taken:[10000 TO 999999999] http://www.lucenetutorial.com/lucene-query-syntax.html

Load dashboards

Filter

Performance S

Performance goals S Focus Incident Handling and Forensics S Max speed of indexing S Max speed of searching S During indexation search may be slow S No need for redundancy S So don t use this advice for operations-live-production

Performance Logstash S Memory setting: (/etc/default/elasticsearch) S LS_HEAP_SIZE="500m"! S Command line flag: S -w or filterworkers AMOUNT_OF_CORES (default: 1)! S Each extra filter slows it down S Grok aka regex = slow S Prefer csv, kv S Use the least possible wildcards (* or +)! S Geoip = slow but very practical S User-agent = slow, often practical

Performance Elasticsearch S Memory setting (/etc/default/elasticsearch) S ES_HEAP_SIZE=12g => set to half of RAM (max 32 GB) S Disable redundancy (/etc/elasticsearch/elasticsearch.yml) S index.number_of_replicas: 0! S Shards for number of nodes (/etc/elasticsearch/elasticsearch.yml) S index.number_of_shards: 1 S Increase memory buffer for search S indices.memory.index_buffer_size: 50%!

Perf. Elasticsearch Indexes S Open Index = memory usage + disk usage Closed Index = disk usage, so close index when not needed S Per case new indexes Similar logs in the same index, but use a field host to differentiate investigations S system timelines: logstash-%{[case]}-%{[type]} S mail logs: logstash-%{[case]}-%{[type]}-%{+yyyy.mm} S proxy logs: logstash-%{[case]}-%{[type]}-%{+yyyy.mm.dd} S curl -XPOST 'localhost:9200/logstash-${case}*/_close' curl -XPOST 'localhost:9200/logstash-${case}*/_open'!

Performance Kibana S Each block/graph is extra search S So 10 graphs equals 10 simultaneous searches 1. First select small date/time window 2. Test your search on small data set 3. Add filters 4. Zoom out on date/time 5. Dig deeper

Keep in mind S Logstash is (relatively) SLOW S Finished? Close the index, do NOT delete it S Or save JSON to files (output plugin Logstash), re-index them later S Node++ = Speed++

Forensic analysis S

Plaso S Plaso = the new log2timeline and more S log2timeline.py win7-64-nfury-10.3.58.6.dump /path/to/ disk/image S psort.py -o elastic win7-64-nfury-10.3.58.6.dump

ELK-forensics S https://github.com/cvandeplas/elk-forensics S Logstash configs S Kibana dashboards S Mactime, Log2timeline csv, BlueCoat, Mail IMSS, IWSVA, IIS S More to come

Other interesting projects using Elasticsearch S Moloch Open Source large scale IPv4 full PCAP capturing, indexing and database system. https://github.com/aol/moloch S Mozdef PoC automate IH process and facilitate realtime activities - https://github.com/jeffbryner/mozdef S Suricata Exports data in EVE format (JSON). Great to visualize malware activity from sandbox

Places to be? https://github.com/cvandeplas/elk-forensics http://www.elasticsearch.org/overview/elkdownloads/ http://logstash.net/ https://groups.google.com/forum/#!forum/logstash-users S