Using Logstash and Elasticsearch analytics capabilities as a BI tool Pashalis Korosoglou, Pavlos Daoglou, Stefanos Laskaridis, Dimitris Daskopoulos Aristotle University of Thessaloniki, IT Center
Outline Technical stuff (Logstash, Elastic, Kibana, Ansible) Motivation for monitoring Software licenses Other use cases Summary and next steps
Logstash Written in jruby Applicable well beyond log files Plethora of core and community contributed plugins I/O plugins Filtering plugins Codecs Take this msg and parse/compute/save stuff on the wire
A very simple pipe example serviceuri: node03.domain.gr\nhostname: node03.domain.gr\nserviceflavour: service_i\nsitename: SITENAME\nmetricStatus: OK\nmetricName: org.ldap.freshness\nsummarydata: OK: freshness=70s, entries=1\ngatheredat: nagios.domain.gr\ntimestamp: 2015-06-08T19:42:31Z\nnagiosName: org.ldap.freshness\nservicetype: service_i\neot\n { } "@timestamp" => "2015-06-08T19:42:31.000Z, "hostname" => "node03.domain.gr", "serviceflavour" => "service_i", "sitename" => "SITENAME", "metricstatus" => "OK", "metricname" => "org.ldap.freshness", "freshness" => 70 "entries" => 1 "gatheredat" => "nagios.domain.gr", "probe" => "org.ldap.freshness", "servicetype" => "service_i"
Logstash forwarders & Lumberjack Logstash-forwarder is a lightweight forwarding service Keeps track of offset within log file Failure resistant Supports multiple file inputs Lumberjack is a collection service Basically one of many input plugins available Uses zlib for compression Secure transmission of logs via OpenSSL
Architecture Overview
Architecture Overview Logstash forwarder(s) configuration
ElasticSearch (Elastic) Distributed data-store with near-real time search capabilities Built on top of Apache Lucene Exposes HTTP RESTful API (i.e. for querying) Multitenant architecture Highly available Shards replication Supports 3 rd party plugins (i.e. HQ, head etc) Apache 2.0 license
Elastic, RDBMS & Hadoop concepts ES document -> Table row in a RDB ES Index -> RDB database A collection of documents ES Mapping -> RDB schema definition ES Shards -> Hadoop splits Each shard is actually a Lucene index ES index splits into shards
Elastic, RDBMS & Hadoop concepts Replication: 1 5 primary shards by default 1 replica for each shard Replicas can t be assigned on the same node with the primary shard
Kibana Node.js frontend for Elastic Allows (realtime) visualisation of data Flexible interface One can add, remove, move, modify rows and graphs Allows different search queries Allows save, import, export and share operations for dashboards
Software @ Auth More than 20 annual contracts signed (+a few perpetual) The majority relies on FlexLM service Expenditures/year: ~100K Use cases Which departments use software X? Which departments use software X for educational or research purposes? How often is software X s Y component used?
Software @ Auth The problem(s) with flex logs: 23:29:06 (deamon) TIMESTAMP 6/3/2015 0:36:51 (deamon) OUT: "feature" someone@somewhere 0:39:04 (deamon) IN: "feature" someone@somewhere 0:54:47 (deamon) DENIED: "feature" someone@somewhere (Licensed number of users already reached. (-4,342)) 0:54:47 (deamon) UNSUPPORTED: "feature" (PORT_AT_HOST_PLUS ) someone@somewhere (License server system does not support this feature. (-18,327)) 0:54:47 (deamon) OUT: "feature" someone@somewhere (2 licenses) 1:08:08 (deamon) IN: "feature" someone@somewhere (2 licenses) 1:08:31 (deamon) OUT: "feature" someone@somewhere 1:10:09 (deamon) IN: "feature" someone@somewhere 1:13:43 (deamon) UNSUPPORTED: "feature" (PORT_AT_HOST_PLUS ) someone@somewhere (License server system does not support this feature. (-18,327)) 3:16:44 (lmgrd) TIMESTAMP 6/4/2015
Software @ Auth Our solution (via logstash filtering): { } "_type": "deamon", "_source": { "message": "19:07:17 (deamon) IN: \"feature\" someone@somewhere", "@version": "1", "@timestamp": "2015-06-08T16:07:17.000Z", "host": "tracker01", "tags": [ "taskterminated", "elapsed", "elapsed.match" ], "feature": "\"feature\"", "username": "someone", "hostname": "somewhere", "elapsed.time": 67, "elapsed.timestamp_start": "2015-06-08T16:06:10.000Z" }
Software @ Auth (screenshots)
Software @ Auth (screenshots)
Software @ Auth The decision making on what contracts will continue and with how many seating licenses depend on our accounting monitoring Actual scenarios/decisions Renew contract for software X but reduce the number of floating licenses Renew annual license for software X but don t renew component Y
Other Use Cases? Web services Accounting Resources usage Environmental monitoring Logins and brute force attempts Performance metrics Any log file (?)
Web services filter { if [type] == "httpd" { grok { match => { "message" => "%{COMBINEDAPACHELOG}" } } geoip { source => "clientip" } mutate { convert => { bytes => "integer" } } date { match => [ "timestamp", "dd/mmm/yyyy:hh:mm:ss Z" ] } } }
Web services
Accounting (local HPC resource)
Accounting (local HPC resource)
Logins (successful and attacks)
Resources Usage
Re-playing Log files are still kept in central syslog Scratch elastic completely and everything is reproducible ad-hoc Filters (via Ansible) Log files (via central syslog)
Reporting Elastic API not reachable from outside What if we want to send reports to our users? Using phantomjs framework and rasterize.js we can generate: custom weekly or monthly or annual reports in pdf format and share with our users
Summary The wealth sometimes hidden away in our log files is enormous ELK should not be considered a replacement for central logging Rather it s best to treat it as an addition to an existing stack ELK has helped us in Indexing data from log files Searching through log files Visualizing data and gain useful business insight
Next steps Performance monitoring via Nagios/Icinga probes & metrics Combination with Hadoop stack Safekeeping cold data Performing combined aggregated queries λ architectural prototype Upgrade elastic and kibana to 1.5.x Apply data data retention policies and use Elastic's repository features for long term storage
Questions support@.gr