Finding the needle in the haystack with ELK Elasticsearch for Incident Handlers and Forensic Analysts S by Christophe@Vandeplas.com
Whoami S Working for the Belgian Government my own company S Incident Handling S Malware analysis S Forensics (network + system) S Open Source minded S Creator of MISP Malware Information Sharing Platform S Creator of pystemon pastebin monitoring tool S Core organizer of the FOSDEM conference for many years S Contact me: christophe@vandeplas.com
Finding the needle in the haystack with ELK Elasticsearch for Incident Handlers and Forensic Analysts S by Christophe@Vandeplas.com
image by James Lumb
What tools do you use? S Text logs S notepad S Grep S awk / sed / cut S MS Excel / OOo Calc
image by velorichard.wordpress.com
Optimizing S grep -F log.txt S zgrep -F log.txt S zgrep -f patterns.txt -F log.txt S find "$LOGS_DIR" -iname "*.gz" -print0 parallel --gnu -0 -n1 -P8 zgrep -f patterns.txt F > result-all.txt S Fast for single search, however no column lookup!
Optimizing S MySQL / MS Access S Splunk S free = 500MB/day S ELSA Enterprise Log Search and Archive S Limitation of the # of columns S ${COMMERCIAL_TOOL}
Trick for Splunk Addicts S Limit is 500 MB /day S 3 license violations allowed per month S Set the date to 00:01 AM S Index as much as possible 24h/day for 3 days (while loops are your friend) S Enjoy searching
Trick for all = ELK S Elasticsearch Logstash Kibana S Index as much as you want S No limit on volume, speed or position of the moon S Open Source, Free to use, commercial support logstash kibana
Configurations S https://github.com/cvandeplas/elk-forensics S Repository with Logstash and Kibana configurations S Mactime, BlueCoat, Mail IMSS, IWSVA, IIS, SuperTimeline, Plaso, S http://christophe.vandeplas.com/2014/06/setting-upsingle-node-elk-in-20-minutes.html S Our focus today: S Forensics and Incident Handling S Batch-Import
How does it work? S
Trick for all = ELK S Elasticsearch Logstash Kibana S Index as much as you want S No limit on volume, speed or position-of-the-moon-licensing S Open Source, Free to use, commercial support logstash kibana
Inputs S Inputs & codecs S collectd, drupal_dblog, elasticsearch, eventlog, exec, file, ganglia, gelf, gemfire, generator, graphite, heroku, imap, invalid_input, irc, jmx, log4j, lumberjack, pipe, puppet_facter, rabbitmq, rackspace, redis, relp, s3, snmptrap, sqlite, sqs, stdin, stomp, syslog, tcp, twitter, udp, unix, varnishlog, websocket, wmi, xmpp, zenoss, zeromq S cloudtrail, collectd, compress_spooler, dots, edn, edn_lines, fluent, graphite, json, json_lines, json_spooler, line, msgpack, multiline, netflow, noop, oldlogstashjson, plain, rubydebug, spool S Outputs S Filters
Input Example S I usually don t use file as input S Keeps a reference to the position in the file S TCP socket is the easiest for me S ncat log01.lab.internal 18001 < logfile.log!
Outputs S Inputs & codecs S Outputs S boundary, circonus, cloudwatch, csv, datadog, datadog_metrics, elasticsearch, elasticsearch_http, elasticsearch_river, email, exec, file, ganglia, gelf, gemfire, google_bigquery, google_cloud_storage, graphite, graphtastic, hipchat, http, irc, jira, juggernaut, librato, loggly, lumberjack, metriccatcher, mongodb, nagios, nagios_nsca, null, opentsdb, pagerduty, pipe, rabbitmq, rackspace, redis, redmine, riak, riemann, s3, sns, solr_http, sqs, statsd, stdout, stomp, syslog, tcp, udp, websocket, xmpp, zabbix, zeromq S Filters
Output Example
Filters S Inputs & codecs S Outputs S Filters S advisor, alter, anonymize, checksum, cidr, cipher, clone, collate, csv, date, dns, drop, elapsed, elasticsearch, environment, extractnumbers, fingerprint, gelfify, geoip, grep, grok, grokdiscovery, i18n, json, json_encode, kv, metaevent, metrics, multiline, mutate, noop, prune, punct, railsparallelrequest, range, ruby, sleep, split, sumnumbers, syslog_pri, throttle, translate, unique, urldecode, useragent, uuid, wms, wmts, xml, zeromq
Filter Example
Filter Example
Grok S Named regular expressions to match patterns/extract data. S Logstash ships with lots of patterns! https://github.com/elasticsearch/logstash/tree/master/patterns S Test app: http://grokdebug.herokuapp.com
Testing complex Groks
Data Enrichment with Filters S Extract fields: csv, grok, kv! S Extract date! S Modify using mutate! S Enrich with S Geoip S User-agent S Urldecode S Translate S
Geoip
Geoip
User-Agent
User-Agent
Translate
Translate
Ruby as last resort * There might be a better way to do this, but ruby and I are not really friends yet
Data Enrichment with Filters S Extract fields: csv, grok, kv! S Extract date! S Modify using mutate! S Enrich with S Geoip S User-agent S Urldecode S Translate S
Trick for all = ELK S Elasticsearch Logstash Kibana S Index as much as you want S No limit on volume, speed or season-licensing S Open Source, Free to use, commercial support logstash kibana
Elasticsearch S Wikipedia: Elasticsearch is a search server based on Lucene. It provides a distributed, multitenant-capable fulltext search engine with a RESTful web interface and schema-free JSON documents. Elasticsearch is developed in Java and is released as open source under the terms of the Apache License. S Very very fast S Adding an node = easier than extremely easy
Elasticsearch S Be cautious S No security by default S Auto-discovery, auto-distribution if other node is present S Elastic HQ plugin S cd /usr/share/elasticsearch/bin! S./plugin -install royrusso/elasticsearch-hq!
Trick for all = ELK S Elasticsearch Logstash Kibana S Index as much as you want S No limit on volume, speed or horoscope-licensing S Open Source, Free to use, commercial support logstash kibana
Kibana S Fancy GUI S Extremely easy to build up a dashboard S Gives good overview over data S Powerful, but limited in capability S For more: write a python script or use REST API
DO NOT PRESS THIS BUTTON
Search syntax S Apache Lucene Search syntax S title:foo title:"foo bar S title:"foo bar AND body:"quick fox S (title:"foo bar" AND body:"quick fox") OR title:fox S title:foo -title:bar S title:foo*bar S time_taken:[10000 TO 999999999] http://www.lucenetutorial.com/lucene-query-syntax.html
Load dashboards
Filter
Performance S
Performance goals S Focus Incident Handling and Forensics S Max speed of indexing S Max speed of searching S During indexation search may be slow S No need for redundancy S So don t use this advice for operations-live-production
Performance Logstash S Memory setting: (/etc/default/elasticsearch) S LS_HEAP_SIZE="500m"! S Command line flag: S -w or filterworkers AMOUNT_OF_CORES (default: 1)! S Each extra filter slows it down S Grok aka regex = slow S Prefer csv, kv S Use the least possible wildcards (* or +)! S Geoip = slow but very practical S User-agent = slow, often practical
Performance Elasticsearch S Memory setting (/etc/default/elasticsearch) S ES_HEAP_SIZE=12g => set to half of RAM (max 32 GB) S Disable redundancy (/etc/elasticsearch/elasticsearch.yml) S index.number_of_replicas: 0! S Shards for number of nodes (/etc/elasticsearch/elasticsearch.yml) S index.number_of_shards: 1 S Increase memory buffer for search S indices.memory.index_buffer_size: 50%!
Perf. Elasticsearch Indexes S Open Index = memory usage + disk usage Closed Index = disk usage, so close index when not needed S Per case new indexes Similar logs in the same index, but use a field host to differentiate investigations S system timelines: logstash-%{[case]}-%{[type]} S mail logs: logstash-%{[case]}-%{[type]}-%{+yyyy.mm} S proxy logs: logstash-%{[case]}-%{[type]}-%{+yyyy.mm.dd} S curl -XPOST 'localhost:9200/logstash-${case}*/_close' curl -XPOST 'localhost:9200/logstash-${case}*/_open'!
Performance Kibana S Each block/graph is extra search S So 10 graphs equals 10 simultaneous searches 1. First select small date/time window 2. Test your search on small data set 3. Add filters 4. Zoom out on date/time 5. Dig deeper
Keep in mind S Logstash is (relatively) SLOW S Finished? Close the index, do NOT delete it S Or save JSON to files (output plugin Logstash), re-index them later S Node++ = Speed++
Forensic analysis S
Plaso S Plaso = the new log2timeline and more S log2timeline.py win7-64-nfury-10.3.58.6.dump /path/to/ disk/image S psort.py -o elastic win7-64-nfury-10.3.58.6.dump
ELK-forensics S https://github.com/cvandeplas/elk-forensics S Logstash configs S Kibana dashboards S Mactime, Log2timeline csv, BlueCoat, Mail IMSS, IWSVA, IIS S More to come
Other interesting projects using Elasticsearch S Moloch Open Source large scale IPv4 full PCAP capturing, indexing and database system. https://github.com/aol/moloch S Mozdef PoC automate IH process and facilitate realtime activities - https://github.com/jeffbryner/mozdef S Suricata Exports data in EVE format (JSON). Great to visualize malware activity from sandbox
Places to be? https://github.com/cvandeplas/elk-forensics http://www.elasticsearch.org/overview/elkdownloads/ http://logstash.net/ https://groups.google.com/forum/#!forum/logstash-users S