1 Wie man aus langweiligen Logdateien Gold gewinnen kann
About me 2 Klaus Bild Senior System Architect IBM Connections/Sametime/TDI Monitoring/Log Management Infrastructure (Cloud, Docker ) Blog: http://kbild.ch http://linkedin.com/in/kbild https://www.xing.com/profile/klaus_bild
Logdatei 3 Eine Logdatei (auch Ereignisprotokolldatei; englisch log file) enthält das automatisch geführte Protokoll aller oder bestimmter Aktionen von Prozessen auf einem Computersystem. Die korrekte Bezeichnung dafür ist deshalb Protokolldatei. Wichtige Anwendungen finden sich vor allem bei der Prozesskontrolle und Automatisierung. Prinzipiell werden alle Aktionen mitgeschrieben, die für eine spätere Untersuchung (Audit) erforderlich sind oder sein könnten. Der Flugschreiber in Flugzeugen ist ein Beispiel für kontinuierliche Protokollierung, die jedoch selten ausgewertet wird, zum Beispiel nach einem Unfall. Im Bereich der Datenbanken bezeichnet Logfile die Protokolldatei, in der Änderungen an der Datenbank von korrekt abgeschlossenen Transaktionen (per Commit abgeschlossen) festgehalten werden, um im Fall eines Fehlers (z. B. Systemabsturz) den aktuellen Datenbestand wiederherstellen zu können. https://de.wikipedia.org/wiki/logdatei
When do you consult logs? 4 Never: You are not an admin or developer If something went wrong (and a user reported it): What happened? Where? When? Why?
But 5 Multi-tier systems: Multiple servers Multiple applications Multiple databases Multiple systems
Log Sources 6 Infrastructure Servers Containers Web servers Load balancers Paas / IaaS Databases Queries Errors Appliances Routers Switches Firewalls Sensors IoT Industrie 4.0 Home automation Tools Configuration Automation Analytics tools Alerting tools Chat tools Front End Log-ins Form completions Important click events Applications / APIs Requests Error handling Successes Failed attempts Privilege changes Object manipulation
Log examples: Logs [01988:00243-3598456576] 18.01.2016 08:49:35 Opened session for WGMob01/WGC/CH (Release 9.0.1FP4) [41732479.416668] [INT_2_VYATTA-default-D]IN=bond1 OUT=bond1.2036 MAC=00:00:5e:00:01:01:00:08:e3:ff:fd:90:08:00 SRC=95.26.112.172 DST=81.95.156.246 LEN=106 TOS=0x00 PREC=0x00 TTL=55 ID=27102 PROTO=ICMP TYPE=3 CODE=3 [SRC=81.95.156.246 DST=95.26.112.172 LEN=78 TOS=0x08 PREC=0x20 TTL=235 ID=62876 DF PROTO=UDP SPT=15798 DPT=53 LEN=58 ] 220.160.156.109 - - [18/Jan/2016:01:54:22-0600] "POST /savenewsubmit.do HTTP/1.1" 200 6687 "http://www.logfilesarecool.net/createsubmit.do?submitid=4418324" "Mozilla/5.0 (Windows NT 6.1; Trident/7.0; MATP; rv:11.0) like Gecko [1/18/16 8:46:05:061 CET] 000001b6 IndexBuilderQ I com.ibm.connections.search.admin.index.impl.indexbuilderqueue build CLFRW0285I: Search is starting to build the index for wikis. 7
Visualization of Logs = Gold 8
Visualization of Logs 9 Gives you: Operational Visibility Gain end-to-end visibility across your operations and break down silos across your infrastructure Search and Investigation Find and fix problems, correlate events across multiple data sources and automatically detect patterns across massive sets of data Proactive Monitoring Monitor systems in real time to identify issues, problems and attacks before they impact your customers, services and revenues Business Insights Make better-informed business decisions by understanding trends, patterns and gaining operational intelligence from machine data
Visualization of Logs 10 The Solution - ELK Stack
The ELK stack 11 Elastic Search: Lucene based search engine (Java Stack) Distributed capability REST API over HTTP Data share using JSON fromat Logstash: Ruby Agent application Agent to collect log data in numerous input formats Filters can be applied Many Output formats supported Kibana: Flexible analytics and visualization platform
WebGate environment 12 Agents/Shipper Broker Filter/Indexer Filebeat Web Interface/Visualizer Search/Storage Docker containers
Logstash 13 Input: beats, couchdb_changes, drupal_dblog, elasticsearch, exec, eventlog, file, ganglia, gelf, generator, graphite, github, heartbeat, heroku, http, http_poller, irc, imap, jdbc, jmx, kafka, log4j, lumberjack, meetup, pipe, puppet_facter, relp, rss, rackspace, rabbitmq, redis, snmptrap, stdin, sqlite, s3, sqs, stomp, syslog, tcp, twitter, unix, udp, varnishlog, wmi, websocket, xmpp, zenoss, zeromq Output: boundary, circonus, csv, cloudwatch, datadog, datadog_metrics, email, elasticsearch, elasticsearch_java, exec, file, google_bigquery, google_cloud_storage, ganglia, gelf, graphtastic, graphite, hipchat, http, irc, influxdb, juggernaut, jira, kafka, lumberjack, librato, loggly, mongodb, metriccatcher, nagios, null, nagios_nsca, opentsdb, pagerduty, pipe, riemann, redmine, rackspace, rabbitmq, redis, riak, s3, sqs, stomp, statsd, solr_http, sns, syslog, stdout, tcp, udp, webhdfs, websocket, xmpp, zabbix, zeromq
Logstash 14 Filter: aggregate, alter, anonymize, collate, csv, cidr, clone, cipher, checksum, date, de_dot, dns, drop, elasticsearch, extractnumbers, environment, elapsed, fingerprint, geoip, grok, i18n, json, json_encode, kv, mutate, metrics, multiline, metaevent, prune, punct, ruby, range, syslog_pri, sleep, split, throttle, translate, uuid, urldecode, useragent, xml, zeromq 84.74.43.46 - - [15/Mar/2016:08:41:00 +0100] "GET /files/basic/api/myfilesync/feed?page=1&pagesize=500&includeconflict=true HTTP/1.1" 200 1323 "-" "IBM-LC-IBM Connections sync/1602.3033.1103 (Mac OS X 10.10.5)" Log Entry/Message Filters Field 1 i.e. Source IP Field 2 Field 3 Field 4 Field 5 Document
Logstash 15 Example (HTTP access log): 84.74.43.46 - - [15/Mar/2016:08:41:00 +0100] "GET /files/basic/api/myfilesync/feed?page=1&pagesize=500&includeconflict=true HTTP/1.1" 200 1323 "-" "IBM-LC-IBM Connections sync/1602.3033.1103 (Mac OS X 10.10.5)" filter { if [type] == "apache_access" { grok { match => { "message" => "%{IPORHOST:clientip} %{USER:ident} %{USER:auth} \[%{HTTPDATE:timestamp}\] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})? %{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes} -) %{QS:referrer} %{QS:agent} } clientip : 84.74.43.46 timestamp: 15/Mar/2016:08:41:00 +0100 verb: GET request: /files/basic/api/myfilesync/feed?page=1&pagesize=5 00&includeConflict=true httpversion: 1.1 response: 200 bytes: 1323 referrer: - agent: "IBM-LC-IBM Connections sync/1602.3033.1103 (Mac OS X 10.10.5)"
Logstash 16 Example (HTTP access log): 84.74.43.46 - - [15/Mar/2016:08:41:00 +0100] "GET /files/basic/api/myfilesync/feed?page=1&pagesize=500&includeconflict=true HTTP/1.1" 200 1323 "-" "IBM-LC-IBM Connections sync/1602.3033.1103 (Mac OS X 10.10.5)" date { match => [ "timestamp", "dd/mmm/yyyy:hh:mm:ss Z" ] } geoip { source => "clientip" target => "geoip" database => "/etc/logstash/geolitecity.dat" add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ] add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}" ] } useragent { source => "agent" add_tag => [ "browser" ] } } } os_name : Mac OS X timestamp: 15/Mar/2016:08:41:00 +0100 agent: "IBM-LC-IBM Connections sync/1602.3033.1103 (Mac OS X 10.10.5)" os_major : 10 clientip : 84.74.43.46 geoip.country_code3: CHE os_minor : 10 geoip.location: 8.298599999999993, 47.06030000000001 name : Other
Logstash 17
Visualization of Logs 18 Gives you: Operational Visibility Gain end-to-end visibility across your operations and break down silos across your infrastructure Search and Investigation Find and fix problems, correlate events across multiple data sources and automatically detect patterns across massive sets of data Proactive Monitoring Monitor systems in real time to identify issues, problems and attacks before they impact your customers, services and revenues Business Insights Make better-informed business decisions by understanding trends, patterns and gaining operational intelligence from machine data IBM Solutions Log Management Centralized Log Management Security Monitoring Performance Monitoring Data Analysis
19 Costs All ELK Stack products are Installation and configuration: Couple of days