syslog-ng: nyers adatból Big Data



Similar documents
syslog-ng: from log collection to processing and information extraction

syslog-ng 3.0 Monitoring logs with Nagios

The syslog-ng Premium Edition 5F2

The syslog-ng Open Source Edition 3.7 Administrator Guide

The syslog-ng Premium Edition 5LTS

Syslog (Centralized Logging and Analysis) Jason Healy, Director of Networks and Systems

Centralized Logging With syslog ng. Ryan Ma6eson h6p://prefetch.net

Log Management with Open-Source Tools. Risto Vaarandi SEB Estonia

The syslog-ng Open Source Edition 3.6 Administrator Guide

Log Management with Open-Source Tools. Risto Vaarandi rvaarandi 4T Y4H00 D0T C0M

Performance Guideline for syslog-ng Premium Edition 5 LTS

What is new in syslog-ng Premium Edition 4 F1

Log management with Logstash and Elasticsearch. Matteo Dessalvi

Performance measurements of syslog-ng Premium Edition 4 F1

RSA Security Analytics

Log Analysis with the ELK Stack (Elasticsearch, Logstash and Kibana) Gary Smith, Pacific Northwest National Laboratory

PCI DSS compliance and log management

Efficient Management of System Logs using a Cloud Radoslav Bodó, Daniel Kouřil CESNET. ISGC 2013, March 2013

Syslog & xinetd. Stephen Pilon

The syslog-ng Open Source Edition 3.4 Administrator Guide

The syslog-ng Open Source Edition 3.5 Administrator Guide

RSA Event Source Configuration Guide. F5 Big-IP Local Traffic Manager

Ease the rsyslog admin's life... Rainer Gerhards

Logging on a Shoestring Budget

The syslog-ng Open Source Edition 3.5 Administrator Guide

Centralized logging system based on WebSockets protocol

Red Condor Syslog Server Configurations

The syslog-ng Store Box 3 F2

Computer Security DD2395

Comparative Analysis of Open-Source Log Management Solutions for Security Monitoring and Network Forensics

Unless otherwise noted, all references to STRM refer to STRM, STRM Log Manager, and STRM Network Anomaly Detection.

NXLOG Community Edition Reference Manual for v

Challenge 10 - Attack Visualization The Honeynet Project / Forensic Challenge 2011 /

Log managing at PIC. A. Bruno Rodríguez Rodríguez. Port d informació científica Campus UAB, Bellaterra Barcelona. December 3, 2013

1. Stem. Configuration and Use of Stem

RSA Authentication Manager

Processing millions of logs with Logstash

IronKey Enterprise File Audit Admin Guide

The syslog-ng Store Box 4 LTS Administrator Guide

Building a logging pipeline with Open Source tools. Iñigo Ortiz de Urbina Cazenave

The syslog-ng 3.0 Administrator Guide

The syslog-ng Premium Edition 5 F3 Administrator Guide

Andrew Moore Amsterdam 2015

Accellion Secure File Transfer

Distributed syslog architectures with syslog-ng Premium Edition

April 8th - 10th, 2014 LUG14 LUG14. Lustre Log Analyzer. Kalpak Shah. DataDirect Networks. ddn.com DataDirect Networks. All Rights Reserved.

The syslog-ng Premium Edition 5 LTS Administrator Guide

A Plan for the Continued Development of the DNS Statistics Collector

Automated Data Ingestion. Bernhard Disselhoff Enterprise Sales Engineer

The Automatic HTTP Requests Logging and Replaying Subsystem for CMS Plone

Big Data Analytics in LinkedIn. Danielle Aring & William Merritt

XGenPlus Installation Guide

Security Correlation Server Quick Installation Guide

TPAf KTl Pen source. System Monitoring. Zenoss Core 3.x Network and

Hinemos ver.2 Installation manual


The syslog-ng Open Source Edition 3.2 Administrator Guide

Analyzing large flow data sets using. visualization tools. modern open-source data search and. FloCon Max Putas

The syslog-ng Store Box 3 LTS

What is new in syslog-ng Premium Edition 5 F3

Troubleshooting This document outlines some of the potential issues which you may encouter while administering an atech Telecoms installation.

Windows Quick Start Guide for syslog-ng Premium Edition 5 LTS

Developing an Application Tracing Utility for Mule ESB Application on EL (Elastic Search, Log stash) Stack Using AOP

Data Discovery and Systems Diagnostics with the ELK stack. Rittman Mead - BI Forum 2015, Brighton. Robin Moffatt, Principal Consultant Rittman Mead

Guideline for stresstest Page 1 of 6. Stress test

Cloud Control Panel (CCP) Installation Guide

syslog-ng Product Line

Security Correlation Server Quick Installation Guide

Syslog Configuration for Auditing

Information Retrieval Elasticsearch

The Windows Web Platform. Michael Epprecht Microsoft Switzerland twitter: fastflame

Graylog2 Lennart Koopmann, OSDC /

HIPAA Compliance Use Case

Network Monitoring & Management Log Management

Kafka & Redis for Big Data Solutions

VX Search File Search Solution. VX Search FILE SEARCH SOLUTION. User Manual. Version 8.2. Jan Flexense Ltd.

LANGuardian Integration Guide

EventTracker Windows syslog User Guide

Mobile Analytics. mit Elasticsearch und Kibana. Dominik Helleberg

Sentimental Analysis using Hadoop Phase 2: Week 2

Logging with syslog-ng, Part One

CERT-In Indian Computer Emergency Response Team Handling Computer Security Incidents

Network Monitoring & Management Log Management

Volume SYSLOG JUNCTION. User s Guide. User s Guide

HAProxy. Free, Fast High Availability and Load Balancing. Adam Thornton 10 September 2014

F-Secure Messaging Security Gateway. Deployment Guide

Technology Overview. Lawful Intercept Using Flow-Tap. Published: Copyright 2014, Juniper Networks, Inc.

Configuring and Using the TMM with LDAP / Active Directory

the missing log collector Treasure Data, Inc. Muga Nishizawa

MyPBX Security Configuration Guide

Big Data Course Highlights

DiskBoss. File & Disk Manager. Version 2.0. Dec Flexense Ltd. info@flexense.com. File Integrity Monitor

EXPLORER. TFT Filter CONFIGURATION

The HTTP Plug-in. Table of contents

Linux VPS with cpanel. Getting Started Guide

syslog-ng Store Box PRODUCT DESCRIPTION Copyright BalaBit IT Security All rights reserved.

Transcription:

syslog-ng: nyers adatból Big Data 2015. vday, Budapest Czanik Péter / Balabit

About me Peter Czanik from Hungary Community manager at BalaBit: syslog-ng upstream Doing syslog-ng packaging, support, advocating BalaBit is an IT security company with development HQ in Budapest, Hungary Over 200 employees: the majority are engineers 2

syslog-ng Logging: recording events, like this one: Jan 14 11:38:48 linux-0jbu sshd[7716]: Accepted publickey for root from 127.0.0.1 port 48806 ssh2 syslog-ng: enhanced logging daemon, with a focus on central log collection. Not only syslog Parsing, rewriting, filtering messages Storing to a central location or forwarding to a wide variety of destinations 3

C-3PO (Star Wars) 4

syslog-ng and Big Data syslog-ng can facilitate the data pipeline to Big Data in many ways: Data collector Data processor Data filtering 5

syslog-ng: data collector Collect system and application logs together: contextual data for either side Receive syslog messages over the network Legacy or RFC5424, UDP/TCP/TLS A wide variety of platform specific sources: /dev/log & Co Journal, Sun streams Logs or any kind of data from applications: Through files, sockets, pipes, etc. Application output 6

syslog-ng: processing Process messages close to the source: easier filtering, lower load on the consumer side classify, normalize and structure logs with built-in parsers: CSV-parser, DB-parser (PatternDB), JSON parser rewrite messages: for example anonymization Reformatting messages using templates: Enrich data: Destination might need a specific format (ISO date, JSON, etc.) GeoIP, additional fields based on message content 7

Main uses: syslog-ng: data filtering Message routing (login events to SIEM, smtp logs to separate file, etc.) Throw away surplus logs (don't store debug level messages to SQL) Many possibilities: Based on message content, parameters or macros Using comparisons, wildcards, regular expressions and functions Combining all of these with boolean operators 8

syslog-ng Big Data destinations Distributed file systems: Hadoop NoSQL databases: MongoDB Elasticsearch Messaging systems: Kafka 9

Free-form log messages Most log messages are: date + hostname + text Mar 11 13:37:56 linux-6965 sshd[4547]: Accepted keyboardinteractive/pam for root from 127.0.0.1 port 46048 ssh2 Text = English sentence with some variable parts Easy to read by a human Difficult to process them with scripts 10

Solution: structured logging Events represented as name-value pairs Example: an ssh login: source_ip=192.168.123.45 app=sshd user=root syslog-ng: name-value pairs inside Date, facility, priority, program name, pid, etc. Parsers in syslog-ng can turn unstructured and some structured data (csv, JSON) into name value pairs 11

JSON parser Turns JSON based log messages into name-value pairs {"PROGRAM":"prg00000","PRIORITY":"info","PID":"1234","MESSAGE":"s eq: 0000000000, thread: 0000, runid: 1374490607, stamp: 2013-07- 22T12:56:47 MESSAGE... ","HOST":"localhost","FACILITY":"auth","DATE":"Jul 22 12:56:47"} 12

csv parser csv-parser: parses columnar data into fields parser p_apache { csv-parser(columns("apache.client_ip", "APACHE.IDENT_NAME", "APACHE.USER_NAME", "APACHE.TIMESTAMP", "APACHE.REQUEST_URL", "APACHE.REQUEST_STATUS", "APACHE.CONTENT_LENGTH", "APACHE.REFERER", "APACHE.USER_AGENT", "APACHE.PROCESS_TIME", "APACHE.SERVER_NAME") flags(escape-double-char,strip-whitespace) delimiters(" ") quote-pairs('""[]') ); }; destination d_file { file("/var/log/messages-${apache.user_name:-nouser}"); }; log { source(s_local); parser(p_apache); destination(d_file);}; 13

PatternDB parser PatternDB message parser: Can extract useful information from unstructured messages into name-value pairs Add status fields based on message text Message classification (like LogCheck) Needs XML describing log messages Example: an ssh login failure: user=root, source_ip=192.168.123.45, action=login, status=failure classified as violation 14

Anonymizing messages Many regulations about what can be logged PCI-DSS: credit card numbers Europe: IP addresses, user names Locating sensitive information: Regular expressions: slow, works also in unknown logs Patterndb: fast, only in known log messages Anonymizing: Overwrite it with constant Overwrite it with a hash of the original 15

Language bindings in syslog-ng The primary language of syslog-ng is C: High performance: processes a lot more EPS than interpreted languages Not everything is implemented in C Rapid prototyping is easier in interpreted languages Python & Java destinations in syslog-ng, Lua & Perl in incubator Embedded interpreter Message or full range of name value pairs can be passed Proper error handling 16

Java based Big Data destinations Most of Big Data is written in Java C and Python clients exist, but Java is official and maintained together with the server component More effort to get started: Due to missing JARs and build tools (gradle) not yet in distributions libjvm.so needs to be added to LD_LIBRARY_PATH https://czanik.blogs.balabit.com/2015/08/getting-started-with-syslog-ng-3-7-1-and-elasticsearch-hadoop-kafka/ 17

Configuration Don't Panic Simple and logical, even if looks difficult first Pipeline model: Many different building blocks (sources, destinations, filters, parsers, etc.) Connected using log statements into a pipeline 18

@version:3.7 @include "scl.conf" syslog-ng.conf: global options # this is a comment :) options { flush_lines (0); # [...] keep_hostname (yes); }; 19

syslog-ng.conf: sources source s_sys { system(); internal(); }; source s_net { }; udp(ip(0.0.0.0) port(514)); 20

syslog-ng.conf: destinations destination d_mesg { file("/var/log/messages"); }; destination d_es { elasticsearch( index("syslog-ng_${year}.${month}.${day}") type("test") cluster("syslog-ng") template("$(format-json --scope rfc3164 --scope nv-pairs --exclude R_DATE --key ISODATE)\n"); ); }; 21

syslog-ng.conf: filters, parsers filter f_nodebug { level(info..emerg); }; filter f_messages { level(info..emerg) and not (facility(mail) or facility(authpriv) or facility(cron)); }; parser pattern_db { }; db-parser(file("/opt/syslog-ng/etc/patterndb.xml") ); 22

syslog-ng.conf: logpath log { source(s_sys); filter(f_messages); destination(d_mesg); }; log { source(s_net); source(s_sys); filter(f_nodebug); parser(pattern_db); destination(d_es); flags(flow-control); }; 23

Patterndb & ElasticSearch & Kibana 24

Kafka Publish subscribe messaging Data backbone for data driven organizations LinkedIn Spotify Kafka destination is already in syslog-ng Source is planned 25

syslog-ng benefits for Big Data High performance reliable log collection Simplified architecture Single application for both syslog and application data Easier to use data Parsed and presented in a ready to use format Lower load on destinations Efficient message filtering and routing 26

Joining the community syslog-ng: http://syslog-ng.org/ Source on GitHub: https://github.com/balabit/syslog-ng Mailing list: https://lists.balabit.hu/pipermail/syslog-ng/ IRC: #syslog-ng on freenode University students: open trainee positions! syslog-ng universe 27

Questions? Questions? My blog: http://czanik.blogs.balabit.com/ My e-mail: peter.czanik@balabit.com 28

End 29

Sample XML <?xml version='1.0' encoding='utf-8'?> <patterndb version='3' pub_date='2010-07-13'> <ruleset name='opensshd' id='2448293e-6d1c-412c-a418-a80025639511'> <pattern>sshd</pattern> <rules> <rule provider="patterndb" id="4dd5a329-da83-4876-a431-ddcb59c2858c" class="system"> <patterns> <pattern>accepted @ESTRING:usracct.authmethod: @for @ESTRING:usracct.username: @from @ESTRING:usracct.device: @port @ESTRING:: @@ANYSTRING:usracct.service@</pattern> </patterns> <examples> <example> <test_message program="sshd">accepted password for bazsi from 127.0.0.1 port 48650 ssh2</test_message> <test_values> <test_value name="usracct.username">bazsi</test_value> <test_value name="usracct.authmethod">password</test_value> <test_value name="usracct.device">127.0.0.1</test_value> <test_value name="usracct.service">ssh2</test_value> </test_values> </example> </examples> <values> <value name="usracct.type">login</value> <value name="usracct.sessionid">$pid</value> <value name="usracct.application">$program</value> <value name="secevt.verdict">accept</value> </values> </rule> 30