Efficient Management of System Logs using a Cloud Radoslav Bodó, Daniel Kouřil CESNET. ISGC 2013, March 2013

Similar documents

Log Management with Open-Source Tools. Risto Vaarandi SEB Estonia

Logging on a Shoestring Budget

Log management with Logstash and Elasticsearch. Matteo Dessalvi

Real-time Data Analytics mit Elasticsearch. Bernhard Pflugfelder inovex GmbH

Log Management with Open-Source Tools. Risto Vaarandi rvaarandi 4T Y4H00 D0T C0M

Information Retrieval Elasticsearch

Log Analysis with the ELK Stack (Elasticsearch, Logstash and Kibana) Gary Smith, Pacific Northwest National Laboratory

Log infrastructure & Zabbix. logging tools integration

Comparative Analysis of Open-Source Log Management Solutions for Security Monitoring and Network Forensics

Processing millions of logs with Logstash

Log managing at PIC. A. Bruno Rodríguez Rodríguez. Port d informació científica Campus UAB, Bellaterra Barcelona. December 3, 2013

Powering Monitoring Analytics with ELK stack

Andrew Moore Amsterdam 2015

Analyzing large flow data sets using. visualization tools. modern open-source data search and. FloCon Max Putas

How To Use Elasticsearch

MongoDB Developer and Administrator Certification Course Agenda

Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam

Maintaining Non-Stop Services with Multi Layer Monitoring

the missing log collector Treasure Data, Inc. Muga Nishizawa

A New Approach to Network Visibility at UBC. Presented by the Network Management Centre and Wireless Infrastructure Teams

Reliable log data transfer

Using Logstash and Elasticsearch analytics capabilities as a BI tool

Performance Evaluation of NoSQL Systems Using YCSB in a resource Austere Environment

Mobile Analytics. mit Elasticsearch und Kibana. Dominik Helleberg

Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack

Log management with Graylog2 Lennart Koopmann, FrOSCon Mittwoch, 29. August 12

Scalable Architecture on Amazon AWS Cloud

A Universal Logging System for LHCb Online

A Year of HTCondor Monitoring. Lincoln Bryant Suchandra Thapa

Data Discovery and Systems Diagnostics with the ELK stack. Rittman Mead - BI Forum 2015, Brighton. Robin Moffatt, Principal Consultant Rittman Mead

Developing an Application Tracing Utility for Mule ESB Application on EL (Elastic Search, Log stash) Stack Using AOP

Using NXLog with Elasticsearch and Kibana. Using NXLog with Elasticsearch and Kibana

The Cloud to the rescue!

The syslog-ng Store Box 3 F2

A Comparative Analysis of Open-Source Log Management Solutions for Security Monitoring and Network Forensics. Risto Vaarandi Paweł Niziński

Hadoop and Map-Reduce. Swati Gore

syslog-ng 3.0 Monitoring logs with Nagios

Technical Overview Simple, Scalable, Object Storage Software

Automated Data Ingestion. Bernhard Disselhoff Enterprise Sales Engineer

NoSQL in der Cloud Why? Andreas Hartmann

April 8th - 10th, 2014 LUG14 LUG14. Lustre Log Analyzer. Kalpak Shah. DataDirect Networks. ddn.com DataDirect Networks. All Rights Reserved.

Structured Data Storage

Improve performance and availability of Banking Portal with HADOOP

Search and Real-Time Analytics on Big Data

How Comcast Built An Open Source Content Delivery Network National Engineering & Technical Operations

Blackboard Open Source Monitoring

Modern Web development and operations practices. Grig Gheorghiu VP Tech Operations Nasty Gal

logstash The Book Log management made easy James Turnbull

syslog-ng: from log collection to processing and information extraction

Wisdom from Crowds of Machines

Building a logging pipeline with Open Source tools. Iñigo Ortiz de Urbina Cazenave

Deploying and Managing SolrCloud in the Cloud ApacheCon, April 8, 2014 Timothy Potter. Search Discover Analyze

Cisco Application Networking Manager Version 2.0

WhiteWave's Integrated Managed File Transfer (MFT)

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

Big Data with Component Based Software

The syslog-ng Premium Edition 5F2

CI Pipeline with Docker

The syslog-ng Store Box 3 LTS

MySQL Enterprise Monitor

syslog-ng: nyers adatból Big Data

WHITE PAPER Redefining Monitoring for Today s Modern IT Infrastructures

THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES

NoSQL replacement for SQLite (for Beatstream) Antti-Jussi Kovalainen Seminar OHJ-1860: NoSQL databases

Real World Big Data Architecture - Splunk, Hadoop, RDBMS

SANS Dshield Webhoneypot Project. OWASP November 13th, The OWASP Foundation Jason Lam

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January Website:

SQL Server 2016 New Features!

Why NoSQL? Your database options in the new non- relational world IBM Cloudant 1

Cloud Scale Distributed Data Storage. Jürmo Mehine

MongoDB: document-oriented database

Sentimental Analysis using Hadoop Phase 2: Week 2

Web Traffic Capture Butler Street, Suite 200 Pittsburgh, PA (412)

Mule Enterprise Service Bus (ESB) Hosting

Big Data & Data Science Course Example using MapReduce. Presented by Juan C. Vega

Centralized logging system based on WebSockets protocol

NoSQL Databases. Nikos Parlavantzas

Sisense. Product Highlights.

So What s the Big Deal?

Running an OpenStack Cloud for several years and living to tell the tale. Alexandre Maumené Gaëtan Trellu Tokyo Summit, November 2015

Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world

Logitoring : log driven monitroing. the Rocket science. and. Eugene Istomin. IT Architect. e.istomin@edss.ee. Cone Center,Tallinn

A Scalable Data Transformation Framework using the Hadoop Ecosystem

FleSSR Project: Installing Eucalyptus Open Source Cloud Solution at Oxford e- Research Centre

Graylog2 Lennart Koopmann, OSDC /

NoSQL Performance Test In-Memory Performance Comparison of SequoiaDB, Cassandra, and MongoDB

Microsoft SQL Server Beginner course content (3-day)

How to Ingest Data into Google BigQuery using Talend for Big Data. A Technical Solution Paper from Saama Technologies, Inc.

Big Data Approaches. Making Sense of Big Data. Ian Crosland. Jan 2016

Snapt Balancer Manual

MySQL és Hadoop mint Big Data platform (SQL + NoSQL = MySQL Cluster?!)

Cloud Computing at Google. Architecture

High Availability Solutions for the MariaDB and MySQL Database

Mark Bennett. Search and the Virtual Machine

Federated SQL on Hadoop and Beyond: Leveraging Apache Geode to Build a Poor Man's SAP HANA. by Christian

Transcription:

Efficient Management of System Logs using a Cloud Radoslav Bodó, Daniel Kouřil CESNET ISGC 2013, March 2013

Agenda Introduction Collecting logs Log Processing Advanced analysis Resume

Introduction Status NGI MetaCentrum.cz approx. 750 worker nodes web servers support services Motivation central logging services for security operations

Goals secure delivery encrypted, authenticated channel reliable delivery system resilient to central server outage availability redundant central server -- not this time

Collecting logs linux + logging = syslog forwarding logs with syslog protocol UDP, TCP, RELP TLS, GSS-API NGI Metacentrum Debian environment Kerberized environment rsyslogd forwarding logs over GSS-API protected channel

rsyslogd shipper omgssapi.so -- client forwarding is action action queue must be non direct queue must be limited full queue must not block main queue

rsyslogd server imgssapi.so -- server nothing really special listener per IP layout service logs

rsyslogd GSS patches original GSS-API plugins are not maintained since 3.x plugin does not reflect internal changes in rsyslogd >> occasional segfaults/asserts not quite nice even after upstream hotfix no more segfaults, but SYN storms (v5,v6,?v7) a new omgssapi based on old one + actual omfwd (tcp forward) contributed to public domain but not merged yet we'll try to push it again into v7

rsyslogd testbed development of multithreaded application working with strings and networking is error prone process.. everytime virtual testbed used to test produced builds

rsyslogd wrapup in production about a 1 year approx. 90% nodes coverage (700 nodes) 100GB per month 2GB compressed with 7zip monitoring nagios cron scripts

Log processing why centralized logging? having logs on single place allows us to do centralized do_magic_here classic approach grep, perl, cron, tail -f alerting from PBS logs jobs_too_long

jobs_too_long

Log processing classic approach grep, perl, cron, tail -f alerting from PBS logs jobs_too_long perl is fine but not quite fast for 100GB of data for analytics a database must be used but planning first...

The size the grid scales logs growing more and more a scaling DB must be used clustering, partitioning MySQL, PostgreSQL,...

The structure strikes back logs are not just text lines, but rather a nested structure LOG ::= TIMESTAMP DATA DATA ::= LOGSOURCE PROGRAM PID MESSAGE MESSAGE ::= M1 M2 logs differ a lot between products kernel, mta, httpd, ssh, kdc,... and that does not play well with RDBMS (with fixed data structures)

A new hope? NoSQL databases emerging technology cloud technology scaling technology c00l technology focused on ElasticSearch MongoDB

ElasticSearch is a full-text search engine built on the top of the Lucene library it is meant to be distributed autodiscovery automatic sharding/partitioning, dynamic replica (re)allocation, various clients already

REST or native protocol PUT indexname&data (json documents) GET _search?dsl_query... index will speed up the query ElasticSearch is not meant to be facing public world no authentication no problem!! no encryption

rsyslog testbed Private cloud a private cloud has to be created in the grid cluster members are created as jobs cluster is interconnected by private VLAN proxy is handling traffic in and out

Turning logs into structures rsyslogd omelasticsearch, ommongodb LOG ::= TIMESTAMP DATA DATA ::= LOGSOURCE PROGRAM PID MESSAGE MESSAGE ::= M1 M2... Logstash grok flexible architecture

logstash -- libgrok reusable regular expressions language and parsing library by Jordan Sissel

Grokked syslog

logstash -- arch event processing pipeline input filter output many IO plugins flexible...

Log processing proxy ES + LS + Kibana... or even simpler (ES embedded in LS)

btw Kibana LS + ES web frontend

Performance Proxy parser might not be enough for grid logs.. creating cloud service is easy with LS, all we need is a spooling service >> redis Speeding things up batching, bulk indexing rediser bypassing logstash internals overhead on a hot spot (proxy) Logstash does not implement all necessary features yet http time flush, synchronized queue... custom plugins, working with upstream...

Cloud parser

Performance Proxy parser might not be enough for grid logs.. creating cloud service is easy with LS, all we need is a spooling service >> redis Speeding things up batching, bulk indexing rediser bypassing logstash internals overhead on a hot spot (proxy) Logstash does not implement all necessary features yet http time flush, synchronized queue... custom plugins, working with upstream...

LS + ES wrapup upload testdata logs from January 2013 105GB -- cca 800M events uploaded in 4h 8 nodes ESD cluster 16 shared parsers (LS on ESD) 4 nodes cluster - 8h speed vary because of the data (lots of small msgs) during normal operations a large cluster is not needed

LS + ES wrapup Speed of ES upload depends on size of grokked data and final documents, batch/flush size of input and output processing, filters used during processing, LS outputs share sized queue which can block processing (lanes:), elasticsearch index (template) setting....... tuning for top speed is manual job (graphite,...)

LS + ES wrapup search speed ~

Advanced log analysis ES is a fulltext SE, not a database but for analytics a DB is necessary Document-Oriented Storage Schemaless document storage Auto-Sharding Mapreduce and aggregation framework

Advanced log analysis MongoDB Can be fed with grokked data by Logstash sshd log analysis

MapReduce

Mongomine on the top of created collection time based aggregations (profiling, browsing) custom views (mapcrackers) mapremoteresultsperday.find( {time= last 14days, result={fail}, external data (Warden,...) count>20} )

Mongomine Logstash + MongoDB application sshd log analysis security events analysis python bottle webapp Google charts automated reporting successful logins from mapcrackers Warden...

Mongomine

Mongomine wrapup testcase 20GB -- January 2013 1 MongoDB node, 24 CPUs, 20 shards 1 parser node, 6 LS parsers speed upload -- approx. 8h (no bulk inserts :( 1st MR job -- approx. 4h incremental MR during normal ops -- approx. 10s

Overall schema rsyslogd + testbed LS + ES LS + Mongomine + Ext

Lessons learned proxy is most important place need to make it redundant virtualization is cool helps with development and operations more analytics more external data more speed

Resume It works system scales according current needs custom patches published solution is ready to accept new data with any or almost no structure Features collecting -- rsyslog processing -- logstash high interaction interface -- ES, kibana analysis and alerting -- mongomine

Questions? now or... https://wiki.metacentrum.cz/wiki/user:bodik mailto:bodik@civ.zcu.cz mailto:kouril@ics.muni.cz