Using Logstash and Elasticsearch analytics capabilities as a BI tool
|
|
- Hillary Preston
- 8 years ago
- Views:
Transcription
1 Using Logstash and Elasticsearch analytics capabilities as a BI tool Pashalis Korosoglou, Pavlos Daoglou, Stefanos Laskaridis, Dimitris Daskopoulos Aristotle University of Thessaloniki, IT Center
2 Outline Technical stuff (Logstash, Elastic, Kibana, Ansible) Motivation for monitoring Software licenses Other use cases Summary and next steps
3 Logstash Written in jruby Applicable well beyond log files Plethora of core and community contributed plugins I/O plugins Filtering plugins Codecs Take this msg and parse/compute/save stuff on the wire
4 A very simple pipe example serviceuri: node03.domain.gr\nhostname: node03.domain.gr\nserviceflavour: service_i\nsitename: SITENAME\nmetricStatus: OK\nmetricName: org.ldap.freshness\nsummarydata: OK: freshness=70s, entries=1\ngatheredat: nagios.domain.gr\ntimestamp: T19:42:31Z\nnagiosName: org.ldap.freshness\nservicetype: service_i\neot\n { } "@timestamp" => " T19:42:31.000Z, "hostname" => "node03.domain.gr", "serviceflavour" => "service_i", "sitename" => "SITENAME", "metricstatus" => "OK", "metricname" => "org.ldap.freshness", "freshness" => 70 "entries" => 1 "gatheredat" => "nagios.domain.gr", "probe" => "org.ldap.freshness", "servicetype" => "service_i"
5 Logstash forwarders & Lumberjack Logstash-forwarder is a lightweight forwarding service Keeps track of offset within log file Failure resistant Supports multiple file inputs Lumberjack is a collection service Basically one of many input plugins available Uses zlib for compression Secure transmission of logs via OpenSSL
6 Architecture Overview
7 Architecture Overview Logstash forwarder(s) configuration
8 ElasticSearch (Elastic) Distributed data-store with near-real time search capabilities Built on top of Apache Lucene Exposes HTTP RESTful API (i.e. for querying) Multitenant architecture Highly available Shards replication Supports 3 rd party plugins (i.e. HQ, head etc) Apache 2.0 license
9 Elastic, RDBMS & Hadoop concepts ES document -> Table row in a RDB ES Index -> RDB database A collection of documents ES Mapping -> RDB schema definition ES Shards -> Hadoop splits Each shard is actually a Lucene index ES index splits into shards
10 Elastic, RDBMS & Hadoop concepts Replication: 1 5 primary shards by default 1 replica for each shard Replicas can t be assigned on the same node with the primary shard
11 Kibana Node.js frontend for Elastic Allows (realtime) visualisation of data Flexible interface One can add, remove, move, modify rows and graphs Allows different search queries Allows save, import, export and share operations for dashboards
12 Auth More than 20 annual contracts signed (+a few perpetual) The majority relies on FlexLM service Expenditures/year: ~100K Use cases Which departments use software X? Which departments use software X for educational or research purposes? How often is software X s Y component used?
13 Auth The problem(s) with flex logs: 23:29:06 (deamon) TIMESTAMP 6/3/2015 0:36:51 (deamon) OUT: "feature" someone@somewhere 0:39:04 (deamon) IN: "feature" someone@somewhere 0:54:47 (deamon) DENIED: "feature" someone@somewhere (Licensed number of users already reached. (-4,342)) 0:54:47 (deamon) UNSUPPORTED: "feature" (PORT_AT_HOST_PLUS ) someone@somewhere (License server system does not support this feature. (-18,327)) 0:54:47 (deamon) OUT: "feature" someone@somewhere (2 licenses) 1:08:08 (deamon) IN: "feature" someone@somewhere (2 licenses) 1:08:31 (deamon) OUT: "feature" someone@somewhere 1:10:09 (deamon) IN: "feature" someone@somewhere 1:13:43 (deamon) UNSUPPORTED: "feature" (PORT_AT_HOST_PLUS ) someone@somewhere (License server system does not support this feature. (-18,327)) 3:16:44 (lmgrd) TIMESTAMP 6/4/2015
14 Auth Our solution (via logstash filtering): { } "_type": "deamon", "_source": { "message": "19:07:17 (deamon) IN: \"feature\" someone@somewhere", "@version": "1", "@timestamp": " T16:07:17.000Z", "host": "tracker01", "tags": [ "taskterminated", "elapsed", "elapsed.match" ], "feature": "\"feature\"", "username": "someone", "hostname": "somewhere", "elapsed.time": 67, "elapsed.timestamp_start": " T16:06:10.000Z" }
15 Auth (screenshots)
16 Auth (screenshots)
17 Auth The decision making on what contracts will continue and with how many seating licenses depend on our accounting monitoring Actual scenarios/decisions Renew contract for software X but reduce the number of floating licenses Renew annual license for software X but don t renew component Y
18 Other Use Cases? Web services Accounting Resources usage Environmental monitoring Logins and brute force attempts Performance metrics Any log file (?)
19 Web services filter { if [type] == "httpd" { grok { match => { "message" => "%{COMBINEDAPACHELOG}" } } geoip { source => "clientip" } mutate { convert => { bytes => "integer" } } date { match => [ "timestamp", "dd/mmm/yyyy:hh:mm:ss Z" ] } } }
20 Web services
21 Accounting (local HPC resource)
22 Accounting (local HPC resource)
23 Logins (successful and attacks)
24 Resources Usage
25 Re-playing Log files are still kept in central syslog Scratch elastic completely and everything is reproducible ad-hoc Filters (via Ansible) Log files (via central syslog)
26 Reporting Elastic API not reachable from outside What if we want to send reports to our users? Using phantomjs framework and rasterize.js we can generate: custom weekly or monthly or annual reports in pdf format and share with our users
27 Summary The wealth sometimes hidden away in our log files is enormous ELK should not be considered a replacement for central logging Rather it s best to treat it as an addition to an existing stack ELK has helped us in Indexing data from log files Searching through log files Visualizing data and gain useful business insight
28 Next steps Performance monitoring via Nagios/Icinga probes & metrics Combination with Hadoop stack Safekeeping cold data Performing combined aggregated queries λ architectural prototype Upgrade elastic and kibana to 1.5.x Apply data data retention policies and use Elastic's repository features for long term storage
29 Questions
Real-time Data Analytics mit Elasticsearch. Bernhard Pflugfelder inovex GmbH
Real-time Data Analytics mit Elasticsearch Bernhard Pflugfelder inovex GmbH Bernhard Pflugfelder Big Data Engineer @ inovex Fields of interest: search analytics big data bi Working with: Lucene Solr Elasticsearch
More informationAndrew Moore Amsterdam 2015
Andrew Moore Amsterdam 2015 Agenda Why log How to log Audit plugins Log analysis Demos Logs [timestamp]: [some useful data] Why log? Error Log Binary Log Slow Log General Log Why log? Why log? Why log?
More informationLog management with Logstash and Elasticsearch. Matteo Dessalvi
Log management with Logstash and Elasticsearch Matteo Dessalvi HEPiX 2013 Outline Centralized logging. Logstash: what you can do with it. Logstash + Redis + Elasticsearch. Grok filtering. Elasticsearch
More informationLogging on a Shoestring Budget
UNIVERSITY OF NEBRASKA AT OMAHA Logging on a Shoestring Budget James Harr jharr@unomaha.edu Agenda The Tools ElasticSearch Logstash Kibana redis Composing a Log System Q&A, Conclusions, Lessons Learned
More informationDeveloping an Application Tracing Utility for Mule ESB Application on EL (Elastic Search, Log stash) Stack Using AOP
Developing an Application Tracing Utility for Mule ESB Application on EL (Elastic Search, Log stash) Stack Using AOP Mohan Bandaru, Amarendra Kothalanka, Vikram Uppala Student, Department of Computer Science
More informationLog managing at PIC. A. Bruno Rodríguez Rodríguez. Port d informació científica Campus UAB, Bellaterra Barcelona. December 3, 2013
Log managing at PIC A. Bruno Rodríguez Rodríguez Port d informació científica Campus UAB, Bellaterra Barcelona December 3, 2013 Bruno Rodríguez (PIC) Log managing at PIC December 3, 2013 1 / 21 What will
More informationMobile Analytics. mit Elasticsearch und Kibana. Dominik Helleberg
Mobile Analytics mit Elasticsearch und Kibana Dominik Helleberg Speaker Dominik Helleberg Mobile Development Android / Embedded Tools http://dominik-helleberg.de/+ Mobile Analytics Warum? Server Software
More informationAnalyzing large flow data sets using. visualization tools. modern open-source data search and. FloCon 2014. Max Putas
Analyzing large flow data sets using modern open-source data search and visualization tools FloCon 2014 Max Putas About me Operations Engineer - DevOps BS, MS, and CAS in Telecommunications Work/research
More informationA New Approach to Network Visibility at UBC. Presented by the Network Management Centre and Wireless Infrastructure Teams
A New Approach to Network Visibility at UBC Presented by the Network Management Centre and Wireless Infrastructure Teams Agenda Business Drivers Technical Overview Network Packet Broker Tool Network Monitoring
More informationHow To Use Elasticsearch
Elasticsearch, Logstash, and Kibana (ELK) Dwight Beaver dsbeaver@cert.org Sean Hutchison shutchison@cert.org January 2015 2014 Carnegie Mellon University This material is based upon work funded and supported
More informationPowering Monitoring Analytics with ELK stack
Powering Monitoring Analytics with ELK stack Abdelkader Lahmadi, Frédéric Beck INRIA Nancy Grand Est, University of Lorraine, France 2015 (compiled on: June 23, 2015) References online Tutorials Elasticsearch
More informationLog Analysis with the ELK Stack (Elasticsearch, Logstash and Kibana) Gary Smith, Pacific Northwest National Laboratory
Log Analysis with the ELK Stack (Elasticsearch, Logstash and Kibana) Gary Smith, Pacific Northwest National Laboratory A Little Context! The Five Golden Principles of Security! Know your system! Principle
More informationProcessing millions of logs with Logstash
and integrating with Elasticsearch, Hadoop and Cassandra November 21, 2014 About me My name is Valentin Fischer-Mitoiu and I work for the University of Vienna. More specificaly in a group called Domainis
More informationInformation Retrieval Elasticsearch
Information Retrieval Elasticsearch IR Information retrieval (IR) is the activity of obtaining information resources relevant to an information need from a collection of information resources. Searches
More informationEfficient Management of System Logs using a Cloud Radoslav Bodó, Daniel Kouřil CESNET. ISGC 2013, March 2013
Efficient Management of System Logs using a Cloud Radoslav Bodó, Daniel Kouřil CESNET ISGC 2013, March 2013 Agenda Introduction Collecting logs Log Processing Advanced analysis Resume Introduction Status
More informationLog Management with Open-Source Tools. Risto Vaarandi SEB Estonia
Log Management with Open-Source Tools Risto Vaarandi SEB Estonia Outline Why use open source tools for log management? Widely used logging protocols and recently introduced new standards Open-source syslog
More informationElasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack
Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack HIGHLIGHTS Real-Time Results Elasticsearch on Cisco UCS enables a deeper
More informationUsing NXLog with Elasticsearch and Kibana. Using NXLog with Elasticsearch and Kibana
Using NXLog with Elasticsearch and Kibana i Using NXLog with Elasticsearch and Kibana Using NXLog with Elasticsearch and Kibana ii Contents 1 Setting up Elasticsearch and Kibana 1 1.1 Installing Elasticsearch................................................
More informationImprove performance and availability of Banking Portal with HADOOP
Improve performance and availability of Banking Portal with HADOOP Our client is a leading U.S. company providing information management services in Finance Investment, and Banking. This company has a
More informationBuilding a logging pipeline with Open Source tools. Iñigo Ortiz de Urbina Cazenave
Building a logging pipeline with Open Source tools Iñigo Ortiz de Urbina Cazenave NLUUG Utrecht - Netherlands 28 May 2015 whoami; 2 Iñigo Ortiz de Urbina Cazenave Systems Engineer whoami; groups; 3 Iñigo
More informationData Discovery and Systems Diagnostics with the ELK stack. Rittman Mead - BI Forum 2015, Brighton. Robin Moffatt, Principal Consultant Rittman Mead
Data Discovery and Systems Diagnostics with the ELK stack Rittman Mead - BI Forum 2015, Brighton Robin Moffatt, Principal Consultant Rittman Mead T : +44 (0) 1273 911 268 (UK) About Me Principal Consultant
More informationBeyond Web Application Log Analysis using Apache TM Hadoop. A Whitepaper by Orzota, Inc.
Beyond Web Application Log Analysis using Apache TM Hadoop A Whitepaper by Orzota, Inc. 1 Web Applications As more and more software moves to a Software as a Service (SaaS) model, the web application has
More informationLog Management with Open-Source Tools. Risto Vaarandi rvaarandi 4T Y4H00 D0T C0M
Log Management with Open-Source Tools Risto Vaarandi rvaarandi 4T Y4H00 D0T C0M Outline Why do we need log collection and management? Why use open source tools? Widely used logging protocols and recently
More informationStreamlining Infrastructure Monitoring and Metrics in IT- DB-IMS
Streamlining Infrastructure Monitoring and Metrics in IT- DB-IMS August 2015 Author: Charles Callum Newey Supervisors: Giacomo Tenaglia Artur Wiecek CERN openlab Summer Student Report Project Specification
More informationReliable log data transfer
OWASP Switzerland Chapter December 2015 Reliable log data transfer About (r)syslog, logstash, and log data signing A field report pascal.buchbinder@adnovum.ch Agenda Why we need log data transfer Syslog
More informationXpoLog Center Suite Log Management & Analysis platform
XpoLog Center Suite Log Management & Analysis platform Summary: 1. End to End data management collects and indexes data in any format from any machine / device in the environment. 2. Logs Monitoring -
More informationWho did what, when, where and how MySQL Audit Logging. Jeremy Glick & Andrew Moore 20/10/14
Who did what, when, where and how MySQL Audit Logging Jeremy Glick & Andrew Moore 20/10/14 Intro 2 Hello! Intro 3 Jeremy Glick MySQL DBA Head honcho of Chicago MySQL meetup 13 years industry experience
More informationCitusDB Architecture for Real-Time Big Data
CitusDB Architecture for Real-Time Big Data CitusDB Highlights Empowers real-time Big Data using PostgreSQL Scales out PostgreSQL to support up to hundreds of terabytes of data Fast parallel processing
More informationA Scalable Data Transformation Framework using the Hadoop Ecosystem
A Scalable Data Transformation Framework using the Hadoop Ecosystem Raj Nair Director Data Platform Kiru Pakkirisamy CTO AGENDA About Penton and Serendio Inc Data Processing at Penton PoC Use Case Functional
More informationXpoLog Center Suite Data Sheet
XpoLog Center Suite Data Sheet General XpoLog is a data analysis and management platform for Applications IT data. Business applications rely on a dynamic heterogeneous applications infrastructure, such
More informationEfficient Management of System Logs using a Cloud
, CESNET z.s.p.o.,zikova 4, 160 00 Praha 6, Czech Republic and University of West Bohemia,Univerzitní 8, 306 14 Pilsen, Czech Republic E-mail: bodik@civ.zcu.cz Daniel Kouřil, CESNET z.s.p.o.,zikova 4,
More informationWhy should you look at your logs? Why ELK (Elasticsearch, Logstash, and Kibana)?
Authors Introduction This guide is designed to help developers, DevOps engineers, and operations teams that run and manage applications on top of AWS to effectively analyze their log data to get visibility
More informationFUJITSU Software ServerView Cloud Monitoring Manager V1 Introduction
FUJITSU Software ServerView Cloud Monitoring Manager V1 Introduction November 2015 Fujitsu Limited Product Overview 1 Why a Monitoring & Logging OpenStack Service? OpenStack systems are large, complex
More informationBernd Ahlers Michael Friedrich. Log Monitoring Simplified Get the best out of Graylog2 & Icinga 2
Bernd Ahlers Michael Friedrich Log Monitoring Simplified Get the best out of Graylog2 & Icinga 2 BEFORE WE START Agenda AGENDA Introduction Tools Log History Logs & Monitoring Demo The Future Resources
More informationHarnessing the Power of the Microsoft Cloud for Deep Data Analytics
1 Harnessing the Power of the Microsoft Cloud for Deep Data Analytics Today's Focus How you can operate your business more efficiently and effectively by tapping into Cloud based data analytics solutions
More informationKatta & Hadoop. Katta - Distributed Lucene Index in Production. Stefan Groschupf Scale Unlimited, 101tec. sg{at}101tec.com
1 Katta & Hadoop Katta - Distributed Lucene Index in Production Stefan Groschupf Scale Unlimited, 101tec. sg{at}101tec.com foto by: belgianchocolate@flickr.com 2 Intro Business intelligence reports from
More informationHow To Use Big Data For Telco (For A Telco)
ON-LINE VIDEO ANALYTICS EMBRACING BIG DATA David Vanderfeesten, Bell Labs Belgium ANNO 2012 YOUR DATA IS MONEY BIG MONEY! Your click stream, your activity stream, your electricity consumption, your call
More informationBlackboard Open Source Monitoring
Blackboard Open Source Monitoring By Greg Lloyd Submitted to the Faculty of the School of Information Technology in Partial Fulfillment of the Requirements for the Degree of Bachelor of Science in Information
More informationApril 8th - 10th, 2014 LUG14 LUG14. Lustre Log Analyzer. Kalpak Shah. DataDirect Networks. ddn.com. 2014 DataDirect Networks. All Rights Reserved.
April 8th - 10th, 2014 LUG14 LUG14 Lustre Log Analyzer Kalpak Shah DataDirect Networks Lustre Log Analysis Requirements Need scripts to parse Lustre debug logs Only way to effectively use the logs for
More informationXpoLog Competitive Comparison Sheet
XpoLog Competitive Comparison Sheet New frontier in big log data analysis and application intelligence Technical white paper May 2015 XpoLog, a data analysis and management platform for applications' IT
More informationLog management with Graylog2 Lennart Koopmann, FrOSCon 2012. Mittwoch, 29. August 12
Log management with Graylog2 Lennart Koopmann, FrOSCon 2012 About me 24 years old, Software Engineer at XING AG Hamburg, Germany @_lennart Graylog2 Free and open source log management system Started in
More informationIntroduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.
Big Data Hadoop Administration and Developer Course This course is designed to understand and implement the concepts of Big data and Hadoop. This will cover right from setting up Hadoop environment in
More informationIntroduction. Background
Introduction Bro is an open-source network security monitor which inspects network traffic looking for suspicious activity. The Bro framework provides an extensible scripting language that allows an analysis
More informationHareDB HBase Client Web Version USER MANUAL HAREDB TEAM
2013 HareDB HBase Client Web Version USER MANUAL HAREDB TEAM Connect to HBase... 2 Connection... 3 Connection Manager... 3 Add a new Connection... 4 Alter Connection... 6 Delete Connection... 6 Clone Connection...
More informationHBase A Comprehensive Introduction. James Chin, Zikai Wang Monday, March 14, 2011 CS 227 (Topics in Database Management) CIT 367
HBase A Comprehensive Introduction James Chin, Zikai Wang Monday, March 14, 2011 CS 227 (Topics in Database Management) CIT 367 Overview Overview: History Began as project by Powerset to process massive
More informationWHITE PAPER Redefining Monitoring for Today s Modern IT Infrastructures
WHITE PAPER Redefining Monitoring for Today s Modern IT Infrastructures Modern technologies in Zenoss Service Dynamics v5 enable IT organizations to scale out monitoring and scale back costs, avoid service
More informationPetabyte Scale Data at Facebook. Dhruba Borthakur, Engineer at Facebook, UC Berkeley, Nov 2012
Petabyte Scale Data at Facebook Dhruba Borthakur, Engineer at Facebook, UC Berkeley, Nov 2012 Agenda 1 Types of Data 2 Data Model and API for Facebook Graph Data 3 SLTP (Semi-OLTP) and Analytics data 4
More informationDatasheet FUJITSU Software ServerView Cloud Monitoring Manager V1.0
Datasheet FUJITSU Software ServerView Cloud Monitoring Manager V1.0 Datasheet FUJITSU Software ServerView Cloud Monitoring Manager V1.0 A Monitoring Cloud Service for Enterprise OpenStack Systems Cloud
More informationProgramming Hadoop 5-day, instructor-led BD-106. MapReduce Overview. Hadoop Overview
Programming Hadoop 5-day, instructor-led BD-106 MapReduce Overview The Client Server Processing Pattern Distributed Computing Challenges MapReduce Defined Google's MapReduce The Map Phase of MapReduce
More informationPulsar Realtime Analytics At Scale. Tony Ng April 14, 2015
Pulsar Realtime Analytics At Scale Tony Ng April 14, 2015 Big Data Trends Bigger data volumes More data sources DBs, logs, behavioral & business event streams, sensors Faster analysis Next day to hours
More informationBIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES
BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data
More informationPetabyte Scale Data at Facebook. Dhruba Borthakur, Engineer at Facebook, SIGMOD, New York, June 2013
Petabyte Scale Data at Facebook Dhruba Borthakur, Engineer at Facebook, SIGMOD, New York, June 2013 Agenda 1 Types of Data 2 Data Model and API for Facebook Graph Data 3 SLTP (Semi-OLTP) and Analytics
More informationComparative Analysis of Open-Source Log Management Solutions for Security Monitoring and Network Forensics
Comparative Analysis of Open-Source Log Management Solutions for Security Monitoring and Network Forensics Risto Vaarandi, Paweł Niziski NATO Cooperative Cyber Defence Centre of Excellence, Tallinn, Estonia
More informationwww.basho.com Technical Overview Simple, Scalable, Object Storage Software
www.basho.com Technical Overview Simple, Scalable, Object Storage Software Table of Contents Table of Contents... 1 Introduction & Overview... 1 Architecture... 2 How it Works... 2 APIs and Interfaces...
More informationthe missing log collector Treasure Data, Inc. Muga Nishizawa
the missing log collector Treasure Data, Inc. Muga Nishizawa Muga Nishizawa (@muga_nishizawa) Chief Software Architect, Treasure Data Treasure Data Overview Founded to deliver big data analytics in days
More informationBIG DATA FOR MEDIA SIGMA DATA SCIENCE GROUP MARCH 2ND, OSLO
BIG DATA FOR MEDIA SIGMA DATA SCIENCE GROUP MARCH 2ND, OSLO ANTHONY A. KALINDE SIGMA DATA SCIENCE GROUP ASSOCIATE "REALTIME BEHAVIOURAL DATA COLLECTION CLICKSTREAM EXAMPLE" WHAT IS CLICKSTREAM ANALYTICS?
More informationRealtime Apache Hadoop at Facebook. Jonathan Gray & Dhruba Borthakur June 14, 2011 at SIGMOD, Athens
Realtime Apache Hadoop at Facebook Jonathan Gray & Dhruba Borthakur June 14, 2011 at SIGMOD, Athens Agenda 1 Why Apache Hadoop and HBase? 2 Quick Introduction to Apache HBase 3 Applications of HBase at
More informationReal-Time Analytical Processing (RTAP) Using the Spark Stack. Jason Dai jason.dai@intel.com Intel Software and Services Group
Real-Time Analytical Processing (RTAP) Using the Spark Stack Jason Dai jason.dai@intel.com Intel Software and Services Group Project Overview Research & open source projects initiated by AMPLab in UC Berkeley
More informationMySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering
MySQL and Hadoop: Big Data Integration Shubhangi Garg & Neha Kumari MySQL Engineering 1Copyright 2013, Oracle and/or its affiliates. All rights reserved. Agenda Design rationale Implementation Installation
More informationUsing elasticsearch, logstash and kibana to create realtime dashboards
Using elasticsearch, logstash and kibana to create realtime dashboards Alexander Reelsen @spinscale alexander.reelsen@elasticsearch.com Agenda The need, complexity and pain of logging Logstash basics Usage
More informationA Year of HTCondor Monitoring. Lincoln Bryant Suchandra Thapa
A Year of HTCondor Monitoring Lincoln Bryant Suchandra Thapa HTCondor Week 2015 May 21, 2015 Analytics vs. Operations Two parallel tracks in mind: o Operations o Analytics Operations needs to: o Observe
More informationCAPTURING & PROCESSING REAL-TIME DATA ON AWS
CAPTURING & PROCESSING REAL-TIME DATA ON AWS @ 2015 Amazon.com, Inc. and Its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent
More informationSharePlex for SQL Server
SharePlex for SQL Server Improving analytics and reporting with near real-time data replication Written by Susan Wong, principal solutions architect, Dell Software Abstract Many organizations today rely
More informationBig data blue print for cloud architecture
Big data blue print for cloud architecture -COGNIZANT Image Area Prabhu Inbarajan Srinivasan Thiruvengadathan Muralicharan Gurumoorthy Praveen Codur 2012, Cognizant Next 30 minutes Big Data / Cloud challenges
More informationDecoding DNS data. Using DNS traffic analysis to identify cyber security threats, server misconfigurations and software bugs
Decoding DNS data Using DNS traffic analysis to identify cyber security threats, server misconfigurations and software bugs The Domain Name System (DNS) is a core component of the Internet infrastructure,
More informationIncreasing Business Productivity and Value in Financial Services with Secure Big Data Architecture
Increasing Business Productivity and Value in Financial Services with Secure Big Data Architecture Stefanus Natahusada, Director/Consultant Email: info@stefansecurity.com Agenda Financial Services Requirements
More informationHadoop & its Usage at Facebook
Hadoop & its Usage at Facebook Dhruba Borthakur Project Lead, Hadoop Distributed File System dhruba@apache.org Presented at the The Israeli Association of Grid Technologies July 15, 2009 Outline Architecture
More informationTesting Automation for Distributed Applications By Isabel Drost-Fromm, Software Engineer, Elastic
Testing Automation for Distributed Applications By Isabel Drost-Fromm, Software Engineer, Elastic The challenge When building distributed, large-scale applications, quality assurance (QA) gets increasingly
More informationBest Practices for Hadoop Data Analysis with Tableau
Best Practices for Hadoop Data Analysis with Tableau September 2013 2013 Hortonworks Inc. http:// Tableau 6.1.4 introduced the ability to visualize large, complex data stored in Apache Hadoop with Hortonworks
More informationResearch Report. IBM Operations Analytics - Log Analysis: Getting the Most out of Your Operational Big Data
Research Report IBM Operations Analytics - Log Analysis: Getting the Most out of Your Operational Big Data Introduction Operational data, such as log files and system metrics, provides important information
More informationManaging Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database
Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica
More informationPerformance and Health Monitoring and Analysis of Hive Scales Portal Web Application
Grand Valley State University ScholarWorks@GVSU Technical Library School of Computing and Information Systems 2016 Performance and Health Monitoring and Analysis of Hive Scales Portal Web Application Ronald
More informationElasticsearch for Lua Developers. Pablo Musa pablo@elastic.co
Elasticsearch for Lua Developers Pablo Musa pablo@elastic.co + + Me Pablo Musa Educational Engineer @ Elastic Which student? 5 interested students 3 very good proposals Key Points: - Background (Lua, Elasticsearch,
More informationZynga Analytics Leveraging Big Data to Make Games More Fun and Social
Connecting the World Through Games Zynga Analytics Leveraging Big Data to Make Games More Fun and Social Daniel McCaffrey General Manager, Platform and Analytics Engineering World s leading social game
More informationLambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January 2015. Email: bdg@qburst.com Website: www.qburst.com
Lambda Architecture Near Real-Time Big Data Analytics Using Hadoop January 2015 Contents Overview... 3 Lambda Architecture: A Quick Introduction... 4 Batch Layer... 4 Serving Layer... 4 Speed Layer...
More informationData Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot
www.etidaho.com (208) 327-0768 Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot 3 Days About this Course This course is designed for the end users and analysts that
More informationBig Data Approaches. Making Sense of Big Data. Ian Crosland. Jan 2016
Big Data Approaches Making Sense of Big Data Ian Crosland Jan 2016 Accelerate Big Data ROI Even firms that are investing in Big Data are still struggling to get the most from it. Make Big Data Accessible
More informationUser and Programmer Guide for the FI- STAR Monitoring Service SE
User and Programmer Guide for the FI- STAR Monitoring Service SE FI-STAR Beta Release Copyright 2014 - Yahya Al-Hazmi, Technische Universität Berlin This document gives a short guide on how to use the
More informationMonitoring and data warehousing tools Initial version
Ref. Ares(2016)529011-01/02/2016 Developing Data-Intensive Cloud Applications with Iterative Quality Enhancements Monitoring and data warehousing tools Initial version Deliverable 4.1 Deliverable: D4.1
More informationMonitor and Manage Your MicroStrategy BI Environment Using Enterprise Manager and Health Center
Monitor and Manage Your MicroStrategy BI Environment Using Enterprise Manager and Health Center Presented by: Dennis Liao Sales Engineer Zach Rea Sales Engineer January 27 th, 2015 Session 4 This Session
More informationSimba Apache Cassandra ODBC Driver
Simba Apache Cassandra ODBC Driver with SQL Connector 2.2.0 Released 2015-11-13 These release notes provide details of enhancements, features, and known issues in Simba Apache Cassandra ODBC Driver with
More informationUser Reports. Time on System. Session Count. Detailed Reports. Summary Reports. Individual Gantt Charts
DETAILED REPORT LIST Track which users, when and for how long they used an application on Remote Desktop Services (formerly Terminal Services) and Citrix XenApp (known as Citrix Presentation Server). These
More informationA Performance Analysis of Distributed Indexing using Terrier
A Performance Analysis of Distributed Indexing using Terrier Amaury Couste Jakub Kozłowski William Martin Indexing Indexing Used by search
More informationTrafodion Operational SQL-on-Hadoop
Trafodion Operational SQL-on-Hadoop SophiaConf 2015 Pierre Baudelle, HP EMEA TSC July 6 th, 2015 Hadoop workload profiles Operational Interactive Non-interactive Batch Real-time analytics Operational SQL
More informationDelivering secure, real-time business insights for the Industrial world
Delivering secure, real-time business insights for the Industrial world Arnaud Mathieu: Program Director, Internet of Things Dev., IBM amathieu@us.ibm.com @arnomath 1 We are on the threshold of massive
More informationHBase Schema Design. NoSQL Ma4ers, Cologne, April 2013. Lars George Director EMEA Services
HBase Schema Design NoSQL Ma4ers, Cologne, April 2013 Lars George Director EMEA Services About Me Director EMEA Services @ Cloudera ConsulFng on Hadoop projects (everywhere) Apache Commi4er HBase and Whirr
More informationSavanna Hadoop on. OpenStack. Savanna Technical Lead
Savanna Hadoop on OpenStack Sergey Lukjanov Savanna Technical Lead Mirantis, 2013 Agenda Savanna Overview Savanna Use Cases Roadmap & Current Status Architecture & Features Overview Hadoop vs. Virtualization
More informationFinding the Needle in a Big Data Haystack. Wolfgang Hoschek (@whoschek) JAX 2014
Finding the Needle in a Big Data Haystack Wolfgang Hoschek (@whoschek) JAX 2014 1 About Wolfgang Software Engineer @ Cloudera Search Platform Team Previously CERN, Lawrence Berkeley National Laboratory,
More informationOn- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform
On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform Page 1 of 16 Table of Contents Table of Contents... 2 Introduction... 3 NoSQL Databases... 3 CumuLogic NoSQL Database Service...
More informationPetabyte Scale Data at Facebook. Dhruba Borthakur, Engineer at Facebook, XLDB Conference at Stanford University, Sept 2012
Petabyte Scale Data at Facebook Dhruba Borthakur, Engineer at Facebook, XLDB Conference at Stanford University, Sept 2012 Agenda 1 Types of Data 2 Data Model and API for Facebook Graph Data 3 SLTP (Semi-OLTP)
More informationFinding the needle in the haystack with ELK
Finding the needle in the haystack with ELK Elasticsearch for Incident Handlers and Forensic Analysts S by Christophe@Vandeplas.com Whoami S Working for the Belgian Government my own company S Incident
More informationBeyond Lambda - how to get from logical to physical. Artur Borycki, Director International Technology & Innovations
Beyond Lambda - how to get from logical to physical Artur Borycki, Director International Technology & Innovations Simplification & Efficiency Teradata believe in the principles of self-service, automation
More informationTowards Smart and Intelligent SDN Controller
Towards Smart and Intelligent SDN Controller - Through the Generic, Extensible, and Elastic Time Series Data Repository (TSDR) YuLing Chen, Dell Inc. Rajesh Narayanan, Dell Inc. Sharon Aicler, Cisco Systems
More informationScalable and Live Trace Processing in the Cloud
Scalable and Live Trace Processing in the Cloud Bachelor s Thesis Phil Stelzer April 7, 2014 Kiel University Department of Computer Science Software Engineering Group Advised by: Prof. Dr. Wilhelm Hasselbring
More informationNear Real Time Indexing Kafka Message to Apache Blur using Spark Streaming. by Dibyendu Bhattacharya
Near Real Time Indexing Kafka Message to Apache Blur using Spark Streaming by Dibyendu Bhattacharya Pearson : What We Do? We are building a scalable, reliable cloud-based learning platform providing services
More informationAgile Business Intelligence Data Lake Architecture
Agile Business Intelligence Data Lake Architecture TABLE OF CONTENTS Introduction... 2 Data Lake Architecture... 2 Step 1 Extract From Source Data... 5 Step 2 Register And Catalogue Data Sets... 5 Step
More informationOpenbus Documentation
Openbus Documentation Release 1 Produban February 17, 2014 Contents i ii An open source architecture able to process the massive amount of events that occur in a banking IT Infraestructure. Contents:
More informationAdding Indirection Enhances Functionality
Adding Indirection Enhances Functionality The Story Of A Proxy Mark Riddoch & Massimiliano Pinto Introductions Mark Riddoch Staff Engineer, VMware Formally Chief Architect, MariaDB Corporation Massimiliano
More informationTime-Series Databases and Machine Learning
Time-Series Databases and Machine Learning Jimmy Bates November 2017 1 Top-Ranked Hadoop 1 3 5 7 Read Write File System World Record Performance High Availability Enterprise-grade Security Distribution
More information