Splunk for Data Science

Size: px
Start display at page:

Download "Splunk for Data Science"

Transcription

1 Copyright 2014 Splunk Inc. Splunk for Data Science Tom LaGa=a Data Splunk Olivier de Garrigues Sr Prof Services Consultant, Splunk

2 Disclaimer During the course of this we may make forward- looking statements regarding future events or the expected performance of the company. We you that such statements reflect our current and based on factors currently known to us and that actual events or results could differ materially. For important factors that may cause actual results to differ from those contained in our forward- looking statements, please review our filings with the SEC. The forward- looking statements made in the this are being made as of and date of its live If reviewed arer its live this may not contain current or accurate We do not assume any to update any forward- looking statements we may make. In any about our roadmap outlines our general product and is subject to change at without It is for purposes only, and shall not be incorporated into any contract or other commitment. Splunk undertakes no either to develop the features or described or to include any such feature or in a future release. 2

3 3 Key Takeaways Data Science is about extrac:ng ac:onable insights from data. Splunk is great for doing Data Science! Splunk complements other tools in the Data Science toolkit. 3

4 About Us! Tom LaGaAa, Data Scien:st Tom joined Splunk in Spring 2014 as a Data Scien@st specializing in Probability and Sta@s@cs. Tom is an expert on the mathema@cs of inference, and he enjoys func@onal programming in languages like Clojure, Haskell & R. At Splunk, Tom is helping to develop our internal and external Data Science program and curriculum. Tom has a Ph.D. in Mathema@cs from the University of Arizona, and un@l recently was a Courant Instructor at the Courant Ins@tute at New York University. Tom is based in New York City.! Olivier de Garrigues, Senior Professional Services Consultant Olivier is based in London on the EMEA Professional Services team and has helped out more than 40 customers in 10 countries on various Splunk projects in the past year and a half. Prior to this, he worked as a quan@ta@ve analyst with extensive use of MATLAB and R. He developed a keen interest in machine learning and enjoys dreaming about how to make Splunk be=er for data scien@sts, and helped develop the R Project App. Olivier holds an MS in Mathema@cs of Finance from Columbia University. 4

5 Splunk for Data Science

6 What is Data Science? Data Science is about insights from data.! Helps people make be=er decisions! Can be used for automated decision- making! Data Science is cross- and blends techniques & theories from: CS / Programming Math and Sta@s@cs Machine Learning Data Mining / Databases Data Visualiza@on! Don t be afraid of Data Science! Substan@ve / Domain Exper@se Social Science Communica@on and Presenta@on Accoun@ng, Finance and KPIs Business Analy@cs 6

7 Data Science & Teams There is no one size fits all data Data Science & teams are made up of people with complementary skill sets. Source: Schu= & O Neil. Doing Data Science

8 Splunk for Data Science Splunk is great for doing Data Science!! Integrate, query & visualize all the data: Plalorm for machine data Connects with any other data source! Easy- to- use Powerful algorithms out- of- the- box! Sharp and dashboards! Deliver results to both IT & Business users! Complements other Data Science tools (next slide) 8

9 Splunk and Data Science Tools Splunk complements other tools in the Data Science toolkit:! Hadoop: the workhorse of the Data Science world. Using Hunk, you can integrate Hadoop & HDFS seamlessly into Splunk.! R & Python: the preferred languages of Data Science. Execute R & Python scripts in your Splunk queries using the R Project App & SDK for Python! SQL & other RDBMS: valuable stores for customer & product data. Use Splunk s DB Connect App to mash rela@onal data up with machine data.! External tools: export finalized data from Splunk using the ODBC Driver Tip: do all your data processing in Splunk/Hunk, and export only the final results! D3 Custom Visualiza@ons: sharp dashboards & reports using Splunk 9

10 Splunk and Data Science Use Cases Splunk is a powerful tool for lots of Data Science use cases: Green Use Cases (easy out of the box) Yellow Use Cases Trend Forecas@ng D3 Custom Visualiza@ons A/B Tes@ng Predic@ve Modeling Root Cause Analysis Sen@ment Analysis Anomaly Detec@on Conversion Funnel/Pathing Market Segmenta@on More Algorithms via R & Python Topic Modeling Capacity Planning Correlate Data from 2+ Sources Data Munging & Normaliza@on KPIs & Execu@ve Dashboards 10

11 Data Science Use Cases

12 Use Case: Trend Trend given past & data, predict future values & events.! Common Forecast revenue & other KPIs Web server traffic & product downloads Customer conversion rates MTTR & server outages Resource & capacity planning (AWS App) Security threats (Enterprise Security App)! The true course of events can (and will) take only one of many divergent paths. But which one?! Be mindful of rare events & black swans! 12

13 predict command: forecast future trajectories series! Implements a Kalman filter to iden@fy seasonal trends! Gives an uncertainty envelope as a buffer around the trend! Tip: Always run the predict command on LOTS of past data. Capture low- frequency and high- frequency trends Splunk Solu@on: predict!! Remember: the future is always uncertain 13

14 Splunk Predict App David Carasso s Predict App: forecast future values of individual events. 8 minute walkthrough: h=ps:// Implements a Naïve Bayes classifier! You have to train models!! Train a model to predict any target field using any reference field(s): fields ref1, ref2,..., target train my_model from target!! Guess target field for incoming events: guess my_model into target!! Temporal or non- temporal predic@on (include _@me among reference fields) 14

15 Concept: Supervised Learning & Supervised learning: use observed training data to classify values of unknown tes1ng data! predict command (Kalman filter): Training data of past & real@me values. Tes@ng data range for future values! Predict App (Naïve Bayes classifier): Training data = events with reference & target fields. Tes@ng data = events with reference fields but not target field! Tip: only deploy models & algorithms a2er extensive tes@ng & evalua@on! More powerful learning algorithms using R Project App or SDK for Python 15

16 Demo: Predict App! Train a model to predict movie Ra@ng based on MovieID, UserID, Genre, Tag index=movielens Timestamp < UserID=593* eval original_rating = case(rating<3,"dislike", Rating=3,"Neutral", Rating>3,"Like") fields original_rating MovieID UserID Genre Tag train rating_model from original_rating!! Guess Ra@ng for test data based on trained model index=movielens Timestamp > UserID=593* guess rating_model into guessed_rating top original_rating guessed_rating!! Accuracy of model: correct on 97.6% of values! Tip: always train on LOTS of training data! Evaluate before deploying 16

17 Use Case: Analysis Analysis: the assignment of labels to textual data! Can be simple +1 vs. - 1, or more sophis@cated: happy, angry, sad, etc.! Analyze tweets, s, news ar@cles, logs or any other textual data! Social data correlates with other factors! Typically done via supervised learning: Train a model on labeled corpus of text Test the model on incoming text data! Read more about Sen@ment Analysis: Chapter 14 of Big Data Analy1cs Using Splunk (pp ) Michael Wilde & David Carasso. Social Media & Sen1ment Analysis..conf2012 r=.79 17% 1.8% 10% 36% 19% 3 rd 8 th 4 th 1 st 2 nd 2011 Irish General Elec@on 17

18 Splunk Analysis App David Carasso s Sen@ment Analysis App assigns binary sen@ment values to textual data (logs, tweets, , etc.)! Naïve Bayes classifier under the hood! Twi=er & IMDB models out of the box! Can guess language of authorship, and heat, a measure of emo@onal charge! Tip: compare rela@ve sen@ment changes & groups! How to train your own models: h=p://answers.splunk.com/answers/

19 Demo: Analysis App 19

20 Use Case: Anomaly An anomaly (or outlier) is an event which is vastly dissimilar to other events! Anomaly is one of Splunk s most common use cases. Examples: Transac@ons which occur faster than humanly possible DDoS a=acks from IP address ranges High- value customer purchase pa=erns! Quick techniques for finding sta@s@cal outliers: Non- average outliers: more than 2*stdev from the avg Non- typical outliers: more than 1.5*IQR above perc75 or below perc25! Tip: save these as even=ypes for automated outlier detec@on! Once anomalies have been found, dig deeper to discover root causes 20

21 Splunk cluster! Anomalies are dissimilar to other events (by We can use clustering algorithms to help us detect anomalies: Non- anomalous events typically form a few large clusters Anomalous events typically form lots of small clusters! Cluster your data, sort ascending: cluster showcount=true labelonly=true sort cluster_count cluster_label!! Remember: there is no right way to find all anomalies. Explore your data! 21

22 Concept: Unsupervised Learning & Clustering! A clustering algorithm is any process which groups together similar things (events, people, etc), and separates dissimilar things (events, people, etc.)! Clustering is unsupervised: choose labels based on pa=erns in the data! Clustering is in the eye of the beholder: Lots of different clustering algorithms Lots of different similarity func@ons! Do not confuse with: Computer cluster: a group of computers working together as a single system Splunk cluster: a group of Splunk indexers replica@ng indexes & external data 22

23 Demo: cluster! 23

24 Splunk Other Commands! anomalies: Assigns an unexpectedness score to each event! anomalousvalue: Assigns an anomaly score to events with anomalous values! outlier: Removes or truncates outliers! kmeans: Powerful clustering algorithm. You choose k = # of clusters 24

25 Splunk Prelert (Partner App)! Manages Anomaly directly Pre- built dashboards, alerts, API. Use cases: Security, IT Ops / APM, DevOps Godfrey Sullivan: "beau@fully adjacent and complimentary to what Splunk does! Can download from Splunk Apps May save with Anomaly Detec@on Can also be good source of inspira@on for your own Anomaly Detec@on dashboards! Keep in mind Prelert is a paid app: Cost: 5GB 25

26 Use Case: Market Market group customers according to common needs and and develop strategies to target them Market segments are internally homogeneous, and externally heterogeneous i.e., market segments are clusters of customers! Many reasons for Market Different market segments require different strategies Customers in same segment have similar product preferences. Different segments, different preferences Segments should be reasonably stable, to allow for historical analysis (good for Data Science)! Use Splunk s clustering algorithms to iden@fy and label market segments! 26

27 Data

28 Intro to Data Data is the and study of the visual of data, and is a vital part of Data Science The goal of data visualiza@on is to communicate informa1on: Visualiza@ons communicate complex ideas with clarity, precision, and efficiency Transmission speed of the op@c nerve is about 9Mb/sec fast image processing Pa=ern matching, edge detec@on Visualiza@ons pack lots of informa@on into small spaces. More than text alone! 28

29 Telling Stories with Data We process data in linear even dashboards go top- to- bo=om! help pierce the monotony of text, number & data streams Think about the story you re telling:! Empathize with the viewer What s their takeaway?! A good visualiza@on tells its own story: Island Na@on Obtains Favourable Balance of Trade; Goes On To Rule The World! Weave mul@ple visualiza@ons together to tell more effec@ve stories 29 William Playfair (1786)

30 Splunk Source: New York Times. May 17,

31 Splunk Source: New York Times. May 17,

32 Tips for Data Plot the most important keys on x & y axes You choose most important. You might need >1 visualiza@on.! Manipulate size, color and shape to convey addi@onal informa@on! Annotate, label and add icons! Use chart overlay to correlate data sources. Mix histograms & line charts! Manipulate numerical scale: linear vs. log scales (previous 2 slides)! Read more about Data Visualiza@on: Tableau s whitepaper, Visual Analysis Best Prac1ces (2013) Edward TuRe s The Visual Display of Quan1ta1ve Informa1on (2001) 32

33 D3 Custom in Splunk! Splunk now supports D3 with some minor Satoshi s talk: I want that cool viz in Splunk!! Resources for Custom Visualiza@ons: Splunk Web Framework Toolkit h=ps://apps.splunk.com/app/1613/ Splunk 6.x Dashboard Examples h=ps://apps.splunk.com/app/1603/ Custom SimpleXML Extensions h=p://apps.splunk.com/app/1772/ Lots more D3 visualiza@ons for h=ps://github.com/mbostock/d3/wiki/gallery 33

34 Demo: Sankey Chart 34

35 How- to for Sankey Charts! Install the Custom SimpleXML Extensions app: h=p://apps.splunk.com/app/1772/! Create your own app, and install Sankey chart components: Drop autodiscover.js in Copy & paste /sankeychart/ subfolder into $SPLUNK_HOME/etc/apps/<YOURAPP>/ Restart Splunk! In your dashboard: Include script="autodiscover.js" in <form> or <dashboard> opening tag Insert XML snippet from 2- or 3- node Sankey dashboard example Change 2 instances of custom_simplexml_extensions to <YOURAPP> Update search and data- op@ons parameters (nodes) in XML to reflect your data 35

36 Know Your Audience! Finally, keep in mind your audience: who are they, what do they care about, and how do they want to consume the data? KPIs, charts, tables with icons Analyst: KPIs & metrics. Sharp images for their own reports & decks. Tableau Data output clean data to organized data stores (Hunk, HDFS, SQL, NoSQL) Sysadmin: sparklines, gauges for & MTTR, tables with highlighted anomalies Security Ops: maps with detailed overlays, drill down on anomalous events.! Bring it back to the business problem & use 36

37 3 Key Takeaways Data Science is about extrac:ng ac:onable insights from data. Splunk is great for doing Data Science! Splunk complements other tools in the Data Science toolkit. 37

38 List of References Good books on Data Science:! Schu= & O Neil. Doing Data Science. O Reilly 2013! Provost & Fawce=. Data Science for Business. O Reilly 2013! Max Shron. Thinking With Data. O Reilly 2014! Edward TuRe. The Visual Display of Quan1ta1ve Informa1on. Graphics Press 2001! Zumel & Mount. Prac1cal Data Science with R. Manning 2014! Has@e et al. Elements of Sta1s1cal Learning. Springer- Verlag 2009 (free PDF!) Using Splunk for Data Science:! Zadrozny, Kodali (and Stout). Big Data Analy1cs Using Splunk. Apress 2013! David Carasso. Exploring Splunk. CITO Research 2012! David Carasso. Data Mining with Splunk..conf2012! Michael Wilde & David Carasso. Social Media & Sen1ment Analysis..conf2012 Good free references:! Tableau. Visual Analysis Best Prac1ces. Tableau 2013! King & Magoulas Data Science Salary Survey. O Reilly 2013! DJ Pa@l. Building Data Science Teams. O Reilly 2013! Cathy O Neil. On Being A Data Skep1c. O Reilly

39 THANK YOU

How To Use Splunk For Android (Windows) With A Mobile App On A Microsoft Tablet (Windows 8) For Free (Windows 7) For A Limited Time (Windows 10) For $99.99) For Two Years (Windows 9

How To Use Splunk For Android (Windows) With A Mobile App On A Microsoft Tablet (Windows 8) For Free (Windows 7) For A Limited Time (Windows 10) For $99.99) For Two Years (Windows 9 Copyright 2014 Splunk Inc. Splunk for Mobile Intelligence Bill Emme< Director, Solu?ons Marke?ng Panos Papadopoulos Director, Product Management Disclaimer During the course of this presenta?on, we may

More information

Hunk & Elas=c MapReduce: Big Data Analy=cs on AWS

Hunk & Elas=c MapReduce: Big Data Analy=cs on AWS Copyright 2014 Splunk Inc. Hunk & Elas=c MapReduce: Big Data Analy=cs on AWS Dritan Bi=ncka BD Solu=ons Architecture Disclaimer During the course of this presenta=on, we may make forward looking statements

More information

Stream Deployments in the Real World: Enhance Opera?onal Intelligence Across Applica?on Delivery, IT Ops, Security, and More

Stream Deployments in the Real World: Enhance Opera?onal Intelligence Across Applica?on Delivery, IT Ops, Security, and More Copyright 2015 Splunk Inc. Stream Deployments in the Real World: Enhance Opera?onal Intelligence Across Applica?on Delivery, IT Ops, Security, and More Stela Udovicic Sr. Product Marke?ng Manager Clayton

More information

Splunk for Networking and SDN

Splunk for Networking and SDN Copyright 2013 Splunk Inc. Splunk for Networking and SDN Stela Udovicic Senior Product Marke?ng Manager, Splunk #splunkconf Legal No?ces During the course of this presenta?on, we may make forward- looking

More information

Introducing Data Visualiza2on Cloud Service

Introducing Data Visualiza2on Cloud Service Introducing Data Visualiza2on Cloud Service Vasu Murthy Sr. Director, Product Management Samar Lo2a VP of Development Oracle Business Analy2cs October 28, 2015 Note: The speaker notes for this slide include

More information

Accelera'ng Your Solu'on Development with Splunk Reference Apps

Accelera'ng Your Solu'on Development with Splunk Reference Apps Copyright 2015 Splunk Inc. Accelera'ng Your Solu'on Development with Splunk Reference Apps Grigori Melnik Principal Product Manager Developer PlaAorm, Splunk @gmelnik Disclaimer During the course of this

More information

The Right BI Tool for the Job in a non- SAP Applica9on Environment

The Right BI Tool for the Job in a non- SAP Applica9on Environment September 9 11, 2013 Anaheim, California The Right BI Tool for the Job in a non- SAP Applica9on Environment Speaker Name(s): Ty Miller Full Spectrum Business Intelligence Self Service Dashboards and Apps

More information

Real World Big Data Architecture - Splunk, Hadoop, RDBMS

Real World Big Data Architecture - Splunk, Hadoop, RDBMS Copyright 2015 Splunk Inc. Real World Big Data Architecture - Splunk, Hadoop, RDBMS Raanan Dagan, Big Data Specialist, Splunk Disclaimer During the course of this presentagon, we may make forward looking

More information

MSc Data Science at the University of Sheffield. Started in September 2014

MSc Data Science at the University of Sheffield. Started in September 2014 MSc Data Science at the University of Sheffield Started in September 2014 Gianluca Demar?ni Lecturer in Data Science at the Informa?on School since 2014 Ph.D. in Computer Science at U. Hannover, Germany

More information

Blue Medora VMware vcenter Opera3ons Manager Management Pack for Oracle Enterprise Manager

Blue Medora VMware vcenter Opera3ons Manager Management Pack for Oracle Enterprise Manager Blue Medora VMware vcenter Opera3ons Manager Management Pack for Oracle Enterprise Manager Oracle WebLogic J2EE on VMware Monitoring 203 Blue Medora LLC All rights reserved WebLogic on VMware Management

More information

BENCHMARKING V ISUALIZATION TOOL

BENCHMARKING V ISUALIZATION TOOL Copyright 2014 Splunk Inc. BENCHMARKING V ISUALIZATION TOOL J. Green Computer Scien

More information

DNS Big Data Analy@cs

DNS Big Data Analy@cs Klik om de s+jl te bewerken Klik om de models+jlen te bewerken! Tweede niveau! Derde niveau! Vierde niveau DNS Big Data Analy@cs Vijfde niveau DNS- OARC Fall 2015 Workshop October 4th 2015 Maarten Wullink,

More information

Architec;ng Splunk for High Availability and Disaster Recovery

Architec;ng Splunk for High Availability and Disaster Recovery Copyright 2013 Splunk Inc. Architec;ng Splunk for High Availability and Disaster Recovery Dritan Bi;ncka Professional Services #splunkconf Legal No;ces During the course of this presenta;on, we may make

More information

Architec;ng Splunk for High Availability and Disaster Recovery

Architec;ng Splunk for High Availability and Disaster Recovery Copyright 2014 Splunk Inc. Architec;ng Splunk for High Availability and Disaster Recovery Dritan Bi;ncka BD Solu;on Architecture Disclaimer During the course of this presenta;on, we may make forward- looking

More information

.nl ENTRADA. CENTR-tech 33. November 2015 Marco Davids, SIDN Labs. Klik om de s+jl te bewerken

.nl ENTRADA. CENTR-tech 33. November 2015 Marco Davids, SIDN Labs. Klik om de s+jl te bewerken Klik om de s+jl te bewerken Klik om de models+jlen te bewerken Tweede niveau Derde niveau Vierde niveau.nl ENTRADA Vijfde niveau CENTR-tech 33 November 2015 Marco Davids, SIDN Labs Wie zijn wij? Mijlpalen

More information

Incident Response Using Splunk for State and Local Governments

Incident Response Using Splunk for State and Local Governments Copyright 2013 Splunk Inc. Incident Response Using Splunk for State and Local Governments Bert Hayes Solu=ons Engineer [email protected] #splunkconf Legal No=ces During the course of this presenta=on, we

More information

Data Stream Algorithms in Storm and R. Radek Maciaszek

Data Stream Algorithms in Storm and R. Radek Maciaszek Data Stream Algorithms in Storm and R Radek Maciaszek Who Am I? l Radek Maciaszek l l l l l l Consul9ng at DataMine Lab (www.dataminelab.com) - Data mining, business intelligence and data warehouse consultancy.

More information

Big Data. The Big Picture. Our flexible and efficient Big Data solu9ons open the door to new opportuni9es and new business areas

Big Data. The Big Picture. Our flexible and efficient Big Data solu9ons open the door to new opportuni9es and new business areas Big Data The Big Picture Our flexible and efficient Big Data solu9ons open the door to new opportuni9es and new business areas What is Big Data? Big Data gets its name because that s what it is data that

More information

Data Science And Big Data Analytics Course

Data Science And Big Data Analytics Course Data Science And Big Data Analytics Course Copyright 2014 EMC Corpora3on. All Rights Reserved. Introduc3on and Course Agenda 1 Introduc3on and Course Agenda 2 Introduc3on and Course Agenda The following

More information

Pu?ng B2B Research to the Legal Test

Pu?ng B2B Research to the Legal Test With the global leader in sampling and data services Pu?ng B2B Research to the Legal Test Ashlin Quirk, SSI General Counsel 2014 Survey Sampling Interna6onal 1 2014 Survey Sampling Interna6onal Se?ng the

More information

How to Use Splunk To Detect and Defeat Fraud, TheK And Abuse

How to Use Splunk To Detect and Defeat Fraud, TheK And Abuse Copyright 2015 Splunk Inc. How to Use Splunk To Detect and Defeat Fraud, TheK And Abuse Joe Goldberg Product Marke@ng, Splunk Young Cho Technical Product Marke@ng, Splunk Disclaimer During the course of

More information

More Than A Buzzword: Big Data in the Environmental Arena

More Than A Buzzword: Big Data in the Environmental Arena More Than A Buzzword: Big Data in the Environmental Arena 2015 Na>onal Environmental Monitoring Conference July 15, 2015 Brooke Roecker Senior Environmental Data Analyst Mark Packard, PG, CPG President/CEO

More information

Protec'ng Communica'on Networks, Devices, and their Users: Technology and Psychology

Protec'ng Communica'on Networks, Devices, and their Users: Technology and Psychology Protec'ng Communica'on Networks, Devices, and their Users: Technology and Psychology Alexey Kirichenko, F- Secure Corpora7on ICT SHOK, Future Internet program 30.5.2012 Outline 1. Security WP (WP6) overview

More information

Effec%ve AX 2012 Upgrade Project Planning and Microso< Sure Step. Arbela Technologies

Effec%ve AX 2012 Upgrade Project Planning and Microso< Sure Step. Arbela Technologies Effec%ve AX 2012 Upgrade Project Planning and Microso< Sure Step Arbela Technologies Why Upgrade? What to do? How to do it? Tools and templates Agenda Sure Step 2012 Ax2012 Upgrade specific steps Checklist

More information

DTCC Data Quality Survey Industry Report

DTCC Data Quality Survey Industry Report DTCC Data Quality Survey Industry Report November 2013 element 22 unlocking the power of your data Contents 1. Introduction 3 2. Approach and participants 4 3. Summary findings 5 4. Findings by topic 6

More information

An Open Dynamic Big Data Driven Applica3on System Toolkit

An Open Dynamic Big Data Driven Applica3on System Toolkit An Open Dynamic Big Data Driven Applica3on System Toolkit Craig C. Douglas University of Wyoming and KAUST This research is supported in part by the Na3onal Science Founda3on and King Abdullah University

More information

End- to- End Monitoring Unified Performance Dashboard (UPD)

End- to- End Monitoring Unified Performance Dashboard (UPD) Calvin Smith Project Solution Architect Rich Galloway Systems Integration Engineer Michael Rodriguez Splunk Analytics Engineer Karen Wilson Program Manager Northrop Grumman Information Systems (NGIS) Copyright

More information

ANALYTICAL TECHNIQUES FOR DATA VISUALIZATION

ANALYTICAL TECHNIQUES FOR DATA VISUALIZATION ANALYTICAL TECHNIQUES FOR DATA VISUALIZATION CSE 537 Ar@ficial Intelligence Professor Anita Wasilewska GROUP 2 TEAM MEMBERS: SAEED BOOR BOOR - 110564337 SHIH- YU TSAI - 110385129 HAN LI 110168054 SOURCES

More information

CMMI for High-Performance with TSP/PSP

CMMI for High-Performance with TSP/PSP Dr. Kıvanç DİNÇER, PMP Hace6epe University Implemen@ng CMMI for High-Performance with TSP/PSP Informa@on Systems & SoFware The Informa@on Systems usage has experienced an exponen@al growth over the past

More information

Hadoop s Advantages for! Machine! Learning and. Predictive! Analytics. Webinar will begin shortly. Presented by Hortonworks & Zementis

Hadoop s Advantages for! Machine! Learning and. Predictive! Analytics. Webinar will begin shortly. Presented by Hortonworks & Zementis Webinar will begin shortly Hadoop s Advantages for Machine Learning and Predictive Analytics Presented by Hortonworks & Zementis September 10, 2014 Copyright 2014 Zementis, Inc. All rights reserved. 2

More information

Data Mining. SPSS Clementine 12.0. 1. Clementine Overview. Spring 2010 Instructor: Dr. Masoud Yaghini. Clementine

Data Mining. SPSS Clementine 12.0. 1. Clementine Overview. Spring 2010 Instructor: Dr. Masoud Yaghini. Clementine Data Mining SPSS 12.0 1. Overview Spring 2010 Instructor: Dr. Masoud Yaghini Introduction Types of Models Interface Projects References Outline Introduction Introduction Three of the common data mining

More information

Protec'ng Informa'on Assets - Week 8 - Business Continuity and Disaster Recovery Planning. MIS 5206 Protec/ng Informa/on Assets Greg Senko

Protec'ng Informa'on Assets - Week 8 - Business Continuity and Disaster Recovery Planning. MIS 5206 Protec/ng Informa/on Assets Greg Senko Protec'ng Informa'on Assets - Week 8 - Business Continuity and Disaster Recovery Planning MIS5206 Week 8 In the News Readings In Class Case Study BCP/DRP Test Taking Tip Quiz In the News Discuss items

More information

Tax Fraud in Increasing

Tax Fraud in Increasing Preventing Fraud with Through Analytics Satya Bhamidipati Data Scientist Business Analytics Product Group Copyright 2014 Oracle and/or its affiliates. All rights reserved. 2 Tax Fraud in Increasing 27%

More information

Splunk and Big Data for Insider Threats

Splunk and Big Data for Insider Threats Copyright 2014 Splunk Inc. Splunk and Big Data for Insider Threats Mark Seward Sr. Director, Public Sector Company Company (NASDAQ: SPLK)! Founded 2004, first sohware release in 2006! HQ: San Francisco

More information

Unified Monitoring with AppDynamics

Unified Monitoring with AppDynamics Unified Monitoring with AppDynamics Dus$n Whi*le @AppDynamics 52% of Fortune 500 firms since 2000 are gone Application complexity is exploding Agile SOA Login Flight Status Search Flight Purchase Mobile

More information

Splunk Company Overview

Splunk Company Overview Copyright 2015 Splunk Inc. Splunk Company Overview Name Title Safe Harbor Statement During the course of this presentation, we may make forward looking statements regarding future events or the expected

More information

Tableau Your Data! Wiley. with Tableau Software. the InterWorks Bl Team. Fast and Easy Visual Analysis. Daniel G. Murray and

Tableau Your Data! Wiley. with Tableau Software. the InterWorks Bl Team. Fast and Easy Visual Analysis. Daniel G. Murray and Tableau Your Data! Fast and Easy Visual Analysis with Tableau Software Daniel G. Murray and the InterWorks Bl Team Wiley Contents Foreword xix Introduction xxi Part I Desktop 1 1 Creating Visual Analytics

More information

DDOS Mi'ga'on in RedIRIS. SIG- ISM. Vienna

DDOS Mi'ga'on in RedIRIS. SIG- ISM. Vienna DDOS Mi'ga'on in RedIRIS SIG- ISM. Vienna Index Evolu'on of DDOS a:acks in RedIRIS Mi'ga'on Tools Current DDOS strategy About RedIRIS Spanish Academic & research network. Universi'es, research centers,.

More information

BIG DATA AND INVESTIGATIVE ANALYTICS

BIG DATA AND INVESTIGATIVE ANALYTICS The New Fron+er BIG DATA AND INVESTIGATIVE ANALYTICS A Publication of Infobright Table of Contents Introduc+on 3 Chapter 1: What Is Inves+ga+ve Analy+cs?. 4 Chapter 2: Top Five Requirements for Inves+ga+ve

More information

Adventures in Bouncerland. Nicholas J. Percoco Sean Schulte Trustwave SpiderLabs

Adventures in Bouncerland. Nicholas J. Percoco Sean Schulte Trustwave SpiderLabs Adventures in Bouncerland Nicholas J. Percoco Sean Schulte Trustwave SpiderLabs Agenda Introduc5ons Our Mo5va5ons What We Knew About Bouncer Research Approach & Process Phase 0 Phase 1 7 Final Test What

More information

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be

More information

MAXIMIZING THE SUCCESS OF YOUR E-PROCUREMENT TECHNOLOGY INVESTMENT. How to Drive Adop.on, Efficiency, and ROI for the Long Term

MAXIMIZING THE SUCCESS OF YOUR E-PROCUREMENT TECHNOLOGY INVESTMENT. How to Drive Adop.on, Efficiency, and ROI for the Long Term MAXIMIZING THE SUCCESS OF YOUR E-PROCUREMENT TECHNOLOGY INVESTMENT How to Drive Adop.on, Efficiency, and ROI for the Long Term What We Will Cover Today Presenta(on Agenda! Who We Are! Our History! Par7al

More information

Bayesian networks - Time-series models - Apache Spark & Scala

Bayesian networks - Time-series models - Apache Spark & Scala Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly

More information

INCREMENTAL, APPROXIMATE DATABASE QUERIES AND UNCERTAINTY FOR EXPLORATORY VISUALIZATION. Danyel Fisher Microso0 Research

INCREMENTAL, APPROXIMATE DATABASE QUERIES AND UNCERTAINTY FOR EXPLORATORY VISUALIZATION. Danyel Fisher Microso0 Research INCREMENTAL, APPROXIMATE DATABASE QUERIES AND UNCERTAINTY FOR EXPLORATORY VISUALIZATION Danyel Fisher Microso0 Research Exploratory Visualiza9on Ini9al Query Process query Get a response Change parameters

More information

Program Model: Muskingum University offers a unique graduate program integra6ng BUSINESS and TECHNOLOGY to develop the 21 st century professional.

Program Model: Muskingum University offers a unique graduate program integra6ng BUSINESS and TECHNOLOGY to develop the 21 st century professional. Program Model: Muskingum University offers a unique graduate program integra6ng BUSINESS and TECHNOLOGY to develop the 21 st century professional. 163 Stormont Street New Concord, OH 43762 614-286-7895

More information

DATA EXPERTS MINE ANALYZE VISUALIZE. We accelerate research and transform data to help you create actionable insights

DATA EXPERTS MINE ANALYZE VISUALIZE. We accelerate research and transform data to help you create actionable insights DATA EXPERTS We accelerate research and transform data to help you create actionable insights WE MINE WE ANALYZE WE VISUALIZE Domains Data Mining Mining longitudinal and linked datasets from web and other

More information

Data Management in the Cloud: Limitations and Opportunities. Annies Ductan

Data Management in the Cloud: Limitations and Opportunities. Annies Ductan Data Management in the Cloud: Limitations and Opportunities Annies Ductan Discussion Outline: Introduc)on Overview Vision of Cloud Compu8ng Managing Data in The Cloud Cloud Characteris8cs Data Management

More information

This presenta,on covers the essen,al informa,on about IT services and facili,es which all new students will need to get started.

This presenta,on covers the essen,al informa,on about IT services and facili,es which all new students will need to get started. This presenta,on covers the essen,al informa,on about IT services and facili,es which all new students will need to get started. 1 Most of the informa,on is covered in more depth on the Informa,on Services

More information

ECBDL 14: Evolu/onary Computa/on for Big Data and Big Learning Workshop July 13 th, 2014 Big Data Compe//on

ECBDL 14: Evolu/onary Computa/on for Big Data and Big Learning Workshop July 13 th, 2014 Big Data Compe//on ECBDL 14: Evolu/onary Computa/on for Big Data and Big Learning Workshop July 13 th, 2014 Big Data Compe//on Jaume Bacardit [email protected] The Interdisciplinary Compu/ng and Complex BioSystems

More information

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved Hortonworks & SAS Analytics everywhere. Page 1 A change in focus. A shift in Advertising From mass branding A shift in Financial Services From Educated Investing A shift in Healthcare From mass treatment

More information

Sisense. Product Highlights. www.sisense.com

Sisense. Product Highlights. www.sisense.com Sisense Product Highlights Introduction Sisense is a business intelligence solution that simplifies analytics for complex data by offering an end-to-end platform that lets users easily prepare and analyze

More information

Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

More information

April 2016 JPoint Moscow, Russia. How to Apply Big Data Analytics and Machine Learning to Real Time Processing. Kai Wähner. kwaehner@tibco.

April 2016 JPoint Moscow, Russia. How to Apply Big Data Analytics and Machine Learning to Real Time Processing. Kai Wähner. kwaehner@tibco. April 2016 JPoint Moscow, Russia How to Apply Big Data Analytics and Machine Learning to Real Time Processing Kai Wähner [email protected] @KaiWaehner www.kai-waehner.de LinkedIn / Xing Please connect!

More information

Defending Against Web App A0acks Using ModSecurity. Jason Wood Principal Security Consultant Secure Ideas

Defending Against Web App A0acks Using ModSecurity. Jason Wood Principal Security Consultant Secure Ideas Defending Against Web App A0acks Using ModSecurity Jason Wood Principal Security Consultant Secure Ideas Background Info! Penetra?on Tester, Security Engineer & Systems Administrator!!!! Web environments

More information

Using RDBMS, NoSQL or Hadoop?

Using RDBMS, NoSQL or Hadoop? Using RDBMS, NoSQL or Hadoop? DOAG Conference 2015 Jean- Pierre Dijcks Big Data Product Management Server Technologies Copyright 2014 Oracle and/or its affiliates. All rights reserved. Data Ingest 2 Ingest

More information

Big Data Use Cases. At Salesforce.com. Narayan Bharadwaj Director, Product Management Salesforce.com. @nadubharadwaj

Big Data Use Cases. At Salesforce.com. Narayan Bharadwaj Director, Product Management Salesforce.com. @nadubharadwaj Big Data Use Cases At Salesforce.com Narayan Bharadwaj Director, Product Management Salesforce.com @nadubharadwaj Safe harbor Safe harbor statement under the Private Securi9es Li9ga9on Reform Act of 1995:

More information

Machine Learning with MATLAB David Willingham Application Engineer

Machine Learning with MATLAB David Willingham Application Engineer Machine Learning with MATLAB David Willingham Application Engineer 2014 The MathWorks, Inc. 1 Goals Overview of machine learning Machine learning models & techniques available in MATLAB Streamlining the

More information

WHITE PAPER SPLUNK SOFTWARE AS A SIEM

WHITE PAPER SPLUNK SOFTWARE AS A SIEM SPLUNK SOFTWARE AS A SIEM Improve your security posture by using Splunk as your SIEM HIGHLIGHTS Splunk software can be used to operate security operations centers (SOC) of any size (large, med, small)

More information

Big Data Integration: A Buyer's Guide

Big Data Integration: A Buyer's Guide SEPTEMBER 2013 Buyer s Guide to Big Data Integration Sponsored by Contents Introduction 1 Challenges of Big Data Integration: New and Old 1 What You Need for Big Data Integration 3 Preferred Technology

More information

PROJECT PORTFOLIO SUITE

PROJECT PORTFOLIO SUITE ServiceNow So1ware Development manages Scrum or waterfall development efforts and defines the tasks required for developing and maintaining so[ware throughout the lifecycle, from incep4on to deployment.

More information

DATA SCIENCE CURRICULUM WEEK 1 ONLINE PRE-WORK INSTALLING PACKAGES COMMAND LINE CODE EDITOR PYTHON STATISTICS PROJECT O5 PROJECT O3 PROJECT O2

DATA SCIENCE CURRICULUM WEEK 1 ONLINE PRE-WORK INSTALLING PACKAGES COMMAND LINE CODE EDITOR PYTHON STATISTICS PROJECT O5 PROJECT O3 PROJECT O2 DATA SCIENCE CURRICULUM Before class even begins, students start an at-home pre-work phase. When they convene in class, students spend the first eight weeks doing iterative, project-centered skill acquisition.

More information

Applying Machine Learning to Network Security Monitoring. Alex Pinto Chief Data Scien2st MLSec Project @alexcpsec @MLSecProject!

Applying Machine Learning to Network Security Monitoring. Alex Pinto Chief Data Scien2st MLSec Project @alexcpsec @MLSecProject! Applying Machine Learning to Network Security Monitoring Alex Pinto Chief Data Scien2st MLSec Project @alexcpsec @MLSecProject! whoami Almost 15 years in Informa2on Security, done a licle bit of everything.

More information

TLD Data Analysis. ICANN Tech Day, Dublin. October 19th 2015 Maarten Wullink, SIDN. Klik om de s+jl te bewerken

TLD Data Analysis. ICANN Tech Day, Dublin. October 19th 2015 Maarten Wullink, SIDN. Klik om de s+jl te bewerken Klik om de s+jl te bewerken Klik om de models+jlen te bewerken Tweede niveau Derde niveau Vierde niveau TLD Data Analysis Vijfde niveau ICANN Tech Day, Dublin October 19th 2015 Maarten Wullink, SIDN Wie

More information

A Tutorial Introduc/on to Big Data. Hands On Data Analy/cs over EMR. Robert Grossman University of Chicago Open Data Group

A Tutorial Introduc/on to Big Data. Hands On Data Analy/cs over EMR. Robert Grossman University of Chicago Open Data Group A Tutorial Introduc/on to Big Data Hands On Data Analy/cs over EMR Robert Grossman University of Chicago Open Data Group Collin BenneE Open Data Group November 12, 2012 1 Amazon AWS Elas/c MapReduce allows

More information

Oracle Big Data Discovery Unlock Potential in Big Data Reservoir

Oracle Big Data Discovery Unlock Potential in Big Data Reservoir Oracle Big Data Discovery Unlock Potential in Big Data Reservoir Gokula Mishra Premjith Balakrishnan Business Analytics Product Group September 29, 2014 Copyright 2014, Oracle and/or its affiliates. All

More information

Machine Learning Capacity and Performance Analysis and R

Machine Learning Capacity and Performance Analysis and R Machine Learning and R May 3, 11 30 25 15 10 5 25 15 10 5 30 25 15 10 5 0 2 4 6 8 101214161822 0 2 4 6 8 101214161822 0 2 4 6 8 101214161822 100 80 60 40 100 80 60 40 100 80 60 40 30 25 15 10 5 25 15 10

More information

Exploiting IT Log Analytics to Find and Fix Problems Before They Become Outages

Exploiting IT Log Analytics to Find and Fix Problems Before They Become Outages Exploiting IT Log Analytics to Find and Fix Problems Before They Become Outages Session 17595 Paul Smith (Smitty) ([email protected]) IBM z Systems Service Management / zanalytics Architect Anuja Deedwaniya

More information

Data Mining. Supervised Methods. Ciro Donalek [email protected]. Ay/Bi 199ab: Methods of Computa@onal Sciences hcp://esci101.blogspot.

Data Mining. Supervised Methods. Ciro Donalek donalek@astro.caltech.edu. Ay/Bi 199ab: Methods of Computa@onal Sciences hcp://esci101.blogspot. Data Mining Supervised Methods Ciro Donalek [email protected] Supervised Methods Summary Ar@ficial Neural Networks Mul@layer Perceptron Support Vector Machines SoLwares Supervised Models: Supervised

More information

Hadoop & SAS Data Loader for Hadoop

Hadoop & SAS Data Loader for Hadoop Turning Data into Value Hadoop & SAS Data Loader for Hadoop Sebastiaan Schaap Frederik Vandenberghe Agenda What s Hadoop SAS Data management: Traditional In-Database In-Memory The Hadoop analytics lifecycle

More information

KnowledgeSTUDIO HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES

KnowledgeSTUDIO HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES Translating data into business value requires the right data mining and modeling techniques which uncover important patterns within

More information

FRESCO: Modular Composable Security Services for So;ware- Defined Networks

FRESCO: Modular Composable Security Services for So;ware- Defined Networks FRESCO: Modular Composable Security Services for So;ware- Defined Networks Seungwon Shin, Phil Porras, Vinod Yegneswaran, MarIn Fong, Guofei Gu, and Mabry Tyson SUCCESS LAB, Texas A&M and SRI Interna7onal

More information