Enabling R for Big Data with PL/R and PivotalR Real World Examples on Hadoop & MPP Databases
|
|
|
- Bridget Carr
- 10 years ago
- Views:
Transcription
1 Enabling R for Big Data with PL/R and PivotalR Real World Examples on Hadoop & MPP Databases Woo J. Jung Principal Data Scientist Pivotal Labs 1
2 All In On Open Source Still can t believe we did this. Truly exciting.
3 Data Science Toolkit KEY TOOLS KEY LANGUAGES SQL P L A T F O R M Pivotal Big Data Suite 3
4 How Pivotal Data Scientists Select Which Tool to Use Prototype in R or directly in MADlib/PivotalR Optimized for both algorithm efficiency and code overhead Is the algorithm of choice available in MADlib/PivotalR? Yes Build final set of models in MADlib/PivotalR No Do opportunities for explicit parallelization exist? Yes Build final set of models in PL/R No Connect to Pivotal via ODBC 4
5 How Pivotal Data Scientists Select Which Tool to Use Prototype in R or directly in MADlib/PivotalR Optimized for both algorithm efficiency and code overhead Is the algorithm of choice available in MADlib/PivotalR? Yes Build final set of models in MADlib/PivotalR No Do opportunities for explicit parallelization exist? Yes Build final set of models in PL/R No Connect to Pivotal via ODBC 5
6 MADlib: Toolkit for Advanced Big Data Analytics madlib.net Better Parallelism Algorithms designed to leverage MPP or Hadoop architecture Better Scalability Algorithms scale as your data set scales No data movement Better Predictive Accuracy Using all data, not a sample, may improve accuracy Open Source Available for customization and optimization by user 6
7 7
8 PivotalR: Bringing MADlib and HAWQ to a Familiar R Interface Challenge Want to harness the familiarity of R s interface and the performance & scalability benefits of in-db/in-hadoop analytics Simple solution: Translate R code into SQL SQL Code SELECT madlib.linregr_train( 'houses,! 'houses_linregr,! 'price,! 'ARRAY[1, tax, bath, size] );! PivotalR d <- db.data.frame( houses")! houses_linregr <- madlib.lm(price ~ tax!!!!+ bath!!!!+ size!!!!, data=d)!
9 PivotalR Design Overview Call MADlib s in-db machine learning functions directly from R Syntax is analogous to native R function RPostgreSQL PivotalR 2. SQL to execute 1. R à SQL 3. Computation results No data here Data doesn t need to leave the database All heavy lifting, including model estimation & computation, are done in the database Database/Hadoop w/ MADlib Data lives here 9
10 More Piggybacking 10
11 How Pivotal Data Scientists Select Which Tool to Use Prototype in R or directly in MADlib/PivotalR Optimized for both algorithm efficiency and code overhead Is the algorithm of choice available in MADlib/PivotalR? Yes Build final set of models in MADlib/PivotalR No Do opportunities for explicit parallelization exist? Yes Build final set of models in PL/R No Connect to Pivotal via ODBC 11
12 What is Data Parallelism? Little or no effort is required to break up the problem into a number of parallel tasks, and there exists no dependency (or communication) between those parallel tasks Also known as explicit parallelism Examples: Have each person in this room weigh themselves: Measure each person s weight in parallel Count a deck of cards by dividing it up between people in this room: Count in parallel MapReduce apply() family of functions in R 12
13 Procedural Language R (PL/R) Parallelized model building in the R language Originally developed by Joe Conway for PostgreSQL Parallelized by virtue of piggybacking on distributed architectures SQL & R 13
14 Parallelized Analytics in Pivotal via PL/R: An Example SQL & R TN Data CA Data NY Data PA Data TX Data CT Data NJ Data IL Data MA Data WA Data TN Model CA Model NY Model PA Model TX Model CT Model NJ Model IL Model MA Model WA Model Parsimonious R piggy-backs on Pivotal s parallel architecture Minimize data movement Build predictive model for each state in parallel 14
15 Parallelized R via PL/R: One Example of Its Use With placeholders in SQL, write functions in the native R language Accessible, powerful modeling framework
16 Parallelized R via PL/R: One Example of Its Use Execute PL/R function Plain and simple table is returned 16
17 Examples of Usage 17
18 Pivotal Data Science: Areas of Expertise 18
19 Pivotal Data Science: Packaged Services LAB PRIMER (2-Week Roadmapping) DATA JAM (Internal DS Contest) LAB 100 (Analytics Bundle) LAB 600 (6-Week Lab) LAB 1200 (12-Week Lab) Analytics Roadmap Prioritized Opportunities Architectural Recommendati ons Hands-on training Hosted data on Pivotal Data stack Results review & assessment On-site MPP analytics training Analytics toolkit Quick insight (2 weeks) Prof. services Data science model building Ready-todeploy model(s) Prof. services Data science model building Ready-todeploy model(s) 19
20 The Internet of Things: Smart Meter Analytics
21 Engagement Summary Objective Build key foundations of a data-driven framework for anomaly detection to leverage in revenue protection initiatives Results With limited access to limited data, our models (FFT and Time Series Analysis) identified 191K potentially anomalous meters (7% of all meters). High Performance Pivotal Big Data Suite including MADlib and PL/R 90 seconds to compute FFT for over 3.1 million meters (~3.5 billion readings) à ms/meter ~36 minutes to compute time series models for over 3.1 million meters (~3.5 billion readings) à ms/meter 21
22 Anomaly Detection Methodology & Results All Data (4.5 million meters / ~20 billion meter readings) Step 1: Select Data for Advanced Modeling (3.1 million meters / ~3.5 billion meter readings) Step 2: Detect Anomalies From Frequency Domain Analysis (547K) Step 4: Detect Anomalies From Combined Analysis (191K) Step 3: Detect Anomalies From Time Domain Analysis (485K) 22
23 -- create type to store frequency, spec, and max freq! create type fourier_type AS (! freq text, spec text, freq_with_maxspec float8);!! -- create plr function to compute periodogram and return frequency with maximum spectral density! create or replace function pgram_concise(tsval float8[])! RETURNS float8 AS! $$! rpgram <- spec.pgram(tsval,fast=false,plot=false,detrend=true)! freq_with_maxspec <- rpgram$freq[which(rpgram$spec==max(rpgram$spec))]! return(freq_with_maxspec)! $$! LANGUAGE 'plr ;!! -- execute function! create table pg_gram_results! as select geo_id, meter_id, pgram_concise(load_ts) FROM! meter_data distributed by (geo_id,meter_id);!
24 Number of Meters Most Households Use Energy in Daily or Half-Daily Cycles Estimated Periodicity of Meters Meters with Anomalous Usage Patterns (~20% of all meters) Periodicity in Hours 1092 Dominant periodicity (i.e. maximum frequency) of each meter is computed ~80% of all households show daily or half-daily patterns of energy usage ~20% of all households show anomalous patterns of energy usage Flag meters falling into the 20% as potentially anomalous meters Follow-up Items: Event type of the anomalies w.r.t. Revenue Protection to be determined with additional data & models 24
25 Irregular Patterns of Energy Consumption Displayed by Detected Anomalous Meters FFT Analysis: Time Series of an Ordinary Meter Load Elaspsed Hours FFT Analysis: Time Series of an Anomalous Meter Load Elaspsed Hours 25
26 Parallelize the Generation of Visualizations
27 Parallelize Visualization Generation
28 Parallelize Visualization Generation 28
29 Demand Modeling & What-If Scenario Analysis Scalable Algorithm Development Using R Prototyping Dashboards on RShiny
30 Engagement Overview Customer s Business Goal Make data-driven decisions about how to allocate resources for planning & inventory management Compose rich set of reusable data assets from disparate LOBs and make available for ongoing analysis & reporting Build parallelized demand models for 100+ products & locations Develop scalable Hierarchical/Multilevel Bayesian Modeling algorithm (Gibbs Sampling) Construct framework & prototype app for what-if scenario analysis in RShiny 30
31 Overview of Hierarchical Linear Model Likelihood Priors for parameters Priors for hyperparameters Posterior Likelihood x Priors for parameters x Priors for hyperparameters This joint posterior distribution does not take the form of a known probability density, thus it is a challenge to draw samples from it directly à However, the full conditional posterior distributions follow known probability densities (Gibbs Sampling) 31
32 Game Plan 1. Figure out which components of the Gibbs Sampler can be embarrassingly parallelized, i.e. the key building blocks Mostly matrix algebra calculations & draws from full conditional distributions, parallelized by Product-Location 2. Build functions (i.e. in PL/R) for each of the building blocks 3. Build a meta-function that ties together each of the functions in (2) to run a Gibbs Sampler 4. Run functions for K iterations, monitor convergence, summarize results 32
33 Examples of Building Block Functions
34 Meta-Function & Execution 34
35 PivotalR & RShiny RShiny RPostgreSQL PivotalR SQL to execute R à SQL Computation results No data here Data doesn t need to leave the database All heavy lifting, including model estimation & computation, are done in the database Database/Hadoop w/ MADlib RShiny Server Data lives here 35
36 36
37 37
38 Next Steps Continue to build even more PivotalR wrapper functions Identify more areas where core R functions can be releveraged and made scalable via PivotalR Explore, learn, and share notes with other packages like PivotalR Explore closer integration with Spark, MLlib, H20 PL/R wrappers directly from R 38
39 Thank You Have Any Questions? 39
40 Check out the Pivotal Data Science Blog! 40
41 Additional References PivotalR Video Demo PL/R & General Pivotal+R Interoperability MADlib
42 BUILT FOR THE SPEED OF BUSINESS
PivotalR: A Package for Machine Learning on Big Data
PivotalR: A Package for Machine Learning on Big Data Hai Qian Predictive Analytics Team, Pivotal Inc. [email protected] Copyright 2013 Pivotal. All rights reserved. What Can Small Data Scientists Bring
Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.
Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!
How Eastern Bank Uses Big Data to Better Serve & Protect its Customers!
How Eastern Bank Uses Big Data to Better Serve & Protect its Customers! Brian Griffith Principal Data Engineer Agenda! Introduction Eastern Bank & the banking industry Data architecture and our big data
Native Connectivity to Big Data Sources in MSTR 10
Native Connectivity to Big Data Sources in MSTR 10 Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single
Advanced In-Database Analytics
Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1 That sounds complicated? 2 Who can tell me how best to solve this 3 What are the main mathematical functions??
W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract
W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the
Databricks. A Primer
Databricks A Primer Who is Databricks? Databricks was founded by the team behind Apache Spark, the most active open source project in the big data ecosystem today. Our mission at Databricks is to dramatically
Interactive data analytics drive insights
Big data Interactive data analytics drive insights Daniel Davis/Invodo/S&P. Screen images courtesy of Landmark Software and Services By Armando Acosta and Joey Jablonski The Apache Hadoop Big data has
Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.
Mike Maxey Senior Director Product Marketing Greenplum A Division of EMC 1 Greenplum Becomes the Foundation of EMC s Big Data Analytics (July 2010) E M C A C Q U I R E S G R E E N P L U M For three years,
From Spark to Ignition:
From Spark to Ignition: Fueling Your Business on Real-Time Analytics Eric Frenkiel, MemSQL CEO June 29, 2015 San Francisco, CA What s in Store For This Presentation? 1. MemSQL: A real-time database for
Databricks. A Primer
Databricks A Primer Who is Databricks? Databricks vision is to empower anyone to easily build and deploy advanced analytics solutions. The company was founded by the team who created Apache Spark, a powerful
Harnessing the Power of the Microsoft Cloud for Deep Data Analytics
1 Harnessing the Power of the Microsoft Cloud for Deep Data Analytics Today's Focus How you can operate your business more efficiently and effectively by tapping into Cloud based data analytics solutions
The 4 Pillars of Technosoft s Big Data Practice
beyond possible Big Use End-user applications Big Analytics Visualisation tools Big Analytical tools Big management systems The 4 Pillars of Technosoft s Big Practice Overview Businesses have long managed
Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop
1 Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop 2 Pivotal s Full Approach It s More Than Just Hadoop Pivotal Data Labs 3 Why Pivotal Exists First Movers Solve the Big Data Utility Gap
Big Data Scenario mit Power BI vs. SAP HANA Gerhard Brückl
Big Data Scenario mit Power BI vs. SAP HANA Gerhard Brückl About me Gerhard Brückl Working with Microsoft BI since 2006 Started working with SAP HANA in 2013 focused on Analytics and Reporting Blog: email:
AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW
AGENDA What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story Hadoop PDW Our BIG DATA Roadmap BIG DATA? Volume 59% growth in annual WW information 1.2M Zetabytes (10 21 bytes) this
Big Data Open Source Stack vs. Traditional Stack for BI and Analytics
Big Data Open Source Stack vs. Traditional Stack for BI and Analytics Part I By Sam Poozhikala, Vice President Customer Solutions at StratApps Inc. 4/4/2014 You may contact Sam Poozhikala at [email protected].
Detecting Anomalous Behavior with the Business Data Lake. Reference Architecture and Enterprise Approaches.
Detecting Anomalous Behavior with the Business Data Lake Reference Architecture and Enterprise Approaches. 2 Detecting Anomalous Behavior with the Business Data Lake Pivotal the way we see it Reference
EMC Greenplum Driving the Future of Data Warehousing and Analytics. Tools and Technologies for Big Data
EMC Greenplum Driving the Future of Data Warehousing and Analytics Tools and Technologies for Big Data Steven Hillion V.P. Analytics EMC Data Computing Division 1 Big Data Size: The Volume Of Data Continues
Azure Data Lake Analytics
Azure Data Lake Analytics Compose and orchestrate data services at scale Fully managed service to support orchestration of data movement and processing Connect to relational or non-relational data
High-Performance Analytics
High-Performance Analytics David Pope January 2012 Principal Solutions Architect High Performance Analytics Practice Saturday, April 21, 2012 Agenda Who Is SAS / SAS Technology Evolution Current Trends
Information Architecture
The Bloor Group Actian and The Big Data Information Architecture WHITE PAPER The Actian Big Data Information Architecture Actian and The Big Data Information Architecture Originally founded in 2005 to
Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.
Big Data Analytics 1 Priority Discussion Topics What are the most compelling business drivers behind big data analytics? Do you have or expect to have data scientists on your staff, and what will be their
Demonstration of SAP Predictive Analysis 1.0, consumption from SAP BI clients and best practices
September 10-13, 2012 Orlando, Florida Demonstration of SAP Predictive Analysis 1.0, consumption from SAP BI clients and best practices Vishwanath Belur, Product Manager, SAP Predictive Analysis Learning
Harnessing the power of advanced analytics with IBM Netezza
IBM Software Information Management White Paper Harnessing the power of advanced analytics with IBM Netezza How an appliance approach simplifies the use of advanced analytics Harnessing the power of advanced
Bringing Big Data to People
Bringing Big Data to People Microsoft s modern data platform SQL Server 2014 Analytics Platform System Microsoft Azure HDInsight Data Platform Everyone should have access to the data they need. Process
GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION
GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION Syed Rasheed Solution Manager Red Hat Corp. Kenny Peeples Technical Manager Red Hat Corp. Kimberly Palko Product Manager Red Hat Corp.
Microsoft SQL Server Business Intelligence and Teradata Database
Microsoft SQL Server Business Intelligence and Teradata Database Help improve customer response rates by using the most sophisticated marketing automation application available. Integrated Marketing Management
How to Drive the Mobile App Evolution
Transformational Technologies Mobility & Analytics ARC Tenth India Forum, Hyderabad 5-7 July, 2012 Transforming Industry and Infrastructure through New Processes and Technologies Ritesh Arora Consulting
WHITE PAPER. Harnessing the Power of Advanced Analytics How an appliance approach simplifies the use of advanced analytics
WHITE PAPER Harnessing the Power of Advanced How an appliance approach simplifies the use of advanced analytics Introduction The Netezza TwinFin i-class advanced analytics appliance pushes the limits of
SAP BusinessObjects BI Clients
SAP BusinessObjects BI Clients April 2015 Customer Use this title slide only with an image BI Use Cases High Level View Agility Data Discovery Analyze and visualize data from multiple sources Data analysis
Big Data and the Data Lake. February 2015
Big Data and the Data Lake February 2015 My Vision: Our Mission Data Intelligence is a broad term that describes the real, meaningful insights that can be extracted from your data truths that you can act
Massive Predictive Modeling using Oracle R Technologies Mark Hornick, Director, Oracle Advanced Analytics
Massive Predictive Modeling using Oracle R Technologies Mark Hornick, Director, Oracle Advanced Analytics Safe Harbor Statement The following is intended to outline our general product direction. It is
Data-intensive HPC: opportunities and challenges. Patrick Valduriez
Data-intensive HPC: opportunities and challenges Patrick Valduriez Big Data Landscape Multi-$billion market! Big data = Hadoop = MapReduce? No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard,
SAP and Hortonworks Reference Architecture
SAP and Hortonworks Reference Architecture Hortonworks. We Do Hadoop. June Page 1 2014 Hortonworks Inc. 2011 2014. All Rights Reserved A Modern Data Architecture With SAP DATA SYSTEMS APPLICATIO NS Statistical
Building Your Big Data Team
Building Your Big Data Team With all the buzz around Big Data, many companies have decided they need some sort of Big Data initiative in place to stay current with modern data management requirements.
How To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
Cisco IT Hadoop Journey
Cisco IT Hadoop Journey Srini Desikan, Program Manager IT 2015 MapR Technologies 1 Agenda Hadoop Platform Timeline Key Decisions / Lessons Learnt Data Lake Hadoop s place in IT Data Platforms Use Cases
Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics
In Organizations Mark Vervuurt Cluster Data Science & Analytics AGENDA 1. Yellow Elephant 2. Data Ingestion & Complex Event Processing 3. SQL on Hadoop 4. NoSQL 5. InMemory 6. Data Science & Machine Learning
Step by Step: Big Data Technology. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 25 August 2015
Step by Step: Big Data Technology Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 25 August 2015 Data Sources IT Infrastructure Analytics 2 B y 2015, 20% of Global 1000 organizations
Data Discovery, Analytics, and the Enterprise Data Hub
Data Discovery, Analytics, and the Enterprise Data Hub Version: 101 Table of Contents Summary 3 Used Data and Limitations of Legacy Analytic Architecture 3 The Meaning of Data Discovery & Analytics 4 Machine
Building Data-Driven Internet of Things (IoT) Applications
Building Data-Driven Internet of Things (IoT) Applications A four-step primer IOT DEMANDS NEW APPLICATIONS Automated homes. Connected cars. Smart cities. The Internet of Things (IoT) will forever change
Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database
Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica
How Companies are! Using Spark
How Companies are! Using Spark And where the Edge in Big Data will be Matei Zaharia History Decreasing storage costs have led to an explosion of big data Commodity cluster software, like Hadoop, has made
Next-Generation Cloud Analytics with Amazon Redshift
Next-Generation Cloud Analytics with Amazon Redshift What s inside Introduction Why Amazon Redshift is Great for Analytics Cloud Data Warehousing Strategies for Relational Databases Analyzing Fast, Transactional
ORACLE BUSINESS INTELLIGENCE FOUNDATION SUITE 11g WHAT S NEW
ORACLE BUSINESS INTELLIGENCEFOUNDATION SUITE 11g DATA SHEET Disclaimer: This document is for informational purposes. It is not a commitment to deliver any material, code, or functionality, and should not
Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved
Hortonworks & SAS Analytics everywhere. Page 1 A change in focus. A shift in Advertising From mass branding A shift in Financial Services From Educated Investing A shift in Healthcare From mass treatment
The 3 questions to ask yourself about BIG DATA
The 3 questions to ask yourself about BIG DATA Do you have a big data problem? Companies looking to tackle big data problems are embarking on a journey that is full of hype, buzz, confusion, and misinformation.
LEARNING SOLUTIONS website milner.com/learning email [email protected] phone 800 875 5042
Course 6451B: Planning, Deploying and Managing Microsoft System Center Configuration Manager 2007 Length: 3 Days Published: June 29, 2012 Language(s): English Audience(s): IT Professionals Level: 300 Technology:
Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014
Forecast of Big Data Trends Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Big Data transforms Business 2 Data created every minute Source http://mashable.com/2012/06/22/data-created-every-minute/
Modern Data Architecture for Predictive Analytics
Modern Data Architecture for Predictive Analytics David Smith VP Marketing and Community - Revolution Analytics John Kreisa VP Strategic Marketing- Hortonworks Hortonworks Inc. 2013 Page 1 Your Presenters
The power of IBM SPSS Statistics and R together
IBM Software Business Analytics SPSS Statistics The power of IBM SPSS Statistics and R together 2 Business Analytics Contents 2 Executive summary 2 Why integrate SPSS Statistics and R? 4 Integrating R
Oracle Big Data Discovery Unlock Potential in Big Data Reservoir
Oracle Big Data Discovery Unlock Potential in Big Data Reservoir Gokula Mishra Premjith Balakrishnan Business Analytics Product Group September 29, 2014 Copyright 2014, Oracle and/or its affiliates. All
Business Intelligence Competency Partners Untangling the Confusion What SAP BW powered by HANA and HANA LIVE mean to your organization
Business Intelligence Competency Partners Untangling the Confusion What SAP BW powered by HANA and HANA LIVE mean to your organization Sven Jensen Program Director Audience, Objective & Agenda This presentation
IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads
89 Fifth Avenue, 7th Floor New York, NY 10003 www.theedison.com @EdisonGroupInc 212.367.7400 IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads A Competitive Test and Evaluation Report
Big Data Analytics Best Practices
1 Big Data Analytics Best Practices Marshall Presser Federal Field CTO Greenplum 2 Big Data Makes the Mainstream 3 WHAT DOES IT TAKE? 4 1. New Applications MADlib 5 2. New Skill Sets -- Data Science 6
Building Dashboards for Real Business Results. Cindi Howson BIScorecard December 11, 2012
Building Dashboards for Real Business Results Cindi Howson BIScorecard December 11, 2012 Sponsor 2 Speakers Cindi Howson Founder, BIScorecard Mark Gamble Director of Technical Marketing, Actuate Cindi
IBM: An Early Leader across the Big Data Security Analytics Continuum Date: June 2013 Author: Jon Oltsik, Senior Principal Analyst
ESG Brief IBM: An Early Leader across the Big Data Security Analytics Continuum Date: June 2013 Author: Jon Oltsik, Senior Principal Analyst Abstract: Many enterprise organizations claim that they already
PARC and SAP Co-innovation: High-performance Graph Analytics for Big Data Powered by SAP HANA
PARC and SAP Co-innovation: High-performance Graph Analytics for Big Data Powered by SAP HANA Harnessing the combined power of SAP HANA and PARC s HiperGraph graph analytics technology for real-time insights
Microsoft Research Windows Azure for Research Training
Copyright 2013 Microsoft Corporation. All rights reserved. Except where otherwise noted, these materials are licensed under the terms of the Apache License, Version 2.0. You may use it according to the
YOUR APP. OUR CLOUD.
YOUR APP. OUR CLOUD. The Original Mobile APP! Copyright cloudbase.io 2013 2 THE MARKET Mobile cloud market in billions of $ $ 16 $ 14 $ 12 $ 10 $14.5bn The size of the mobile cloud market in 2015 $ 8 $
Microsoft Research Microsoft Azure for Research Training
Copyright 2014 Microsoft Corporation. All rights reserved. Except where otherwise noted, these materials are licensed under the terms of the Apache License, Version 2.0. You may use it according to the
SEIZE THE DATA. 2015 SEIZE THE DATA. 2015
1 Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. BIG DATA CONFERENCE 2015 Boston August 10-13 Predicting and reducing deforestation
Data Analytics for a Secure Smart Grid
Data Analytics for a Secure Smart Grid Dr. Silvio La Porta Senior Research Scientist EMC Research Europe Ireland COE. Agenda APT modus operandi Data Analysis and Security SPARKS Data Analytics Module Anatomy
An In-Depth Look at In-Memory Predictive Analytics for Developers
September 9 11, 2013 Anaheim, California An In-Depth Look at In-Memory Predictive Analytics for Developers Philip Mugglestone SAP Learning Points Understand the SAP HANA Predictive Analysis library (PAL)
Automating Big Data Benchmarking for Different Architectures with ALOJA
www.bsc.es Jan 2016 Automating Big Data Benchmarking for Different Architectures with ALOJA Nicolas Poggi, Postdoc Researcher Agenda 1. Intro on Hadoop performance 1. Current scenario and problematic 2.
Distributed R for Big Data
Distributed R for Big Data Indrajit Roy, HP Labs November 2013 Team: Shivara m Erik Kyungyon g Alvin Rob Vanish A Big Data story Once upon a time, a customer in distress had. 2+ billion rows of financial
Big Data Challenges and Success Factors. Deloitte Analytics Your data, inside out
Big Data Challenges and Success Factors Deloitte Analytics Your data, inside out Big Data refers to the set of problems and subsequent technologies developed to solve them that are hard or expensive to
Safe Harbor Statement
Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment
SAP Solution Brief SAP HANA. Transform Your Future with Better Business Insight Using Predictive Analytics
SAP Brief SAP HANA Objectives Transform Your Future with Better Business Insight Using Predictive Analytics Dealing with the new reality Dealing with the new reality Organizations like yours can identify
Bayesian networks - Time-series models - Apache Spark & Scala
Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly
Tackling Big Data with MATLAB Adam Filion Application Engineer MathWorks, Inc.
Tackling Big Data with MATLAB Adam Filion Application Engineer MathWorks, Inc. 2015 The MathWorks, Inc. 1 Challenges of Big Data Any collection of data sets so large and complex that it becomes difficult
BIG DATA-AS-A-SERVICE
White Paper BIG DATA-AS-A-SERVICE What Big Data is about What service providers can do with Big Data What EMC can do to help EMC Solutions Group Abstract This white paper looks at what service providers
Booz Allen Cloud Solutions. Our Capability-Based Approach
Booz Allen Cloud Solutions Our Capability-Based Approach Booz Allen Cloud Solutions Our Capability-Based Approach Booz Allen Cloud Solutions Our Capability-Based Approach In today s budget-conscious environment,
Delivering secure, real-time business insights for the Industrial world
Delivering secure, real-time business insights for the Industrial world Arnaud Mathieu: Program Director, Internet of Things Dev., IBM [email protected] @arnomath 1 We are on the threshold of massive
Cisco Data Preparation
Data Sheet Cisco Data Preparation Unleash your business analysts to develop the insights that drive better business outcomes, sooner, from all your data. As self-service business intelligence (BI) and
Up Your R Game. James Taylor, Decision Management Solutions Bill Franks, Teradata
Up Your R Game James Taylor, Decision Management Solutions Bill Franks, Teradata Today s Speakers James Taylor Bill Franks CEO Chief Analytics Officer Decision Management Solutions Teradata 7/28/14 3 Polling
Advanced Big Data Analytics with R and Hadoop
REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional
EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved.
EMC Federation Big Data Solutions 1 Introduction to data analytics Federation offering 2 Traditional Analytics! Traditional type of data analysis, sometimes called Business Intelligence! Type of analytics
BIG DATA TRENDS AND TECHNOLOGIES
BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.
Greenplum Database. Getting Started with Big Data Analytics. Ofir Manor Pre Sales Technical Architect, EMC Greenplum
Greenplum Database Getting Started with Big Data Analytics Ofir Manor Pre Sales Technical Architect, EMC Greenplum 1 Agenda Introduction to Greenplum Greenplum Database Architecture Flexible Database Configuration
CA Technologies Big Data Infrastructure Management Unified Management and Visibility of Big Data
Research Report CA Technologies Big Data Infrastructure Management Executive Summary CA Technologies recently exhibited new technology innovations, marking its entry into the Big Data marketplace with
Ø Teaching Evaluations. q Open March 3 through 16. Ø Final Exam. q Thursday, March 19, 4-7PM. Ø 2 flavors: q Public Cloud, available to public
Announcements TIM 50 Teaching Evaluations Open March 3 through 16 Final Exam Thursday, March 19, 4-7PM Lecture 19 20 March 12, 2015 Cloud Computing Cloud Computing: refers to both applications delivered
Predictive Analytics. Noam Zeigerson, CTO
Predictive Analytics Noam Zeigerson, CTO Agenda The Predictive Analytics Need Innovative Technologies Business Solutions The problem: Inconsistent stream of revenue Available Data Sources ERP data Web
BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES
BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data
Solving Big Data Problems in Computer Vision with MATLAB Loren Shure
Solving Big Data Problems in Computer Vision with MATLAB Loren Shure 2015 The MathWorks, Inc. 1 Why Are We Talking About Big Data? 100 hours of video uploaded to YouTube per minute 1 Explosive increase
Integrating a Big Data Platform into Government:
Integrating a Big Data Platform into Government: Drive Better Decisions for Policy and Program Outcomes John Haddad, Senior Director Product Marketing, Informatica Digital Government Institute s Government
A Vision for Operational Analytics as the Enabler for Business Focused Hybrid Cloud Operations
A Vision for Operational Analytics as the Enabler for Focused Hybrid Cloud Operations As infrastructure and applications have evolved from legacy to modern technologies with the evolution of Hybrid Cloud
COMP9321 Web Application Engineering
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411
The Machine: The future of technology [email protected] Hyperscale Division EMEA
The Machine: The future of technology [email protected] Hyperscale Division EMEA Agenda 1: Vision 2: Core Technologies 3: Systems Moonshot The Machine Tsunami of data on the horizon 202X will be
SURVEY REPORT DATA SCIENCE SOCIETY 2014
SURVEY REPORT DATA SCIENCE SOCIETY 2014 TABLE OF CONTENTS Contents About the Initiative 1 Report Summary 2 Participants Info 3 Participants Expertise 6 Suggested Discussion Topics 7 Selected Responses
SAP BusinessObjects BI Clients. January 2016
SAP BusinessObjects BI Clients January 2016 SAP Analytics and BI Strategy SAP Analytics Strategy SAP Cloud for Analytics Provide new SaaS Analytics capabilities All analytics capabilities in one product
